Nearest Neighbour Based Outlier Detection Techniques

  IJCOT-book-cover
 
International Journal of Computer Trends and Technology (IJCTT)          
 
© - Issue 2012 by IJCTT Journal
Volume-3 Issue-2                           
Year of Publication : 2012
Authors :Dr. Shuchita Upadhyaya, Karanjit Singh.

MLA

Dr. Shuchita Upadhyaya, Karanjit Singh."Nearest Neighbour Based Outlier Detection Techniques"International Journal of Computer Trends and Technology (IJCTT),V3(2):295-299 Issue 2012 .ISSN 2231-2803.www.ijcttjournal.org. Published by Seventh Sense Research Group.

Abstract: -Outlier detection is an important research area forming part of many application domains. Specific application domains call for specific detection techniques, while the more generic ones can be applied in a large number of scenarios with good results. This survey tries to provide a structured and comprehensive overview of the research on Nearest Neighbor Based Outlier Detection listing out various techniques as applicable to our area of research. We have focused on the underlying approach adopted by each technique. We have identified key assumptions, which are used by the techniques to differentiate between normal and Outlier behavior. When applying a given technique to a particular domain, these assumptions can be used as guidelines to assess the effectiveness of the technique in that domain. We provide a basic outlier detection technique, and then show how the different existing techniques in that category are variants of this basic technique. This template provides an easier and succinct understanding of the Nearest Neighbor based techniques. Further we identify the advantages and disadvantages of various Nearest Neighbor based techniques. We also provide a discussion on the computational complexity of the techniques since it is an important issue in our application domain. We hope that this survey will provide a better understanding of the different directions in which research has been done on this topic, and how techniques developed in this area can be applied in other domains for which they were not intended to begin with.

References-

[1] Tan, P.-N., Steinbach, M., and Kumar, V. 2005. Introduction to Data Mining. Addison-Wesley.
[2] Boriah, S., Chandola, V., and Kumar, V. 2008. Similarity measures for categorical data: A comparative evaluation. In Proceedings of the eighth SIAM International Conference on Data Mining. 243 - 254.
[3] Chandola, V., Eilertson, E., Ertoz, L., Simon, G., and Kumar, V. 2006. Data mining for cyber security. In Data Warehousing and Data Mining Techniques for Computer Security, A. Singhal, Ed. Springer.
[4] Byers, S. D. and Raftery, A. E. 1998. Nearest neighbor clutter removal for estimating features in spatial point processes. Journal of the American Statistical Association 93, 577 - 584.
[5] Guttormsson, S., II, R. M., and El-Sharkawi, M. 1999. Elliptical novelty grouping for on-line short-turn detection of excited running rotors. IEEE Transactions on Energy Conversion 14, 1 (March).
[6] Ramaswamy, S., Rastogi, R., and Shim, K. 2000. Efficient algorithms for mining outliers from large data sets. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data. ACM Press, 427 - 438.
[7] Eskin, E., Arnold, A., Prerau, M., Portnoy, L., and Stolfo, S. 2002. A geometric frame-work for unsupervised outlier detection. In Proceedings of Applications of Data Mining in Computer Security. Kluwer Academics, 78 - 100.

KeywordsOutlier, Outlier Detection, Nearest Neighbour Concept, Multivariate, Algorithms, Data Mining, Nearest Neighbor based Outlier Detection, K-Nearest Neighbor, LOF Neighborhood, COF Neighborhood