Anomaly Detection Using Pagerank Algorithm

Deepak Shrivastava; Somesh Kumar Dewangan

doi:https://doi.org/10.14445/22312803/IJCTT-V4I9P152

Research Article | Open Access | Download PDF

Volume 4 | Issue 9 | Year 2013 | Article Id. IJCTT-V4I9P152 | DOI : https://doi.org/10.14445/22312803/IJCTT-V4I9P152

Anomaly Detection Using Pagerank Algorithm

Deepak Shrivastava, Somesh Kumar Dewangan

Citation :

Deepak Shrivastava, Somesh Kumar Dewangan, "Anomaly Detection Using Pagerank Algorithm," International Journal of Computer Trends and Technology (IJCTT), vol. 4, no. 9, pp. 3247-3254, 2013. Crossref, https://doi.org/10.14445/22312803/IJCTT-V4I9P152

Abstract

Anomaly detection techniques are widely used in a various type of applications. We explored proximity graphs for anomaly detection and the Page Rank algorithm. We used a different PageRank algorithm at peak in proximity graph collection of data points indicated by vertices, gives results a score quantifying the extent to which each data point is anomalous. In this way we requires first forming a density calculating using the training data, it was high calculative intensive for sets of high-dimensional data. In the case of mild assumptions and appropriately chosen parameters, we explored that PageRank probability in point-wise consistent density imagines for the data points in an asymptotic sense and decreased computational effort. With that heavy betterments in case of executing time are experienced while maintaining similar detection performance. This way is computationally tractable and scales perfectly to huge high-dimensional data sets.

Keywords

Anomaly Detection, Proximity Graph, Personalized Page-Rank

References

[1] V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,”ACM Computing Surveys, vol. 41, no. 3, pp. 15:1–15:58, 2009.
[2] L. Page, S. Brin, R. Motwani, and T. Winograd, “The PageRank citation ranking: Bringing order to the web,” Stanford InforLab, Tech. Rep. 1999-66, 1999.
[3] B. Sch¨olkopf, J. Platt, J. Shawe-Taylor, A. Smola, and R. Williamson, “Estimating the support of a high-dimensional distribution,” Neural computation, vol. 13, no. 7, pp. 14431471, 2001.
[4] C. Scott and R. Nowak, “A Neyman-Pearson approach to statistical learning,” IEEE Transactions on Information Theory, vol. 51, no. 11,pp. 3806–3819, 2005.
[5]“Learning minimum volume sets,” Journal of Machine Learning Research, vol. 7, pp. 665–704, 2006.
[6] C. Scott and E. Kolaczyk, “Nonparametric assessment of contamination in multivariate data using generalized quantile sets and fdr,” Journal of Computational and Graphical Statistics, vol. 19, no. 2, pp. 439–456,2010.
[7] A. Hero III, “Geometric entropy minimization (GEM) for anomaly detection and localization,” in Proc. Advances in Neural Information Processing Systems, vol. 19, Vancouver, BC, Canada, 2006, pp. 585–592.
[8] S. Byers and A. Raftery, “Nearest-neighbor clutter removal for estimating features in spatial point processes,” Journal of the American Statistical Association, vol. 93, no. 442, pp. 577–584, 1998.
[9] S. Ramaswamy, R. Rastogi, and K. Shim, “Efficient algorithms for mining outliers from large data sets,” ACM SIGMOD Record, vol. 29,no. 2, pp. 427–438, 2000.
[10] M. Breunig, H. Kriegel, R. Ng, and J. Sander, “OPTICS-OF: Identifying local outliers,” in Proc. European Conference on Principles of Data Mining and Knowledge Discovery, Prague, Czech Republic, 1999, pp.262–270.
[11] ——, “LOF: Identifying density-based local outliers,” ACM SIGMOD Record, vol. 29, no. 2, pp. 93–104, 2000.
[12] D. Pokrajac, A. Lazarevic, and L. Latecki, “Incremental local outlier detection for data streams,” in Proc. IEEE Symposium on Computational Intelligence and Data Mining, Honolulu, HI, USA, 2007, pp. 504–515.
[13] M. Zhao and V. Saligrama, “Anomaly detection with score functions based on nearest neighbor graphs,” in Proc. Advances in Neural Information Processing Systems, vol. 22, Vancouver, BC, Canada, 2009, pp. 2250–2258.
[14] J. Sun, H. Qu, D. Chakrabarti, and C. Faloutsos, “Neighborhood formation and anomaly detection in bipartite graphs,” in Proc. IEEE International Conference on Data Mining, Houston, TX, USA, 2005, pp. 418–425.
[15] J. He, Y. Liu, and R. Lawrence, “Graphbased rare category detection,” in Proc. IEEE International Conference on Data Mining, Houston, TX, USA, 2005, pp. 418–425.
[16] J. He, J. Carbonell, and Y. Liu, “Graph-based semi-supervised learning as a generative model,” in Proc. International Joint Conference on Artificial Intelligence, Hyderabad, India, 2007, pp. 2429–2497.
[17] H. Cheng, P. Tan, C. Potter, and S. Klooster, “Detection and characterization of anomalies in multivariate time series,” in Proc. SIAM International Conference on Data Mining, Sparks, NV, USA, 2009, pp. 413–424.