Applied Constraints on Sequential Pattern Mining with Prefixspan Algorithm

International Journal of Computer Trends and Technology (IJCTT)          
© 2018 by IJCTT Journal
Volume-57 Number-1
Year of Publication : 2018
Authors : Kalpesh A. Kshatriya, Bhavikkumar M. Patel, Hitesh B.Patel
DOI :  10.14445/22312803/IJCTT-V57P108


Kalpesh A. Kshatriya, Bhavikkumar M. Patel, Hitesh B.Patel "Applied Constraints on Sequential Pattern Mining with Prefixspan Algorithm". International Journal of Computer Trends and Technology (IJCTT) V57(1):44-50, March 2018. ISSN:2231-2803. Published by Seventh Sense Research Group.

Sequential pattern mining is the process of applying data mining techniques to a sequence database for the purpose of discovering the correlation that exist among an order list of events. Here the PrefixSpan [1] sequential pattern mining algorithm is used to generate sequential patterns from the dataset. After applying any sequential pattern mining algorithm on large dataset there could be a huge number of sequential patterns generated which are very hard to understand and hard to use by the users[2]. Users are often interested in only small subset of such patterns so by inserting several constraints with the sequential pattern mining algorithm we can restrict the algorithm from generating such a huge number of patterns. Here we study constraints like Item, Duration and Length of Transaction with the PrefixSpan algorithm in order to handle the large database. Less number of sequential patterns is generated when we use the PrefixSpan algorithm with Item, Duration and Length of Transaction constraint.

[1] Han, Jiawei, Jian Pei, Behzad Mortazavi-Asl, Helen Pinto, Qiming Chen, Umeshwar Dayal, and M. C. Hsu. "Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth." In proceedings of the 17th international conference on data engineering, Chicago pp. 215-224. 2001.
[2] Khan Irfan. "PrefixSpan Algorithm Based on Multiple Constraints for Mining Sequential Patterns." International Journal of Computer Science and Management Research, vol. 1 Issue 5 December 2012.
[3] Jiawei Han, Micheline kamber, and jian Pei “Data mining concepts and techniques” , The Morgan Kaufmann series in Data Management Systems. 2012.
[4] Agrawal R. and Srikant R., "Mining sequential patterns" , Proceedings of the Eleventh International Conference on Data Engineeering, vol., no., pp.3,14, 6-10 Mar 1995.
[5] Agrawal R., Imielinski T. And Swami A.N. , “Mining association rules between sets of items in large databases”, In proceedings of the ACM SIGMOD International Conference on Managment of Data, Washington D.C., 207-216 1993.
[6] Sitanggang, Imas Sukaesih, N. A. Husin, A. Agustina, and Naghmeh Mahmoodian. "Sequential pattern mining on library transaction data." In Information Technology (ITSim), 2010 International Symposium in, vol. 1, pp. 1-4. IEEE, 2010.
[7] Srikant R. and Agrawal R. , “Mining sequential patterns: Generalization and performance improvements”, Proceedings of the 5th International Conference Extending Database Technology, 1996, 1057, 3-17.
[8] Han J., Dong G., Mortazavi-Asl B., Chen Q., Dayal U., and Hsu M.-C., Freespan: Frequent pattern-projected sequential pattern mining, Proceedings of International Conference. Knowledge Discovery and Data Mining (KDD’00), 2000, pp. 355-359.
[9] S Vijayarani and S Deepa, ”Sequential Pattern Mining AStudy”. IJCA Proceedings on International Conference on Research Trends in Computer Technologies ICRTCT(1):14-18, February 2013.
[10] Chetna Chand, Amit Thakkar, and Amit Ganatra- “Sequential Pattern Mining: Survey and Current Research Challenges”, International Journal of Soft Computing and Engineering (IJSCE) ISSN: 2231-2307, Volume-2, Issue-1, March 2012.
[11] NIZAR R. MABROUKEH and C. I. EZEIFE, “A Taxonomy of Sequential Pattern Mining Algorithms”, ACM Computing Surveys, Vol. 43, No. 1, Article 3, Publication date: November 2010.
[12] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” Proceedings of International Conference Very Large Data Bases (VLDB ’94), pp. 487-499, Sept. 1994.
[13] Mallick Bhawna, Deepak Garg, and Preetam Singh Grover. "Constraint-Based Sequential Pattern Mining: A Pattern Growth Algorithm Incorporating Compactness, Length and Monetary." International Arab Journal of Information Technology, Vol. 11, No. 1 (2011).
[14] Chen, Yen-Liang, Mi-Hao Kuo, Shin-Yi Wu, and Kwei Tang. “Discovering recency, frequency, and monetary (RFM) sequential patterns from customers’ purchasing data.” Electronic Commerce Research and Applications 8, no. 5 (2009): 241-251.
[15] Boghey R. and Singh S., "Sequential Pattern Mining: A Survey on Approaches," IEEE International Conference on Communication Systems and Network Technologies (CSNT), vol., no., pp.670,674, Gwaliar,6-8 April 2013.
[16] Pei, Jian, Jiawei Han, and Wei Wang. "Constraint-based sequential pattern mining: the pattern-growth methods." Journal of Intelligent Information Systems 28, no. 2 (2007): 133-160.
[17] Morzy, Tadeusz, Marek Wojciechowski, and Maciej Zakrzewicz. "Efficient Constraint-Based Sequential Pattern Mining Using Dataset Filtering Techniques." In Databases and Information Systems II, pp. 297-309. Springer Netherlands, 2002.

pattern, mining, PrefixSpan, data