Survey Mining High Utility Patterns In One Phase Without Generating Candidates

International Journal of Computer Trends and Technology (IJCTT)          
© 2016 by IJCTT Journal
Volume-41 Number-2
Year of Publication : 2016
Authors : Dr. P.Sengottuvelan, Prof. S. Joseph Gabriel


Dr. P.Sengottuvelan, Prof. S. Joseph Gabriel "Survey Mining High Utility Patterns In One Phase Without Generating Candidates". International Journal of Computer Trends and Technology (IJCTT) V41(2):67-76, November 2016. ISSN:2231-2803. Published by Seventh Sense Research Group.

Abstract -
Software mining is a new development of information mining technology. Among software mining troubles, software mining with the itemset proportion framework is a difficult one as no anti-monotonicity property holds with the interestingness degree. Earlier works on this problem all employ a -section, candidate technology technique with one exception that is however inefficient and not scalable with big databases. The two-section technique suffers from scalability problem due to the big type of candidates. This paper proposes a completely unique set of regulations that reveals excessive utility patterns in a single phase without generating applicants. The novelties lie in an immoderate application pattern boom approach, a look in advance approach, and a linear facts shape. Concretely, our pattern increase method is to search a opposite set enumeration tree and to prune seek space by way of using software pinnacle bounding. We additionally look beforehand to understand excessive utility styles without enumeration through way of a closure assets and a singleton belongings. Our linear information shape lets in us to compute an awesome positive for effective pruning and to immediately perceive excessive software styles in an efficient and scalable manner, which dreams the idea purpose with prior algorithms. big experiments on sparse and dense, synthetic and actual international statistics recommend that our set of policies is as much as at least one to three orders of significance more green and is more scalable than the present day-day algorithms. Mining excessive software itemset from a transactional database refers to the discovery of object sets with excessive software like income. Notwithstanding the fact that some of applicable approaches had been proposed in modern-day years, but they incur the trouble of manufacturing a huge range of candidate item sets for excessive software object devices. This sort of large wide variety of candidate object units degrades the mining overall performance in phrases of execution time and area requirement. The situation may additionally end up worse at the same time as the database consists of masses of prolonged transactions or long immoderate software object units. to overcome this all predicament on this paper we proposed set of guidelines, specially UP growth and UP boom plus set of policies for mining high application object units with effective set of pruning technique. The Experimental consequences show that the proposed set of rules, especially application sample increase plus, required a lot less execution time and decreased reminiscence usage even as databases encompass lots of the excessive transactions

[1] R. Agawam, C. Aggarwal, and V. Prasad, “Depth first generation of long patterns,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2000, pp. 108–118.
[2] R. Agrawal, T. Imielinski, and A. Swami, “Mining association rules between sets of items in large databases,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, 1993, pp. 207–216.
[3] R. Agrawal and R. Srikant, “Fast algorithms for mining association rules,” in Proc. 20th Int. Conf. Very Large Databases, 1994,pp. 487–499.
[4] C. F. Ahmed, S. K. Tanbeer, B.-S. Jeong, and Y.-K. Lee, “Efficient tree structures for high utility pattern mining in incremental databases,”IEEE Trans. Knowl. Data Eng., vol. 21, no. 12, pp. 1708–1721, Dec. 2009.
[5] R. Bayardo and R. Agrawal, “Mining the most interesting rules,”in Proc. 5th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining,1999, pp. 145–154.
[6] F. Bonchi, F. Giannotti, A. Mazzanti, and D. Pedreschi, “ExAnte: A preprocessing method for frequent-pattern mining,” IEEE Intell.Syst., vol. 20, no. 3, pp. 25–31, May/Jun. 2005.
[7] F. Bonchi and B. Goethals, “FP-Bonsai: The art of growing and pruning small FP-trees,” in Proc. 8th Pacific-Asia Conf. Adv. Knowl. Discovery Data Mining, 2004, pp. 155–160.
[8] F. Bonchi and C. Lucchese, “Extending the state-of-the-art of constraint-based pattern discovery,” Data Knowl. Eng., vol. 60,no. 2, pp. 377–399, 2007.
[9] C. Bucila, J. Gehrke, D. Kifer, and W. M. White, “Dualminer: A dual-pruning algorithm for itemsets with constraints,” Data Mining Knowl. Discovery, vol. 7, no. 3, pp. 241–272, 2003.
[10] C. H. Cai, A. W. C. Fu, C. H. Cheng, and W. W. Kwong, “Mining association rules with weighted items,” in Proc. Int. Database Eng.Appl. Symp., 1998, pp. 68–77.
[11] R. Chan, Q. Yang, and Y. Shen, “Mining high utility itemsets,” in Proc. Int. Conf. Data Mining, 2003, pp. 19–26.
[12] S. Dawar and V. Goyal, “UP-Hist tree: An efficient data structure for mining high utility patterns from transaction databases,” in Proc. 19th Int. Database Eng. Appl. Symp., 2015, pp. 56–61.
[13] T. De Bie, “Maximum entropy models and subjective interestingness: An application to tiles in binary databases,” Data MiningKnowl. Discovery, vol. 23, no. 3, pp. 407–446, 2011.
[14] L. De Raedt, T. Guns, and S. Nijssen, “Constraint programming for itemset mining,” in Proc. ACM SIGKDD, 2008, pp. 204–212.
[15] A. Erwin, R. P. Gopalan, and N. R. Achuthan, “Efficient mining of high utility itemsets from large datasets,” in Proc. 12th Pacific-Asia Conf. Adv. Knowl. Discovery Data Mining, 2008, pp. 554–561.
[16] P. Fournier-Viger, C.-W. Wu, S. Zida, and V. S. Tseng, “FHM:Faster high-utility itemset mining using estimated utility cooccurrence pruning,” in Proc. 21st Int. Symp. Found. Intell. Syst.,2014, pp. 83–92.
[17] L. Geng and H. J. Hamilton, “Interestingness measures for data mining: A survey,” ACM Comput. Surveys, vol. 38, no. 3, p. 9, 2006.
[18] J. Han, J. Pei, and Y. Yin, “Mining frequent patterns without candidate generation,” in Proc. ACM SIGMOD Int. Conf. Manage.Data, 2000, pp. 1–12.
[19] R. J. Hilderman, C. L. Carter, H. J. Hamilton, and N. Cercone,“Mining market basket data using share measures and characterized itemsets,” in Proc. PAKDD, 1998, pp. 72–86.
[20] R. J. Hilderman and H. J. Hamilton, “Measuring the interestingness of discovered knowledge: A principled approach,” Intell.Data Anal., vol. 7, no. 4, pp. 347–382, 2003.
[21] M. Holsheimer, M. Kersten, H. Mannila, and H. Toivonen, “A perspective on databases and data mining,” in Proc. 1st Int. Conf.Knowl. Discovery Data Mining, 1995, pp. 150–155.
[22] S. Krishnamoorthy, “Pruning strategies for mining high utility itemsets,” Expert Syst. Appl., vol. 42, no. 5, pp. 2371–2381, 2015.
[23] G.-C. Lan, T.-P. Hong, and V. S. Tseng, “An efficient projectionbased indexing approach for mining high utility itemsets,” Knowl.Inf. Syst., vol. 38, no. 1, pp. 85–107, 2014.
[24] Y.-C. Li, J.-S. Yeh, and C.-C. Chang, “Isolated items discarding strategy for discovering high utility itemsets,” Data Knowl. Eng.,vol. 64, no. 1, pp. 198–217, 2008.
[25] T. Y. Lin, Y. Y. Yao, and E. Louie, “Value added association rules,”in Proc. 6th Pacific-Asia Conf. Adv. Knowl. Discovery Data Mining,2002, pp. 328–333.
[26] J. Liu, Y. Pan, K. Wang, and J. Han, “Mining frequent item sets byopportunistic projection,” in Proc. 8th ACM SIGKDD Int. Conf.Knowl. Discovery Data Mining, 2002, pp. 229–238.
[27] J. Liu, K. Wang, and B. Fung, “Direct discovery of high utility itemsets without candidate generation,” in Proc. IEEE 12th Int. Conf. Data Mining, 2012, pp. 984–989.
[28] M. Liu and J. Qu, “Mining high utility itemsets without candidate generation,” in Proc. ACMConf. Inf. Knowl.Manage., 2012, pp. 55–64.
[29] Y. Liu, W. Liao, and A. Choudhary, “A fast high utility itemsets mining algorithm,” in Proc. Utility-Based Data Mining Workshop SIGKDD, 2005, pp. 253–262.
[30] S. Lu, H. Hu, and F. Li, “Mining weighted association rules,”Intell. Data Anal., vol. 5, no. 3, pp. 211–225, 2001.
[31] S. Morishita and J. Sese, “Traversing itemset lattice with statistical metric pruning,” in Proc. 19th ACM Symp. Principles Database Syst.,2000, pp. 226–236.
[32] J. Pei, J. Han, and V. Lakshmanan, “Pushing convertible constraints in frequent itemset mining,” Data Mining Knowl. Discovery,vol. 8, no. 3, pp. 227–252, 2004.
[33] J. Pei, J. Han, H. Pinto, Q. Chen, U. Dayal, and M. Hsu,“PrefixSpan: Mining sequential patterns efficiently by prefixprojected pattern growth,” in Proc. 17th Int. Conf. Data Eng., 2001, pp. 215–224.
[34] A. Savasere, E. Omiecinski, and S. B. Navathe, “An efficient algorithm for mining association rules in large databases,” in Proc.21st Int. Conf. Very Large Databases, 1995, pp. 432–444.
[35] Y. Shen, Q. Yang, and Z. Zhang, “Objective-oriented utility-based association mining,” in Proc. IEEE Int. Conf. Data Mining, 2002,pp. 426–433.
[36] A. Silberschatz and A. Tuzhilin, “On subjective measures of interestingness in knowledge discovery,” in Proc. ACM 1st Int. Conf.Knowl. Discovery Data Mining, 1995, pp. 275–281.
[37] P. N. Tan, V. Kumar, and J. Srivastava,, “Selecting the right objective measure for association analysis,” Inf. Syst., vol. 29, no. 4,pp. 293–313, 2004.
[38] V. S. Tseng, B.-E. Shie, C.-W. Wu, and P. S. Yu, “Efficient algorithms for mining high utility itemsets from transactional databases,”IEEE Trans. Knowl. Data Eng., vol. 25, no. 8, pp. 1772–1786,Aug. 2013.
[39] H. Yao and H. J. Hamilton, “Mining itemset utilities from transaction databases,” Data Knowl. Eng., vol. 59, no. 3, pp. 603–626, 2006.
[40] H. Yao, H. J. Hamilton, and C. J. Butz, “A foundational approach to mining itemset utilities from databases,” in Proc. SIAM Int.Conf. Data Mining, 2004, pp. 482–486.
[41] H. Yao, H. J. Hamilton, and L. Geng, “A unified framework for utility-based measures for mining itemsets,” in Proc. ACMSIGKDD 2nd Workshop Utility-Based Data Mining, 2006, pp. 28–37.
[42] U. Yun, H. Ryang, and K. H. Ryu, “High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates,” Expert Syst. Appl., vol. 41, no. 8, pp. 3861–3878, 2014.
[43] M. J. Zaki, “Scalable algorithms for association mining,” IEEE Trans. Knowl. Data Eng., vol. 12, no. 3, pp. 372–390, May/Jun. 2000.
[44] M. J. Zaki and C. Hsiao, “Efficient algorithms for mining closed itemsets and their lattice structure,” IEEE Trans. Knowl. Data Eng., vol. 17, no. 4, pp. 462–478, Apr. 2005.

The Experimental consequences show that the proposed set of rules, especially application sample increase plus.