Clickstream Analysis using Hadoop

  IJCTT-book-cover
 
International Journal of Computer Trends and Technology (IJCTT)          
 
© 2016 by IJCTT Journal
Volume-34 Number-2
Year of Publication : 2016
Authors : Harshit Makhecha, Dharmendra Singh, Bhagirath Prajapati, Priyanka Puvar
  10.14445/22312803/IJCTT-V34P115

MLA

Harshit Makhecha, Dharmendra Singh, Bhagirath Prajapati, Priyanka Puvar "Clickstream Analysis using Hadoop". International Journal of Computer Trends and Technology (IJCTT) V34(2):89-92, April 2016. ISSN:2231-2803. www.ijcttjournal.org. Published by Seventh Sense Research Group.

Abstract -
E-Commerce websites generates huge churns of data due to large amount of transactions taking place every second and so their inventory should be updated as per transactions very quickly to remain stable in these competitive market. Analyzing web log files has become one of the important task for ECommerce companies to predict their customer behavior. Clickstream data is very important part of big data marketing as it will tell what customers click on and purchase or (do not purchase). The primary focus of the paper is to prepare web log analysis system which will depict trends based on the users browsing mode using Hadoop MapReduce and handling heterogeneous query execution on log file.

References
[1] What is big data: - IBM?
[2] “Why Big Data is a must in E-Commerce”, Guest post by Jerry Jao, CEO of Retention Science. http://www.bigdatalandscape.com/news/why-big-data-is-amust- in-ecommerce
[3] Tom White, (2009) “Hadoop: The Definitive Guide. O’Reilly”, Scbastopol, California.
[4] Apache-Hadoop, http://Hadoop.apache.org
[5] L.K. Joshila Grace, V.Maheswari, Dhinaharan Nagamalai, “ANALYSIS OF WEB LOGS AND WEB USER IN WEB MINING”, International Journal of Network Security & Its Applications (IJNSA), Vol.3, No.1, January 2011
[6] https://en.wikipedia.org/wiki/Semi-structured_data

Keywords
The primary focus of the paper is to prepare web log analysis system which will depict trends based on the users browsing mode using Hadoop MapReduce and handling heterogeneous query execution on log file.