Survey On Leveraging Django and Redis Using Web Scrapping

  IJCTT-book-cover
 
         
 
© 2020 by IJCTT Journal
Volume-68 Issue-4
Year of Publication : 2020
Authors : Abhinav R, Abhinav Raman, Abilash R
DOI :  10.14445/22312803/IJCTT-V68I4P110

How to Cite?

Abhinav R, Abhinav Raman, Abilash R, "Survey On Leveraging Django and Redis Using Web Scrapping," International Journal of Computer Trends and Technology, vol. 68, no. 4, pp. 54-58, 2020. Crossref, https://doi.org/10.14445/22312803/IJCTT-V68I4P110

Abstract
Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites.Web scraping software may access the World Wide Web directly using the Hypertext Transfer Protocol, or through a web browser. Over the years, due to advancements in web development and its technology, various frameworks have come in use and almost all of websites are dynamic with their content being served from CMS. This makes it tough to extract data since there is no common template for extracting data. Hence we use RSS.RSS (originally RDF Site Summary; later, two competing approaches emerged, which used the backronyms Rich Site Summary and Really Simple Syndication respectively) is a type of web feed which allows users and applications to access updates to websites in a standardized, computer-readable format.This project combines the use of RSS to extract data from websites and serve users in a robust and easy way. The differentiation is that this project uses server side caching to serve users almost instantaneously without the need to perform data extraction from the requested site all over again.This is done using Redis and Django.

Keywords
Web Scrapping, Survey

Reference
[1] Belen Vela, Jose, Ploma, “A Semi-Automatic data scraping method for the public transport domain”, IEEE access, vol.7,no.10,pp. 335-339, 2019.
[2] K. Sundarmoorthy, R. Durga, “An aggregation system for news using web scraping method”, IEEE International conference on Technical Advancements in Computers and Communications(ICTACC), pp.1340-1343, 2017.
[3] Deborah, Deny, “Increased information retrieval capabilities on e-commerce website using scraping techniques” , IEEE International conference on sustainable information engineering and technology(SIET), pp. 829-834, 2017
[4] Abdul, WindiEka, Muhamat Abdul, “An approach on web scraping on news website based on regular expressions”, IEEE 2nd East Indonesia Conference on Computer and Information Technology(EIConCIT), pp.906-923, 2018.
[5] Sandeep Sirsat , Vinay Chavan, “Pattern matching for extraction of core contents from news web pages ”, IEEE Second International Conference on Web Research, pp.51-54, 2016.
[6] LI Zhao,SI-Feng Du, “Design and implementation of website content management system”, IEEE International Conference on Information Management and Engineering, pp.5-8, 2018.
[7] Shreya Upadhyay,Vishal Pant, “Articulating the construction of a web scraper for massive data extraction”, IEEE Second International Conference on Electrical, Computer and Communication Technologies(ICECCT), pp.1-3, 2017.
[8] David Mathew Thomas,Sandeep Mathur, “Data analysis by web scraping using python”, IEEE 3rd International Conference on Electronics,Communication and Aerospace Technology(ICECA), pp.6179-6186, 2019.