Important Considerations for Maximizing Performance and Ensuring Uninterrupted Operation with the Cassandra Database

  IJCTT-book-cover
 
         
 
© 2023 by IJCTT Journal
Volume-71 Issue-9
Year of Publication : 2023
Authors : Venugopal Thati
DOI :  10.14445/22312803/IJCTT-V71I9P103

How to Cite?

Venugopal Thati, "Important Considerations for Maximizing Performance and Ensuring Uninterrupted Operation with the Cassandra Database," International Journal of Computer Trends and Technology, vol. 71, no. 9, pp. 15-21, 2023. Crossref, https://doi.org/10.14445/22312803/IJCTT-V71I9P103

Abstract
In the world of digital transformation and applications made available globally by enterprises for their customers, it is becoming very important to keep systems online 24/7, incurring no downtime. Moreover, these systems must scale as the demand grows and perform optimally to meet business needs. One key component in any enterprise application is the database, and it is a challenging component to scale easily compared to the front-end web apps or backend services where no state is maintained. There has been tremendous growth over the last decade in distributed NoSQL databases, which promise to solve availability, scalability, and reliability challenges. Moreover, several top-tier companies successfully implemented solutions for their businesses using these distributed databases, and some of these companies are contributing to introducing more features in these databases. This article covers important facets of achieving zero unscheduled downtime, fault tolerance, high availability, and scalability with Cassandra. Cassandra can be self-managed or used as a managed service from public cloud offerings. Insights shared in the article are from the research and practical experiences in self-managing the geographically distributed cluster and designing applications to use the Cassandra database efficiently.

Keywords
Availability, Cassandra, Distributed databases, NoSQL, Scalability.

Reference

[1] Abdul Wahid, and Kanupriya Kashyap, “Cassandra-A Distributed Database System: An Overview,” Emerging Technologies in Data Mining and Information Security, Advances in Intelligent Systems and Computing, vol. 755, pp. 519-526, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Pedro Martins et al., NoSQL Comparative Performance Study, Trends and Applications in Information Systems and Technologies, WorldCIST 2021, Advances in Intelligent Systems and Computing, vol. 1366, pp. 428-438, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Guoxi Wang, and Jianfeng Tang, “The NoSQL Principles and Basic Application of Cassandra Model,” 2012 International Conference on Computer Science and Service System, pp. 1332-1335, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Avinash Lakshman, and Prashant Malik, “Cassandra: A Decentralized Structured Storage System,” ACM SIGOPS Operating Systems Review, vol. 44, no. 2, pp. 35-40, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Muh. Rafif Murazza, and Arif Nurwidyantoro, “Cassandra and SQL Database Comparison for Near Real-Time Twitter Data Warehouse,” 2016 International Seminar on Intelligent Technology and Its Applications (ISITIA), pp. 195-200, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Giuseppe Baruffa et al., “Comparison of MongoDB and Cassandra Databases for Spectrum Monitoring As-a-Service,” IEEE Transactions on Network and Service Management, vol. 17, no. 1, pp. 346-360, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Vishal Dilipbhai Jogi, and Ashay Sinha, “Performance Evaluation of MySQL, Cassandra and HBase for Heavy Write Operation,” 2016 3 rd International Conference on Recent Advances in Information Technology (RAIT), pp. 586-590, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Shubham Dhingra et al., “Fault Tolerant Streaming of Live News Using Multi-Node Cassandra,” 2017 Tenth International Conference on Contemporary Computing (IC3), pp. 1-5, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Katalin Ferencz, and József Domokos, “IoT Sensor Data Acquisition and Storage System Using Raspberry Pi and Apache Cassandra,” 2018 International IEEE Conference and Workshop in Óbuda on Electrical and Power Engineering (CANDO-EPE), pp. 143-146, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Gautam Pal, Gangmin Li, and Katie Atkinson, “Near Real-Time Big Data Stream Processing Platform Using Cassandra,” 2018 4th International Conference for Convergence in Technology (I2CT), pp. 1-7, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Cassandra Documentation, Ensure Keyspaces are Created with Network Topology Strategy, Apache Cassandra. [Online]. Available: https://cassandra.apache.org/doc/4.1/cassandra/getting_started/production.html#ensure-keyspaces-are-created-with-networktopologystrategy
[12] Cassandra Documentation, Tunable Consistency, Apache Cassandra. [Online]. Available: https://cassandra.apache.org/doc/latest/cassandra/architecture/dynamo.html#tunable-consistency
[13] Load Balancing, DataStax Documentation. [Online]. Available: https://docs.datastax.com/en/developer/java-driver/4.2/manual/core/load_balancing/
[14] Cassandra Optimal Partition Size, Stackoverflow. [Online]. Available: https://stackoverflow.com/questions/69282435/cassandra-optimal-partition-size
[15] Garbage Collection Tuning for Apache Cassandra, The Last Pickle, 2018. [Online]. Available: https://thelastpickle.com/blog/2018/04/11/gc-tuning.html
[16] Wide-Column Store, Wikipedia. [Online]. Available: https://en.wikipedia.org/wiki/Wide-column_store
[17] Cassandra Stress, Cassandra Documentation, Apache Cassandra. [Online]. Available: https://cassandra.apache.org/doc/latest/cassandra/tools/cassandra_stress.html
[18] Cassandra, Operating, Metrics, Cassandra Documentation, Apache Cassandra. [Online]. Available: https://cassandra.apache.org/doc/latest/cassandra/operating/metrics.html
[19] Repair, Cassandra Documentation, Apache Cassandra. [Online]. Available: https://cassandra.apache.org/doc/latest/cassandra/operating/repair.html