Overcoming Challenges in Deploying Large Language Models for Generative AI Use Cases: The Role of Containers and Orchestration

Sriramaraju Sagi

doi:https://doi.org/10.14445/22312803/ IJCTT-V72I2P114

Research Article | Open Access | Download PDF

Volume 72 | Issue 2 | Year 2024 | Article Id. IJCTT-V72I2P114 | DOI : https://doi.org/10.14445/22312803/IJCTT-V72I2P114

Overcoming Challenges in Deploying Large Language Models for Generative AI Use Cases: The Role of Containers and Orchestration

Sriramaraju Sagi

Received	Revised	Accepted	Published
07 Jan 2024	07 Feb 2024	19 Feb 2024	29 Feb 2024

Citation :

Sriramaraju Sagi, "Overcoming Challenges in Deploying Large Language Models for Generative AI Use Cases: The Role of Containers and Orchestration," International Journal of Computer Trends and Technology (IJCTT), vol. 72, no. 2, pp. 75-81, 2024. Crossref, https://doi.org/10.14445/22312803/ IJCTT-V72I2P114

Abstract

This research delves into using Language Models (LLMs) in converged infrastructure, specifically focusing on container technologies like Kubernetes and OpenShift for orchestration purposes. The passage discusses the challenges involved in implementing LLMs, including scalability, performance issues and security considerations. It suggests that containers can effectively address these challenges. Additionally, it explores the benefits of using containers to deploy LLMs, such as scalability, optimized resource utilization, enhanced flexibility, increased portability, and strengthened security measures. Furthermore, it examines how Suse Rancher plays a role in managing applications that are containerized to ensure both security and scalability. The validation and analysis section provides an assessment of a study that utilizes an infrastructure platform called FlexPod to evaluate LLM models across container orchestration platforms, demonstrating the practicality and advantages of integrating FlexPod Datacenter.

Keywords

Large Language Models (LLM), Containerization, Scalability, Datacenter, Kubernetes.

References

[1] FlexPod Datacenter with SUSE Rancher for AI Workloads Design Guide, NetApp, Cisco, 2023. [Online]. Available: https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/UCS_CVDs/flexpod_suse_rancher_design.html
[2] Diaz Jorge-Martinez et al., “Artificial Intelligence-based Kubernetes Container for Scheduling Nodes of Energy Composition,” International Journal of System Assurance Engineering and Management, pp. 1-9, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Laszlo Toka et al., “Adaptive AI-based Auto-Scaling for Kubernetes,” 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, Melbourne, VIC, Australia, pp. 599-608, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Brandon Thurgood, and Ruth G. Lennon, “Cloud Computing With Kubernetes Cluster Elastic Scaling,” Proceedings of the 3rd International Conference on Future Networks and Distributed Systems, Paris France, pp. 1-7, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Nhat-Minh Dang-Quang, and Myungsik Yoo, “Deep Learning-Based Autoscaling Using Bidirectional Long Short-Term Memory for Kubernetes,” Applied Sciences, vol. 11, no. 9, pp. 1-25, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Chaoyu Wu, E Haihong, and Meina Song, “An Automatic Artificial Intelligence Training Platform Based on Kubernetes,” Proceedings of the 2020 2nd International Conference on Big Data Engineering and Technology, Singapore China, pp. 58-62, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Chun-Hsiang Lee et al., “Multi-Tenant Machine Learning Platform Based on Kubernetes,” Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence, pp. 5-12, 2020.
[CrossRef] [Google Scholar] [Publisher Link]