Asynchronous Inference Graph Execution for Model Routing in Machine Learning Systems

Gangadharan Venkataraman

doi:https://doi.org/10.14445/22312803/IJCTT-V72I10P101

Research Article | Open Access | Download PDF

Volume 72 | Issue 10 | Year 2024 | Article Id. IJCTT-V72I10P101 | DOI : https://doi.org/10.14445/22312803/IJCTT-V72I10P101

Asynchronous Inference Graph Execution for Model Routing in Machine Learning Systems

Gangadharan Venkataraman

Received	Revised	Accepted	Published
16 Aug 2024	20 Sep 2024	05 Oct 2024	22 Oct 2024

Citation :

Gangadharan Venkataraman, "Asynchronous Inference Graph Execution for Model Routing in Machine Learning Systems," International Journal of Computer Trends and Technology (IJCTT), vol. 72, no. 10, pp. 1-4, 2024. Crossref, https://doi.org/10.14445/22312803/ IJCTT-V72I10P101

Abstract

It is for this reason that this paper creates a routing mechanism in machine learning systems by performing asynchronous inference graphs for such systems. The system will allow model chaining, champion/challenger evaluation, and traffic splitting; hence, it will have very efficient model deployment strategies. In detail, we describe the architecture and implementation of the routing mechanism along with its application to real-world ML pipelines.

Keywords

Inference Service, Model Routing, Asynchronous Execution, Model Chaining, Champion/Challenger, Traffic Splitting.

References

[1] D. Sculley et al., “Hidden Technical Debt in Machine Learning Systems,” NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal Canada, vol. 2, pp. 2503-2511, 2015.
[Google Scholar] [Publisher Link]
[2] Daniel Crankshaw et al., “Clipper: A Low-Latency Online Prediction Serving System,” 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI), Boston, MA, pp. 613-627, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Matei Zaharia et al., “Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing,” NSDI '13: 10th USENIX Symposium on Networked Systems Design and Implementation, San Jose, CA, pp. 1-14, 2012.
[Google Scholar] [Publisher Link]
[4] Martín Abadi et al., “TensorFlow: A System for Large-Scale Machine Learning,” 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Savannah, GA, USA, pp. 265-283, 2016.
[Google Scholar] [Publisher Link]
[5] Neoklis Polyzotis et al., “Data Management Challenges in Production Machine Learning,” SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data, Chicago Illinois USA, pp. 1723-1726, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Ruben Mayer, and Hans-Arno Jacobsen, “Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques, and Tools,” ACM Computing Surveys (CSUR), vol. 53, no. 1, pp. 1-37, 2020.
[CrossRef] [Google Scholar] [Publisher Link]