Kailash Budhathoki

Bio

I am currently a Team Lead (Sr. Applied Scientist) in the AWS Deep Science for Systems and Services team, where I work at the intersection of machine learning and systems. Our team’s goal is to optimize foundation models for inference in Amazon Bedrock—driving higher hardware utilization, lower latency, and lower cost. We develop algorithms (e.g., for quantization, speculative decoding, structured sparsity, accelerating multi-LoRA inference) and optimize systems (e.g., inference engines like vLLM, kernel tuning, identifying inference perf bottlenecks) that power GenAI workloads in Amazon Bedrock without compromising model accuracy. Learn more about our team’s recent public works: (Park et al., 2024)(Kübler et al., 2025)(Wang et al., 2025).

I joined Amazon Research Lablet Tübingen (part of AWS AI) in 2020, where I developed algorithms / tools to help businesses explain complex cause-effect relationships underlying their business problems, and led cross-org effort within Amazon to launch them in production (Budhathoki & Blöbaum, 2022)(Budhathoki, 2021)(Götz & Budhathoki, 2022). Businesses like Amazon Supply Chain and Amazon Ads actively use those solutions for effect estimation and root cause analysis of changes / outliers. Those algorithmic solutions were also open-sourced to the Python DoWhy library under a new package called gcm (Götz & Budhathoki, 2022)(Blöbaum et al., 2023)(Emre Kiciman, 2022). This collaboration with Microsoft Research led to a new GitHub organization, PyWhy, with the mission to build an open source ecosystem for causal machine learning (Götz & Budhathoki, 2022)(Emre Kiciman, 2022). Briefly, I also led the cross-org science effort within Amazon to deliver bias mitigation solutions for the first family of Amazon’s in-house multimodal foundation models, called Titan Multimodal Embeddings model and Amazon Titan Image Generation model, towards their re:Invent 2023 release (Kleindessner et al., 2025)(Barth, 2023)(Ali et al., 2023).

In 2020, I received the Doctoral Degree in Computer Science from the Saarland University, where I conducted my doctoral research at the Max Planck Institute for Informatics. During my PhD, I also interned at the Amazon Research Lablet Tübingen for 2.5 months in the spring of 2019. Earlier, in 2015, I completed my Master’s degree in Computer Science with honours from the Saarland University. Prior to that I worked as a Software Developer at ImmuneSecurity A/S (now LogPoint) between 2011 and 2013. I studied Bachelor’s Degree in Computer Engineering at the Institute of Engineering, Pulchowk Campus in Nepal (2006-2010).

Research

I have been fortunate to collaborate with great colleagues across diverse research topics, but the common thread is a customer-centric approach to machine learning. Despite shifts in direction, my work consistently aims to create ML systems that deliver real value for customers. See up-to-date publications on Google Scholar.

Selected artifacts

Block-Diagonal LoRA for Eliminating Communication Overhead in Tensor Parallel LoRA Serving, NeurIPS, 2025
®️ Patent pending.
Mitigating Bias in Multimodal Models via Query Transformation, US Patent, 2025
🚀 Launched in production
Evaluating the Fairness of Discriminative Foundation Models in Computer Vision, AIES, 2023
🚀 Launched in production
AWS contributes novel causal machine learning algorithms to DoWhy Python library, Amazon Science Blog, 2022
🚀 Official release of gcm package in DoWhy w/ major refactoring of DoWhy codebase
New method identifies the root causes of statistical outliers, Amazon Science Blog, 2022
🎉 Top-10 blog posts of 2022 by Amazon Science
Causal structure-based root cause analysis of outliers, ICML, 2022
🚀 Launched in production
Explaining the root causes of unit-level changes, arXiv, 2022
🚀 Launched in production
Why did the distribution change?, AISTAT, 2021
🚀 Launched in production

References

Park, Y., Budhathoki, K., Chen, L., Kübler, J., Huang, J., Kleindessner, M., Huan, J., Cevher, V., Wang, Y., & Karypis, G. (2024). Inference Optimization of Foundation Models on AI Accelerators. https://arxiv.org/abs/2407.09111
Kübler, J. M., Wang, Y.-X., Sabach, S., Ansari, N., Kleindessner, M., Budhathoki, K., Cevher, V., & Karypis, G. (2025). A Proximal Operator for Inducing 2: 4-Sparsity. Transactions on Machine Learning Research.
Wang, X., Kübler, J. M., Budhathoki, K., Wang, Y., & Kleindessner, M. (2025). Block-Diagonal LoRA for Eliminating Communication Overhead in Tensor Parallel LoRA Serving. The Thirty-Ninth Annual Conference on Neural Information Processing Systems. https://openreview.net/forum?id=1cjLvtFOmL
Budhathoki, K., & Blöbaum, P. (2022). New method identifies the root causes of statistical outliers. https://www.amazon.science/blog/new-method-identifies-the-root-causes-of-statistical-outliers
Budhathoki, K. (2021). Explaining changes in real-world data. https://www.amazon.science/blog/explaining-changes-in-real-world-data
Götz, P., & Budhathoki, K. (2022). AWS contributes novel causal machine learning algorithms to DoWhy Python library. https://www.amazon.science/blog/aws-contributes-novel-causal-machine-learning-algorithms-to-dowhy
Blöbaum, P., Budhathoki, K., & Götz, P. (2023). Root Cause Analysis with DoWhy, an Open Source Python Library for Causal Machine Learning. https://aws.amazon.com/blogs/opensource/root-cause-analysis-with-dowhy-an-open-source-python-library-for-causal-machine-learning/
Emre Kiciman, A. S. (2022). DoWhy evolves to independent PyWhy model to help causal inference grow. https://www.microsoft.com/en-us/research/blog/dowhy-evolves-to-independent-pywhy-model-to-help-causal-inference-grow/
Kleindessner, M., Russell, C. M., Budhathoki, K., Turkmen, A. C., Deng, S., Gunjal, V., Swaminathan, A., Manmatha, R., & Yang, H. (2025). Mitigating Bias in Multimodal Models via Query Transformation. Amazon Technologies Inc. https://patents.google.com/patent/US12229179B1/en
Barth, A. (2023). Amazon Titan Image Generator, Multimodal Embeddings, and Text models are now available in Amazon Bedrock. https://aws.amazon.com/blogs/aws/amazon-titan-image-generator-multimodal-embeddings-and-text-models-are-now-available-in-amazon-bedrock
Ali, J., Kleindessner, M., Wenzel, F., Budhathoki, K., Cevher, V., & Russell, C. (2023). Evaluating the Fairness of Discriminative Foundation Models in Computer Vision. Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 809–833.