TOPIC #1: Enabling Efficient Serverless Inference Serving for LLM (Large Language Model) in the Cloud.

Requirements: basic understanding of deep learning platform, Serverless Computing, eg. Pytorch, Tensorflow

Key Paper: ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models (OSDI 24)

Yao Fu and Leyang Xue and Yeqi Huang and Andrei-Octavian Brabete and Dmitrii Ustiugov, et.al.

arxiv.org

Source Code:

https://github.com/ServerlessLLM/ServerlessLLM

Reference:

TOPIC #2: Enabling Efficient Distributed Training for LLM (Large Language Model) in the Cloud.

Requirements: basic understanding of deep learning platform, Serverless Computing, eg. Pytorch, Tensorflow

Key Paper: ElasticFlow: An Elastic Serverless Training Platform for Distributed Deep Learning

Gu, Diandian, et al. [ASPLOS 2023]