Lordzuko's Blog

Posts
Projects
Experience
Tags
Archive
Search

Home » Tags 🏷️

tag: inference

Sharding Large Language Models: Achieving Efficient Distributed Inference

Techniques to load LLMs on smaller GPUs and enable parallel inference using Hugging Face Accelerate

2023-09-22 llms ai inference 753 words 4 min

© 2023 Lordzuko's Blog CC BY-SA Powered by Hugo & PaperModX