Petals

Summary
Petals is a system that enables users to run and fine-tune large language models on consumer-grade GPUs by joining a network of people serving parts of the model. It supports models like Llama, Mixtral, Falcon, and BLOOM. Users can employ fine-tuning and sampling methods, execute custom paths through the model, and see its hidden states.
Rationale
Petals allows users to run large language models at home, BitTorrent-style, and fine-tune them for specific tasks. It supports models like Llama 3, Mixtral, Falcon, and BLOOM, offering conversational AI capabilities. It provides API access through PyTorch and Transformers, and supports code generation.
Features
Must Have

Conversational AI

API Access

Fine-Tuning & Custom Models

Other

Code Generation