Quickstart
LocalAI is a free, open-source alternative to OpenAI (Anthropic, etc.), functioning as a drop-in replacement REST API for local inferencing. It allows you to run LLMs, generate images, and produce audio, all locally or on-premises with consumer-grade hardware, supporting multiple model families and architectures.
Tip
Security considerations
If you are exposing LocalAI remotely, make sure you protect the API endpoints adequately with a mechanism which allows to protect from the incoming traffic or alternatively, run LocalAI with API_KEY to gate the access with an API key. The API key guarantees a total access to the features (there is no role separation), and it is to be considered as likely as an admin role.
Quickstart
This guide assumes you have already installed LocalAI. If you haven’t installed it yet, see the Installation guide first.
Starting LocalAI
Once installed, start LocalAI. For Docker installations:
The API will be available at http://localhost:8080.
Downloading models on start
When starting LocalAI (either via Docker or via CLI) you can specify as argument a list of models to install automatically before starting the API, for example:
Tip
Automatic Backend Detection: When you install models from the gallery or YAML files, LocalAI automatically detects your system’s GPU capabilities (NVIDIA, AMD, Intel) and downloads the appropriate backend. For advanced configuration options, see GPU Acceleration.
For a full list of options, you can run LocalAI with --help or refer to the Linux Installation guide for installer configuration options.
Using LocalAI and the full stack with LocalAGI
LocalAI is part of the Local family stack, along with LocalAGI and LocalRecall.
LocalAGI is a powerful, self-hostable AI Agent platform designed for maximum privacy and flexibility which encompassess and uses all the software stack. It provides a complete drop-in replacement for OpenAI’s Responses APIs with advanced agentic capabilities, working entirely locally on consumer-grade hardware (CPU and GPU).
Quick Start
Key Features
- Privacy-Focused: All processing happens locally, ensuring your data never leaves your machine
- Flexible Deployment: Supports CPU, NVIDIA GPU, and Intel GPU configurations
- Multiple Model Support: Compatible with various models from Hugging Face and other sources
- Web Interface: User-friendly chat interface for interacting with AI agents
- Advanced Capabilities: Supports multimodal models, image generation, and more
- Docker Integration: Easy deployment using Docker Compose
Environment Variables
You can customize your LocalAGI setup using the following environment variables:
MODEL_NAME: Specify the model to use (e.g.,gemma-3-12b-it)MULTIMODAL_MODEL: Set a custom multimodal modelIMAGE_MODEL: Configure an image generation model
For more advanced configuration and API documentation, visit the LocalAGI GitHub repository.
What’s Next?
There is much more to explore with LocalAI! You can run any model from Hugging Face, perform video generation, and also voice cloning. For a comprehensive overview, check out the features section.
Explore additional resources and community contributions: