Ollama
Ollama is an open-source tool that allows users to run large language models (LLMs) locally on their machines. It provides a simple interface to download, run, and interact with state-of-the-art language models like LLaMA, Mistral, and others.
---
Table of Contents
1. [What is Ollama?](#what-is-ollama) 2. [Key Features](#key-features) 3. [How Ollama Works](#how-ollama-works) 4. [Installation](#installation) 5. [Usage](#usage) 6. [Supported Models](#supported-models) 7. [Examples](#examples) 8. [Community and Resources](#community-and-resources) 9. [Contributing](#contributing) 10. [License](#license)
---
What is Ollama?
Ollama is a tool designed to bring large language models to everyone. It enables users to:
- Run powerful AI models locally on their devices.
- Interact with models through a simple command-line interface (CLI) or API.
- Avoid dependency on cloud services for processing.
Ollama supports models such as LLaMA, Mistral, Phi, GPT-J, GPT-NeoX, and more. It is particularly useful for users who prioritize privacy, control, or offline use.
---
Key Features
- Local Execution: Models run entirely on your machine.
- Simple CLI: Easy-to-use commands for model interaction.
- Model Management: Download, update, and manage models seamlessly.
- API Integration: Expose model capabilities via a REST API.
- Cross-Platform: Works on Windows, macOS, and Linux.
- Privacy-Focused: No data sent to external servers.
---
How Ollama Works
1. Model Downloading: Users can pull models from repositories like Hugging Face or the Ollama Hub. 2. Containerization: Models are run in containers for isolation and ease of use. 3. Inference: Users can query models via the CLI or API. 4. Quantization: Ollama supports quantized models for faster performance on lower-end hardware.
---
Installation
For Linux/macOS
curl -fsSL https://ollama.com/install.sh | sh
For Windows
Download the installer from the Ollama website and run it.
---
Usage
List Available Models
ollama list
Start a Model
ollama serve # Starts the API server ollama pull llama2 # Downloads the LLaMA 2 model
Interact with a Model
ollama run llama2
Use the API
Send HTTP requests to `http://localhost:11434/api/generate`.
---
Supported Models
Ollama supports a wide range of models, including:
- LLaMA (Meta)
- Mistral (Sideline)
- Phi (Microsoft)
- GPT-J
- GPT-NeoX
- Falcon (Tiihs)
- and many more.
Check the Ollama Hub for the latest list.
---
Examples
Generate Text
ollama run llama2 "Write a poem about artificial intelligence."
Stream Output
ollama generate llama2 -p "Explain quantum computing in simple terms."
Use with Python
```python import requests
response = requests.post(
"http://localhost:11434/api/generate", json={ "model": "llama2", "prompt": "What is the meaning of life?", "stream": False }
)
print(response.json()["response"]) ```
---
Community and Resources
[1](https://github.com/jmorganca/ollama)
- Documentation: [2](https://ollama.com/docs)
- Community: Join discussions on the Ollama Forum (https://forum.ollama.com)
---
Contributing
Contributions are welcome! Check the GitHub repository for guidelines.
---
License
Ollama is released under the MIT License. See the [LICENSE](https://github.com/jmorganca/ollama/blob/main/LICENSE) file for details.
---
This page was last updated on 2025-04-08.