Ollama

From Open Source Ecology
Jump to navigation Jump to search

Ollama is an open-source tool that allows users to run large language models (LLMs) locally on their machines. It provides a simple interface to download, run, and interact with state-of-the-art language models like LLaMA, Mistral, and others.

---

\

What is Ollama?

Ollama is a tool designed to bring large language models to everyone. It enables users to:

  • Run powerful AI models locally on their devices.
  • Interact with models through a simple command-line interface (CLI) or API.
  • Avoid dependency on cloud services for processing.

Ollama supports models such as LLaMA, Mistral, Phi, GPT-J, GPT-NeoX, and more. It is particularly useful for users who prioritize privacy, control, or offline use.

---

Key Features

---

How Ollama Works

1. Model Downloading: Users can pull models from repositories like Hugging Face or the Ollama Hub. 2. Containerization: Models are run in containers for isolation and ease of use. 3. Inference: Users can query models via the CLI or API. 4. Quantization: Ollama supports quantized models for faster performance on lower-end hardware.

---

Installation

For Linux/macOS

curl -fsSL https://ollama.com/install.sh | sh

For Windows

Download the installer from the Ollama website and run it.

---

Usage

List Available Models

ollama list

Start a Model

ollama serve  # Starts the API server
ollama pull llama2  # Downloads the LLaMA 2 model

Interact with a Model

ollama run llama2

Use the API

Send HTTP requests to `http://localhost:11434/api/generate`.

---

Supported Models

Ollama supports a wide range of models, including:

  • LLaMA (Meta)
  • Mistral (Sideline)
  • Phi (Microsoft)
  • GPT-J
  • GPT-NeoX
  • Falcon (Tiihs)
  • and many more.

Check the Ollama Hub for the latest list.

---

Examples

Generate Text

ollama run llama2 "Write a poem about artificial intelligence."

Stream Output

ollama generate llama2 -p "Explain quantum computing in simple terms."

Use with Python

```python import requests

response = requests.post(

   "http://localhost:11434/api/generate",
   json={
       "model": "llama2",
       "prompt": "What is the meaning of life?",
       "stream": False
   }

)

print(response.json()["response"]) ```

---

Community and Resources

[1](https://github.com/jmorganca/ollama)

---

Contributing

Contributions are welcome! Check the GitHub repository for guidelines.

---

License

Ollama is released under the MIT License. See the [LICENSE](https://github.com/jmorganca/ollama/blob/main/LICENSE) file for details.

---

This page was last updated on 2025-04-08.