Skip to main content

Ollama (Local / Self-Hosted)

Ollama is typically used for local or self-hosted inference. In most cases it does not require third-party cloud API keys, but it does require a reachable service endpoint.

1. Deployment Modes

Local mode: install and run Ollama on your own device.
Remote mode: deploy Ollama on your own server and expose reachable networking.

2. Pre-checks

Target model has been pulled on the server.
Endpoint address, port, and network policy are configured correctly.

3. Configure in Mask

Select Ollama as provider.
Fill in local or remote service endpoint.
Choose an installed model and run a test.

4. Common Issues

Connection failed: endpoint unreachable, firewall, or closed ports.
Model not found: model not downloaded on server side.
Slow response: insufficient local compute or excessive concurrency.

5. Privacy Note

Local Ollama is often the easiest way to keep data on-device.
If using remote self-hosted nodes, data is sent to your server; you are responsible for network and storage security.

1. Deployment Modes
2. Pre-checks
3. Configure in Mask
4. Common Issues
5. Privacy Note