Ollama (Local / Self-Hosted)
Ollama is typically used for local or self-hosted inference. In most cases it does not require third-party cloud API keys, but it does require a reachable service endpoint.
1. Deployment Modes
- Local mode: install and run Ollama on your own device.
- Remote mode: deploy Ollama on your own server and expose reachable networking.
2. Pre-checks
- Target model has been pulled on the server.
- Endpoint address, port, and network policy are configured correctly.
3. Configure in Mask
- Select Ollama as provider.
- Fill in local or remote service endpoint.
- Choose an installed model and run a test.
4. Common Issues
- Connection failed: endpoint unreachable, firewall, or closed ports.
- Model not found: model not downloaded on server side.
- Slow response: insufficient local compute or excessive concurrency.
5. Privacy Note
- Local Ollama is often the easiest way to keep data on-device.
- If using remote self-hosted nodes, data is sent to your server; you are responsible for network and storage security.