docs(release): prep v0.2 guidance and config samples

AC:\n- README badge shows 0.2.0 and highlights cloud fallback, quotas, web search.\n- Configuration docs and sample config cover list TTL, quotas, context window, and updated env guidance.\n- Troubleshooting docs explain authentication fallback and rate limit recovery.\n\nTests:\n- Attempted 'cargo xtask lint-docs' (command unavailable: no such command: xtask).
This commit is contained in:
2025-10-24 12:55:17 +02:00
parent 3f6d7d56f6
commit 7e2c6ea037
4 changed files with 44 additions and 7 deletions

View File

@@ -3,7 +3,7 @@
> Terminal-native assistant for running local language models with a comfortable TUI. > Terminal-native assistant for running local language models with a comfortable TUI.
![Status](https://img.shields.io/badge/status-alpha-yellow) ![Status](https://img.shields.io/badge/status-alpha-yellow)
![Version](https://img.shields.io/badge/version-0.1.11-blue) ![Version](https://img.shields.io/badge/version-0.2.0-blue)
![Rust](https://img.shields.io/badge/made_with-Rust-ffc832?logo=rust&logoColor=white) ![Rust](https://img.shields.io/badge/made_with-Rust-ffc832?logo=rust&logoColor=white)
![License](https://img.shields.io/badge/license-AGPL--3.0-blue) ![License](https://img.shields.io/badge/license-AGPL--3.0-blue)
@@ -39,6 +39,13 @@ The refreshed chrome introduces a cockpit-style header with live gradient gauges
- **Non-Blocking UI Loop**: Asynchronous generation tasks and provider health checks run off-thread, keeping the TUI responsive even while streaming long replies. - **Non-Blocking UI Loop**: Asynchronous generation tasks and provider health checks run off-thread, keeping the TUI responsive even while streaming long replies.
- **Guided Setup**: `owlen config doctor` upgrades legacy configs and verifies your environment in seconds. - **Guided Setup**: `owlen config doctor` upgrades legacy configs and verifies your environment in seconds.
## Upgrading to v0.2
- **Local + Cloud resiliency**: Owlen now distinguishes the on-device daemon from Ollama Cloud and gracefully falls back to local if the hosted key is missing or unauthorized. Cloud requests include `Authorization: Bearer <API_KEY>` and reuse the canonical `https://ollama.com` base URL so you no longer hit 401 loops.
- **Context + quota cockpit**: The header shows `context used / window (percentage)` and a second gauge for hourly/weekly cloud token usage. Configure soft limits via `providers.ollama_cloud.hourly_quota_tokens` and `weekly_quota_tokens`; Owlen tracks consumption locally even when the provider omits token counters.
- **Web search tooling**: When cloud is enabled, models can call the `web.search` tool automatically. Toggle availability at runtime with `:web on` / `:web off` if you need a local-only session.
- **Docs & config parity**: Ship-ready config templates now include per-provider `list_ttl_secs` and `default_context_window` values, plus explicit `OLLAMA_API_KEY` guidance. Run `owlen config doctor` after upgrading from v0.1 to normalize legacy keys and receive deprecation warnings for `OLLAMA_CLOUD_API_KEY` and `OWLEN_OLLAMA_CLOUD_API_KEY`.
## Security & Privacy ## Security & Privacy
Owlen is designed to keep data local by default while still allowing controlled access to remote tooling. Owlen is designed to keep data local by default while still allowing controlled access to remote tooling.

View File

@@ -9,6 +9,8 @@ encrypt_local_data = true
enabled = true enabled = true
provider_type = "ollama" provider_type = "ollama"
base_url = "http://localhost:11434" base_url = "http://localhost:11434"
list_ttl_secs = 60
default_context_window = 8192
[providers.ollama_cloud] [providers.ollama_cloud]
enabled = false enabled = false
@@ -17,6 +19,8 @@ base_url = "https://ollama.com"
api_key_env = "OLLAMA_API_KEY" api_key_env = "OLLAMA_API_KEY"
hourly_quota_tokens = 50000 hourly_quota_tokens = 50000
weekly_quota_tokens = 250000 weekly_quota_tokens = 250000
list_ttl_secs = 60
default_context_window = 8192
[providers.openai] [providers.openai]
enabled = false enabled = false

View File

@@ -126,12 +126,18 @@ This section contains a table for each provider you want to configure. Owlen now
enabled = true enabled = true
provider_type = "ollama" provider_type = "ollama"
base_url = "http://localhost:11434" base_url = "http://localhost:11434"
list_ttl_secs = 60
default_context_window = 8192
[providers.ollama_cloud] [providers.ollama_cloud]
enabled = false enabled = false
provider_type = "ollama_cloud" provider_type = "ollama_cloud"
base_url = "https://ollama.com" base_url = "https://ollama.com"
api_key_env = "OLLAMA_API_KEY" api_key_env = "OLLAMA_API_KEY"
hourly_quota_tokens = 50000
weekly_quota_tokens = 250000
list_ttl_secs = 60
default_context_window = 8192
[providers.openai] [providers.openai]
enabled = false enabled = false
@@ -158,6 +164,15 @@ api_key_env = "ANTHROPIC_API_KEY"
- `api_key` / `api_key_env` (string, optional) - `api_key` / `api_key_env` (string, optional)
Authentication material. Prefer `api_key_env` to reference an environment variable so secrets remain outside of the config file. Authentication material. Prefer `api_key_env` to reference an environment variable so secrets remain outside of the config file.
- `list_ttl_secs` (integer, default: `60`)
Time-to-live for the cached model list used by the picker. Increase it to reduce background traffic or decrease it if you rotate models frequently.
- `default_context_window` (integer, optional)
Expected maximum prompt length (tokens) for the provider. Owlen uses this to render the context usage gauge and warn when you approach the limit.
- `hourly_quota_tokens` / `weekly_quota_tokens` (integer, optional)
Soft limits that drive the cloud usage gauge and `:limits` readout. Owlen tracks actual usage locally and compares it to these thresholds to raise 80% / 95% toasts.
- `extra` (table, optional) - `extra` (table, optional)
Any additional, provider-specific parameters can be added here. Any additional, provider-specific parameters can be added here.
@@ -179,13 +194,15 @@ base_url = "https://ollama.com"
api_key_env = "OLLAMA_API_KEY" api_key_env = "OLLAMA_API_KEY"
hourly_quota_tokens = 50000 hourly_quota_tokens = 50000
weekly_quota_tokens = 250000 weekly_quota_tokens = 250000
list_ttl_secs = 60
default_context_window = 8192
``` ```
Requests target the same `/api/chat` endpoint documented by Ollama and automatically include the API key using a `Bearer` authorization header. If you prefer not to store the key in the config file, either rely on `api_key_env` or export the `OLLAMA_API_KEY` environment variable manually (legacy names `OLLAMA_CLOUD_API_KEY` and `OWLEN_OLLAMA_CLOUD_API_KEY` continue to work, but Owlen will emit a warning). Owlen normalises the base URL automatically—it enforces HTTPS, trims trailing slashes, and accepts both `https://ollama.com` and `https://api.ollama.com` without rewriting the host. Requests target the same `/api/chat` endpoint documented by Ollama and automatically include the API key using a `Bearer` authorization header. If you prefer not to store the key in the config file, either rely on `api_key_env` or export the `OLLAMA_API_KEY` environment variable manually (legacy names `OLLAMA_CLOUD_API_KEY` and `OWLEN_OLLAMA_CLOUD_API_KEY` continue to work, but Owlen will emit a warning). Owlen normalises the base URL automatically—it enforces HTTPS, trims trailing slashes, and accepts both `https://ollama.com` and `https://api.ollama.com` without rewriting the host.
The quota fields are optional and purely informational—they are never sent to the provider. Owlen uses them to display hourly/weekly token usage in the chat header, emit pre-limit toasts at 80% and 95%, and power the `:limits` command. Adjust the numbers to reflect the soft limits on your account or remove the keys altogether if you do not want usage tracking. The quota fields are optional and purely informational—they are never sent to the provider. Owlen uses them to display hourly/weekly token usage in the chat header, emit pre-limit toasts at 80% and 95%, and power the `:limits` command. Adjust the numbers to reflect the soft limits on your account or remove the keys altogether if you do not want usage tracking.
If your deployment exposes the web search endpoint under a different path, set `web_search_endpoint` in the same table. The default (`/api/web_search`) matches the Ollama Cloud REST API documented in the web retrieval guide.citeturn4open0 If your deployment exposes the web search endpoint under a different path, set `web_search_endpoint` in the same table. The default (`/api/web_search`) matches the Ollama Cloud REST API documented in the web retrieval guide.
> **Tip:** If the official `ollama signin` flow fails on Linux v0.12.3, follow the [Linux Ollama sign-in workaround](#linux-ollama-sign-in-workaround-v0123) in the troubleshooting guide to copy keys from a working machine or register them manually. > **Tip:** If the official `ollama signin` flow fails on Linux v0.12.3, follow the [Linux Ollama sign-in workaround](#linux-ollama-sign-in-workaround-v0123) in the troubleshooting guide to copy keys from a working machine or register them manually.

View File

@@ -55,10 +55,19 @@ If Owlen is not behaving as you expect, there might be an issue with your config
If you see `Auth` errors when using the hosted service: If you see `Auth` errors when using the hosted service:
1. Run `owlen cloud setup` to register your API key (with `--api-key` for non-interactive use). 1. Run `owlen cloud setup` to register your API key (with `--api-key` for non-interactive use).
2. Use `owlen cloud status` to verify Owlen can authenticate against [Ollama Cloud](https://docs.ollama.com/cloud). 2. Use `owlen cloud status` to verify Owlen can authenticate against [Ollama Cloud](https://docs.ollama.com/cloud) with the canonical `https://ollama.com` base URL. Override the endpoint via `providers.ollama_cloud.base_url` only if your account is pointed at a custom region.
3. Ensure `providers.ollama.api_key` is set **or** export `OLLAMA_API_KEY` (legacy: `OLLAMA_CLOUD_API_KEY` / `OWLEN_OLLAMA_CLOUD_API_KEY`) when encryption is disabled. With `privacy.encrypt_local_data = true`, the key lives in the encrypted vault and is loaded automatically. 3. Ensure `providers.ollama_cloud.api_key` is set **or** export `OLLAMA_API_KEY` (legacy: `OLLAMA_CLOUD_API_KEY` / `OWLEN_OLLAMA_CLOUD_API_KEY`) when encryption is disabled. With `privacy.encrypt_local_data = true`, the key lives in the encrypted vault and is loaded automatically.
4. Confirm the key has access to the requested models. 4. Confirm the key has access to the requested models. Recent accounts scope access per workspace; visit <https://ollama.com/models> while signed in to double-check the SKU name.
5. Avoid pasting extra quotes or whitespace into the config file—`owlen config doctor` will normalise the entry for you. 5. Owlen disables the cloud provider after consecutive 401/403 responses, posts a toast, and falls back to the last healthy local provider so you can keep chatting. Re-run `owlen cloud setup` and flip back with `:provider ollama_cloud` once the key is valid again.
6. Avoid pasting extra quotes or whitespace into the config file—`owlen config doctor` will normalise the entry for you.
## Ollama Cloud Rate Limits (HTTP 429)
If the hosted API returns `HTTP 429 Too Many Requests`, Owlen keeps the provider enabled but surfaces a rate-limit toast and replays your message against the local provider so you do not lose work. To recover:
1. Check the cockpit header or run `:limits` to see your locally tracked hourly/weekly totals. When either bar crosses 80% Owlen warns you; 95% triggers a critical toast.
2. Raise or remove the soft quotas (`providers.ollama_cloud.hourly_quota_tokens`, `weekly_quota_tokens`) if your vendor allotment is higher, or pause cloud usage until the next window resets.
3. If you need the cloud-only model, retry after the providers cooling-off period (Ollama currently resets the rate window hourly for most SKUs). Adjust `list_ttl_secs` upward if automated refreshes are consuming too many tokens.
### Linux Ollama Sign-In Workaround (v0.12.3) ### Linux Ollama Sign-In Workaround (v0.12.3)