diff --git a/README.md b/README.md index ba6f14b..4d92237 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ > Terminal-native assistant for running local language models with a comfortable TUI. ![Status](https://img.shields.io/badge/status-alpha-yellow) -![Version](https://img.shields.io/badge/version-0.1.11-blue) +![Version](https://img.shields.io/badge/version-0.2.0-blue) ![Rust](https://img.shields.io/badge/made_with-Rust-ffc832?logo=rust&logoColor=white) ![License](https://img.shields.io/badge/license-AGPL--3.0-blue) @@ -39,6 +39,13 @@ The refreshed chrome introduces a cockpit-style header with live gradient gauges - **Non-Blocking UI Loop**: Asynchronous generation tasks and provider health checks run off-thread, keeping the TUI responsive even while streaming long replies. - **Guided Setup**: `owlen config doctor` upgrades legacy configs and verifies your environment in seconds. +## Upgrading to v0.2 + +- **Local + Cloud resiliency**: Owlen now distinguishes the on-device daemon from Ollama Cloud and gracefully falls back to local if the hosted key is missing or unauthorized. Cloud requests include `Authorization: Bearer ` and reuse the canonical `https://ollama.com` base URL so you no longer hit 401 loops. +- **Context + quota cockpit**: The header shows `context used / window (percentage)` and a second gauge for hourly/weekly cloud token usage. Configure soft limits via `providers.ollama_cloud.hourly_quota_tokens` and `weekly_quota_tokens`; Owlen tracks consumption locally even when the provider omits token counters. +- **Web search tooling**: When cloud is enabled, models can call the `web.search` tool automatically. Toggle availability at runtime with `:web on` / `:web off` if you need a local-only session. +- **Docs & config parity**: Ship-ready config templates now include per-provider `list_ttl_secs` and `default_context_window` values, plus explicit `OLLAMA_API_KEY` guidance. Run `owlen config doctor` after upgrading from v0.1 to normalize legacy keys and receive deprecation warnings for `OLLAMA_CLOUD_API_KEY` and `OWLEN_OLLAMA_CLOUD_API_KEY`. + ## Security & Privacy Owlen is designed to keep data local by default while still allowing controlled access to remote tooling. diff --git a/config.toml b/config.toml index f472662..a236c50 100644 --- a/config.toml +++ b/config.toml @@ -9,6 +9,8 @@ encrypt_local_data = true enabled = true provider_type = "ollama" base_url = "http://localhost:11434" +list_ttl_secs = 60 +default_context_window = 8192 [providers.ollama_cloud] enabled = false @@ -17,6 +19,8 @@ base_url = "https://ollama.com" api_key_env = "OLLAMA_API_KEY" hourly_quota_tokens = 50000 weekly_quota_tokens = 250000 +list_ttl_secs = 60 +default_context_window = 8192 [providers.openai] enabled = false diff --git a/docs/configuration.md b/docs/configuration.md index b204668..9cd7a5c 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -126,12 +126,18 @@ This section contains a table for each provider you want to configure. Owlen now enabled = true provider_type = "ollama" base_url = "http://localhost:11434" +list_ttl_secs = 60 +default_context_window = 8192 [providers.ollama_cloud] enabled = false provider_type = "ollama_cloud" base_url = "https://ollama.com" api_key_env = "OLLAMA_API_KEY" +hourly_quota_tokens = 50000 +weekly_quota_tokens = 250000 +list_ttl_secs = 60 +default_context_window = 8192 [providers.openai] enabled = false @@ -158,6 +164,15 @@ api_key_env = "ANTHROPIC_API_KEY" - `api_key` / `api_key_env` (string, optional) Authentication material. Prefer `api_key_env` to reference an environment variable so secrets remain outside of the config file. +- `list_ttl_secs` (integer, default: `60`) + Time-to-live for the cached model list used by the picker. Increase it to reduce background traffic or decrease it if you rotate models frequently. + +- `default_context_window` (integer, optional) + Expected maximum prompt length (tokens) for the provider. Owlen uses this to render the context usage gauge and warn when you approach the limit. + +- `hourly_quota_tokens` / `weekly_quota_tokens` (integer, optional) + Soft limits that drive the cloud usage gauge and `:limits` readout. Owlen tracks actual usage locally and compares it to these thresholds to raise 80% / 95% toasts. + - `extra` (table, optional) Any additional, provider-specific parameters can be added here. @@ -179,13 +194,15 @@ base_url = "https://ollama.com" api_key_env = "OLLAMA_API_KEY" hourly_quota_tokens = 50000 weekly_quota_tokens = 250000 +list_ttl_secs = 60 +default_context_window = 8192 ``` Requests target the same `/api/chat` endpoint documented by Ollama and automatically include the API key using a `Bearer` authorization header. If you prefer not to store the key in the config file, either rely on `api_key_env` or export the `OLLAMA_API_KEY` environment variable manually (legacy names `OLLAMA_CLOUD_API_KEY` and `OWLEN_OLLAMA_CLOUD_API_KEY` continue to work, but Owlen will emit a warning). Owlen normalises the base URL automatically—it enforces HTTPS, trims trailing slashes, and accepts both `https://ollama.com` and `https://api.ollama.com` without rewriting the host. -The quota fields are optional and purely informational—they are never sent to the provider. Owlen uses them to display hourly/weekly token usage in the chat header, emit pre-limit toasts at 80 % and 95 %, and power the `:limits` command. Adjust the numbers to reflect the soft limits on your account or remove the keys altogether if you do not want usage tracking. +The quota fields are optional and purely informational—they are never sent to the provider. Owlen uses them to display hourly/weekly token usage in the chat header, emit pre-limit toasts at 80% and 95%, and power the `:limits` command. Adjust the numbers to reflect the soft limits on your account or remove the keys altogether if you do not want usage tracking. -If your deployment exposes the web search endpoint under a different path, set `web_search_endpoint` in the same table. The default (`/api/web_search`) matches the Ollama Cloud REST API documented in the web retrieval guide.citeturn4open0 +If your deployment exposes the web search endpoint under a different path, set `web_search_endpoint` in the same table. The default (`/api/web_search`) matches the Ollama Cloud REST API documented in the web retrieval guide. > **Tip:** If the official `ollama signin` flow fails on Linux v0.12.3, follow the [Linux Ollama sign-in workaround](#linux-ollama-sign-in-workaround-v0123) in the troubleshooting guide to copy keys from a working machine or register them manually. diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index 918b801..12664b8 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -55,10 +55,19 @@ If Owlen is not behaving as you expect, there might be an issue with your config If you see `Auth` errors when using the hosted service: 1. Run `owlen cloud setup` to register your API key (with `--api-key` for non-interactive use). -2. Use `owlen cloud status` to verify Owlen can authenticate against [Ollama Cloud](https://docs.ollama.com/cloud). -3. Ensure `providers.ollama.api_key` is set **or** export `OLLAMA_API_KEY` (legacy: `OLLAMA_CLOUD_API_KEY` / `OWLEN_OLLAMA_CLOUD_API_KEY`) when encryption is disabled. With `privacy.encrypt_local_data = true`, the key lives in the encrypted vault and is loaded automatically. -4. Confirm the key has access to the requested models. -5. Avoid pasting extra quotes or whitespace into the config file—`owlen config doctor` will normalise the entry for you. +2. Use `owlen cloud status` to verify Owlen can authenticate against [Ollama Cloud](https://docs.ollama.com/cloud) with the canonical `https://ollama.com` base URL. Override the endpoint via `providers.ollama_cloud.base_url` only if your account is pointed at a custom region. +3. Ensure `providers.ollama_cloud.api_key` is set **or** export `OLLAMA_API_KEY` (legacy: `OLLAMA_CLOUD_API_KEY` / `OWLEN_OLLAMA_CLOUD_API_KEY`) when encryption is disabled. With `privacy.encrypt_local_data = true`, the key lives in the encrypted vault and is loaded automatically. +4. Confirm the key has access to the requested models. Recent accounts scope access per workspace; visit while signed in to double-check the SKU name. +5. Owlen disables the cloud provider after consecutive 401/403 responses, posts a toast, and falls back to the last healthy local provider so you can keep chatting. Re-run `owlen cloud setup` and flip back with `:provider ollama_cloud` once the key is valid again. +6. Avoid pasting extra quotes or whitespace into the config file—`owlen config doctor` will normalise the entry for you. + +## Ollama Cloud Rate Limits (HTTP 429) + +If the hosted API returns `HTTP 429 Too Many Requests`, Owlen keeps the provider enabled but surfaces a rate-limit toast and replays your message against the local provider so you do not lose work. To recover: + +1. Check the cockpit header or run `:limits` to see your locally tracked hourly/weekly totals. When either bar crosses 80% Owlen warns you; 95% triggers a critical toast. +2. Raise or remove the soft quotas (`providers.ollama_cloud.hourly_quota_tokens`, `weekly_quota_tokens`) if your vendor allotment is higher, or pause cloud usage until the next window resets. +3. If you need the cloud-only model, retry after the provider’s cooling-off period (Ollama currently resets the rate window hourly for most SKUs). Adjust `list_ttl_secs` upward if automated refreshes are consuming too many tokens. ### Linux Ollama Sign-In Workaround (v0.12.3)