docs(release): prep v0.2 guidance and config samples

AC:\n- README badge shows 0.2.0 and highlights cloud fallback, quotas, web search.\n- Configuration docs and sample config cover list TTL, quotas, context window, and updated env guidance.\n- Troubleshooting docs explain authentication fallback and rate limit recovery.\n\nTests:\n- Attempted 'cargo xtask lint-docs' (command unavailable: no such command: xtask).
2025-10-24 12:55:17 +02:00
parent 3f6d7d56f6
commit 7e2c6ea037
4 changed files with 44 additions and 7 deletions
--- a/README.md
+++ b/README.md
@@ -3,7 +3,7 @@
 > Terminal-native assistant for running local language models with a comfortable TUI.

 ![Status](https://img.shields.io/badge/status-alpha-yellow)
-![Version](https://img.shields.io/badge/version-0.1.11-blue)
+![Version](https://img.shields.io/badge/version-0.2.0-blue)
 ![Rust](https://img.shields.io/badge/made_with-Rust-ffc832?logo=rust&logoColor=white)
 ![License](https://img.shields.io/badge/license-AGPL--3.0-blue)

@@ -39,6 +39,13 @@ The refreshed chrome introduces a cockpit-style header with live gradient gauges
 - **Non-Blocking UI Loop**: Asynchronous generation tasks and provider health checks run off-thread, keeping the TUI responsive even while streaming long replies.
 - **Guided Setup**: `owlen config doctor` upgrades legacy configs and verifies your environment in seconds.

+## Upgrading to v0.2
+
+- **Local + Cloud resiliency**: Owlen now distinguishes the on-device daemon from Ollama Cloud and gracefully falls back to local if the hosted key is missing or unauthorized. Cloud requests include `Authorization: Bearer <API_KEY>` and reuse the canonical `https://ollama.com` base URL so you no longer hit 401 loops.
+- **Context + quota cockpit**: The header shows `context used / window (percentage)` and a second gauge for hourly/weekly cloud token usage. Configure soft limits via `providers.ollama_cloud.hourly_quota_tokens` and `weekly_quota_tokens`; Owlen tracks consumption locally even when the provider omits token counters.
+- **Web search tooling**: When cloud is enabled, models can call the `web.search` tool automatically. Toggle availability at runtime with `:web on` / `:web off` if you need a local-only session.
+- **Docs & config parity**: Ship-ready config templates now include per-provider `list_ttl_secs` and `default_context_window` values, plus explicit `OLLAMA_API_KEY` guidance. Run `owlen config doctor` after upgrading from v0.1 to normalize legacy keys and receive deprecation warnings for `OLLAMA_CLOUD_API_KEY` and `OWLEN_OLLAMA_CLOUD_API_KEY`.
+
 ## Security & Privacy

 Owlen is designed to keep data local by default while still allowing controlled access to remote tooling.
--- a/config.toml
+++ b/config.toml
@@ -9,6 +9,8 @@ encrypt_local_data = true
 enabled = true
 provider_type = "ollama"
 base_url = "http://localhost:11434"
+list_ttl_secs = 60
+default_context_window = 8192

 [providers.ollama_cloud]
 enabled = false
@@ -17,6 +19,8 @@ base_url = "https://ollama.com"
 api_key_env = "OLLAMA_API_KEY"
 hourly_quota_tokens = 50000
 weekly_quota_tokens = 250000
+list_ttl_secs = 60
+default_context_window = 8192

 [providers.openai]
 enabled = false
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -126,12 +126,18 @@ This section contains a table for each provider you want to configure. Owlen now
 enabled = true
 provider_type = "ollama"
 base_url = "http://localhost:11434"
+list_ttl_secs = 60
+default_context_window = 8192

 [providers.ollama_cloud]
 enabled = false
 provider_type = "ollama_cloud"
 base_url = "https://ollama.com"
 api_key_env = "OLLAMA_API_KEY"
+hourly_quota_tokens = 50000
+weekly_quota_tokens = 250000
+list_ttl_secs = 60
+default_context_window = 8192

 [providers.openai]
 enabled = false
@@ -158,6 +164,15 @@ api_key_env = "ANTHROPIC_API_KEY"
 -   `api_key` / `api_key_env` (string, optional)
    Authentication material. Prefer `api_key_env` to reference an environment variable so secrets remain outside of the config file.

+-   `list_ttl_secs` (integer, default: `60`)
+    Time-to-live for the cached model list used by the picker. Increase it to reduce background traffic or decrease it if you rotate models frequently.
+
+-   `default_context_window` (integer, optional)
+    Expected maximum prompt length (tokens) for the provider. Owlen uses this to render the context usage gauge and warn when you approach the limit.
+
+-   `hourly_quota_tokens` / `weekly_quota_tokens` (integer, optional)
+    Soft limits that drive the cloud usage gauge and `:limits` readout. Owlen tracks actual usage locally and compares it to these thresholds to raise 80% / 95% toasts.
+
 -   `extra` (table, optional)
    Any additional, provider-specific parameters can be added here.

@@ -179,13 +194,15 @@ base_url = "https://ollama.com"
 api_key_env = "OLLAMA_API_KEY"
 hourly_quota_tokens = 50000
 weekly_quota_tokens = 250000
+list_ttl_secs = 60
+default_context_window = 8192
 ```

 Requests target the same `/api/chat` endpoint documented by Ollama and automatically include the API key using a `Bearer` authorization header. If you prefer not to store the key in the config file, either rely on `api_key_env` or export the `OLLAMA_API_KEY` environment variable manually (legacy names `OLLAMA_CLOUD_API_KEY` and `OWLEN_OLLAMA_CLOUD_API_KEY` continue to work, but Owlen will emit a warning). Owlen normalises the base URL automatically—it enforces HTTPS, trims trailing slashes, and accepts both `https://ollama.com` and `https://api.ollama.com` without rewriting the host.

-The quota fields are optional and purely informational—they are never sent to the provider. Owlen uses them to display hourly/weekly token usage in the chat header, emit pre-limit toasts at 80 % and 95 %, and power the `:limits` command. Adjust the numbers to reflect the soft limits on your account or remove the keys altogether if you do not want usage tracking.
+The quota fields are optional and purely informational—they are never sent to the provider. Owlen uses them to display hourly/weekly token usage in the chat header, emit pre-limit toasts at 80% and 95%, and power the `:limits` command. Adjust the numbers to reflect the soft limits on your account or remove the keys altogether if you do not want usage tracking.

-If your deployment exposes the web search endpoint under a different path, set `web_search_endpoint` in the same table. The default (`/api/web_search`) matches the Ollama Cloud REST API documented in the web retrieval guide.citeturn4open0
+If your deployment exposes the web search endpoint under a different path, set `web_search_endpoint` in the same table. The default (`/api/web_search`) matches the Ollama Cloud REST API documented in the web retrieval guide.

 > **Tip:** If the official `ollama signin` flow fails on Linux v0.12.3, follow the [Linux Ollama sign-in workaround](#linux-ollama-sign-in-workaround-v0123) in the troubleshooting guide to copy keys from a working machine or register them manually.

--- a/docs/troubleshooting.md
+++ b/docs/troubleshooting.md
@@ -55,10 +55,19 @@ If Owlen is not behaving as you expect, there might be an issue with your config
 If you see `Auth` errors when using the hosted service:

 1. Run `owlen cloud setup` to register your API key (with `--api-key` for non-interactive use).
-2. Use `owlen cloud status` to verify Owlen can authenticate against [Ollama Cloud](https://docs.ollama.com/cloud).
-3. Ensure `providers.ollama.api_key` is set **or** export `OLLAMA_API_KEY` (legacy: `OLLAMA_CLOUD_API_KEY` / `OWLEN_OLLAMA_CLOUD_API_KEY`) when encryption is disabled. With `privacy.encrypt_local_data = true`, the key lives in the encrypted vault and is loaded automatically.
-4. Confirm the key has access to the requested models.
-5. Avoid pasting extra quotes or whitespace into the config file—`owlen config doctor` will normalise the entry for you.
+2. Use `owlen cloud status` to verify Owlen can authenticate against [Ollama Cloud](https://docs.ollama.com/cloud) with the canonical `https://ollama.com` base URL. Override the endpoint via `providers.ollama_cloud.base_url` only if your account is pointed at a custom region.
+3. Ensure `providers.ollama_cloud.api_key` is set **or** export `OLLAMA_API_KEY` (legacy: `OLLAMA_CLOUD_API_KEY` / `OWLEN_OLLAMA_CLOUD_API_KEY`) when encryption is disabled. With `privacy.encrypt_local_data = true`, the key lives in the encrypted vault and is loaded automatically.
+4. Confirm the key has access to the requested models. Recent accounts scope access per workspace; visit <https://ollama.com/models> while signed in to double-check the SKU name.
+5. Owlen disables the cloud provider after consecutive 401/403 responses, posts a toast, and falls back to the last healthy local provider so you can keep chatting. Re-run `owlen cloud setup` and flip back with `:provider ollama_cloud` once the key is valid again.
+6. Avoid pasting extra quotes or whitespace into the config file—`owlen config doctor` will normalise the entry for you.
+
+## Ollama Cloud Rate Limits (HTTP 429)
+
+If the hosted API returns `HTTP 429 Too Many Requests`, Owlen keeps the provider enabled but surfaces a rate-limit toast and replays your message against the local provider so you do not lose work. To recover:
+
+1. Check the cockpit header or run `:limits` to see your locally tracked hourly/weekly totals. When either bar crosses 80% Owlen warns you; 95% triggers a critical toast.
+2. Raise or remove the soft quotas (`providers.ollama_cloud.hourly_quota_tokens`, `weekly_quota_tokens`) if your vendor allotment is higher, or pause cloud usage until the next window resets.
+3. If you need the cloud-only model, retry after the provider’s cooling-off period (Ollama currently resets the rate window hourly for most SKUs). Adjust `list_ttl_secs` upward if automated refreshes are consuming too many tokens.

 ### Linux Ollama Sign-In Workaround (v0.12.3)