Files
owlen/agents-2025-10-23.md

89 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
## fix(config): align ollama cloud defaults with current api host ✅
- **Severity:** High older defaults used the non-canonical base host and bespoke env var, forcing manual tweaks before cloud requests succeeded (`crates/owlen-core/src/config.rs`, `config.toml`, `docs/configuration.md`).
- **Resolution (2025-10-23):** Canonical base URL now defaults to `https://ollama.com` per the latest Ollama Cloud REST guide, with automatic normalisation of `https://api.ollama.com` as a legacy alias. Configuration templating, CLI/TUI helpers, and generated docs reference the vendor-provided `OLLAMA_API_KEY`, while still accepting `OLLAMA_CLOUD_API_KEY` and `OWLEN_OLLAMA_CLOUD_API_KEY` with a one-time warning.
- **Verification:** Added unit coverage for base URL normalisation, adjusted config/env migration tests, and re-ran `cargo test -p owlen-core -p owlen-providers` to confirm env fallbacks continue to work.
## fix(provider/ollama): keep stream whitespace intact
- **Severity:** High `generate_stream` trims every line so assistants lose leading newlines and indentation (`crates/owlen-providers/src/ollama/shared.rs:135` and :161).
- **Issue:** Markdown blocks and code fences arrive malformed, forcing the UI to patch formatting after the fact.
- **Plan:** Replace the naive `trim()` with CR/LF stripping that preserves user-visible whitespace, keep incomplete line fragments buffered, and add defensive logging around JSON decode failures.
- **Acceptance Criteria:** Streaming chunks can start with whitespace (code blocks render correctly); end-of-stream metadata still lands in the final message; no regressions for local or cloud daemons.
- **Test Notes:** Add unit tests that feed `\r\n`-terminated chunks with leading spaces, plus an integration test using recorded `samples.json`; run `cargo test -p owlen-providers`.
## feat(provider/ollama): wire tool calling and response metadata
- **Severity:** High Ollama models never see tool descriptors because `prepare_chat_request` drops them and `supports_tools` is hard-coded false (`crates/owlen-core/src/providers/ollama.rs:825` and :1258).
- **Issue:** The agent pipeline cannot hand off `web.search` or other tools; chat stalls when the model requests a function call.
- **Plan:** Send tool schemas to Ollama, surface tool_call deltas back into `ChatResponse`, mark capable models as tool-ready, and ensure session handling replays tool results before resuming generation.
- **Acceptance Criteria:** Models that emit tool calls loop through the MCP tool registry without manual intervention; disabling tools blanks them out; streaming transcripts keep tool boundaries.
- **Test Notes:** Add provider tests covering `tool_calls` frames, extend session streaming tests with mock tool responses, and run `cargo test -p owlen-core`.
## feat(usage): track cloud token quotas and expose :limits
- **Severity:** Medium there is no hourly/weekly accounting or `:limits` command even though the UI promises it (`crates/owlen-core/src/session.rs`, `crates/owlen-tui/src/chat_app.rs:6880` area, `crates/owlen-tui/src/ui.rs:2280` area).
- **Issue:** Users cannot see or cap consumption, and the header lacks the cloud usage line from the spec.
- **Plan:** Introduce a persisted usage ledger (rolling hour/week windows) keyed by provider, surface quotas from config, add `:limits` to the palette, and render an extra header row with colored percentages plus 80%/95% toasts.
- **Acceptance Criteria:** Usage survives restart, `:limits` prints hourly and weekly tallies, header updates after each completion, and quota breaches raise a toast once per band.
- **Test Notes:** Write time-travel unit tests for the tracker, snapshot the new header rendering, add a TUI command test, and run `cargo test -p owlen-core -p owlen-tui`.
## feat(tool/web): route searches through provider and gate on cloud availability
- **Severity:** High `web_search` hits DuckDuckGo directly (`crates/owlen-core/src/tools/web_search.rs:14`) and the registry exposes it even when the cloud provider is disabled (`crates/owlen-core/src/session.rs:245` range).
- **Issue:** In restricted environments the tool fails silently, and the model keeps saying it lacks internet access despite the stated goal.
- **Plan:** Replace the DuckDuckGo client with a provider-aware adapter that calls the configured cloud search endpoint, hide the tool unless the active provider advertises web search support, and surface clean errors for 401/429 responses.
- **Acceptance Criteria:** With a valid cloud key the assistant auto-invokes `web.search`; without cloud access the tool is absent and the model does not ask for it; detailed errors reach the toast system.
- **Test Notes:** Add Wiremock-backed integration tests for happy path, 401, and 429; assert tool registration toggles via session tests; run `cargo test -p owlen-core --tests tools`.
## feat(ui): modernize chat chrome and usage surfacing
- **Severity:** Medium the current TUI still uses heavy bordered blocks and text badges (`crates/owlen-tui/src/ui.rs:200` onward) and lacks the cockpit-style progress bars requested for context and usage.
- **Issue:** Visual hierarchy is dated, the context badge is text-only, and screenshots (`images/layout.png`) no longer match the roadmap.
- **Plan:** Refresh the header with gradient bars for context and cloud usage, adopt lighter glassmorphism-inspired panels, tighten spacing, update themes, and regenerate screenshots for docs.
- **Acceptance Criteria:** Header shows context and usage bars with color thresholds, panels use the new styling without breaking vim keybinds, and updated screenshots appear in docs.
- **Test Notes:** Update golden snapshot tests for the layout, add UI smoke tests for narrow terminals, and capture new PNGs via the existing screenshot script.
## docs(release): prep v0.2 guidance and config samples
- **Severity:** Medium public docs still advertise v0.1.11 and outdated setup steps (`README.md:6`, `docs/troubleshooting.md`, `AGENTS.md` samples out of sync with actual config schema).
- **Issue:** Users follow stale instructions, miss new keys (hourly/weekly quotas), and cannot reconcile the config example with generated files.
- **Plan:** Bump version strings to v0.2, document the new config sections (quotas, list TTL), add troubleshooting for cloud auth and rate limits, and note the revised env naming.
- **Acceptance Criteria:** README, docs, and sample configs match the implemented features; upgrade notes explain migration steps; release checklist reflects the new DoD.
- **Test Notes:** Run link checker over `docs/`, lint Markdown with `cargo xtask lint-docs`, and have CI confirm the changelog entry.
## feat(commands): expose `:web on|off` toggle in both TUI and CLI
- **Severity:** Medium there is no runtime switch for the web-search tool (`crates/owlen-tui/src/commands/registry.rs`, `crates/owlen-tui/src/chat_app.rs`), so users must hand-edit config to disable remote lookups when on metered links.
- **Issue:** Lack of a fast toggle contradicts the DoD requirement for obvious tooling controls and encourages unsafe defaults when privacy-sensitive users need a local-only session.
- **Plan:** Add `:web on|off` and `owlen providers web --enable/--disable` commands that mutate `config.tools.web_search.enabled`, broadcast the change to active sessions, and synchronise the header indicator/toast copy.
- **Acceptance Criteria:** Toggling via command takes effect immediately (tool descriptors disappear/appear for the next request), the setting persists to disk, and both CLI+TUI help list the new flags.
- **Test Notes:** Extend command registry unit tests, add an integration test that toggles and asserts tool exposure, and validate persistence with a temporary config dir.
## feat(config): align generated config with provider sections & env fallbacks
- **Severity:** Medium the sample `config.toml` and `README` snippets still use flat `[providers.ollama_cloud]` keys with `api_key_env = "OLLAMA_CLOUD_API_KEY"` and base URL `https://ollama.com` (`config.toml:16`, `README.md:119`), diverging from `crates/owlen-core/src/config.rs`.
- **Issue:** Onboarding breaks because the UI and doctor expect the `[providers.ollama_cloud]` schema with `OLLAMA_API_KEY` as the default (while warning on legacy names), so fresh installs require manual correction.
- **Plan:** Regenerate the CLI templated config, update documentation snippets, implement migration warnings for the legacy env name, and ensure doctor explains both env vars with precedence.
- **Acceptance Criteria:** `owlen config init` writes the new schema, doctor reports actionable guidance when the legacy env is present, and README/config docs match the generated file.
- **Test Notes:** Snapshot regeneration output, add config migration tests covering legacy envs, and run `cargo test -p owlen-core config` suite.
## refactor(errors): typed provider failures mapped to structured toasts
- **Severity:** High the TUI currently inspects error strings for `"429"` / `"rate limit"` substrings (`crates/owlen-tui/src/chat_app.rs:10377`) and loses context when providers change wording.
- **Issue:** String matching misses localized messages and risks leaking raw responses into the UI, preventing reliable recovery and localisation.
- **Plan:** Introduce a typed error enum in `owlen-core` (`Error::Unauthorized`, `::RateLimited`, `::Unavailable`, etc.), propagate it through providers, and map the variants to toast/banners with deterministic copy.
- **Acceptance Criteria:** Cloud 401/429 cases surface the new typed variants, the UI no longer string-parses errors, logs redact secrets, and analytics tests assert the mapping.
- **Test Notes:** Extend provider unit tests for HTTP status translation, add UI tests that inject each error variant, and fuzz unknown error text to ensure graceful fallback.
## test(integration): cover local/cloud/search tool happy and unhappy paths
- **Severity:** Medium there are no Wiremock-backed integration cases validating multi-provider flows; regressions slip through (`crates/owlen-core/tests` lacks HTTP stubs).
- **Issue:** Streaming/tool-call regressions and auth handling bugs reach users because only unit tests exist.
- **Plan:** Create integration tests that spin up stub servers for `/api/tags`, `/api/chat`, and `/v1/web/search`, covering 200/401/429 responses, tool-call loops, and rate-limit recovery.
- **Acceptance Criteria:** CI runs the new suite, failures are deterministic, and each scenario asserts provider status transitions plus persisted usage updates.
- **Test Notes:** Use `wiremock` for HTTP stubbing, reuse sample transcripts from `samples.json`, and document the fixtures under `tests/fixtures/`.
## chore(deps/ui): upgrade to Ratatui 0.29+ and enable gradient widgets
- **Severity:** Medium we pin `ratatui = 0.28` and `crossterm = 0.28` (`Cargo.toml`), missing gradient fills, flexible layout APIs, and native mouse handling released in 2025.
- **Issue:** Without the newer widgets we hand-roll gradients/badges and duplicate layout math, increasing maintenance debt.
- **Plan:** Bump Ratatui/Crossterm, adopt the `Gauge` gradient support for context/usage bars, replace custom layout helpers with the new `Flex` API, and ensure theme palettes cover extended colour spaces.
- **Acceptance Criteria:** Build succeeds with the upgraded stack, header widgets use the new gradient gauges, and CI snapshots reflect the refreshed rendering.
- **Test Notes:** Run `cargo test -p owlen-tui`, update golden snapshots, and manually verify mouse/resize handling in the demo app.
## chore(release): bump workspace metadata to v0.2
- **Severity:** Medium the workspace version, crates, and README badges still say 0.1.11 (`Cargo.toml`, `README.md:8`), undermining the v0.2 delivery narrative.
- **Issue:** Packaging scripts (AUR, PKGBUILD) and CLI `--version` output disagree with the roadmap, confusing users and automated installers.
- **Plan:** Update `[workspace.package].version`, propagate the bump to crate manifests, adjust badges/screenshots, and refresh packaging metadata (PKGBUILD, docs/CHANGELOG).
- **Acceptance Criteria:** `owlen --version` prints 0.2.0, README badges align, and release notes enumerate the v0.2 features.
- **Test Notes:** Run `cargo xtask lint-release`, build the Arch package locally, and smoke-test binary version output.