Files
owlen/agents-2025-10-23.md

12 KiB
Raw Blame History

fix(config): align ollama cloud defaults with current api host

  • Severity: High older defaults used the non-canonical base host and bespoke env var, forcing manual tweaks before cloud requests succeeded (crates/owlen-core/src/config.rs, config.toml, docs/configuration.md).
  • Resolution (2025-10-23): Canonical base URL now defaults to https://ollama.com per the latest Ollama Cloud REST guide, with automatic normalisation of https://api.ollama.com as a legacy alias. Configuration templating, CLI/TUI helpers, and generated docs reference the vendor-provided OLLAMA_API_KEY, while still accepting OLLAMA_CLOUD_API_KEY and OWLEN_OLLAMA_CLOUD_API_KEY with a one-time warning.
  • Verification: Added unit coverage for base URL normalisation, adjusted config/env migration tests, and re-ran cargo test -p owlen-core -p owlen-providers to confirm env fallbacks continue to work.

fix(provider/ollama): keep stream whitespace intact

  • Severity: High generate_stream trims every line so assistants lose leading newlines and indentation (crates/owlen-providers/src/ollama/shared.rs:135 and :161).
  • Issue: Markdown blocks and code fences arrive malformed, forcing the UI to patch formatting after the fact.
  • Plan: Replace the naive trim() with CR/LF stripping that preserves user-visible whitespace, keep incomplete line fragments buffered, and add defensive logging around JSON decode failures.
  • Acceptance Criteria: Streaming chunks can start with whitespace (code blocks render correctly); end-of-stream metadata still lands in the final message; no regressions for local or cloud daemons.
  • Test Notes: Add unit tests that feed \r\n-terminated chunks with leading spaces, plus an integration test using recorded samples.json; run cargo test -p owlen-providers.

feat(provider/ollama): wire tool calling and response metadata

  • Severity: High Ollama models never see tool descriptors because prepare_chat_request drops them and supports_tools is hard-coded false (crates/owlen-core/src/providers/ollama.rs:825 and :1258).
  • Issue: The agent pipeline cannot hand off web.search or other tools; chat stalls when the model requests a function call.
  • Plan: Send tool schemas to Ollama, surface tool_call deltas back into ChatResponse, mark capable models as tool-ready, and ensure session handling replays tool results before resuming generation.
  • Acceptance Criteria: Models that emit tool calls loop through the MCP tool registry without manual intervention; disabling tools blanks them out; streaming transcripts keep tool boundaries.
  • Test Notes: Add provider tests covering tool_calls frames, extend session streaming tests with mock tool responses, and run cargo test -p owlen-core.

feat(usage): track cloud token quotas and expose :limits

  • Severity: Medium there is no hourly/weekly accounting or :limits command even though the UI promises it (crates/owlen-core/src/session.rs, crates/owlen-tui/src/chat_app.rs:6880 area, crates/owlen-tui/src/ui.rs:2280 area).
  • Issue: Users cannot see or cap consumption, and the header lacks the cloud usage line from the spec.
  • Plan: Introduce a persisted usage ledger (rolling hour/week windows) keyed by provider, surface quotas from config, add :limits to the palette, and render an extra header row with colored percentages plus 80%/95% toasts.
  • Acceptance Criteria: Usage survives restart, :limits prints hourly and weekly tallies, header updates after each completion, and quota breaches raise a toast once per band.
  • Test Notes: Write time-travel unit tests for the tracker, snapshot the new header rendering, add a TUI command test, and run cargo test -p owlen-core -p owlen-tui.

feat(tool/web): route searches through provider and gate on cloud availability

  • Severity: High web_search hits DuckDuckGo directly (crates/owlen-core/src/tools/web_search.rs:14) and the registry exposes it even when the cloud provider is disabled (crates/owlen-core/src/session.rs:245 range).
  • Issue: In restricted environments the tool fails silently, and the model keeps saying it lacks internet access despite the stated goal.
  • Plan: Replace the DuckDuckGo client with a provider-aware adapter that calls the configured cloud search endpoint, hide the tool unless the active provider advertises web search support, and surface clean errors for 401/429 responses.
  • Acceptance Criteria: With a valid cloud key the assistant auto-invokes web.search; without cloud access the tool is absent and the model does not ask for it; detailed errors reach the toast system.
  • Test Notes: Add Wiremock-backed integration tests for happy path, 401, and 429; assert tool registration toggles via session tests; run cargo test -p owlen-core --tests tools.

feat(ui): modernize chat chrome and usage surfacing

  • Severity: Medium the current TUI still uses heavy bordered blocks and text badges (crates/owlen-tui/src/ui.rs:200 onward) and lacks the cockpit-style progress bars requested for context and usage.
  • Issue: Visual hierarchy is dated, the context badge is text-only, and screenshots (images/layout.png) no longer match the roadmap.
  • Plan: Refresh the header with gradient bars for context and cloud usage, adopt lighter glassmorphism-inspired panels, tighten spacing, update themes, and regenerate screenshots for docs.
  • Acceptance Criteria: Header shows context and usage bars with color thresholds, panels use the new styling without breaking vim keybinds, and updated screenshots appear in docs.
  • Test Notes: Update golden snapshot tests for the layout, add UI smoke tests for narrow terminals, and capture new PNGs via the existing screenshot script.

docs(release): prep v0.2 guidance and config samples

  • Severity: Medium public docs still advertise v0.1.11 and outdated setup steps (README.md:6, docs/troubleshooting.md, AGENTS.md samples out of sync with actual config schema).
  • Issue: Users follow stale instructions, miss new keys (hourly/weekly quotas), and cannot reconcile the config example with generated files.
  • Plan: Bump version strings to v0.2, document the new config sections (quotas, list TTL), add troubleshooting for cloud auth and rate limits, and note the revised env naming.
  • Acceptance Criteria: README, docs, and sample configs match the implemented features; upgrade notes explain migration steps; release checklist reflects the new DoD.
  • Test Notes: Run link checker over docs/, lint Markdown with cargo xtask lint-docs, and have CI confirm the changelog entry.

feat(commands): expose :web on|off toggle in both TUI and CLI

  • Severity: Medium there is no runtime switch for the web-search tool (crates/owlen-tui/src/commands/registry.rs, crates/owlen-tui/src/chat_app.rs), so users must hand-edit config to disable remote lookups when on metered links.
  • Issue: Lack of a fast toggle contradicts the DoD requirement for obvious tooling controls and encourages unsafe defaults when privacy-sensitive users need a local-only session.
  • Plan: Add :web on|off and owlen providers web --enable/--disable commands that mutate config.tools.web_search.enabled, broadcast the change to active sessions, and synchronise the header indicator/toast copy.
  • Acceptance Criteria: Toggling via command takes effect immediately (tool descriptors disappear/appear for the next request), the setting persists to disk, and both CLI+TUI help list the new flags.
  • Test Notes: Extend command registry unit tests, add an integration test that toggles and asserts tool exposure, and validate persistence with a temporary config dir.

feat(config): align generated config with provider sections & env fallbacks

  • Severity: Medium the sample config.toml and README snippets still use flat [providers.ollama_cloud] keys with api_key_env = "OLLAMA_CLOUD_API_KEY" and base URL https://ollama.com (config.toml:16, README.md:119), diverging from crates/owlen-core/src/config.rs.
  • Issue: Onboarding breaks because the UI and doctor expect the [providers.ollama_cloud] schema with OLLAMA_API_KEY as the default (while warning on legacy names), so fresh installs require manual correction.
  • Plan: Regenerate the CLI templated config, update documentation snippets, implement migration warnings for the legacy env name, and ensure doctor explains both env vars with precedence.
  • Acceptance Criteria: owlen config init writes the new schema, doctor reports actionable guidance when the legacy env is present, and README/config docs match the generated file.
  • Test Notes: Snapshot regeneration output, add config migration tests covering legacy envs, and run cargo test -p owlen-core config suite.

refactor(errors): typed provider failures mapped to structured toasts

  • Severity: High the TUI currently inspects error strings for "429" / "rate limit" substrings (crates/owlen-tui/src/chat_app.rs:10377) and loses context when providers change wording.
  • Issue: String matching misses localized messages and risks leaking raw responses into the UI, preventing reliable recovery and localisation.
  • Plan: Introduce a typed error enum in owlen-core (Error::Unauthorized, ::RateLimited, ::Unavailable, etc.), propagate it through providers, and map the variants to toast/banners with deterministic copy.
  • Acceptance Criteria: Cloud 401/429 cases surface the new typed variants, the UI no longer string-parses errors, logs redact secrets, and analytics tests assert the mapping.
  • Test Notes: Extend provider unit tests for HTTP status translation, add UI tests that inject each error variant, and fuzz unknown error text to ensure graceful fallback.

test(integration): cover local/cloud/search tool happy and unhappy paths

  • Severity: Medium there are no Wiremock-backed integration cases validating multi-provider flows; regressions slip through (crates/owlen-core/tests lacks HTTP stubs).
  • Issue: Streaming/tool-call regressions and auth handling bugs reach users because only unit tests exist.
  • Plan: Create integration tests that spin up stub servers for /api/tags, /api/chat, and /v1/web/search, covering 200/401/429 responses, tool-call loops, and rate-limit recovery.
  • Acceptance Criteria: CI runs the new suite, failures are deterministic, and each scenario asserts provider status transitions plus persisted usage updates.
  • Test Notes: Use wiremock for HTTP stubbing, reuse sample transcripts from samples.json, and document the fixtures under tests/fixtures/.

chore(deps/ui): upgrade to Ratatui 0.29+ and enable gradient widgets

  • Severity: Medium we pin ratatui = 0.28 and crossterm = 0.28 (Cargo.toml), missing gradient fills, flexible layout APIs, and native mouse handling released in 2025.
  • Issue: Without the newer widgets we hand-roll gradients/badges and duplicate layout math, increasing maintenance debt.
  • Plan: Bump Ratatui/Crossterm, adopt the Gauge gradient support for context/usage bars, replace custom layout helpers with the new Flex API, and ensure theme palettes cover extended colour spaces.
  • Acceptance Criteria: Build succeeds with the upgraded stack, header widgets use the new gradient gauges, and CI snapshots reflect the refreshed rendering.
  • Test Notes: Run cargo test -p owlen-tui, update golden snapshots, and manually verify mouse/resize handling in the demo app.

chore(release): bump workspace metadata to v0.2

  • Severity: Medium the workspace version, crates, and README badges still say 0.1.11 (Cargo.toml, README.md:8), undermining the v0.2 delivery narrative.
  • Issue: Packaging scripts (AUR, PKGBUILD) and CLI --version output disagree with the roadmap, confusing users and automated installers.
  • Plan: Update [workspace.package].version, propagate the bump to crate manifests, adjust badges/screenshots, and refresh packaging metadata (PKGBUILD, docs/CHANGELOG).
  • Acceptance Criteria: owlen --version prints 0.2.0, README badges align, and release notes enumerate the v0.2 features.
  • Test Notes: Run cargo xtask lint-release, build the Arch package locally, and smoke-test binary version output.