Files
owlen/agents-2025-10-25.md
vikingowl 3c6e689de9 docs(mcp): benchmark leading client ecosystems
Acceptance-Criteria:\n- docs cite the MCP identifier regex and enumerate the combined connector bundle.\n- legacy dotted identifiers are removed from the plan in favour of compliant names.

Test-Notes:\n- docs-only change; no automated tests required.
2025-10-25 06:31:05 +02:00

10 KiB
Raw Blame History

fix(markdown): restore ratatui bold assertions after API change

  • Severity: High — cargo test and cargo clippy -- -D warnings fail because Style::contains no longer exists in ratatui 0.29, breaking owlen-markdown test builds (crates/owlen-markdown/src/lib.rs:254 and :268).
  • Issue: The tests still rely on the old Style::contains helper instead of checking the add_modifier / sub_modifier bitflags exposed by the current ratatui Style API.
  • Plan: Replace the obsolete calls with span.style.add_modifier.contains(Modifier::BOLD) (and the italic equivalent if added later), add a regression test covering italic detection, and run cargo test + cargo clippy --all-targets.
  • Acceptance Criteria: Workspace tests and clippy finish cleanly; markdown heading and inline-code styling still render bold markers in the snapshot helper.
  • Test Notes: cargo test -p owlen-markdown, cargo clippy -p owlen-markdown --tests -- -D warnings.

fix(commands): surface :limits and :web toggles in command palette & help

  • Severity: Medium — users cannot discover the new quota tooling because CommandSpec omits :limits and the help footer still lacks the entry (crates/owlen-tui/src/commands/mod.rs, crates/owlen-tui/src/ui.rs:4230 region).
  • Issue: The command dispatcher handles limits/web but the suggestion catalog and help overlay stayed on the pre-v0.2 list, so completions and docs are stale.
  • Plan: Add CommandSpec entries for limits, web on, web off, web status, and ensure the help overlays shortcuts table advertises both :limits and the web toggle; expand the suggestion unit test to guard the set.
  • Acceptance Criteria: Typing :li suggests limits; F1 help shows both the quota readout and the web toggle; no regressions in existing command palette behaviour.
  • Test Notes: cargo test -p owlen-tui tests::suggestions_prioritize_agent_start, plus a new snapshot/assertion for the updated command list.

docs(mcp): benchmark leading MCP client ecosystems

  • Severity: High — Owlens docs still reference web.search and a narrow local catalog while contemporary Model Context Protocol deployments require spec-compliant identifiers and broad connector bundles spanning filesystem, terminal, browser, fetch, git, python, notebooks, sequential reasoning, search, storage, and productivity SaaS. Without parity guidance we keep misaligning expectations and interop.citeturn12open0turn11open0turn13open0
  • Issue: The protocol enforces ^[A-Za-z0-9_-]{1,64}$ tool identifiers, yet our samples still show dotted aliases, and we never document the published connector stacks covering automation, web access, code, data stores, and collaboration that power the reference implementations. Owlen lacks a migration path to that combined bundle.citeturn12open0turn11open0turn13open0
  • Plan:
    1. Publish a naming quick-reference describing the regex, why dotted IDs fail validation, and how to qualify tools while staying under the 64-character limit.citeturn12open0
    2. Add a parity matrix enumerating the combined connector bundle (filesystem, terminal, browser, fetch, git, python, notebook, sequential_thinking, brave_search, firecrawl, puppeteer, memory_bank, slack, notion, github, google_drive, s3, sqlite, qdrant, stripe, sentry) with columns for binaries, auth scopes, environment variables, and Owlen preset commands.citeturn11open0turn13open0
    3. Update README/CHANGELOG/config samples to walk through enabling the preset (install command, scopes, toggles) and note that legacy aliases are removed in favor of spec-compliant identifiers.citeturn11open0turn13open0
    4. Document troubleshooting for validation failures, missing dependencies, and audit outputs produced by the new preset installer so teams can reconcile gaps quickly.citeturn11open0turn13open0
  • Acceptance Criteria: Docs cite the spec naming rule, list the parity connectors with actionable setup steps, and remove references to dotted or deprecated tool names.
  • Test Notes: Run doc link checks and dry-run the install/audit commands referenced in the samples to confirm validation passes without warnings.

feat(mcp): enforce spec-compliant registry & cross-client parity

  • Severity: High — Contemporary MCP deployments reject dotted tool names and rely on spec-compliant, qualified identifiers, so Owlen must enforce the same rules or bundled connectors will fail at runtime.citeturn12open0turn11open0turn13open0
  • Issue: Tool registration still accepts invalid identifiers, lacks telemetry on legacy aliases, and provides no helpers for industry qualification schemes. We also diverge from current SDK adapters, risking drift whenever those implementations evolve.
  • Plan:
    1. Harden ToolRegistry validation (regex enforcement, structured warnings, telemetry) and reject dotted names outright.
    2. Ship qualification utilities to generate {server}__{tool} identifiers while keeping within the 64-character limit.
    3. Reuse or vendor the definitive MCP SDK adapters for filesystem, terminal, git/browser, python, notebook, observability, and productivity connectors.
    4. Add conformance tests that simulate mainstream client handshake flows (WireMock fixtures), assert naming compliance, and verify telemetry events.
    1. Provide scripts/telemetry dashboards to identify and immediately remove any in-tree tools that remain non-compliant.
  • Acceptance Criteria: Invalid identifiers are rejected by default, telemetry confirms no legacy aliases remain, reference connectors load without manual tweaks, and test suites cover qualification rules.
  • Test Notes: cargo test -p owlen-core tool_registry::validates_names; integration suites for reference handshake flows; telemetry snapshot tests for alias usage.

feat(mcp): bundle reference toolchains & audit legacy tools

  • Severity: High — achieving parity now means shipping the union of the published MCP connector bundles while pruning weaker in-app equivalents.citeturn11open0turn13open0
  • Issue: Owlen offers no preset installer, its existing tools may lag the external bundles, and theres no automated audit to retire subpar implementations.
  • Plan:
    1. Build combined presets (standard, extended, full) that install the documented connector set with capability metadata (network, filesystem, auth scopes).
    1. Implement :tools install <preset> and :tools audit commands to materialize presets, diff existing tools, and delete redundant or non-compliant entries automatically (with confirmation).
    1. Add health checks and smoke tests for each connector (filesystem, HTTP/browser, git, python, notebook, observability, productivity SaaS) to ensure parity behavior.
    1. Immediately remove or rewrite any Owlen-native tool that fails the audit.
  • Acceptance Criteria: Presets install without manual edits, audits leave no legacy tools enabled, smoke tests cover each connector, and release notes reflect the pared-down toolset.
  • Test Notes: Integration harness invoking the shared SDK shims; golden transcripts per connector; CI audit report artifacts.

feat(ui/chat): polish header gauges and message chrome

  • Severity: Medium — the “cockpit” header hardcodes a 60/40 split and flat gauge bars, while message cards share a single background, leaving the refreshed styling half-done (crates/owlen-tui/src/ui.rs:360-720, :1995-2230).
  • Plan:
    1. Introduce responsive gauge layouts that collapse into stacked cards below ~80 columns and add subtle divider lines instead of raw paragraphs.
    2. Apply rounded blocks with translucent overlays per message bubble (alternate colors for user/assistant/tool) while keeping the glass palette shadow wrapper.
    3. Surface provider status chips (/⚠️) next to the headers provider label and show textual thresholds for 60%/85% to satisfy non-color users.
  • Acceptance Criteria: Header adapts between ≥100 cols (two-column gauges) and <80 cols (stacked cards); message list renders alternating glass cards without clipping; provider status chip updates as health changes.
  • Test Notes: Add a ratatui snapshot test capturing header + first messages at 100×35 and 80×24; manual check with cargo run -p owlen-tui -- --mock.

test(tui): add golden streaming flows for chat + tool calls

  • Severity: Medium — current integration tests cover controller logic but never assert rendered frames, leaving regressions in modernized UI undetected (crates/owlen-tui/tests only exercise business logic).
  • Issue: Without framebuffer snapshots we cant guarantee the new glass header, usage gauges, or tool call badges survive refactors.
  • Plan: Use ratatui::backend::TestBackend with deterministic seeds to render the chat screen for: idle state, streaming assistant reply, tool call request/response, and quota toast. Store fixtures with insta (or similar) under crates/owlen-tui/tests/snapshots.
  • Acceptance Criteria: Failing snapshots catch layout regressions; CI exercise runs under cargo test -p owlen-tui --tests.
  • Test Notes: New tests render_chat_idle_snapshot, render_chat_tool_call_snapshot.

feat(accessibility): add high-contrast & reduced-chrome display modes

  • Severity: Medium — accessibility guidance for terminals recommends controllable contrast, focus outlines, and non-color status cues, which the current glass theme only partially addresses (themes/ palette + render_chat_header logic).
  • Issue: Color-only indicators (“Cloud usage pending”, gradient bars) conflict with guidelines for low-vision users; no toggle exists for simplified chrome.
  • Plan:
    1. Introduce a ui.accessibility config block with high_contrast and reduced_chrome flags that swap to the bundled grayscale theme and suppress glass shadows.
    2. Add header tooltips/legends with textual quota bands and expose a :accessibility command to cycle presets.
    3. Document screen-reader friendly workflows (keyboard-only focus hints) drawing on current industry guidance.
  • Acceptance Criteria: Toggling accessibility mode updates colors without restarting; header shows textual quota bands; documentation explains the new options.
  • Test Notes: Unit-test config defaults; snapshot tests covering high-contrast mode; manual verification with OWLEN_ACCESSIBILITY=high-contrast.