From 4ce4ac0b0eab400e9b46b6edbb18bc50e3946fd7 Mon Sep 17 00:00:00 2001 From: vikingowl Date: Sat, 18 Oct 2025 04:52:07 +0200 Subject: [PATCH] docs(agents): rewrite AGENTS.md with detailed v0.2 execution plan MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaces the original overview with a comprehensive execution plan for Owlen v0.2, including: - Provider health checks and resilient model listing for Ollama and Ollama Cloud - Cloud key‑gating, rate‑limit handling, and usage tracking - Multi‑provider model registry and UI aggregation - Session pipeline refactor for tool calls and partial updates - Robust streaming JSON parser - UI header displaying context usage percentages - Token usage tracker with hourly/weekly limits and toasts - New web.search tool wrapper and related commands - Expanded command set (`:provider`, `:model`, `:limits`, `:web`) and config sections - Additional documentation, testing guidelines, and release notes for v0.2. --- AGENTS.md | 1042 +++++++++++++---------------------------------------- 1 file changed, 249 insertions(+), 793 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 0274323..1b3359f 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,798 +1,254 @@ -# AGENTS.md - AI Agent Instructions for Owlen Development +# AGENTS.md — Owlen v0.2 Execution Plan -This document provides comprehensive context and guidelines for AI agents (Claude, GPT-4, etc.) working on the Owlen codebase. +**Focus:** Ollama (local) ✅, Ollama Cloud (with API key) ✅, context/limits UI ✅, seamless web-search tool ✅ +**Style:** Medium-sized conventional commits, each with acceptance criteria & test notes. +**Definition of Done (DoD):** -## Project Overview - -**Owlen** is a local-first, terminal-based AI assistant built in Rust using the Ratatui TUI framework. It implements a Model Context Protocol (MCP) architecture for modular tool execution and supports both local (Ollama) and cloud LLM providers. - -**Core Philosophy:** -- **Local-first**: Prioritize local LLMs (Ollama) with cloud as fallback -- **Privacy-focused**: No telemetry, user data stays on device -- **MCP-native**: All operations through MCP servers for modularity -- **Terminal-native**: Vim-style modal interaction in a beautiful TUI - -**Current Status:** v1.0 - MCP-only architecture (Phase 10 complete) - -## Architecture - -### Project Structure - -``` -owlen/ -├── crates/ -│ ├── owlen-core/ # Core types, config, provider traits -│ ├── owlen-tui/ # Ratatui-based terminal interface -│ ├── owlen-cli/ # Command-line interface -│ ├── owlen-ollama/ # Ollama provider implementation -│ ├── owlen-mcp-llm-server/ # LLM inference as MCP server -│ ├── owlen-mcp-client/ # MCP client library -│ ├── owlen-mcp-server/ # Base MCP server framework -│ ├── owlen-mcp-code-server/ # Code execution in Docker -│ └── owlen-mcp-prompt-server/ # Prompt management server -├── docs/ # Documentation -├── themes/ # TUI color themes -└── .agents/ # Agent development plans -``` - -### Key Technologies - -- **Language**: Rust 1.83+ -- **TUI**: Ratatui with Crossterm backend -- **Async Runtime**: Tokio -- **Config**: TOML (serde) -- **HTTP Client**: reqwest -- **LLM Providers**: Ollama (primary), with extensibility for OpenAI/Anthropic -- **Protocol**: JSON-RPC 2.0 over STDIO/HTTP/WebSocket - -## Current Features (v1.0) - -### Core Capabilities - -1. **MCP Architecture** (Phase 3-10 complete) - - All LLM interactions via MCP servers - - Local and remote MCP client support - - STDIO, HTTP, WebSocket transports - - Automatic failover with health checks - -2. **Provider System** - - Ollama (local and cloud) - - Configurable per-provider settings - - API key management with env variable expansion - - Model switching via TUI (`:m` command) - -3. **Agentic Loop** (ReAct pattern) - - THOUGHT → ACTION → OBSERVATION cycle - - Tool discovery and execution - - Configurable iteration limits - - Emergency stop (Ctrl+C) - -4. **Mode System** - - Chat mode: Limited tool availability - - Code mode: Full tool access - - Tool filtering by mode - - Runtime mode switching - -5. **Session Management** - - Auto-save conversations - - Session persistence with encryption - - Description generation - - Session timeout management - -6. **Security** - - Docker sandboxing for code execution - - Tool whitelisting - - Permission prompts for dangerous operations - - Network isolation options - -### TUI Features - -- Vim-style modal editing (Normal, Insert, Visual, Command modes) -- Multi-panel layout (conversation, status, input) -- Syntax highlighting for code blocks -- Theme system (10+ built-in themes) -- Scrollback history (configurable limit) -- Word wrap and visual selection - -## Development Guidelines - -### Code Style - -1. **Rust Best Practices** - - Use `rustfmt` (pre-commit hook enforced) - - Run `cargo clippy` before commits - - Prefer `Result` over `panic!` for errors - - Document public APIs with `///` comments - -2. **Error Handling** - - Use `owlen_core::Error` enum for all errors - - Chain errors with context (`.map_err(|e| Error::X(format!(...)))`) - - Never unwrap in library code (tests OK) - -3. **Async Patterns** - - All I/O operations must be async - - Use `tokio::spawn` for background tasks - - Prefer `tokio::sync::mpsc` for channels - - Always set timeouts for network operations - -4. **Testing** - - Unit tests in same file (`#[cfg(test)] mod tests`) - - Use mock implementations from `test_utils` modules - - Integration tests in `crates/*/tests/` - - All public APIs must have tests - -### File Organization - -**When editing existing files:** -1. Read the entire file first (use `Read` tool) -2. Preserve existing code style and formatting -3. Update related tests in the same commit -4. Keep changes atomic and focused - -**When creating new files:** -1. Check `crates/owlen-core/src/` for similar modules -2. Follow existing module structure -3. Add to `lib.rs` with appropriate visibility -4. Document module purpose with `//!` header - -### Configuration - -**Config file**: `~/.config/owlen/config.toml` - -Example structure: -```toml -[general] -default_provider = "ollama" -default_model = "llama3.2:latest" -enable_streaming = true - -[mcp] -# MCP is always enabled in v1.0+ - -[providers.ollama] -provider_type = "ollama" -base_url = "http://localhost:11434" - -[providers.ollama-cloud] -provider_type = "ollama-cloud" -base_url = "https://ollama.com" -api_key = "$OLLAMA_API_KEY" - -[ui] -theme = "default_dark" -word_wrap = true - -[security] -enable_sandboxing = true -allowed_tools = ["web_search", "code_exec"] -``` - -### Common Tasks - -#### Adding a New Provider - -1. Create `crates/owlen-{provider}/` crate -2. Implement `owlen_core::provider::Provider` trait -3. Add to `owlen_core::router::ProviderRouter` -4. Update config schema in `owlen_core::config` -5. Add tests with `MockProvider` pattern -6. Document in `docs/provider-implementation.md` - -#### Adding a New MCP Server - -1. Create `crates/owlen-mcp-{name}-server/` crate -2. Implement JSON-RPC 2.0 protocol handlers -3. Define tool descriptors with JSON schemas -4. Add sandboxing/security checks -5. Register in `mcp_servers` config array -6. Document tool capabilities - -#### Adding a TUI Feature - -1. Modify `crates/owlen-tui/src/chat_app.rs` -2. Update keybinding handlers -3. Extend UI rendering in `draw()` method -4. Add to help screen (`?` command) -5. Test with different terminal sizes -6. Ensure theme compatibility - -## Feature Parity Roadmap - -Based on analysis of OpenAI Codex and Claude Code, here are prioritized features to implement: - -### Phase 11: MCP Client Enhancement (HIGHEST PRIORITY) - -**Goal**: Full MCP client capabilities to access ecosystem tools - -**Features:** -1. **MCP Server Management** - - `owlen mcp add/list/remove` commands - - Three config scopes: local, project (`.mcp.json`), user - - Environment variable expansion in config - - OAuth 2.0 authentication for remote servers - -2. **MCP Resource References** - - `@github:issue://123` syntax - - `@postgres:schema://users` syntax - - Auto-completion for resources - -3. **MCP Prompts as Slash Commands** - - `/mcp__github__list_prs` - - Dynamic command registration - -**Implementation:** -- Extend `owlen-mcp-client` crate -- Add `.mcp.json` parsing to `owlen-core::config` -- Update TUI command parser for `@` and `/mcp__` syntax -- Add OAuth flow to TUI - -**Files to modify:** -- `crates/owlen-mcp-client/src/lib.rs` -- `crates/owlen-core/src/config.rs` -- `crates/owlen-tui/src/command_parser.rs` - -### Phase 12: Approval & Sandbox System (HIGHEST PRIORITY) - -**Goal**: Safe agentic behavior with user control - -**Features:** -1. **Three-tier Approval Modes** - - `suggest`: Approve ALL file writes and shell commands (default) - - `auto-edit`: Auto-approve file changes, prompt for shell - - `full-auto`: Auto-approve everything (requires Git repo) - -2. **Platform-specific Sandboxing** - - Linux: Docker with network isolation - - macOS: Apple Seatbelt (`sandbox-exec`) - - Windows: AppContainer or Job Objects - -3. **Permission Management** - - `/permissions` command in TUI - - Tool allowlist (e.g., `Edit`, `Bash(git commit:*)`) - - Stored in `.owlen/settings.json` (project) or `~/.owlen.json` (user) - -**Implementation:** -- New `owlen-core::approval` module -- Extend `owlen-core::sandbox` with platform detection -- Update `owlen-mcp-code-server` to use new sandbox -- Add permission storage to config system - -**Files to create:** -- `crates/owlen-core/src/approval.rs` -- `crates/owlen-core/src/sandbox/linux.rs` -- `crates/owlen-core/src/sandbox/macos.rs` -- `crates/owlen-core/src/sandbox/windows.rs` - -### Phase 13: Project Documentation System (HIGH PRIORITY) - -**Goal**: Massive usability improvement with project context - -**Features:** -1. **OWLEN.md System** - - `OWLEN.md` at repo root (checked into git) - - `OWLEN.local.md` (gitignored, personal) - - `~/.config/owlen/OWLEN.md` (global) - - Support nested OWLEN.md in monorepos - -2. **Auto-generation** - - `/init` command to generate project-specific OWLEN.md - - Analyze codebase structure - - Detect build system, test framework - - Suggest common commands - -3. **Live Updates** - - `#` command to add instructions to OWLEN.md - - Context-aware insertion (relevant section) - -**Contents of OWLEN.md:** -- Common bash commands -- Code style guidelines -- Testing instructions -- Core files and utilities -- Known quirks/warnings - -**Implementation:** -- New `owlen-core::project_doc` module -- File discovery algorithm (walk up directory tree) -- Markdown parser for sections -- TUI commands: `/init`, `#` - -**Files to create:** -- `crates/owlen-core/src/project_doc.rs` -- `crates/owlen-tui/src/commands/init.rs` - -### Phase 14: Non-Interactive Mode (HIGH PRIORITY) - -**Goal**: Enable CI/CD integration and automation - -**Features:** -1. **Headless Execution** - ```bash - owlen exec "fix linting errors" --approval-mode auto-edit - owlen --quiet "update CHANGELOG" --json - ``` - -2. **Environment Variables** - - `OWLEN_QUIET_MODE=1` - - `OWLEN_DISABLE_PROJECT_DOC=1` - - `OWLEN_APPROVAL_MODE=full-auto` - -3. **JSON Output** - - Structured output for parsing - - Exit codes for success/failure - - Progress events on stderr - -**Implementation:** -- New `owlen-cli` subcommand: `exec` -- Extend `owlen-core::session` with non-interactive mode -- Add JSON serialization for results -- Environment variable parsing in config - -**Files to modify:** -- `crates/owlen-cli/src/main.rs` -- `crates/owlen-core/src/session.rs` - -### Phase 15: Multi-Provider Expansion (HIGH PRIORITY) - -**Goal**: Support cloud providers while maintaining local-first - -**Providers to add:** -1. OpenAI (GPT-4, o1, o4-mini) -2. Anthropic (Claude 3.5 Sonnet, Opus) -3. Google (Gemini Ultra, Pro) -4. Mistral AI - -**Configuration:** -```toml -[providers.openai] -api_key = "${OPENAI_API_KEY}" -model = "o4-mini" -enabled = true - -[providers.anthropic] -api_key = "${ANTHROPIC_API_KEY}" -model = "claude-3-5-sonnet" -enabled = true -``` - -**Runtime Switching:** -``` -:model ollama/starcoder -:model openai/o4-mini -:model anthropic/claude-3-5-sonnet -``` - -**Implementation:** -- Create `owlen-openai`, `owlen-anthropic`, `owlen-google` crates -- Implement `Provider` trait for each -- Add runtime model switching to TUI -- Maintain Ollama as default - -**Files to create:** -- `crates/owlen-openai/src/lib.rs` -- `crates/owlen-anthropic/src/lib.rs` -- `crates/owlen-google/src/lib.rs` - -### Phase 16: Custom Slash Commands (MEDIUM PRIORITY) - -**Goal**: User and team-defined workflows - -**Features:** -1. **Command Directories** - - `~/.owlen/commands/` (user, available everywhere) - - `.owlen/commands/` (project, checked into git) - - Support `$ARGUMENTS` keyword - -2. **Example Structure** - ```markdown - # .owlen/commands/fix-github-issue.md - Please analyze and fix GitHub issue: $ARGUMENTS. - 1. Use `gh issue view` to get details - 2. Implement changes - 3. Write and run tests - 4. Create PR - ``` - -3. **TUI Integration** - - Auto-complete for custom commands - - Help text from command files - - Parameter validation - -**Implementation:** -- New `owlen-core::commands` module -- Command discovery and parsing -- Template expansion -- TUI command registration - -**Files to create:** -- `crates/owlen-core/src/commands.rs` -- `crates/owlen-tui/src/commands/custom.rs` - -### Phase 17: Plugin System (MEDIUM PRIORITY) - -**Goal**: One-command installation of tool collections - -**Features:** -1. **Plugin Structure** - ```json - { - "name": "github-workflow", - "version": "1.0.0", - "commands": [ - {"name": "pr", "file": "commands/pr.md"} - ], - "mcp_servers": [ - { - "name": "github", - "command": "${OWLEN_PLUGIN_ROOT}/bin/github-mcp" - } - ] - } - ``` - -2. **Installation** - ```bash - owlen plugin install github-workflow - owlen plugin list - owlen plugin remove github-workflow - ``` - -3. **Discovery** - - `~/.owlen/plugins/` directory - - Git repository URLs - - Plugin registry (future) - -**Implementation:** -- New `owlen-core::plugins` module -- Plugin manifest parser -- Installation/removal logic -- Sandboxing for plugin code - -**Files to create:** -- `crates/owlen-core/src/plugins.rs` -- `crates/owlen-cli/src/commands/plugin.rs` - -### Phase 18: Extended Thinking Modes (MEDIUM PRIORITY) - -**Goal**: Progressive computation budgets for complex tasks - -**Modes:** -- `think` - basic extended thinking -- `think hard` - increased computation -- `think harder` - more computation -- `ultrathink` - maximum budget - -**Implementation:** -- Extend `owlen-core::types::ChatParameters` -- Add thinking mode to TUI commands -- Configure per-provider max tokens - -**Files to modify:** -- `crates/owlen-core/src/types.rs` -- `crates/owlen-tui/src/command_parser.rs` - -### Phase 19: Git Workflow Automation (MEDIUM PRIORITY) - -**Goal**: Streamline common Git operations - -**Features:** -1. Auto-commit message generation -2. PR creation via `gh` CLI -3. Rebase conflict resolution -4. File revert operations -5. Git history analysis - -**Implementation:** -- New `owlen-mcp-git-server` crate -- Tools: `commit`, `create_pr`, `rebase`, `revert`, `history` -- Integration with TUI commands - -**Files to create:** -- `crates/owlen-mcp-git-server/src/lib.rs` - -### Phase 20: Enterprise Features (LOW PRIORITY) - -**Goal**: Team and enterprise deployment support - -**Features:** -1. **Managed Configuration** - - `/etc/owlen/managed-mcp.json` (Linux) - - Restrict user additions with `useEnterpriseMcpConfigOnly` - -2. **Audit Logging** - - Log all file writes and shell commands - - Structured JSON logs - - Tamper-proof storage - -3. **Team Collaboration** - - Shared OWLEN.md across team - - Project-scoped MCP servers in `.mcp.json` - - Approval policy enforcement - -**Implementation:** -- Extend `owlen-core::config` with managed settings -- New `owlen-core::audit` module -- Enterprise deployment documentation - -## Testing Requirements - -### Test Coverage Goals - -- **Unit tests**: 80%+ coverage for `owlen-core` -- **Integration tests**: All MCP servers, providers -- **TUI tests**: Key workflows (not pixel-perfect) - -### Test Organization - -```rust -#[cfg(test)] -mod tests { - use super::*; - use crate::provider::test_utils::MockProvider; - use crate::mcp::test_utils::MockMcpClient; - - #[test] - fn test_feature() { - // Setup - let provider = MockProvider::new(); - - // Execute - let result = provider.chat(request).await; - - // Assert - assert!(result.is_ok()); - } -} -``` - -### Running Tests - -```bash -cargo test --all # All tests -cargo test --lib -p owlen-core # Core library tests -cargo test --test integration # Integration tests -``` - -## Documentation Standards - -### Code Documentation - -1. **Module-level** (`//!` at top of file): - ```rust - //! Brief module description - //! - //! Detailed explanation of module purpose, - //! key types, and usage examples. - ``` - -2. **Public APIs** (`///` above items): - ```rust - /// Brief description - /// - /// # Arguments - /// * `arg1` - Description - /// - /// # Returns - /// Description of return value - /// - /// # Errors - /// When this function returns an error - /// - /// # Example - /// ``` - /// let result = function(arg); - /// ``` - pub fn function(arg: Type) -> Result { - // implementation - } - ``` - -3. **Private items**: Optional, use for complex logic - -### User Documentation - -Location: `docs/` directory - -Files to maintain: -- `architecture.md` - System design -- `configuration.md` - Config reference -- `migration-guide.md` - Version upgrades -- `troubleshooting.md` - Common issues -- `provider-implementation.md` - Adding providers -- `faq.md` - Frequently asked questions - -## Git Workflow - -### Branch Strategy - -- `main` - stable releases only -- `dev` - active development (default) -- `feature/*` - new features -- `fix/*` - bug fixes -- `docs/*` - documentation only - -### Commit Messages - -Follow conventional commits: - -``` -type(scope): brief description - -Detailed explanation of changes. - -Breaking changes, if any. - -🤖 Generated with [Claude Code](https://claude.com/claude-code) - -Co-Authored-By: Claude -``` - -Types: `feat`, `fix`, `docs`, `refactor`, `test`, `chore` - -### Pre-commit Hooks - -Automatically run: -- `cargo fmt` (formatting) -- `cargo check` (compilation) -- `cargo clippy` (linting) -- YAML/TOML validation -- Trailing whitespace removal - -## Performance Guidelines - -### Optimization Priorities - -1. **Startup time**: < 500ms cold start -2. **First token latency**: < 2s for local models -3. **Memory usage**: < 100MB base, < 500MB with conversation -4. **Responsiveness**: TUI redraws < 16ms (60 FPS) - -### Profiling - -```bash -cargo build --release --features profiling -valgrind --tool=callgrind target/release/owlen -kcachegrind callgrind.out.* -``` - -### Async Performance - -- Avoid blocking in async contexts -- Use `tokio::spawn` for CPU-intensive work -- Set timeouts on all network operations -- Cancel tasks on shutdown - -## Security Considerations - -### Threat Model - -**Trusted:** -- User's local machine -- User-installed Ollama models -- User configuration files - -**Untrusted:** -- MCP server responses -- Web search results -- Code execution output -- Cloud LLM responses - -### Security Measures - -1. **Input Validation** - - Sanitize all MCP tool arguments - - Validate JSON schemas strictly - - Escape shell commands - -2. **Sandboxing** - - Docker for code execution - - Network isolation - - Filesystem restrictions - -3. **Secrets Management** - - Never log API keys - - Use environment variables - - Encrypt sensitive config fields - -4. **Dependency Auditing** - ```bash - cargo audit - cargo deny check - ``` - -## Debugging Tips - -### Enable Debug Logging - -```bash -OWLEN_DEBUG_OLLAMA=1 owlen # Ollama requests -RUST_LOG=debug owlen # All debug logs -RUST_BACKTRACE=1 owlen # Stack traces -``` - -### Common Issues - -1. **Timeout on Ollama** - - Check `ollama ps` for loaded models - - Increase timeout in config - - Restart Ollama service - -2. **MCP Server Not Found** - - Verify `mcp_servers` config - - Check server binary exists - - Test server manually with STDIO - -3. **TUI Rendering Issues** - - Test in different terminals - - Check terminal size (`tput cols; tput lines`) - - Verify theme compatibility - -## Contributing - -### Before Submitting PR - -1. Run full test suite: `cargo test --all` -2. Check formatting: `cargo fmt -- --check` -3. Run linter: `cargo clippy -- -D warnings` -4. Update documentation if API changed -5. Add tests for new features -6. Update CHANGELOG.md - -### PR Description Template - -```markdown -## Summary -Brief description of changes - -## Type of Change -- [ ] Bug fix -- [ ] New feature -- [ ] Breaking change -- [ ] Documentation update - -## Testing -Describe tests performed - -## Checklist -- [ ] Tests added/updated -- [ ] Documentation updated -- [ ] CHANGELOG.md updated -- [ ] No clippy warnings -``` - -## Resources - -### External Documentation - -- [Ratatui Docs](https://ratatui.rs/) -- [Tokio Tutorial](https://tokio.rs/tokio/tutorial) -- [MCP Specification](https://modelcontextprotocol.io/) -- [Ollama API](https://github.com/ollama/ollama/blob/main/docs/api.md) - -### Internal Documentation - -- `.agents/new_phases.md` - 10-phase migration plan (completed) -- `docs/phase5-mode-system.md` - Mode system design -- `docs/migration-guide.md` - v0.x → v1.0 migration - -### Community - -- GitHub Issues: Bug reports and feature requests -- GitHub Discussions: Questions and ideas -- AUR Package: `owlen-git` (Arch Linux) - -## Version History - -- **v1.0.0** (current) - MCP-only architecture, Phase 10 complete -- **v0.2.0** - Added web search, code execution servers -- **v0.1.0** - Initial release with Ollama support - -## License - -Owlen is open source software. See LICENSE file for details. +* Local Ollama works out-of-the-box (no key required). +* Ollama Cloud is available **only when** an API key is present; no more 401 loops; clear fallback to local. +* Chat header shows **context used / context window** and **%**. +* Cloud usage shows **hourly / weekly token usage** (tracked locally; limits configurable). +* “Internet?” → the model/tooling performs web search automatically; no “I can’t access the internet” replies. --- -**Last Updated**: 2025-10-11 -**Maintained By**: Owlen Development Team -**For AI Agents**: Follow these guidelines when modifying Owlen codebase. Prioritize MCP client enhancement (Phase 11) and approval system (Phase 12) for feature parity with Codex/Claude Code while maintaining local-first philosophy. +## Release Assumptions + +* Provider config has **separate entries** for local (“ollama”) and cloud (“ollama-cloud”), both optional. +* **Cloud base URL** is config-driven (default `[Inference] https://api.ollama.com`), **local** default `http://localhost:11434`. +* Cloud requests include `Authorization: Bearer `. +* Token counts (`prompt_eval_count`, `eval_count`) are read from provider responses **if available**; else fallback to local estimation. +* Rate-limit quotas unknown → **user-configurable** (`hourly_quota_tokens`, `weekly_quota_tokens`); we still track actual usage. + +--- + +## Commit Plan (Conventional Commits) + +### 1) feat(provider/ollama): health checks, resilient model listing, friendly errors + +* **Why:** Local should “just work”; no brittle failures if model not pulled or daemon is down. +* **Implement:** + + * `GET /api/tags` with timeout & retry; on failure → user-friendly banner (“Ollama not reachable on localhost:11434”). + * If chat hits “model not found” → suggest `ollama pull ` (do **not** auto-pull). + * Pre-chat healthcheck endpoint with short timeout. +* **AC:** With no Ollama, UI shows actionable error, app stays usable. With Ollama up, models list renders within TTL. +* **Tests:** Mock 200/500/timeout; simulate missing model. + +### 2) feat(provider/ollama-cloud): separate provider with auth header & key-gating + +* **Why:** Prevent 401s; enable cloud only when key is present. +* **Implement:** + + * New provider “ollama-cloud”. + * Build `reqwest::Client` with default headers incl. `Authorization: Bearer `. + * Base URL from config (default `[Inference] https://api.ollama.com`). + * If no key: provider not registered; cloud models hidden. +* **AC:** With key → cloud models list & chat work; without key → no cloud provider listed. +* **Tests:** No key (hidden), invalid key (401 handled), valid key (happy path). + +### 3) fix(provider/ollama-cloud): handle 401/429 gracefully, auto-degrade + +* **Why:** Avoid dead UX where cloud is selectable but unusable. +* **Implement:** + + * Intercept 401/403 → mark provider “unauthorized”, show toast “Cloud key invalid; using local”. + * Intercept 429 → toast “Cloud rate limit hit; retry later”; keep provider enabled. + * Auto-fallback to last good local provider for the active chat (don’t lose message). +* **AC:** Selecting cloud with bad key never traps the user; one click recovers to local. +* **Tests:** Simulated 401/429 mid-stream & at start. + +### 4) feat(models/registry): multi-provider registry + aggregated model list + +* **Why:** Show local & cloud side-by-side, clearly labeled. +* **Implement:** + + * Registry holds N providers (local, cloud). + * Aggregate lists with `provider_tag` = `ollama` / `ollama-cloud`. + * De-dupe identical names by appending provider tag in UI (“qwen3:8b · local”, “qwen3:8b-cloud · cloud”). +* **AC:** Model picker groups by provider; switching respects selection. +* **Tests:** Only local; only cloud; both; de-dup checks. + +### 5) refactor(session): message pipeline to support tool-calls & partial updates + +* **Why:** Prepare for search tool loop and robust streaming. +* **Implement:** + + * Centralize stream parser → yields either `assistant_text` chunks **or** `tool_call` messages. + * Buffer + flush on `done` or on explicit tool boundary. +* **AC:** No regressions in normal chat; tool call boundary events visible to controller. +* **Tests:** Stream with mixed content, malformed frames, newline-split JSON. + +### 6) fix(streaming): tolerant JSON lines parser & end-of-stream semantics + +* **Why:** Avoid UI stalls on minor protocol hiccups. +* **Implement:** + + * Accept `\n`/`\r\n` delimiters; ignore empty lines; robust `done: true` handling; final metrics extraction. +* **AC:** Long completions never hang; final token counts captured. +* **Tests:** Fuzz malformed frames. + +### 7) feat(ui/header): context usage indicator (value + %), colorized thresholds + +* **Why:** Transparency: “2560 / 8192 (31%)”. +* **Implement:** + + * Track last `prompt_eval_count` (or estimate) as “context used”. + * Context window: from model metadata (if available) else config default per provider/model family. + * Header shows: `Model · Context 2.6k / 8k (33%)`. Colors: <60% normal, 60–85% warn, >85% danger. +* **AC:** Live updates after each assistant turn. +* **Tests:** Various windows & counts; edge 0%/100%. + +### 8) feat(usage): token usage tracker (hourly/weekly), local store + +* **Why:** Cloud limits visibility even without official API. +* **Implement:** + + * Append per-provider token use after each response: prompt+completion. + * Rolling sums: last 60m, last 7d. + * Persist to JSON/SQLite in app data dir; prune daily. + * Configurable quotas: `providers.ollama_cloud.hourly_quota_tokens`, `weekly_quota_tokens`. +* **AC:** `:limits` shows totals & percentages; survives restart. +* **Tests:** Time-travel tests; persistence. + +### 9) feat(ui/usage): surface hourly/weekly usage + threshold toasts + +* **Why:** Prevent surprises; act before you hit the wall. +* **Implement:** + + * Header second line (when cloud active): `Cloud usage · 12k/50k (hour) · 90k/250k (week)` with % colors. + * Toast at 80% & 95% thresholds. +* **AC:** Visuals match tracker; toasts fire once per threshold band. +* **Tests:** Threshold crossings; reset behavior. + +### 10) feat(tools): generic tool-calling loop (function-call adapter) + +* **Why:** “Use the web” without whining. +* **Implement:** + + * Internal tool schema: `{ name, parameters(JSONSchema), invoke(args)->ToolResult }`. + * Provider adapter: if model emits a tool call → pause stream, invoke tool, append `tool` message, resume. + * Guardrails: max tool hops (e.g., 3), max tokens per hop. +* **AC:** Model can call tools mid-answer; transcript shows tool result context to model (not necessarily to user). +* **Tests:** Single & chained tool calls; loop cutoff. + +### 11) feat(tool/web.search): HTTP search wrapper for cloud mode; fallback stub + +* **Why:** Default “internet” capability. +* **Implement:** + + * Tool name: `web.search` with params `{ query: string }`. + * Cloud path: call provider’s search endpoint (configurable path); include API key. + * Fallback (no cloud): disabled; model will not see the tool (prevents “I can’t” messaging). +* **AC:** Asking “what happened today in Rust X?” triggers one search call and integrates snippets into the answer. +* **Tests:** Happy path, 401/key missing (tool hidden), network failure (tool error surfaced to model). + +### 12) feat(commands): `:provider`, `:model`, `:limits`, `:web on|off` + +* **Why:** Fast control for power users. +* **Implement:** + + * `:provider` lists/switches active provider. + * `:model` lists/switches models under active provider. + * `:limits` shows tracked usage. + * `:web on|off` toggles exposing `web.search` to the model. +* **AC:** Commands work in both TUI and CLI modes. +* **Tests:** Command parsing & side-effects. + +### 13) feat(config): explicit sections & env fallbacks + +* **Why:** Clear separation + easy secrets management. +* **Implement (example):** + + ```toml + [providers.ollama] + base_url = "http://localhost:11434" + list_ttl_secs = 60 + default_context_window = 8192 + + [providers.ollama_cloud] + enabled = true + base_url = "https://api.ollama.com" # [Inference] default + api_key = "" # read from env OLLEN_OLLAMA_CLOUD_API_KEY if empty + hourly_quota_tokens = 50000 # configurable; visual only + weekly_quota_tokens = 250000 # configurable; visual only + list_ttl_secs = 60 + ``` +* **AC:** Empty key → cloud disabled; env overrides work. +* **Tests:** Env vs file precedence. + +### 14) docs(readme): setup & troubleshooting for local + cloud + +* **Why:** Reduce support pings. +* **Include:** + + * Local prerequisites; healthcheck tips; “model not found” guidance. + * Cloud key setup; common 401/429 causes; privacy note. + * Context/limits UI explainer; tool behavior & toggles. + * Upgrade notes from v0.1 → v0.2. +* **AC:** New users can configure both in <5 min. + +### 15) test(integration): local-only, cloud-only, mixed; auth & rate-limit sims + +* **Why:** Prevent regressions. +* **Implement:** + + * Wiremock stubs for cloud: `/tags`, `/chat`, search endpoint, 401/429 cases. + * Snapshot test for tool-call transcript roundtrip. +* **AC:** Green suite in CI; reproducible. + +### 16) refactor(errors): typed provider errors + UI toasts + +* **Why:** Replace generic “something broke”. +* **Implement:** Error enum: `Unavailable`, `Unauthorized`, `RateLimited`, `Timeout`, `Protocol`. Map to toasts/banners. +* **AC:** Each known failure surfaces a precise message; logs redact secrets. + +### 17) perf(models): cache model lists with TTL; invalidate on user action + +* **Why:** Reduce network churn. +* **AC:** Re-open picker doesn’t re-fetch until TTL or manual refresh. + +### 18) chore(release): bump to v0.2; changelog; package metadata + +* **AC:** `--version` shows v0.2; changelog includes above bullets. + +--- + +## Missing Features vs. Codex / Claude-Code (Queued for v0.3+) + +* **Multi-vendor providers:** OpenAI & Anthropic provider crates with function-calling parity and structured outputs. +* **Agentic coding ops:** Safe file edits, `git` ops, shell exec with approval queue. +* **“Thinking/Plan” pane:** First-class rendering of plan/reflect steps; optional auto-approve. +* **Plugin system:** Tool discovery & permissions; marketplace later. +* **IDE integration:** Minimal LSP/bridge (VS Code/JetBrains) to run Owlen as backend. +* **Retrieval/RAG:** Local project indexing; selective context injection with token budgeter. + +--- + +## Acceptance Checklist (Release Gate) + +* [ ] Local Ollama: healthcheck OK, models list OK, chat OK. +* [ ] Cloud appears only with valid key; no 401 loops; 429 shows toast. +* [ ] Header shows context value and %; updates after each turn. +* [ ] `:limits` shows hourly/weekly tallies; persists across restarts. +* [ ] Asking for current info triggers `web.search` (cloud on) and returns an answer without “I can’t access the internet”. +* [ ] Docs updated; sample config included; tests green. + +--- + +## Notes & Caveats + +* **Base URLs & endpoints:** Kept **config-driven**. Defaults are `[Inference]`. If your cloud endpoint differs, set it in `config.toml`. +* **Quotas:** No official token quota API assumed; we **track locally** and let you configure limits for UI display. +* **Tooling:** If a given model doesn’t emit tool calls, `web.search` won’t be visible to it. You can force exposure via `:web on`. + +--- + +## Lightly Opinionated Guidance (aka sparring stance) + +* Don’t auto-pull models on errors—teach users where control lives. +* Keep cloud totally opt-in; hide it on misconfig to reduce paper cuts. +* Treat tool-calling like `sudo`: short leash, clear logs, obvious toggles. +* Make the header a cockpit, not a billboard: model · context · usage; that’s it. + +If you want, I can turn this into a series of `codex apply`-ready prompts with one commit per run.