From 4ce4ac0b0eab400e9b46b6edbb18bc50e3946fd7 Mon Sep 17 00:00:00 2001
From: vikingowl <christian@nachtigall.dev>
Date: Sat, 18 Oct 2025 04:52:07 +0200
Subject: [PATCH] docs(agents): rewrite AGENTS.md with detailed v0.2 execution
 plan
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Replaces the original overview with a comprehensive execution plan for Owlen v0.2, including:
- Provider health checks and resilient model listing for Ollama and Ollama Cloud
- Cloud key‑gating, rate‑limit handling, and usage tracking
- Multi‑provider model registry and UI aggregation
- Session pipeline refactor for tool calls and partial updates
- Robust streaming JSON parser
- UI header displaying context usage percentages
- Token usage tracker with hourly/weekly limits and toasts
- New web.search tool wrapper and related commands
- Expanded command set (`:provider`, `:model`, `:limits`, `:web`) and config sections
- Additional documentation, testing guidelines, and release notes for v0.2.
---
 AGENTS.md | 1042 +++++++++++++----------------------------------------
 1 file changed, 249 insertions(+), 793 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
index 0274323..1b3359f 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -1,798 +1,254 @@
-# AGENTS.md - AI Agent Instructions for Owlen Development
+# AGENTS.md — Owlen v0.2 Execution Plan
 
-This document provides comprehensive context and guidelines for AI agents (Claude, GPT-4, etc.) working on the Owlen codebase.
+**Focus:** Ollama (local) ✅, Ollama Cloud (with API key) ✅, context/limits UI ✅, seamless web-search tool ✅
+**Style:** Medium-sized conventional commits, each with acceptance criteria & test notes.
+**Definition of Done (DoD):**
 
-## Project Overview
-
-**Owlen** is a local-first, terminal-based AI assistant built in Rust using the Ratatui TUI framework. It implements a Model Context Protocol (MCP) architecture for modular tool execution and supports both local (Ollama) and cloud LLM providers.
-
-**Core Philosophy:**
-- **Local-first**: Prioritize local LLMs (Ollama) with cloud as fallback
-- **Privacy-focused**: No telemetry, user data stays on device
-- **MCP-native**: All operations through MCP servers for modularity
-- **Terminal-native**: Vim-style modal interaction in a beautiful TUI
-
-**Current Status:** v1.0 - MCP-only architecture (Phase 10 complete)
-
-## Architecture
-
-### Project Structure
-
-```
-owlen/
-├── crates/
-│   ├── owlen-core/          # Core types, config, provider traits
-│   ├── owlen-tui/           # Ratatui-based terminal interface
-│   ├── owlen-cli/           # Command-line interface
-│   ├── owlen-ollama/        # Ollama provider implementation
-│   ├── owlen-mcp-llm-server/    # LLM inference as MCP server
-│   ├── owlen-mcp-client/        # MCP client library
-│   ├── owlen-mcp-server/        # Base MCP server framework
-│   ├── owlen-mcp-code-server/   # Code execution in Docker
-│   └── owlen-mcp-prompt-server/ # Prompt management server
-├── docs/                    # Documentation
-├── themes/                  # TUI color themes
-└── .agents/                 # Agent development plans
-```
-
-### Key Technologies
-
-- **Language**: Rust 1.83+
-- **TUI**: Ratatui with Crossterm backend
-- **Async Runtime**: Tokio
-- **Config**: TOML (serde)
-- **HTTP Client**: reqwest
-- **LLM Providers**: Ollama (primary), with extensibility for OpenAI/Anthropic
-- **Protocol**: JSON-RPC 2.0 over STDIO/HTTP/WebSocket
-
-## Current Features (v1.0)
-
-### Core Capabilities
-
-1. **MCP Architecture** (Phase 3-10 complete)
-   - All LLM interactions via MCP servers
-   - Local and remote MCP client support
-   - STDIO, HTTP, WebSocket transports
-   - Automatic failover with health checks
-
-2. **Provider System**
-   - Ollama (local and cloud)
-   - Configurable per-provider settings
-   - API key management with env variable expansion
-   - Model switching via TUI (`:m` command)
-
-3. **Agentic Loop** (ReAct pattern)
-   - THOUGHT → ACTION → OBSERVATION cycle
-   - Tool discovery and execution
-   - Configurable iteration limits
-   - Emergency stop (Ctrl+C)
-
-4. **Mode System**
-   - Chat mode: Limited tool availability
-   - Code mode: Full tool access
-   - Tool filtering by mode
-   - Runtime mode switching
-
-5. **Session Management**
-   - Auto-save conversations
-   - Session persistence with encryption
-   - Description generation
-   - Session timeout management
-
-6. **Security**
-   - Docker sandboxing for code execution
-   - Tool whitelisting
-   - Permission prompts for dangerous operations
-   - Network isolation options
-
-### TUI Features
-
-- Vim-style modal editing (Normal, Insert, Visual, Command modes)
-- Multi-panel layout (conversation, status, input)
-- Syntax highlighting for code blocks
-- Theme system (10+ built-in themes)
-- Scrollback history (configurable limit)
-- Word wrap and visual selection
-
-## Development Guidelines
-
-### Code Style
-
-1. **Rust Best Practices**
-   - Use `rustfmt` (pre-commit hook enforced)
-   - Run `cargo clippy` before commits
-   - Prefer `Result` over `panic!` for errors
-   - Document public APIs with `///` comments
-
-2. **Error Handling**
-   - Use `owlen_core::Error` enum for all errors
-   - Chain errors with context (`.map_err(|e| Error::X(format!(...)))`)
-   - Never unwrap in library code (tests OK)
-
-3. **Async Patterns**
-   - All I/O operations must be async
-   - Use `tokio::spawn` for background tasks
-   - Prefer `tokio::sync::mpsc` for channels
-   - Always set timeouts for network operations
-
-4. **Testing**
-   - Unit tests in same file (`#[cfg(test)] mod tests`)
-   - Use mock implementations from `test_utils` modules
-   - Integration tests in `crates/*/tests/`
-   - All public APIs must have tests
-
-### File Organization
-
-**When editing existing files:**
-1. Read the entire file first (use `Read` tool)
-2. Preserve existing code style and formatting
-3. Update related tests in the same commit
-4. Keep changes atomic and focused
-
-**When creating new files:**
-1. Check `crates/owlen-core/src/` for similar modules
-2. Follow existing module structure
-3. Add to `lib.rs` with appropriate visibility
-4. Document module purpose with `//!` header
-
-### Configuration
-
-**Config file**: `~/.config/owlen/config.toml`
-
-Example structure:
-```toml
-[general]
-default_provider = "ollama"
-default_model = "llama3.2:latest"
-enable_streaming = true
-
-[mcp]
-# MCP is always enabled in v1.0+
-
-[providers.ollama]
-provider_type = "ollama"
-base_url = "http://localhost:11434"
-
-[providers.ollama-cloud]
-provider_type = "ollama-cloud"
-base_url = "https://ollama.com"
-api_key = "$OLLAMA_API_KEY"
-
-[ui]
-theme = "default_dark"
-word_wrap = true
-
-[security]
-enable_sandboxing = true
-allowed_tools = ["web_search", "code_exec"]
-```
-
-### Common Tasks
-
-#### Adding a New Provider
-
-1. Create `crates/owlen-{provider}/` crate
-2. Implement `owlen_core::provider::Provider` trait
-3. Add to `owlen_core::router::ProviderRouter`
-4. Update config schema in `owlen_core::config`
-5. Add tests with `MockProvider` pattern
-6. Document in `docs/provider-implementation.md`
-
-#### Adding a New MCP Server
-
-1. Create `crates/owlen-mcp-{name}-server/` crate
-2. Implement JSON-RPC 2.0 protocol handlers
-3. Define tool descriptors with JSON schemas
-4. Add sandboxing/security checks
-5. Register in `mcp_servers` config array
-6. Document tool capabilities
-
-#### Adding a TUI Feature
-
-1. Modify `crates/owlen-tui/src/chat_app.rs`
-2. Update keybinding handlers
-3. Extend UI rendering in `draw()` method
-4. Add to help screen (`?` command)
-5. Test with different terminal sizes
-6. Ensure theme compatibility
-
-## Feature Parity Roadmap
-
-Based on analysis of OpenAI Codex and Claude Code, here are prioritized features to implement:
-
-### Phase 11: MCP Client Enhancement (HIGHEST PRIORITY)
-
-**Goal**: Full MCP client capabilities to access ecosystem tools
-
-**Features:**
-1. **MCP Server Management**
-   - `owlen mcp add/list/remove` commands
-   - Three config scopes: local, project (`.mcp.json`), user
-   - Environment variable expansion in config
-   - OAuth 2.0 authentication for remote servers
-
-2. **MCP Resource References**
-   - `@github:issue://123` syntax
-   - `@postgres:schema://users` syntax
-   - Auto-completion for resources
-
-3. **MCP Prompts as Slash Commands**
-   - `/mcp__github__list_prs`
-   - Dynamic command registration
-
-**Implementation:**
-- Extend `owlen-mcp-client` crate
-- Add `.mcp.json` parsing to `owlen-core::config`
-- Update TUI command parser for `@` and `/mcp__` syntax
-- Add OAuth flow to TUI
-
-**Files to modify:**
-- `crates/owlen-mcp-client/src/lib.rs`
-- `crates/owlen-core/src/config.rs`
-- `crates/owlen-tui/src/command_parser.rs`
-
-### Phase 12: Approval & Sandbox System (HIGHEST PRIORITY)
-
-**Goal**: Safe agentic behavior with user control
-
-**Features:**
-1. **Three-tier Approval Modes**
-   - `suggest`: Approve ALL file writes and shell commands (default)
-   - `auto-edit`: Auto-approve file changes, prompt for shell
-   - `full-auto`: Auto-approve everything (requires Git repo)
-
-2. **Platform-specific Sandboxing**
-   - Linux: Docker with network isolation
-   - macOS: Apple Seatbelt (`sandbox-exec`)
-   - Windows: AppContainer or Job Objects
-
-3. **Permission Management**
-   - `/permissions` command in TUI
-   - Tool allowlist (e.g., `Edit`, `Bash(git commit:*)`)
-   - Stored in `.owlen/settings.json` (project) or `~/.owlen.json` (user)
-
-**Implementation:**
-- New `owlen-core::approval` module
-- Extend `owlen-core::sandbox` with platform detection
-- Update `owlen-mcp-code-server` to use new sandbox
-- Add permission storage to config system
-
-**Files to create:**
-- `crates/owlen-core/src/approval.rs`
-- `crates/owlen-core/src/sandbox/linux.rs`
-- `crates/owlen-core/src/sandbox/macos.rs`
-- `crates/owlen-core/src/sandbox/windows.rs`
-
-### Phase 13: Project Documentation System (HIGH PRIORITY)
-
-**Goal**: Massive usability improvement with project context
-
-**Features:**
-1. **OWLEN.md System**
-   - `OWLEN.md` at repo root (checked into git)
-   - `OWLEN.local.md` (gitignored, personal)
-   - `~/.config/owlen/OWLEN.md` (global)
-   - Support nested OWLEN.md in monorepos
-
-2. **Auto-generation**
-   - `/init` command to generate project-specific OWLEN.md
-   - Analyze codebase structure
-   - Detect build system, test framework
-   - Suggest common commands
-
-3. **Live Updates**
-   - `#` command to add instructions to OWLEN.md
-   - Context-aware insertion (relevant section)
-
-**Contents of OWLEN.md:**
-- Common bash commands
-- Code style guidelines
-- Testing instructions
-- Core files and utilities
-- Known quirks/warnings
-
-**Implementation:**
-- New `owlen-core::project_doc` module
-- File discovery algorithm (walk up directory tree)
-- Markdown parser for sections
-- TUI commands: `/init`, `#`
-
-**Files to create:**
-- `crates/owlen-core/src/project_doc.rs`
-- `crates/owlen-tui/src/commands/init.rs`
-
-### Phase 14: Non-Interactive Mode (HIGH PRIORITY)
-
-**Goal**: Enable CI/CD integration and automation
-
-**Features:**
-1. **Headless Execution**
-   ```bash
-   owlen exec "fix linting errors" --approval-mode auto-edit
-   owlen --quiet "update CHANGELOG" --json
-   ```
-
-2. **Environment Variables**
-   - `OWLEN_QUIET_MODE=1`
-   - `OWLEN_DISABLE_PROJECT_DOC=1`
-   - `OWLEN_APPROVAL_MODE=full-auto`
-
-3. **JSON Output**
-   - Structured output for parsing
-   - Exit codes for success/failure
-   - Progress events on stderr
-
-**Implementation:**
-- New `owlen-cli` subcommand: `exec`
-- Extend `owlen-core::session` with non-interactive mode
-- Add JSON serialization for results
-- Environment variable parsing in config
-
-**Files to modify:**
-- `crates/owlen-cli/src/main.rs`
-- `crates/owlen-core/src/session.rs`
-
-### Phase 15: Multi-Provider Expansion (HIGH PRIORITY)
-
-**Goal**: Support cloud providers while maintaining local-first
-
-**Providers to add:**
-1. OpenAI (GPT-4, o1, o4-mini)
-2. Anthropic (Claude 3.5 Sonnet, Opus)
-3. Google (Gemini Ultra, Pro)
-4. Mistral AI
-
-**Configuration:**
-```toml
-[providers.openai]
-api_key = "${OPENAI_API_KEY}"
-model = "o4-mini"
-enabled = true
-
-[providers.anthropic]
-api_key = "${ANTHROPIC_API_KEY}"
-model = "claude-3-5-sonnet"
-enabled = true
-```
-
-**Runtime Switching:**
-```
-:model ollama/starcoder
-:model openai/o4-mini
-:model anthropic/claude-3-5-sonnet
-```
-
-**Implementation:**
-- Create `owlen-openai`, `owlen-anthropic`, `owlen-google` crates
-- Implement `Provider` trait for each
-- Add runtime model switching to TUI
-- Maintain Ollama as default
-
-**Files to create:**
-- `crates/owlen-openai/src/lib.rs`
-- `crates/owlen-anthropic/src/lib.rs`
-- `crates/owlen-google/src/lib.rs`
-
-### Phase 16: Custom Slash Commands (MEDIUM PRIORITY)
-
-**Goal**: User and team-defined workflows
-
-**Features:**
-1. **Command Directories**
-   - `~/.owlen/commands/` (user, available everywhere)
-   - `.owlen/commands/` (project, checked into git)
-   - Support `$ARGUMENTS` keyword
-
-2. **Example Structure**
-   ```markdown
-   # .owlen/commands/fix-github-issue.md
-   Please analyze and fix GitHub issue: $ARGUMENTS.
-   1. Use `gh issue view` to get details
-   2. Implement changes
-   3. Write and run tests
-   4. Create PR
-   ```
-
-3. **TUI Integration**
-   - Auto-complete for custom commands
-   - Help text from command files
-   - Parameter validation
-
-**Implementation:**
-- New `owlen-core::commands` module
-- Command discovery and parsing
-- Template expansion
-- TUI command registration
-
-**Files to create:**
-- `crates/owlen-core/src/commands.rs`
-- `crates/owlen-tui/src/commands/custom.rs`
-
-### Phase 17: Plugin System (MEDIUM PRIORITY)
-
-**Goal**: One-command installation of tool collections
-
-**Features:**
-1. **Plugin Structure**
-   ```json
-   {
-     "name": "github-workflow",
-     "version": "1.0.0",
-     "commands": [
-       {"name": "pr", "file": "commands/pr.md"}
-     ],
-     "mcp_servers": [
-       {
-         "name": "github",
-         "command": "${OWLEN_PLUGIN_ROOT}/bin/github-mcp"
-       }
-     ]
-   }
-   ```
-
-2. **Installation**
-   ```bash
-   owlen plugin install github-workflow
-   owlen plugin list
-   owlen plugin remove github-workflow
-   ```
-
-3. **Discovery**
-   - `~/.owlen/plugins/` directory
-   - Git repository URLs
-   - Plugin registry (future)
-
-**Implementation:**
-- New `owlen-core::plugins` module
-- Plugin manifest parser
-- Installation/removal logic
-- Sandboxing for plugin code
-
-**Files to create:**
-- `crates/owlen-core/src/plugins.rs`
-- `crates/owlen-cli/src/commands/plugin.rs`
-
-### Phase 18: Extended Thinking Modes (MEDIUM PRIORITY)
-
-**Goal**: Progressive computation budgets for complex tasks
-
-**Modes:**
-- `think` - basic extended thinking
-- `think hard` - increased computation
-- `think harder` - more computation
-- `ultrathink` - maximum budget
-
-**Implementation:**
-- Extend `owlen-core::types::ChatParameters`
-- Add thinking mode to TUI commands
-- Configure per-provider max tokens
-
-**Files to modify:**
-- `crates/owlen-core/src/types.rs`
-- `crates/owlen-tui/src/command_parser.rs`
-
-### Phase 19: Git Workflow Automation (MEDIUM PRIORITY)
-
-**Goal**: Streamline common Git operations
-
-**Features:**
-1. Auto-commit message generation
-2. PR creation via `gh` CLI
-3. Rebase conflict resolution
-4. File revert operations
-5. Git history analysis
-
-**Implementation:**
-- New `owlen-mcp-git-server` crate
-- Tools: `commit`, `create_pr`, `rebase`, `revert`, `history`
-- Integration with TUI commands
-
-**Files to create:**
-- `crates/owlen-mcp-git-server/src/lib.rs`
-
-### Phase 20: Enterprise Features (LOW PRIORITY)
-
-**Goal**: Team and enterprise deployment support
-
-**Features:**
-1. **Managed Configuration**
-   - `/etc/owlen/managed-mcp.json` (Linux)
-   - Restrict user additions with `useEnterpriseMcpConfigOnly`
-
-2. **Audit Logging**
-   - Log all file writes and shell commands
-   - Structured JSON logs
-   - Tamper-proof storage
-
-3. **Team Collaboration**
-   - Shared OWLEN.md across team
-   - Project-scoped MCP servers in `.mcp.json`
-   - Approval policy enforcement
-
-**Implementation:**
-- Extend `owlen-core::config` with managed settings
-- New `owlen-core::audit` module
-- Enterprise deployment documentation
-
-## Testing Requirements
-
-### Test Coverage Goals
-
-- **Unit tests**: 80%+ coverage for `owlen-core`
-- **Integration tests**: All MCP servers, providers
-- **TUI tests**: Key workflows (not pixel-perfect)
-
-### Test Organization
-
-```rust
-#[cfg(test)]
-mod tests {
-    use super::*;
-    use crate::provider::test_utils::MockProvider;
-    use crate::mcp::test_utils::MockMcpClient;
-
-    #[test]
-    fn test_feature() {
-        // Setup
-        let provider = MockProvider::new();
-
-        // Execute
-        let result = provider.chat(request).await;
-
-        // Assert
-        assert!(result.is_ok());
-    }
-}
-```
-
-### Running Tests
-
-```bash
-cargo test --all                    # All tests
-cargo test --lib -p owlen-core      # Core library tests
-cargo test --test integration       # Integration tests
-```
-
-## Documentation Standards
-
-### Code Documentation
-
-1. **Module-level** (`//!` at top of file):
-   ```rust
-   //! Brief module description
-   //!
-   //! Detailed explanation of module purpose,
-   //! key types, and usage examples.
-   ```
-
-2. **Public APIs** (`///` above items):
-   ```rust
-   /// Brief description
-   ///
-   /// # Arguments
-   /// * `arg1` - Description
-   ///
-   /// # Returns
-   /// Description of return value
-   ///
-   /// # Errors
-   /// When this function returns an error
-   ///
-   /// # Example
-   /// ```
-   /// let result = function(arg);
-   /// ```
-   pub fn function(arg: Type) -> Result<Output> {
-       // implementation
-   }
-   ```
-
-3. **Private items**: Optional, use for complex logic
-
-### User Documentation
-
-Location: `docs/` directory
-
-Files to maintain:
-- `architecture.md` - System design
-- `configuration.md` - Config reference
-- `migration-guide.md` - Version upgrades
-- `troubleshooting.md` - Common issues
-- `provider-implementation.md` - Adding providers
-- `faq.md` - Frequently asked questions
-
-## Git Workflow
-
-### Branch Strategy
-
-- `main` - stable releases only
-- `dev` - active development (default)
-- `feature/*` - new features
-- `fix/*` - bug fixes
-- `docs/*` - documentation only
-
-### Commit Messages
-
-Follow conventional commits:
-
-```
-type(scope): brief description
-
-Detailed explanation of changes.
-
-Breaking changes, if any.
-
-🤖 Generated with [Claude Code](https://claude.com/claude-code)
-
-Co-Authored-By: Claude <noreply@anthropic.com>
-```
-
-Types: `feat`, `fix`, `docs`, `refactor`, `test`, `chore`
-
-### Pre-commit Hooks
-
-Automatically run:
-- `cargo fmt` (formatting)
-- `cargo check` (compilation)
-- `cargo clippy` (linting)
-- YAML/TOML validation
-- Trailing whitespace removal
-
-## Performance Guidelines
-
-### Optimization Priorities
-
-1. **Startup time**: < 500ms cold start
-2. **First token latency**: < 2s for local models
-3. **Memory usage**: < 100MB base, < 500MB with conversation
-4. **Responsiveness**: TUI redraws < 16ms (60 FPS)
-
-### Profiling
-
-```bash
-cargo build --release --features profiling
-valgrind --tool=callgrind target/release/owlen
-kcachegrind callgrind.out.*
-```
-
-### Async Performance
-
-- Avoid blocking in async contexts
-- Use `tokio::spawn` for CPU-intensive work
-- Set timeouts on all network operations
-- Cancel tasks on shutdown
-
-## Security Considerations
-
-### Threat Model
-
-**Trusted:**
-- User's local machine
-- User-installed Ollama models
-- User configuration files
-
-**Untrusted:**
-- MCP server responses
-- Web search results
-- Code execution output
-- Cloud LLM responses
-
-### Security Measures
-
-1. **Input Validation**
-   - Sanitize all MCP tool arguments
-   - Validate JSON schemas strictly
-   - Escape shell commands
-
-2. **Sandboxing**
-   - Docker for code execution
-   - Network isolation
-   - Filesystem restrictions
-
-3. **Secrets Management**
-   - Never log API keys
-   - Use environment variables
-   - Encrypt sensitive config fields
-
-4. **Dependency Auditing**
-   ```bash
-   cargo audit
-   cargo deny check
-   ```
-
-## Debugging Tips
-
-### Enable Debug Logging
-
-```bash
-OWLEN_DEBUG_OLLAMA=1 owlen          # Ollama requests
-RUST_LOG=debug owlen                # All debug logs
-RUST_BACKTRACE=1 owlen              # Stack traces
-```
-
-### Common Issues
-
-1. **Timeout on Ollama**
-   - Check `ollama ps` for loaded models
-   - Increase timeout in config
-   - Restart Ollama service
-
-2. **MCP Server Not Found**
-   - Verify `mcp_servers` config
-   - Check server binary exists
-   - Test server manually with STDIO
-
-3. **TUI Rendering Issues**
-   - Test in different terminals
-   - Check terminal size (`tput cols; tput lines`)
-   - Verify theme compatibility
-
-## Contributing
-
-### Before Submitting PR
-
-1. Run full test suite: `cargo test --all`
-2. Check formatting: `cargo fmt -- --check`
-3. Run linter: `cargo clippy -- -D warnings`
-4. Update documentation if API changed
-5. Add tests for new features
-6. Update CHANGELOG.md
-
-### PR Description Template
-
-```markdown
-## Summary
-Brief description of changes
-
-## Type of Change
-- [ ] Bug fix
-- [ ] New feature
-- [ ] Breaking change
-- [ ] Documentation update
-
-## Testing
-Describe tests performed
-
-## Checklist
-- [ ] Tests added/updated
-- [ ] Documentation updated
-- [ ] CHANGELOG.md updated
-- [ ] No clippy warnings
-```
-
-## Resources
-
-### External Documentation
-
-- [Ratatui Docs](https://ratatui.rs/)
-- [Tokio Tutorial](https://tokio.rs/tokio/tutorial)
-- [MCP Specification](https://modelcontextprotocol.io/)
-- [Ollama API](https://github.com/ollama/ollama/blob/main/docs/api.md)
-
-### Internal Documentation
-
-- `.agents/new_phases.md` - 10-phase migration plan (completed)
-- `docs/phase5-mode-system.md` - Mode system design
-- `docs/migration-guide.md` - v0.x → v1.0 migration
-
-### Community
-
-- GitHub Issues: Bug reports and feature requests
-- GitHub Discussions: Questions and ideas
-- AUR Package: `owlen-git` (Arch Linux)
-
-## Version History
-
-- **v1.0.0** (current) - MCP-only architecture, Phase 10 complete
-- **v0.2.0** - Added web search, code execution servers
-- **v0.1.0** - Initial release with Ollama support
-
-## License
-
-Owlen is open source software. See LICENSE file for details.
+* Local Ollama works out-of-the-box (no key required).
+* Ollama Cloud is available **only when** an API key is present; no more 401 loops; clear fallback to local.
+* Chat header shows **context used / context window** and **%**.
+* Cloud usage shows **hourly / weekly token usage** (tracked locally; limits configurable).
+* “Internet?” → the model/tooling performs web search automatically; no “I can’t access the internet” replies.
 
 ---
 
-**Last Updated**: 2025-10-11
-**Maintained By**: Owlen Development Team
-**For AI Agents**: Follow these guidelines when modifying Owlen codebase. Prioritize MCP client enhancement (Phase 11) and approval system (Phase 12) for feature parity with Codex/Claude Code while maintaining local-first philosophy.
+## Release Assumptions
+
+* Provider config has **separate entries** for local (“ollama”) and cloud (“ollama-cloud”), both optional.
+* **Cloud base URL** is config-driven (default `[Inference] https://api.ollama.com`), **local** default `http://localhost:11434`.
+* Cloud requests include `Authorization: Bearer <API_KEY>`.
+* Token counts (`prompt_eval_count`, `eval_count`) are read from provider responses **if available**; else fallback to local estimation.
+* Rate-limit quotas unknown → **user-configurable** (`hourly_quota_tokens`, `weekly_quota_tokens`); we still track actual usage.
+
+---
+
+## Commit Plan (Conventional Commits)
+
+### 1) feat(provider/ollama): health checks, resilient model listing, friendly errors
+
+* **Why:** Local should “just work”; no brittle failures if model not pulled or daemon is down.
+* **Implement:**
+
+    * `GET /api/tags` with timeout & retry; on failure → user-friendly banner (“Ollama not reachable on localhost:11434”).
+    * If chat hits “model not found” → suggest `ollama pull <model>` (do **not** auto-pull).
+    * Pre-chat healthcheck endpoint with short timeout.
+* **AC:** With no Ollama, UI shows actionable error, app stays usable. With Ollama up, models list renders within TTL.
+* **Tests:** Mock 200/500/timeout; simulate missing model.
+
+### 2) feat(provider/ollama-cloud): separate provider with auth header & key-gating
+
+* **Why:** Prevent 401s; enable cloud only when key is present.
+* **Implement:**
+
+    * New provider “ollama-cloud”.
+    * Build `reqwest::Client` with default headers incl. `Authorization: Bearer <API_KEY>`.
+    * Base URL from config (default `[Inference] https://api.ollama.com`).
+    * If no key: provider not registered; cloud models hidden.
+* **AC:** With key → cloud models list & chat work; without key → no cloud provider listed.
+* **Tests:** No key (hidden), invalid key (401 handled), valid key (happy path).
+
+### 3) fix(provider/ollama-cloud): handle 401/429 gracefully, auto-degrade
+
+* **Why:** Avoid dead UX where cloud is selectable but unusable.
+* **Implement:**
+
+    * Intercept 401/403 → mark provider “unauthorized”, show toast “Cloud key invalid; using local”.
+    * Intercept 429 → toast “Cloud rate limit hit; retry later”; keep provider enabled.
+    * Auto-fallback to last good local provider for the active chat (don’t lose message).
+* **AC:** Selecting cloud with bad key never traps the user; one click recovers to local.
+* **Tests:** Simulated 401/429 mid-stream & at start.
+
+### 4) feat(models/registry): multi-provider registry + aggregated model list
+
+* **Why:** Show local & cloud side-by-side, clearly labeled.
+* **Implement:**
+
+    * Registry holds N providers (local, cloud).
+    * Aggregate lists with `provider_tag` = `ollama` / `ollama-cloud`.
+    * De-dupe identical names by appending provider tag in UI (“qwen3:8b · local”, “qwen3:8b-cloud · cloud”).
+* **AC:** Model picker groups by provider; switching respects selection.
+* **Tests:** Only local; only cloud; both; de-dup checks.
+
+### 5) refactor(session): message pipeline to support tool-calls & partial updates
+
+* **Why:** Prepare for search tool loop and robust streaming.
+* **Implement:**
+
+    * Centralize stream parser → yields either `assistant_text` chunks **or** `tool_call` messages.
+    * Buffer + flush on `done` or on explicit tool boundary.
+* **AC:** No regressions in normal chat; tool call boundary events visible to controller.
+* **Tests:** Stream with mixed content, malformed frames, newline-split JSON.
+
+### 6) fix(streaming): tolerant JSON lines parser & end-of-stream semantics
+
+* **Why:** Avoid UI stalls on minor protocol hiccups.
+* **Implement:**
+
+    * Accept `\n`/`\r\n` delimiters; ignore empty lines; robust `done: true` handling; final metrics extraction.
+* **AC:** Long completions never hang; final token counts captured.
+* **Tests:** Fuzz malformed frames.
+
+### 7) feat(ui/header): context usage indicator (value + %), colorized thresholds
+
+* **Why:** Transparency: “2560 / 8192 (31%)”.
+* **Implement:**
+
+    * Track last `prompt_eval_count` (or estimate) as “context used”.
+    * Context window: from model metadata (if available) else config default per provider/model family.
+    * Header shows: `Model · Context 2.6k / 8k (33%)`. Colors: <60% normal, 60–85% warn, >85% danger.
+* **AC:** Live updates after each assistant turn.
+* **Tests:** Various windows & counts; edge 0%/100%.
+
+### 8) feat(usage): token usage tracker (hourly/weekly), local store
+
+* **Why:** Cloud limits visibility even without official API.
+* **Implement:**
+
+    * Append per-provider token use after each response: prompt+completion.
+    * Rolling sums: last 60m, last 7d.
+    * Persist to JSON/SQLite in app data dir; prune daily.
+    * Configurable quotas: `providers.ollama_cloud.hourly_quota_tokens`, `weekly_quota_tokens`.
+* **AC:** `:limits` shows totals & percentages; survives restart.
+* **Tests:** Time-travel tests; persistence.
+
+### 9) feat(ui/usage): surface hourly/weekly usage + threshold toasts
+
+* **Why:** Prevent surprises; act before you hit the wall.
+* **Implement:**
+
+    * Header second line (when cloud active): `Cloud usage · 12k/50k (hour) · 90k/250k (week)` with % colors.
+    * Toast at 80% & 95% thresholds.
+* **AC:** Visuals match tracker; toasts fire once per threshold band.
+* **Tests:** Threshold crossings; reset behavior.
+
+### 10) feat(tools): generic tool-calling loop (function-call adapter)
+
+* **Why:** “Use the web” without whining.
+* **Implement:**
+
+    * Internal tool schema: `{ name, parameters(JSONSchema), invoke(args)->ToolResult }`.
+    * Provider adapter: if model emits a tool call → pause stream, invoke tool, append `tool` message, resume.
+    * Guardrails: max tool hops (e.g., 3), max tokens per hop.
+* **AC:** Model can call tools mid-answer; transcript shows tool result context to model (not necessarily to user).
+* **Tests:** Single & chained tool calls; loop cutoff.
+
+### 11) feat(tool/web.search): HTTP search wrapper for cloud mode; fallback stub
+
+* **Why:** Default “internet” capability.
+* **Implement:**
+
+    * Tool name: `web.search` with params `{ query: string }`.
+    * Cloud path: call provider’s search endpoint (configurable path); include API key.
+    * Fallback (no cloud): disabled; model will not see the tool (prevents “I can’t” messaging).
+* **AC:** Asking “what happened today in Rust X?” triggers one search call and integrates snippets into the answer.
+* **Tests:** Happy path, 401/key missing (tool hidden), network failure (tool error surfaced to model).
+
+### 12) feat(commands): `:provider`, `:model`, `:limits`, `:web on|off`
+
+* **Why:** Fast control for power users.
+* **Implement:**
+
+    * `:provider` lists/switches active provider.
+    * `:model` lists/switches models under active provider.
+    * `:limits` shows tracked usage.
+    * `:web on|off` toggles exposing `web.search` to the model.
+* **AC:** Commands work in both TUI and CLI modes.
+* **Tests:** Command parsing & side-effects.
+
+### 13) feat(config): explicit sections & env fallbacks
+
+* **Why:** Clear separation + easy secrets management.
+* **Implement (example):**
+
+  ```toml
+  [providers.ollama]
+  base_url = "http://localhost:11434"
+  list_ttl_secs = 60
+  default_context_window = 8192
+
+  [providers.ollama_cloud]
+  enabled = true
+  base_url = "https://api.ollama.com"        # [Inference] default
+  api_key = ""                                # read from env OLLEN_OLLAMA_CLOUD_API_KEY if empty
+  hourly_quota_tokens = 50000                 # configurable; visual only
+  weekly_quota_tokens = 250000                # configurable; visual only
+  list_ttl_secs = 60
+  ```
+* **AC:** Empty key → cloud disabled; env overrides work.
+* **Tests:** Env vs file precedence.
+
+### 14) docs(readme): setup & troubleshooting for local + cloud
+
+* **Why:** Reduce support pings.
+* **Include:**
+
+    * Local prerequisites; healthcheck tips; “model not found” guidance.
+    * Cloud key setup; common 401/429 causes; privacy note.
+    * Context/limits UI explainer; tool behavior & toggles.
+    * Upgrade notes from v0.1 → v0.2.
+* **AC:** New users can configure both in <5 min.
+
+### 15) test(integration): local-only, cloud-only, mixed; auth & rate-limit sims
+
+* **Why:** Prevent regressions.
+* **Implement:**
+
+    * Wiremock stubs for cloud: `/tags`, `/chat`, search endpoint, 401/429 cases.
+    * Snapshot test for tool-call transcript roundtrip.
+* **AC:** Green suite in CI; reproducible.
+
+### 16) refactor(errors): typed provider errors + UI toasts
+
+* **Why:** Replace generic “something broke”.
+* **Implement:** Error enum: `Unavailable`, `Unauthorized`, `RateLimited`, `Timeout`, `Protocol`. Map to toasts/banners.
+* **AC:** Each known failure surfaces a precise message; logs redact secrets.
+
+### 17) perf(models): cache model lists with TTL; invalidate on user action
+
+* **Why:** Reduce network churn.
+* **AC:** Re-open picker doesn’t re-fetch until TTL or manual refresh.
+
+### 18) chore(release): bump to v0.2; changelog; package metadata
+
+* **AC:** `--version` shows v0.2; changelog includes above bullets.
+
+---
+
+## Missing Features vs. Codex / Claude-Code (Queued for v0.3+)
+
+* **Multi-vendor providers:** OpenAI & Anthropic provider crates with function-calling parity and structured outputs.
+* **Agentic coding ops:** Safe file edits, `git` ops, shell exec with approval queue.
+* **“Thinking/Plan” pane:** First-class rendering of plan/reflect steps; optional auto-approve.
+* **Plugin system:** Tool discovery & permissions; marketplace later.
+* **IDE integration:** Minimal LSP/bridge (VS Code/JetBrains) to run Owlen as backend.
+* **Retrieval/RAG:** Local project indexing; selective context injection with token budgeter.
+
+---
+
+## Acceptance Checklist (Release Gate)
+
+* [ ] Local Ollama: healthcheck OK, models list OK, chat OK.
+* [ ] Cloud appears only with valid key; no 401 loops; 429 shows toast.
+* [ ] Header shows context value and %; updates after each turn.
+* [ ] `:limits` shows hourly/weekly tallies; persists across restarts.
+* [ ] Asking for current info triggers `web.search` (cloud on) and returns an answer without “I can’t access the internet”.
+* [ ] Docs updated; sample config included; tests green.
+
+---
+
+## Notes & Caveats
+
+* **Base URLs & endpoints:** Kept **config-driven**. Defaults are `[Inference]`. If your cloud endpoint differs, set it in `config.toml`.
+* **Quotas:** No official token quota API assumed; we **track locally** and let you configure limits for UI display.
+* **Tooling:** If a given model doesn’t emit tool calls, `web.search` won’t be visible to it. You can force exposure via `:web on`.
+
+---
+
+## Lightly Opinionated Guidance (aka sparring stance)
+
+* Don’t auto-pull models on errors—teach users where control lives.
+* Keep cloud totally opt-in; hide it on misconfig to reduce paper cuts.
+* Treat tool-calling like `sudo`: short leash, clear logs, obvious toggles.
+* Make the header a cockpit, not a billboard: model · context · usage; that’s it.
+
+If you want, I can turn this into a series of `codex apply`-ready prompts with one commit per run.