fix: resolve all compilation errors and clippy warnings

This commit fixes 12 categories of errors across the codebase: - Fix owlen-mcp-llm-server build target conflict by renaming lib.rs to main.rs - Resolve ambiguous glob re-exports in owlen-core by using explicit exports - Add Default derive to MockMcpClient and MockProvider test utilities - Remove unused imports from owlen-core test files - Fix needless borrows in test file arguments - Improve Config initialization style in mode_tool_filter tests - Make AgentExecutor::parse_response public for testing - Remove non-existent max_tool_calls field from AgentConfig usage - Fix AgentExecutor::new calls to use correct 3-argument signature - Fix AgentResult field access in agent tests - Use Debug formatting instead of Display for AgentResult - Remove unnecessary default() calls on unit structs All changes ensure the project compiles cleanly with: - cargo check --all-targets ✓ - cargo clippy --all-targets -- -D warnings ✓ - cargo test --no-run ✓ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
docs: add comprehensive AGENTS.md for AI agent development
2025-10-11 00:49:32 +02:00 · 2025-10-11 00:37:04 +02:00
11 changed files with 853 additions and 65 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,798 @@
 # AGENTS.md - AI Agent Instructions for Owlen Development
 This document provides comprehensive context and guidelines for AI agents (Claude, GPT-4, etc.) working on the Owlen codebase.
 ## Project Overview
 **Owlen** is a local-first, terminal-based AI assistant built in Rust using the Ratatui TUI framework. It implements a Model Context Protocol (MCP) architecture for modular tool execution and supports both local (Ollama) and cloud LLM providers.
 **Core Philosophy:**
 - **Local-first**: Prioritize local LLMs (Ollama) with cloud as fallback
 - **Privacy-focused**: No telemetry, user data stays on device
 - **MCP-native**: All operations through MCP servers for modularity
 - **Terminal-native**: Vim-style modal interaction in a beautiful TUI
 **Current Status:** v1.0 - MCP-only architecture (Phase 10 complete)
 ## Architecture
 ### Project Structure
 ```
 owlen/
 ├── crates/
 │   ├── owlen-core/          # Core types, config, provider traits
 │   ├── owlen-tui/           # Ratatui-based terminal interface
 │   ├── owlen-cli/           # Command-line interface
 │   ├── owlen-ollama/        # Ollama provider implementation
 │   ├── owlen-mcp-llm-server/    # LLM inference as MCP server
 │   ├── owlen-mcp-client/        # MCP client library
 │   ├── owlen-mcp-server/        # Base MCP server framework
 │   ├── owlen-mcp-code-server/   # Code execution in Docker
 │   └── owlen-mcp-prompt-server/ # Prompt management server
 ├── docs/                    # Documentation
 ├── themes/                  # TUI color themes
 └── .agents/                 # Agent development plans
 ```
 ### Key Technologies
 - **Language**: Rust 1.83+
 - **TUI**: Ratatui with Crossterm backend
 - **Async Runtime**: Tokio
 - **Config**: TOML (serde)
 - **HTTP Client**: reqwest
 - **LLM Providers**: Ollama (primary), with extensibility for OpenAI/Anthropic
 - **Protocol**: JSON-RPC 2.0 over STDIO/HTTP/WebSocket
 ## Current Features (v1.0)
 ### Core Capabilities
 1. **MCP Architecture** (Phase 3-10 complete)
   - All LLM interactions via MCP servers
   - Local and remote MCP client support
   - STDIO, HTTP, WebSocket transports
   - Automatic failover with health checks
 2. **Provider System**
   - Ollama (local and cloud)
   - Configurable per-provider settings
   - API key management with env variable expansion
   - Model switching via TUI (`:m` command)
 3. **Agentic Loop** (ReAct pattern)
   - THOUGHT → ACTION → OBSERVATION cycle
   - Tool discovery and execution
   - Configurable iteration limits
   - Emergency stop (Ctrl+C)
 4. **Mode System**
   - Chat mode: Limited tool availability
   - Code mode: Full tool access
   - Tool filtering by mode
   - Runtime mode switching
 5. **Session Management**
   - Auto-save conversations
   - Session persistence with encryption
   - Description generation
   - Session timeout management
 6. **Security**
   - Docker sandboxing for code execution
   - Tool whitelisting
   - Permission prompts for dangerous operations
   - Network isolation options
 ### TUI Features
 - Vim-style modal editing (Normal, Insert, Visual, Command modes)
 - Multi-panel layout (conversation, status, input)
 - Syntax highlighting for code blocks
 - Theme system (10+ built-in themes)
 - Scrollback history (configurable limit)
 - Word wrap and visual selection
 ## Development Guidelines
 ### Code Style
 1. **Rust Best Practices**
   - Use `rustfmt` (pre-commit hook enforced)
   - Run `cargo clippy` before commits
   - Prefer `Result` over `panic!` for errors
   - Document public APIs with `///` comments
 2. **Error Handling**
   - Use `owlen_core::Error` enum for all errors
   - Chain errors with context (`.map_err(|e| Error::X(format!(...)))`)
   - Never unwrap in library code (tests OK)
 3. **Async Patterns**
   - All I/O operations must be async
   - Use `tokio::spawn` for background tasks
   - Prefer `tokio::sync::mpsc` for channels
   - Always set timeouts for network operations
 4. **Testing**
   - Unit tests in same file (`#[cfg(test)] mod tests`)
   - Use mock implementations from `test_utils` modules
   - Integration tests in `crates/*/tests/`
   - All public APIs must have tests
 ### File Organization
 **When editing existing files:**
 1. Read the entire file first (use `Read` tool)
 2. Preserve existing code style and formatting
 3. Update related tests in the same commit
 4. Keep changes atomic and focused
 **When creating new files:**
 1. Check `crates/owlen-core/src/` for similar modules
 2. Follow existing module structure
 3. Add to `lib.rs` with appropriate visibility
 4. Document module purpose with `//!` header
 ### Configuration
 **Config file**: `~/.config/owlen/config.toml`
 Example structure:
 ```toml
 [general]
 default_provider = "ollama"
 default_model = "llama3.2:latest"
 enable_streaming = true
 [mcp]
 # MCP is always enabled in v1.0+
 [providers.ollama]
 provider_type = "ollama"
 base_url = "http://localhost:11434"
 [providers.ollama-cloud]
 provider_type = "ollama-cloud"
 base_url = "https://ollama.com"
 api_key = "$OLLAMA_API_KEY"
 [ui]
 theme = "default_dark"
 word_wrap = true
 [security]
 enable_sandboxing = true
 allowed_tools = ["web_search", "code_exec"]
 ```
 ### Common Tasks
 #### Adding a New Provider
 1. Create `crates/owlen-{provider}/` crate
 2. Implement `owlen_core::provider::Provider` trait
 3. Add to `owlen_core::router::ProviderRouter`
 4. Update config schema in `owlen_core::config`
 5. Add tests with `MockProvider` pattern
 6. Document in `docs/provider-implementation.md`
 #### Adding a New MCP Server
 1. Create `crates/owlen-mcp-{name}-server/` crate
 2. Implement JSON-RPC 2.0 protocol handlers
 3. Define tool descriptors with JSON schemas
 4. Add sandboxing/security checks
 5. Register in `mcp_servers` config array
 6. Document tool capabilities
 #### Adding a TUI Feature
 1. Modify `crates/owlen-tui/src/chat_app.rs`
 2. Update keybinding handlers
 3. Extend UI rendering in `draw()` method
 4. Add to help screen (`?` command)
 5. Test with different terminal sizes
 6. Ensure theme compatibility
 ## Feature Parity Roadmap
 Based on analysis of OpenAI Codex and Claude Code, here are prioritized features to implement:
 ### Phase 11: MCP Client Enhancement (HIGHEST PRIORITY)
 **Goal**: Full MCP client capabilities to access ecosystem tools
 **Features:**
 1. **MCP Server Management**
   - `owlen mcp add/list/remove` commands
   - Three config scopes: local, project (`.mcp.json`), user
   - Environment variable expansion in config
   - OAuth 2.0 authentication for remote servers
 2. **MCP Resource References**
   - `@github:issue://123` syntax
   - `@postgres:schema://users` syntax
   - Auto-completion for resources
 3. **MCP Prompts as Slash Commands**
   - `/mcp__github__list_prs`
   - Dynamic command registration
 **Implementation:**
 - Extend `owlen-mcp-client` crate
 - Add `.mcp.json` parsing to `owlen-core::config`
 - Update TUI command parser for `@` and `/mcp__` syntax
 - Add OAuth flow to TUI
 **Files to modify:**
 - `crates/owlen-mcp-client/src/lib.rs`
 - `crates/owlen-core/src/config.rs`
 - `crates/owlen-tui/src/command_parser.rs`
 ### Phase 12: Approval & Sandbox System (HIGHEST PRIORITY)
 **Goal**: Safe agentic behavior with user control
 **Features:**
 1. **Three-tier Approval Modes**
   - `suggest`: Approve ALL file writes and shell commands (default)
   - `auto-edit`: Auto-approve file changes, prompt for shell
   - `full-auto`: Auto-approve everything (requires Git repo)
 2. **Platform-specific Sandboxing**
   - Linux: Docker with network isolation
   - macOS: Apple Seatbelt (`sandbox-exec`)
   - Windows: AppContainer or Job Objects
 3. **Permission Management**
   - `/permissions` command in TUI
   - Tool allowlist (e.g., `Edit`, `Bash(git commit:*)`)
   - Stored in `.owlen/settings.json` (project) or `~/.owlen.json` (user)
 **Implementation:**
 - New `owlen-core::approval` module
 - Extend `owlen-core::sandbox` with platform detection
 - Update `owlen-mcp-code-server` to use new sandbox
 - Add permission storage to config system
 **Files to create:**
 - `crates/owlen-core/src/approval.rs`
 - `crates/owlen-core/src/sandbox/linux.rs`
 - `crates/owlen-core/src/sandbox/macos.rs`
 - `crates/owlen-core/src/sandbox/windows.rs`
 ### Phase 13: Project Documentation System (HIGH PRIORITY)
 **Goal**: Massive usability improvement with project context
 **Features:**
 1. **OWLEN.md System**
   - `OWLEN.md` at repo root (checked into git)
   - `OWLEN.local.md` (gitignored, personal)
   - `~/.config/owlen/OWLEN.md` (global)
   - Support nested OWLEN.md in monorepos
 2. **Auto-generation**
   - `/init` command to generate project-specific OWLEN.md
   - Analyze codebase structure
   - Detect build system, test framework
   - Suggest common commands
 3. **Live Updates**
   - `#` command to add instructions to OWLEN.md
   - Context-aware insertion (relevant section)
 **Contents of OWLEN.md:**
 - Common bash commands
 - Code style guidelines
 - Testing instructions
 - Core files and utilities
 - Known quirks/warnings
 **Implementation:**
 - New `owlen-core::project_doc` module
 - File discovery algorithm (walk up directory tree)
 - Markdown parser for sections
 - TUI commands: `/init`, `#`
 **Files to create:**
 - `crates/owlen-core/src/project_doc.rs`
 - `crates/owlen-tui/src/commands/init.rs`
 ### Phase 14: Non-Interactive Mode (HIGH PRIORITY)
 **Goal**: Enable CI/CD integration and automation
 **Features:**
 1. **Headless Execution**
   ```bash
   owlen exec "fix linting errors" --approval-mode auto-edit
   owlen --quiet "update CHANGELOG" --json
   ```
 2. **Environment Variables**
   - `OWLEN_QUIET_MODE=1`
   - `OWLEN_DISABLE_PROJECT_DOC=1`
   - `OWLEN_APPROVAL_MODE=full-auto`
 3. **JSON Output**
   - Structured output for parsing
   - Exit codes for success/failure
   - Progress events on stderr
 **Implementation:**
 - New `owlen-cli` subcommand: `exec`
 - Extend `owlen-core::session` with non-interactive mode
 - Add JSON serialization for results
 - Environment variable parsing in config
 **Files to modify:**
 - `crates/owlen-cli/src/main.rs`
 - `crates/owlen-core/src/session.rs`
 ### Phase 15: Multi-Provider Expansion (HIGH PRIORITY)
 **Goal**: Support cloud providers while maintaining local-first
 **Providers to add:**
 1. OpenAI (GPT-4, o1, o4-mini)
 2. Anthropic (Claude 3.5 Sonnet, Opus)
 3. Google (Gemini Ultra, Pro)
 4. Mistral AI
 **Configuration:**
 ```toml
 [providers.openai]
 api_key = "${OPENAI_API_KEY}"
 model = "o4-mini"
 enabled = true
 [providers.anthropic]
 api_key = "${ANTHROPIC_API_KEY}"
 model = "claude-3-5-sonnet"
 enabled = true
 ```
 **Runtime Switching:**
 ```
 :model ollama/starcoder
 :model openai/o4-mini
 :model anthropic/claude-3-5-sonnet
 ```
 **Implementation:**
 - Create `owlen-openai`, `owlen-anthropic`, `owlen-google` crates
 - Implement `Provider` trait for each
 - Add runtime model switching to TUI
 - Maintain Ollama as default
 **Files to create:**
 - `crates/owlen-openai/src/lib.rs`
 - `crates/owlen-anthropic/src/lib.rs`
 - `crates/owlen-google/src/lib.rs`
 ### Phase 16: Custom Slash Commands (MEDIUM PRIORITY)
 **Goal**: User and team-defined workflows
 **Features:**
 1. **Command Directories**
   - `~/.owlen/commands/` (user, available everywhere)
   - `.owlen/commands/` (project, checked into git)
   - Support `$ARGUMENTS` keyword
 2. **Example Structure**
   ```markdown
   # .owlen/commands/fix-github-issue.md
   Please analyze and fix GitHub issue: $ARGUMENTS.
   1. Use `gh issue view` to get details
   2. Implement changes
   3. Write and run tests
   4. Create PR
   ```
 3. **TUI Integration**
   - Auto-complete for custom commands
   - Help text from command files
   - Parameter validation
 **Implementation:**
 - New `owlen-core::commands` module
 - Command discovery and parsing
 - Template expansion
 - TUI command registration
 **Files to create:**
 - `crates/owlen-core/src/commands.rs`
 - `crates/owlen-tui/src/commands/custom.rs`
 ### Phase 17: Plugin System (MEDIUM PRIORITY)
 **Goal**: One-command installation of tool collections
 **Features:**
 1. **Plugin Structure**
   ```json
   {
     "name": "github-workflow",
     "version": "1.0.0",
     "commands": [
       {"name": "pr", "file": "commands/pr.md"}
     ],
     "mcp_servers": [
       {
         "name": "github",
         "command": "${OWLEN_PLUGIN_ROOT}/bin/github-mcp"
       }
     ]
   }
   ```
 2. **Installation**
   ```bash
   owlen plugin install github-workflow
   owlen plugin list
   owlen plugin remove github-workflow
   ```
 3. **Discovery**
   - `~/.owlen/plugins/` directory
   - Git repository URLs
   - Plugin registry (future)
 **Implementation:**
 - New `owlen-core::plugins` module
 - Plugin manifest parser
 - Installation/removal logic
 - Sandboxing for plugin code
 **Files to create:**
 - `crates/owlen-core/src/plugins.rs`
 - `crates/owlen-cli/src/commands/plugin.rs`
 ### Phase 18: Extended Thinking Modes (MEDIUM PRIORITY)
 **Goal**: Progressive computation budgets for complex tasks
 **Modes:**
 - `think` - basic extended thinking
 - `think hard` - increased computation
 - `think harder` - more computation
 - `ultrathink` - maximum budget
 **Implementation:**
 - Extend `owlen-core::types::ChatParameters`
 - Add thinking mode to TUI commands
 - Configure per-provider max tokens
 **Files to modify:**
 - `crates/owlen-core/src/types.rs`
 - `crates/owlen-tui/src/command_parser.rs`
 ### Phase 19: Git Workflow Automation (MEDIUM PRIORITY)
 **Goal**: Streamline common Git operations
 **Features:**
 1. Auto-commit message generation
 2. PR creation via `gh` CLI
 3. Rebase conflict resolution
 4. File revert operations
 5. Git history analysis
 **Implementation:**
 - New `owlen-mcp-git-server` crate
 - Tools: `commit`, `create_pr`, `rebase`, `revert`, `history`
 - Integration with TUI commands
 **Files to create:**
 - `crates/owlen-mcp-git-server/src/lib.rs`
 ### Phase 20: Enterprise Features (LOW PRIORITY)
 **Goal**: Team and enterprise deployment support
 **Features:**
 1. **Managed Configuration**
   - `/etc/owlen/managed-mcp.json` (Linux)
   - Restrict user additions with `useEnterpriseMcpConfigOnly`
 2. **Audit Logging**
   - Log all file writes and shell commands
   - Structured JSON logs
   - Tamper-proof storage
 3. **Team Collaboration**
   - Shared OWLEN.md across team
   - Project-scoped MCP servers in `.mcp.json`
   - Approval policy enforcement
 **Implementation:**
 - Extend `owlen-core::config` with managed settings
 - New `owlen-core::audit` module
 - Enterprise deployment documentation
 ## Testing Requirements
 ### Test Coverage Goals
 - **Unit tests**: 80%+ coverage for `owlen-core`
 - **Integration tests**: All MCP servers, providers
 - **TUI tests**: Key workflows (not pixel-perfect)
 ### Test Organization
 ```rust
 #[cfg(test)]
 mod tests {
    use super::*;
    use crate::provider::test_utils::MockProvider;
    use crate::mcp::test_utils::MockMcpClient;
    #[test]
    fn test_feature() {
        // Setup
        let provider = MockProvider::new();
        // Execute
        let result = provider.chat(request).await;
        // Assert
        assert!(result.is_ok());
    }
 }
 ```
 ### Running Tests
 ```bash
 cargo test --all                    # All tests
 cargo test --lib -p owlen-core      # Core library tests
 cargo test --test integration       # Integration tests
 ```
 ## Documentation Standards
 ### Code Documentation
 1. **Module-level** (`//!` at top of file):
   ```rust
   //! Brief module description
   //!
   //! Detailed explanation of module purpose,
   //! key types, and usage examples.
   ```
 2. **Public APIs** (`///` above items):
   ```rust
   /// Brief description
   ///
   /// # Arguments
   /// * `arg1` - Description
   ///
   /// # Returns
   /// Description of return value
   ///
   /// # Errors
   /// When this function returns an error
   ///
   /// # Example
   /// ```
   /// let result = function(arg);
   /// ```
   pub fn function(arg: Type) -> Result<Output> {
       // implementation
   }
   ```
 3. **Private items**: Optional, use for complex logic
 ### User Documentation
 Location: `docs/` directory
 Files to maintain:
 - `architecture.md` - System design
 - `configuration.md` - Config reference
 - `migration-guide.md` - Version upgrades
 - `troubleshooting.md` - Common issues
 - `provider-implementation.md` - Adding providers
 - `faq.md` - Frequently asked questions
 ## Git Workflow
 ### Branch Strategy
 - `main` - stable releases only
 - `dev` - active development (default)
 - `feature/*` - new features
 - `fix/*` - bug fixes
 - `docs/*` - documentation only
 ### Commit Messages
 Follow conventional commits:
 ```
 type(scope): brief description
 Detailed explanation of changes.
 Breaking changes, if any.
 🤖 Generated with [Claude Code](https://claude.com/claude-code)
 Co-Authored-By: Claude <noreply@anthropic.com>
 ```
 Types: `feat`, `fix`, `docs`, `refactor`, `test`, `chore`
 ### Pre-commit Hooks
 Automatically run:
 - `cargo fmt` (formatting)
 - `cargo check` (compilation)
 - `cargo clippy` (linting)
 - YAML/TOML validation
 - Trailing whitespace removal
 ## Performance Guidelines
 ### Optimization Priorities
 1. **Startup time**: < 500ms cold start
 2. **First token latency**: < 2s for local models
 3. **Memory usage**: < 100MB base, < 500MB with conversation
 4. **Responsiveness**: TUI redraws < 16ms (60 FPS)
 ### Profiling
 ```bash
 cargo build --release --features profiling
 valgrind --tool=callgrind target/release/owlen
 kcachegrind callgrind.out.*
 ```
 ### Async Performance
 - Avoid blocking in async contexts
 - Use `tokio::spawn` for CPU-intensive work
 - Set timeouts on all network operations
 - Cancel tasks on shutdown
 ## Security Considerations
 ### Threat Model
 **Trusted:**
 - User's local machine
 - User-installed Ollama models
 - User configuration files
 **Untrusted:**
 - MCP server responses
 - Web search results
 - Code execution output
 - Cloud LLM responses
 ### Security Measures
 1. **Input Validation**
   - Sanitize all MCP tool arguments
   - Validate JSON schemas strictly
   - Escape shell commands
 2. **Sandboxing**
   - Docker for code execution
   - Network isolation
   - Filesystem restrictions
 3. **Secrets Management**
   - Never log API keys
   - Use environment variables
   - Encrypt sensitive config fields
 4. **Dependency Auditing**
   ```bash
   cargo audit
   cargo deny check
   ```
 ## Debugging Tips
 ### Enable Debug Logging
 ```bash
 OWLEN_DEBUG_OLLAMA=1 owlen          # Ollama requests
 RUST_LOG=debug owlen                # All debug logs
 RUST_BACKTRACE=1 owlen              # Stack traces
 ```
 ### Common Issues
 1. **Timeout on Ollama**
   - Check `ollama ps` for loaded models
   - Increase timeout in config
   - Restart Ollama service
 2. **MCP Server Not Found**
   - Verify `mcp_servers` config
   - Check server binary exists
   - Test server manually with STDIO
 3. **TUI Rendering Issues**
   - Test in different terminals
   - Check terminal size (`tput cols; tput lines`)
   - Verify theme compatibility
 ## Contributing
 ### Before Submitting PR
 1. Run full test suite: `cargo test --all`
 2. Check formatting: `cargo fmt -- --check`
 3. Run linter: `cargo clippy -- -D warnings`
 4. Update documentation if API changed
 5. Add tests for new features
 6. Update CHANGELOG.md
 ### PR Description Template
 ```markdown
 ## Summary
 Brief description of changes
 ## Type of Change
 - [ ] Bug fix
 - [ ] New feature
 - [ ] Breaking change
 - [ ] Documentation update
 ## Testing
 Describe tests performed
 ## Checklist
 - [ ] Tests added/updated
 - [ ] Documentation updated
 - [ ] CHANGELOG.md updated
 - [ ] No clippy warnings
 ```
 ## Resources
 ### External Documentation
 - [Ratatui Docs](https://ratatui.rs/)
 - [Tokio Tutorial](https://tokio.rs/tokio/tutorial)
 - [MCP Specification](https://modelcontextprotocol.io/)
 - [Ollama API](https://github.com/ollama/ollama/blob/main/docs/api.md)
 ### Internal Documentation
 - `.agents/new_phases.md` - 10-phase migration plan (completed)
 - `docs/phase5-mode-system.md` - Mode system design
 - `docs/migration-guide.md` - v0.x → v1.0 migration
 ### Community
 - GitHub Issues: Bug reports and feature requests
 - GitHub Discussions: Questions and ideas
 - AUR Package: `owlen-git` (Arch Linux)
 ## Version History
 - **v1.0.0** (current) - MCP-only architecture, Phase 10 complete
 - **v0.2.0** - Added web search, code execution servers
 - **v0.1.0** - Initial release with Ollama support
 ## License
 Owlen is open source software. See LICENSE file for details.
 ---
 **Last Updated**: 2025-10-11
 **Maintained By**: Owlen Development Team
 **For AI Agents**: Follow these guidelines when modifying Owlen codebase. Prioritize MCP client enhancement (Phase 11) and approval system (Phase 12) for feature parity with Codex/Claude Code while maintaining local-first philosophy.
--- a/crates/owlen-cli/tests/agent_tests.rs
+++ b/crates/owlen-cli/tests/agent_tests.rs
@@ -82,10 +82,9 @@ async fn test_agent_single_tool_scenario() {
        model: "llama3.2".to_string(),
        temperature: Some(0.7),
        max_tokens: None,
        max_tool_calls: 10,
    };
-    let executor = AgentExecutor::new(provider, mcp_client, config, None);
+    let executor = AgentExecutor::new(provider, mcp_client, config);
    // Simple query that should complete in one tool call
    let result = executor
@@ -93,9 +92,12 @@ async fn test_agent_single_tool_scenario() {
        .await;
    match result {
-        Ok(answer) => {
+        Ok(agent_result) => {
-            assert!(!answer.is_empty(), "Answer should not be empty");
+            assert!(
-            println!("Agent answer: {}", answer);
+                !agent_result.answer.is_empty(),
                "Answer should not be empty"
            );
            println!("Agent answer: {}", agent_result.answer);
        }
        Err(e) => {
            // It's okay if this fails due to LLM not following format
@@ -116,10 +118,9 @@ async fn test_agent_multi_step_workflow() {
        model: "llama3.2".to_string(),
        temperature: Some(0.5), // Lower temperature for more consistent behavior
        max_tokens: None,
        max_tool_calls: 20,
    };
-    let executor = AgentExecutor::new(provider, mcp_client, config, None);
+    let executor = AgentExecutor::new(provider, mcp_client, config);
    // Query requiring multiple steps: list -> read -> analyze
    let result = executor
@@ -127,9 +128,9 @@ async fn test_agent_multi_step_workflow() {
        .await;
    match result {
-        Ok(answer) => {
+        Ok(agent_result) => {
-            assert!(!answer.is_empty());
+            assert!(!agent_result.answer.is_empty());
-            println!("Multi-step answer: {}", answer);
+            println!("Multi-step answer: {:?}", agent_result);
        }
        Err(e) => {
            println!("Multi-step test skipped: {}", e);
@@ -148,10 +149,9 @@ async fn test_agent_iteration_limit() {
        model: "llama3.2".to_string(),
        temperature: Some(0.7),
        max_tokens: None,
        max_tool_calls: 5,
    };
-    let executor = AgentExecutor::new(provider, mcp_client, config, None);
+    let executor = AgentExecutor::new(provider, mcp_client, config);
    // Complex query that would require many iterations
    let result = executor
@@ -186,14 +186,13 @@ async fn test_agent_tool_budget_enforcement() {
    let mcp_client = Arc::clone(&provider) as Arc<RemoteMcpClient>;
    let config = AgentConfig {
-        max_iterations: 20,
+        max_iterations: 3, // Very low iteration limit to enforce budget
        model: "llama3.2".to_string(),
        temperature: Some(0.7),
        max_tokens: None,
        max_tool_calls: 3, // Very low tool call budget
    };
-    let executor = AgentExecutor::new(provider, mcp_client, config, None);
+    let executor = AgentExecutor::new(provider, mcp_client, config);
    // Query that would require many tool calls
    let result = executor
@@ -238,7 +237,7 @@ fn create_test_executor() -> AgentExecutor {
    let mcp_client = Arc::clone(&provider) as Arc<RemoteMcpClient>;
    let config = AgentConfig::default();
-    AgentExecutor::new(provider, mcp_client, config, None)
+    AgentExecutor::new(provider, mcp_client, config)
 }
 #[test]
@@ -248,7 +247,7 @@ fn test_agent_config_defaults() {
    assert_eq!(config.max_iterations, 10);
    assert_eq!(config.model, "ollama");
    assert_eq!(config.temperature, Some(0.7));
-    assert_eq!(config.max_tool_calls, 20);
+    // max_tool_calls field removed - agent now tracks iterations instead
 }
 #[test]
@@ -258,12 +257,10 @@ fn test_agent_config_custom() {
        model: "custom-model".to_string(),
        temperature: Some(0.5),
        max_tokens: Some(2000),
        max_tool_calls: 30,
    };
    assert_eq!(config.max_iterations, 15);
    assert_eq!(config.model, "custom-model");
    assert_eq!(config.temperature, Some(0.5));
    assert_eq!(config.max_tokens, Some(2000));
    assert_eq!(config.max_tool_calls, 30);
 }
--- a/crates/owlen-core/src/agent.rs
+++ b/crates/owlen-core/src/agent.rs
@@ -235,7 +235,7 @@ impl AgentExecutor {
    }
    /// Parse LLM response into structured format
-    fn parse_response(&self, text: &str) -> Result<LlmResponse> {
+    pub fn parse_response(&self, text: &str) -> Result<LlmResponse> {
        let lines: Vec<&str> = text.lines().collect();
        let mut thought = String::new();
        let mut action = String::new();
@@ -370,8 +370,8 @@ mod tests {
    #[test]
    fn test_parse_tool_call() {
        let executor = AgentExecutor {
-            llm_client: Arc::new(MockProvider::new()),
+            llm_client: Arc::new(MockProvider),
-            tool_client: Arc::new(MockMcpClient::new()),
+            tool_client: Arc::new(MockMcpClient),
            config: AgentConfig::default(),
        };
@@ -399,8 +399,8 @@ ACTION_INPUT: {"query": "Rust programming language"}
    #[test]
    fn test_parse_final_answer() {
        let executor = AgentExecutor {
-            llm_client: Arc::new(MockProvider::new()),
+            llm_client: Arc::new(MockProvider),
-            tool_client: Arc::new(MockMcpClient::new()),
+            tool_client: Arc::new(MockMcpClient),
            config: AgentConfig::default(),
        };
--- a/crates/owlen-core/src/lib.rs
+++ b/crates/owlen-core/src/lib.rs
@@ -34,10 +34,15 @@ pub use credentials::*;
 pub use encryption::*;
 pub use formatting::*;
 pub use input::*;
-pub use mcp::*;
+// Export MCP types but exclude test_utils to avoid ambiguity
 pub use mcp::{
    client, factory, failover, permission, protocol, remote_client, LocalMcpClient, McpServer,
    McpToolCall, McpToolDescriptor, McpToolResponse,
 };
 pub use mode::*;
 pub use model::*;
-pub use provider::*;
+// Export provider types but exclude test_utils to avoid ambiguity
 pub use provider::{ChatStream, Provider, ProviderConfig, ProviderRegistry};
 pub use router::*;
 pub use sandbox::*;
 pub use session::*;
--- a/crates/owlen-core/src/mcp.rs
+++ b/crates/owlen-core/src/mcp.rs
@@ -149,14 +149,9 @@ pub mod test_utils {
    use super::*;
    /// Mock MCP client for testing
    #[derive(Default)]
    pub struct MockMcpClient;
    impl MockMcpClient {
        pub fn new() -> Self {
            Self
        }
    }
    #[async_trait]
    impl McpClient for MockMcpClient {
        async fn list_tools(&self) -> Result<Vec<McpToolDescriptor>> {
--- a/crates/owlen-core/src/provider.rs
+++ b/crates/owlen-core/src/provider.rs
@@ -181,14 +181,9 @@ pub mod test_utils {
    use crate::types::{ChatRequest, ChatResponse, Message, ModelInfo, Role};
    /// Mock provider for testing
    #[derive(Default)]
    pub struct MockProvider;
    impl MockProvider {
        pub fn new() -> Self {
            Self
        }
    }
    #[async_trait::async_trait]
    impl Provider for MockProvider {
        fn name(&self) -> &str {
--- a/crates/owlen-core/tests/file_server.rs
+++ b/crates/owlen-core/tests/file_server.rs
@@ -1,6 +1,5 @@
 use owlen_core::mcp::client::McpClient;
 use owlen_core::mcp::remote_client::RemoteMcpClient;
-use owlen_core::mcp::McpToolCall;
+use owlen_core::McpToolCall;
 use std::fs::File;
 use std::io::Write;
 use tempfile::tempdir;
@@ -22,7 +21,7 @@ async fn remote_file_server_read_and_list() {
        .join("../..")
        .join("Cargo.toml");
    let build_status = std::process::Command::new("cargo")
-        .args(&["build", "-p", "owlen-mcp-server", "--manifest-path"])
+        .args(["build", "-p", "owlen-mcp-server", "--manifest-path"])
        .arg(manifest_path)
        .status()
        .expect("failed to run cargo build for MCP server");
--- a/crates/owlen-core/tests/file_write.rs
+++ b/crates/owlen-core/tests/file_write.rs
@@ -1,13 +1,12 @@
 use owlen_core::mcp::client::McpClient;
 use owlen_core::mcp::remote_client::RemoteMcpClient;
-use owlen_core::mcp::McpToolCall;
+use owlen_core::McpToolCall;
 use tempfile::tempdir;
 #[tokio::test]
 async fn remote_write_and_delete() {
    // Build the server binary first
    let status = std::process::Command::new("cargo")
-        .args(&["build", "-p", "owlen-mcp-server"])
+        .args(["build", "-p", "owlen-mcp-server"])
        .status()
        .expect("failed to build MCP server");
    assert!(status.success());
@@ -42,7 +41,7 @@ async fn remote_write_and_delete() {
 async fn write_outside_root_is_rejected() {
    // Build server (already built in previous test, but ensure it exists)
    let status = std::process::Command::new("cargo")
-        .args(&["build", "-p", "owlen-mcp-server"])
+        .args(["build", "-p", "owlen-mcp-server"])
        .status()
        .expect("failed to build MCP server");
    assert!(status.success());
--- a/crates/owlen-core/tests/mode_tool_filter.rs
+++ b/crates/owlen-core/tests/mode_tool_filter.rs
@@ -42,14 +42,16 @@ impl Tool for EchoTool {
 #[tokio::test]
 async fn test_tool_allowed_in_chat_mode() {
    // Build a config where the `echo` tool is explicitly allowed in chat.
-    let mut cfg = Config::default();
+    let cfg = Config {
-    cfg.modes = ModeConfig {
+        modes: ModeConfig {
-        chat: ModeToolConfig {
+            chat: ModeToolConfig {
-            allowed_tools: vec!["echo".to_string()],
+                allowed_tools: vec!["echo".to_string()],
-        },
+            },
-        code: ModeToolConfig {
+            code: ModeToolConfig {
-            allowed_tools: vec!["*".to_string()],
+                allowed_tools: vec!["*".to_string()],
            },
        },
        ..Default::default()
    };
    let cfg = Arc::new(Mutex::new(cfg));
@@ -70,17 +72,18 @@ async fn test_tool_allowed_in_chat_mode() {
 #[tokio::test]
 async fn test_tool_not_allowed_in_any_mode() {
    // Config that does NOT list `echo` in either mode.
-    let mut cfg = Config::default();
+    let cfg = Config {
-    cfg.modes = ModeConfig {
+        modes: ModeConfig {
-        chat: ModeToolConfig {
+            chat: ModeToolConfig {
-            allowed_tools: vec!["web_search".to_string()],
+                allowed_tools: vec!["web_search".to_string()],
-        },
+            },
-        code: ModeToolConfig {
+            code: ModeToolConfig {
-            allowed_tools: vec!["*".to_string()], // allow all in code
+                // Strict denial - only web_search allowed
                allowed_tools: vec!["web_search".to_string()],
            },
        },
        ..Default::default()
    };
    // Remove the wildcard for code to simulate strict denial.
    cfg.modes.code.allowed_tools = vec!["web_search".to_string()];
    let cfg = Arc::new(Mutex::new(cfg));
    let ui: Arc<dyn UiController> = Arc::new(NoOpUiController);
--- a/crates/owlen-mcp-llm-server/Cargo.toml
+++ b/crates/owlen-mcp-llm-server/Cargo.toml
@@ -12,9 +12,6 @@ serde_json = "1.0"
 anyhow = "1.0"
 tokio-stream = "0.1"
 [lib]
 path = "src/lib.rs"
 [[bin]]
 name = "owlen-mcp-llm-server"
-path = "src/lib.rs"
+path = "src/main.rs"
--- a/crates/owlen-mcp-llm-server/src/main.rs
+++ b/crates/owlen-mcp-llm-server/src/main.rs