conductor(checkpoint): Checkpoint end of Phase 1 - Core Agent & Platform Testing

This commit is contained in:
2025-12-26 18:28:44 +01:00
parent efc72c5ceb
commit 495f63f0d8
20 changed files with 2993 additions and 54 deletions

1
.tmp Submodule

Submodule .tmp added at 4928f2cdca

311
PERMISSION_SYSTEM.md Normal file
View File

@@ -0,0 +1,311 @@
# Permission System - TUI Implementation
## Overview
The TUI now has a fully functional interactive permission system that allows users to grant, deny, or permanently allow tool executions through an elegant popup interface.
## Features Implemented
### 1. Interactive Permission Popup
**Location:** `crates/app/ui/src/components/permission_popup.rs`
- **Visual Design:**
- Centered modal popup with themed border
- Tool name highlighted with icon (⚡)
- Context information (file path, command, etc.)
- Four selectable options with icons
- Keyboard shortcuts and navigation hints
- **Options Available:**
-`[a]` **Allow once** - Execute this one time
- ✓✓ `[A]` **Always allow** - Add permanent rule to permission manager
-`[d]` **Deny** - Refuse this operation
- ? `[?]` **Explain** - Show what this operation does
- **Navigation:**
- Arrow keys (↑/↓) to select options
- Enter to confirm selection
- Keyboard shortcuts (a/A/d/?) for quick selection
- Esc to deny and close
### 2. Permission Flow Integration
**Location:** `crates/app/ui/src/app.rs`
#### New Components:
1. **PendingToolCall struct:**
```rust
struct PendingToolCall {
tool_name: String,
arguments: Value,
perm_tool: PermTool,
context: Option<String>,
}
```
Stores information about tool awaiting permission.
2. **TuiApp fields:**
- `pending_tool: Option<PendingToolCall>` - Current pending tool
- `permission_tx: Option<oneshot::Sender<bool>>` - Channel to signal decision
3. **execute_tool_with_permission() method:**
```rust
async fn execute_tool_with_permission(
&mut self,
tool_name: &str,
arguments: &Value,
) -> Result<String>
```
**Flow:**
1. Maps tool name to PermTool enum (Read, Write, Edit, Bash, etc.)
2. Extracts context (file path, command, etc.)
3. Checks permission via PermissionManager
4. If `Allow` → Execute immediately
5. If `Deny` → Return error
6. If `Ask` → Show popup and wait for user decision
**Async Wait Mechanism:**
- Creates oneshot channel for permission response
- Shows permission popup
- Awaits channel response (with 5-minute timeout)
- Event loop continues processing keyboard events
- When user responds, channel signals and execution resumes
### 3. Permission Decision Handling
**Location:** `crates/app/ui/src/app.rs:184-254`
When user makes a choice in the popup:
- **Allow Once:**
- Signals permission granted (sends `true` through channel)
- Tool executes once
- No persistent changes
- **Always Allow:**
- Adds new rule to PermissionManager
- Rule format: `perms.add_rule(tool, context, Action::Allow)`
- Example: Always allow reading from `src/` directory
- Signals permission granted
- All future matching operations auto-approved
- **Deny:**
- Signals permission denied (sends `false`)
- Tool execution fails with error
- Error shown in chat
- **Explain:**
- Shows explanation of what the tool does
- Popup remains open for user to choose again
- Tool-specific explanations:
- `read` → "read a file from disk"
- `write` → "write or overwrite a file"
- `edit` → "modify an existing file"
- `bash` → "execute a shell command"
- `grep` → "search for patterns in files"
- `glob` → "list files matching a pattern"
### 4. Agent Loop Integration
**Location:** `crates/app/ui/src/app.rs:488`
Changed from:
```rust
match execute_tool(tool_name, arguments, &self.perms).await {
```
To:
```rust
match self.execute_tool_with_permission(tool_name, arguments).await {
```
This ensures all tool calls in the streaming agent loop go through the permission system.
## Architecture Details
### Async Concurrency Model
The implementation uses Rust's async/await with tokio to handle the permission flow without blocking the UI:
```
┌─────────────────────────────────────────────────────────┐
│ Event Loop │
│ (continuously running at 60 FPS) │
│ │
│ while running { │
│ terminal.draw(...) ← Always responsive │
│ if let Ok(event) = event_rx.try_recv() { │
│ handle_event(event).await │
│ } │
│ tokio::sleep(16ms).await ← Yields to runtime │
│ } │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Keyboard Event Listener │
│ (separate tokio task) │
│ │
│ loop { │
│ event = event_stream.next().await │
│ event_tx.send(Input(key)) │
│ } │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Permission Request Flow │
│ │
│ 1. Tool needs permission (PermissionDecision::Ask) │
│ 2. Create oneshot channel (tx, rx) │
│ 3. Show popup, store tx │
│ 4. await rx ← Yields to event loop │
│ 5. Event loop continues, handles keyboard │
│ 6. User presses 'a' → handle_event processes │
│ 7. tx.send(true) signals channel │
│ 8. rx.await completes, returns true │
│ 9. Tool executes with permission │
└─────────────────────────────────────────────────────────┘
```
### Key Insight
The implementation works because:
1. **Awaiting is non-blocking:** When we `await rx`, we yield control to the tokio runtime
2. **Event loop continues:** The outer event loop continues to run its iterations
3. **Keyboard events processed:** The separate event listener task continues reading keyboard
4. **Channel signals resume:** When user responds, the channel completes and we resume
This creates a smooth UX where the UI remains responsive while waiting for permission.
## Usage Examples
### Example 1: First-time File Write
```
User: "Create a new file hello.txt with 'Hello World'"
Agent: [Calls write tool]
┌───────────────────────────────────────┐
│ 🔒 Permission Required │
├───────────────────────────────────────┤
│ ⚡ Tool: write │
│ 📝 Context: │
│ hello.txt │
├───────────────────────────────────────┤
│ ▶ ✓ [a] Allow once │
│ ✓✓ [A] Always allow │
│ ✗ [d] Deny │
│ ? [?] Explain │
│ │
│ ↑↓ Navigate Enter to select Esc... │
└───────────────────────────────────────┘
User presses 'a' → File created once
```
### Example 2: Always Allow Bash in Current Directory
```
User: "Run npm test"
Agent: [Calls bash tool]
[Permission popup shows with context: "npm test"]
User presses 'A' → Rule added: bash("npm test*") → Allow
Future: User: "Run npm test:unit"
Agent: [Executes immediately, no popup]
```
### Example 3: Explanation Request
```
User: "Read my secrets.env file"
[Permission popup appears]
User presses '?' →
System: "Tool 'read' requires permission. This operation
will read a file from disk."
[Popup remains open]
User presses 'd' → Permission denied
```
## Testing
Build status: ✅ All tests pass
```bash
cargo build --workspace # Success
cargo test --workspace --lib # 28 tests passed
```
## Configuration
The permission system respects three modes from `PermissionManager`:
1. **Plan Mode** (default):
- Read operations (read, grep, glob) → Auto-allowed
- Write operations (write, edit) → Ask
- System operations (bash) → Ask
2. **AcceptEdits Mode**:
- Read operations → Auto-allowed
- Write operations → Auto-allowed
- System operations (bash) → Ask
3. **Code Mode**:
- All operations → Auto-allowed
- No popups shown
User can override mode with CLI flag: `--mode code`
## Future Enhancements
Potential improvements:
1. **Permission History:**
- Show recently granted/denied permissions
- `/permissions` command to view active rules
2. **Temporary Rules:**
- "Allow for this session" option
- Rules expire when TUI closes
3. **Pattern-based Rules:**
- "Always allow reading from `src/` directory"
- "Always allow bash commands starting with `npm`"
4. **Visual Feedback:**
- Show indicator when permission auto-granted by rule
- Different styling for policy-denied vs user-denied
5. **Rule Management:**
- `/clear-rules` command
- Edit/remove specific rules interactively
## Files Modified
- `crates/app/ui/src/app.rs` - Main permission flow logic
- `crates/app/ui/src/events.rs` - Removed unused event type
- `crates/app/ui/src/components/permission_popup.rs` - Pre-existing, now fully integrated
## Summary
The TUI permission system is now fully functional, providing:
- ✅ Interactive permission popups with keyboard navigation
- ✅ Four permission options (allow once, always, deny, explain)
- ✅ Runtime permission rule updates
- ✅ Async flow that keeps UI responsive
- ✅ Integration with existing permission manager
- ✅ Tool-specific context and explanations
- ✅ Timeout handling (5 minutes)
- ✅ All tests passing
Users can now safely interact with the AI agent while maintaining control over potentially dangerous operations.

363
PLUGIN_HOOKS_INTEGRATION.md Normal file
View File

@@ -0,0 +1,363 @@
# Plugin Hooks Integration
This document describes how the plugin system integrates with the hook system to allow plugins to define lifecycle hooks.
## Overview
Plugins can now define hooks in a `hooks/hooks.json` file that will be automatically registered with the `HookManager` during application startup. This allows plugins to:
- Intercept and validate tool calls before execution (PreToolUse)
- React to tool execution results (PostToolUse)
- Run code at session boundaries (SessionStart, SessionEnd)
- Process user input (UserPromptSubmit)
- Handle context compaction (PreCompact)
## Plugin Hook Configuration
Plugins define hooks in `hooks/hooks.json`:
```json
{
"description": "Validation and logging hooks for the plugin",
"hooks": {
"PreToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": "python3 ${CLAUDE_PLUGIN_ROOT}/hooks/validate.py",
"timeout": 5000
}
]
},
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/bash_guard.sh"
}
]
}
],
"PostToolUse": [
{
"hooks": [
{
"type": "command",
"command": "echo 'Tool executed' >> ${CLAUDE_PLUGIN_ROOT}/logs/tool.log && exit 0"
}
]
}
]
}
}
```
### Hook Configuration Schema
- **description** (optional): A human-readable description of what the hooks do
- **hooks**: A map of event names to hook matchers
- **PreToolUse**: Hooks that run before a tool is executed
- **PostToolUse**: Hooks that run after a tool is executed
- **SessionStart**: Hooks that run when a session starts
- **SessionEnd**: Hooks that run when a session ends
- **UserPromptSubmit**: Hooks that run when the user submits a prompt
- **PreCompact**: Hooks that run before context compaction
### Hook Matcher
Each hook matcher contains:
- **matcher** (optional): A regex pattern to match against tool names (for PreToolUse events)
- Example: `"Edit|Write"` matches both Edit and Write tools
- Example: `".*"` matches all tools
- If not specified, the hook applies to all tools
- **hooks**: An array of hook definitions
### Hook Definition
Each hook definition contains:
- **type**: The hook type (`"command"` or `"prompt"`)
- **command**: The shell command to execute (for command-type hooks)
- Can use `${CLAUDE_PLUGIN_ROOT}` which is replaced with the plugin's base path
- **prompt** (future): An LLM prompt for AI-based validation
- **timeout** (optional): Timeout in milliseconds (default: no timeout)
## Variable Substitution
The following variables are automatically substituted in hook commands:
- **${CLAUDE_PLUGIN_ROOT}**: The absolute path to the plugin directory
- Example: `~/.config/owlen/plugins/my-plugin`
- Useful for referencing scripts within the plugin
## Hook Execution Behavior
### Exit Codes
Hooks communicate their decision via exit codes:
- **0**: Allow the operation to proceed
- **2**: Deny the operation (blocks the tool call)
- **Other**: Error (operation fails with error message)
### Input/Output
Hooks receive JSON input via stdin containing the event data:
```json
{
"event": "preToolUse",
"tool": "Edit",
"args": {
"path": "/path/to/file.txt",
"old_string": "foo",
"new_string": "bar"
}
}
```
### Pattern Matching
For PreToolUse hooks, the `matcher` field is treated as a regex pattern:
- `"Edit|Write"` - Matches Edit OR Write tools
- `"Bash"` - Matches only Bash tool
- `".*"` - Matches all tools
- No matcher - Applies to all tools
### Multiple Hooks
- Multiple plugins can define hooks for the same event
- All matching hooks are executed in sequence
- If any hook denies (exit code 2), the operation is blocked
- File-based hooks in `.owlen/hooks/` are executed first, then plugin hooks
## Integration Architecture
### Loading Process
1. **Application Startup** (`main.rs`):
```rust
// Create hook manager
let mut hook_mgr = HookManager::new(".");
// Register plugin hooks
for plugin in app_context.plugin_manager.plugins() {
if let Ok(Some(hooks_config)) = plugin.load_hooks_config() {
for (event, command, pattern, timeout) in plugin.register_hooks_with_manager(&hooks_config) {
hook_mgr.register_hook(event, command, pattern, timeout);
}
}
}
```
2. **Plugin Hook Loading** (`plugins/src/lib.rs`):
- `Plugin::load_hooks_config()` reads and parses `hooks/hooks.json`
- `Plugin::register_hooks_with_manager()` processes the config and performs variable substitution
3. **Hook Registration** (`hooks/src/lib.rs`):
- `HookManager::register_hook()` stores hooks internally
- `HookManager::execute()` filters and executes matching hooks
### Execution Flow
```
Tool Call Request
Permission Check
HookManager::execute(PreToolUse)
Check file-based hook (.owlen/hooks/PreToolUse)
Filter plugin hooks by event and pattern
Execute each matching hook
If any hook denies → Block operation
If all allow → Execute tool
HookManager::execute(PostToolUse)
```
## Example: Validation Hook
Create a plugin with a validation hook:
**Directory structure:**
```
~/.config/owlen/plugins/validation/
├── plugin.json
└── hooks/
├── hooks.json
└── validate.py
```
**plugin.json:**
```json
{
"name": "validation",
"version": "1.0.0",
"description": "Validation hooks for file operations"
}
```
**hooks/hooks.json:**
```json
{
"description": "Validate file operations",
"hooks": {
"PreToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": "python3 ${CLAUDE_PLUGIN_ROOT}/hooks/validate.py",
"timeout": 5000
}
]
}
]
}
}
```
**hooks/validate.py:**
```python
#!/usr/bin/env python3
import json
import sys
# Read event from stdin
event = json.load(sys.stdin)
tool = event.get('tool')
args = event.get('args', {})
path = args.get('path', '')
# Deny operations on system files
if path.startswith('/etc/') or path.startswith('/sys/'):
print(f"Blocked: Cannot modify system file {path}", file=sys.stderr)
sys.exit(2) # Deny
# Allow all other operations
sys.exit(0) # Allow
```
**Make executable:**
```bash
chmod +x ~/.config/owlen/plugins/validation/hooks/validate.py
```
## Testing
### Unit Tests
Test hook registration and execution:
```rust
#[tokio::test]
async fn test_plugin_hooks() -> Result<()> {
let mut hook_mgr = HookManager::new(".");
hook_mgr.register_hook(
"PreToolUse".to_string(),
"echo 'validated' && exit 0".to_string(),
Some("Edit|Write".to_string()),
Some(5000),
);
let event = HookEvent::PreToolUse {
tool: "Edit".to_string(),
args: serde_json::json!({}),
};
let result = hook_mgr.execute(&event, Some(5000)).await?;
assert_eq!(result, HookResult::Allow);
Ok(())
}
```
### Integration Tests
Test the full plugin loading and hook execution:
```rust
#[tokio::test]
async fn test_plugin_hooks_integration() -> Result<()> {
// Create plugin with hooks
let plugin_dir = create_test_plugin_with_hooks()?;
// Load plugin
let mut plugin_manager = PluginManager::with_dirs(vec![plugin_dir]);
plugin_manager.load_all()?;
// Register hooks
let mut hook_mgr = HookManager::new(".");
for plugin in plugin_manager.plugins() {
if let Ok(Some(config)) = plugin.load_hooks_config() {
for (event, cmd, pattern, timeout) in plugin.register_hooks_with_manager(&config) {
hook_mgr.register_hook(event, cmd, pattern, timeout);
}
}
}
// Test hook execution
let event = HookEvent::PreToolUse {
tool: "Edit".to_string(),
args: serde_json::json!({}),
};
let result = hook_mgr.execute(&event, Some(5000)).await?;
assert_eq!(result, HookResult::Allow);
Ok(())
}
```
## Implementation Details
### Modified Crates
1. **plugins** (`crates/platform/plugins/src/lib.rs`):
- Added `PluginHooksConfig`, `HookMatcher`, `HookDefinition` structs
- Added `Plugin::load_hooks_config()` method
- Added `Plugin::register_hooks_with_manager()` method
2. **hooks** (`crates/platform/hooks/src/lib.rs`):
- Refactored to store registered hooks internally
- Added `HookManager::register_hook()` method
- Updated `HookManager::execute()` to handle both file-based and registered hooks
- Added pattern matching support using regex
- Added `regex` dependency
3. **owlen** (`crates/app/cli/src/main.rs`):
- Integrated plugin hook loading during startup
- Registered plugin hooks with HookManager
### Dependencies Added
- **hooks/Cargo.toml**: Added `regex = "1.10"`
## Benefits
1. **Modularity**: Hooks can be packaged with plugins and distributed independently
2. **Reusability**: Plugins can be shared across projects
3. **Flexibility**: Each plugin can define multiple hooks with different patterns
4. **Compatibility**: Works alongside existing file-based hooks in `.owlen/hooks/`
5. **Variable Substitution**: `${CLAUDE_PLUGIN_ROOT}` makes scripts portable
## Future Enhancements
1. **Prompt-based hooks**: Use LLM for validation instead of shell commands
2. **Hook priorities**: Control execution order of hooks
3. **Hook metadata**: Description, author, version for each hook
4. **Hook debugging**: Better error messages and logging
5. **Async hooks**: Support for long-running hooks that don't block

262
TODO.md Normal file
View File

@@ -0,0 +1,262 @@
# Owlen Project Improvement Roadmap
Generated from codebase analysis on 2025-11-01
## Overall Assessment
**Grade:** A (90/100)
**Status:** Production-ready with minor enhancements needed
**Architecture:** Excellent domain-driven design with clean separation of concerns
---
## 🔴 Critical Issues (Do First)
- [x] **Fix Integration Test Failure** (`crates/app/cli/tests/chat_stream.rs`) ✅ **COMPLETED**
- Fixed mock server to accept requests with tools parameter
- Test now passes successfully
- Location: `crates/app/cli/tests/chat_stream.rs`
- [x] **Remove Side Effects from Library Code** (`crates/core/agent/src/lib.rs:348-349`) ✅ **COMPLETED**
- Replaced `println!` with `tracing` crate
- Added `tracing = "0.1"` dependency to `agent-core`
- Changed to structured logging: `tracing::debug!` for tool calls, `tracing::warn!` for errors
- Users can now control verbosity and route logs appropriately
- Location: `crates/core/agent/src/lib.rs:348, 352, 361`
---
## 🟡 High-Priority Improvements
### Permission System ✅
- [x] **Implement Proper Permission Selection in TUI****COMPLETED**
- Added interactive permission popup with keyboard navigation
- Implemented "Allow once", "Always allow", "Deny", and "Explain" options
- Integrated permission requests into agent loop with async channels
- Added runtime permission rule updates for "Always allow"
- Permission popups pause execution and wait for user input
- Location: `crates/app/ui/src/app.rs`, `crates/app/ui/src/components/permission_popup.rs`
### Documentation
- [ ] **Add User-Facing README.md**
- Quick start guide
- Installation instructions
- Usage examples
- Feature overview
- Links to detailed docs
- Priority: HIGH
- [ ] **Add Architecture Documentation**
- Crate dependency graph diagram
- Agent loop flow diagram
- Permission system flow diagram
- Plugin/hook integration points diagram
- Priority: MEDIUM
### Feature Integration
- [ ] **Integrate Plugin System**
- Wire plugin loading into `crates/app/cli/src/main.rs`
- Load plugins at startup
- Test with example plugins
- Priority: HIGH
- [ ] **Integrate MCP Client into Agent**
- Add MCP tools to agent's tool registry
- Enable external tool servers (databases, APIs, etc.)
- Document MCP server setup
- Priority: HIGH
- [ ] **Implement Real Web Search Provider**
- Add provider for DuckDuckGo, Brave Search, or SearXNG
- Make the web tool functional
- Add configuration for provider selection
- Priority: MEDIUM
### Error Handling & Reliability
- [ ] **Add Retry Logic for Transient Failures**
- Exponential backoff for Ollama API calls
- Configurable retry policies (max attempts, timeout)
- Handle network failures gracefully
- Priority: MEDIUM
- [ ] **Enhance Error Messages**
- Add actionable suggestions for common errors
- Example: "Ollama not running? Try: `ollama serve`"
- Example: "Model not found? Try: `ollama pull qwen3:8b`"
- Priority: MEDIUM
---
## 🟢 Medium-Priority Enhancements
### Testing
- [ ] **Add UI Component Testing**
- Snapshot tests for TUI components
- Integration tests for user interactions
- Use `ratatui` testing utilities
- Priority: MEDIUM
- [ ] **Add More Edge Case Tests**
- Glob patterns with special characters
- Edit operations with Unicode
- Very large file handling
- Concurrent tool execution
- Priority: MEDIUM
- [ ] **Code Coverage Reporting**
- Integrate `tarpaulin` or `cargo-llvm-cov`
- Set minimum coverage thresholds (aim for 80%+)
- Track coverage trends over time
- Priority: LOW
### Documentation
- [ ] **Module-Level Documentation**
- Add `//!` docs to key modules
- Explain design decisions and patterns
- Document internal APIs
- Priority: MEDIUM
- [ ] **Create Examples Directory**
- Simple CLI usage examples
- Custom plugin development guide
- Hook script examples
- MCP server integration examples
- Configuration templates
- Priority: MEDIUM
### Code Quality
- [ ] **Fix Dead Code Warning** (`ui/src/app.rs:38`)
- Either use `settings` field or remove it
- Remove `#[allow(dead_code)]`
- Priority: LOW
- [ ] **Improve Error Recovery**
- Checkpoint auto-save on crashes
- Graceful degradation when tools fail
- Better handling of partial tool results
- Priority: MEDIUM
---
## 🔵 Low-Priority Nice-to-Haves
### Project Infrastructure
- [ ] **CI/CD Pipeline (GitHub Actions)**
- Automated testing on push
- Clippy linting
- Format checking with `rustfmt`
- Security audits with `cargo-audit`
- Cross-platform builds (Linux, macOS, Windows)
- Priority: LOW
- [ ] **Performance Benchmarking**
- Add benchmark suite using `criterion` crate
- Track performance for glob, grep, large file ops
- Track agent loop iteration performance
- Priority: LOW
### Code Organization
- [ ] **Extract Reusable Crates**
- Publish `mcp-client` as standalone library
- Publish `llm-ollama` as standalone library
- Enable reuse by other projects
- Consider publishing to crates.io
- Priority: LOW
---
## 💡 Feature Enhancement Ideas
### Session Management
- [ ] **Session Persistence**
- Auto-save sessions across restarts
- Resume previous conversations
- Session history browser in TUI
- Export/import session transcripts
### Multi-Provider Support
- [ ] **Multi-Model Support**
- Support Anthropic Claude API
- Support OpenAI API
- Provider abstraction layer
- Fallback chains for reliability
### Enhanced Permissions
- [ ] **Advanced Permission System**
- Time-based permissions (expire after N minutes)
- Scope-based permissions (allow within specific directories)
- Permission profiles (dev, prod, strict)
- Team permission policies
### Collaboration
- [ ] **Collaborative Features**
- Export sessions as shareable transcripts
- Import/export checkpoints
- Shared permission policies for teams
- Session replay functionality
### Observability
- [ ] **Enhanced Observability**
- Token usage tracking per tool call
- Cost estimation dashboard
- Performance metrics export (JSON/CSV)
- OpenTelemetry integration
- Real-time stats in TUI
---
## 🚀 Quick Wins (Can Be Done Today)
- [ ] Add `README.md` to repository root
- [ ] Fix dead code warning in `ui/src/app.rs:38`
- [ ] Add `tracing` crate and replace `println!` calls
- [ ] Create `.github/workflows/ci.yml` for basic CI
- [ ] Add module-level docs to `agent-core` and `config-agent`
---
## 🌟 Long-Term Vision
- **Plugin Marketplace:** Curated registry of community plugins
- **Interactive Tutorial:** Built-in tutorial mode for new users
- **VS Code Extension:** Editor integration for inline assistance
- **Collaborative Agents:** Multi-agent workflows with role assignment
- **Knowledge Base Integration:** RAG capabilities for project-specific knowledge
- **Web Dashboard:** Browser-based interface for session management
- **Cloud Sync:** Sync configs and sessions across devices
---
## Notes
- **Test Status:** 28+ tests, most passing. 1 integration test failure (mock server issue)
- **Test Coverage:** Strong coverage for core functionality (permissions, checkpoints, hooks)
- **Architecture:** Clean domain-driven workspace with 15 crates across 4 domains
- **Code Quality:** Excellent error handling, consistent patterns, minimal technical debt
- **Innovation Highlights:** Checkpoint/rewind system, three-tiered permissions, shell-based hooks
---
## Priority Legend
- **HIGH:** Should be done soon, blocks other features or affects quality
- **MEDIUM:** Important but not urgent, improves user experience
- **LOW:** Nice to have, can be deferred
---
Last Updated: 2025-11-01

View File

@@ -4,5 +4,5 @@ This file tracks all major tracks for the project. Each track has its own detail
---
## [ ] Track: Establish a comprehensive test suite for the core agent logic and ensure basic documentation for all crates.
## [~] Track: Establish a comprehensive test suite for the core agent logic and ensure basic documentation for all crates.
*Link: [./conductor/tracks/stabilize_core_20251226/](./conductor/tracks/stabilize_core_20251226/)*

View File

@@ -20,8 +20,11 @@ pulldown-cmark = "0.11"
# Internal dependencies
agent-core = { path = "../../core/agent" }
auth-manager = { path = "../../platform/auth" }
permissions = { path = "../../platform/permissions" }
llm-core = { path = "../../llm/core" }
llm-anthropic = { path = "../../llm/anthropic" }
llm-ollama = { path = "../../llm/ollama" }
llm-openai = { path = "../../llm/openai" }
config-agent = { path = "../../platform/config" }
tools-todo = { path = "../../tools/todo" }

View File

@@ -1,22 +1,23 @@
use crate::{
components::{
Autocomplete, AutocompleteResult, ChatMessage, ChatPanel, CommandHelp, InputBox,
PermissionPopup, StatusBar, TodoPanel,
ModelPicker, PermissionPopup, PickerResult, ProviderTabs, StatusBar, TodoPanel,
},
events::{handle_key_event, AppEvent},
layout::AppLayout,
theme::{Theme, VimMode},
provider_manager::ProviderManager,
theme::{Provider, Theme, VimMode},
};
use tools_todo::TodoList;
use agent_core::{CheckpointManager, SessionHistory, SessionStats, ToolContext, execute_tool, get_tool_definitions};
use color_eyre::eyre::Result;
use crossterm::{
event::{Event, EventStream, EnableMouseCapture, DisableMouseCapture},
event::{Event, EventStream, EnableMouseCapture, DisableMouseCapture, KeyCode},
terminal::{disable_raw_mode, enable_raw_mode, EnterAlternateScreen, LeaveAlternateScreen},
ExecutableCommand,
};
use futures::StreamExt;
use llm_core::{ChatMessage as LLMChatMessage, ChatOptions, LlmProvider};
use llm_core::{ChatMessage as LLMChatMessage, ChatOptions, LlmProvider, ProviderType};
use permissions::{Action, PermissionDecision, PermissionManager, Tool as PermTool};
use ratatui::{
backend::CrosstermBackend,
@@ -39,6 +40,14 @@ struct PendingToolCall {
context: Option<String>,
}
/// Provider mode - single client or multi-provider manager
enum ProviderMode {
/// Legacy single-provider mode
Single(Arc<dyn LlmProvider>),
/// Multi-provider with switching support
Multi(ProviderManager),
}
pub struct TuiApp {
// UI components
chat_panel: ChatPanel,
@@ -48,6 +57,8 @@ pub struct TuiApp {
permission_popup: Option<PermissionPopup>,
autocomplete: Autocomplete,
command_help: CommandHelp,
provider_tabs: ProviderTabs,
model_picker: ModelPicker,
theme: Theme,
// Session state
@@ -56,8 +67,8 @@ pub struct TuiApp {
checkpoint_mgr: CheckpointManager,
todo_list: TodoList,
// System state
client: Arc<dyn LlmProvider>,
// Provider state
provider_mode: ProviderMode,
opts: ChatOptions,
perms: PermissionManager,
#[allow(dead_code)] // Reserved for tool execution context
@@ -74,6 +85,7 @@ pub struct TuiApp {
}
impl TuiApp {
/// Create a new TUI app with a single provider (legacy mode)
pub fn new(
client: Arc<dyn LlmProvider>,
opts: ChatOptions,
@@ -83,6 +95,13 @@ impl TuiApp {
let theme = Theme::default();
let mode = perms.mode();
// Determine provider from client name
let provider = match client.name() {
"anthropic" => Provider::Claude,
"openai" => Provider::OpenAI,
_ => Provider::Ollama,
};
Ok(Self {
chat_panel: ChatPanel::new(theme.clone()),
input_box: InputBox::new(theme.clone()),
@@ -91,12 +110,14 @@ impl TuiApp {
permission_popup: None,
autocomplete: Autocomplete::new(theme.clone()),
command_help: CommandHelp::new(theme.clone()),
provider_tabs: ProviderTabs::with_provider(provider, theme.clone()),
model_picker: ModelPicker::new(theme.clone()),
theme,
stats: SessionStats::new(),
history: SessionHistory::new(),
checkpoint_mgr: CheckpointManager::new(PathBuf::from(".owlen/checkpoints")),
todo_list: TodoList::new(),
client,
provider_mode: ProviderMode::Single(client),
opts,
perms,
ctx: ToolContext::new(),
@@ -109,6 +130,104 @@ impl TuiApp {
})
}
/// Create a new TUI app with multi-provider support
pub fn with_provider_manager(
provider_manager: ProviderManager,
perms: PermissionManager,
settings: config_agent::Settings,
) -> Result<Self> {
let theme = Theme::default();
let mode = perms.mode();
// Get initial provider and model
let current_provider = provider_manager.current_provider_type();
let current_model = provider_manager.current_model().to_string();
let provider = match current_provider {
ProviderType::Anthropic => Provider::Claude,
ProviderType::OpenAI => Provider::OpenAI,
ProviderType::Ollama => Provider::Ollama,
};
let opts = ChatOptions::new(&current_model);
Ok(Self {
chat_panel: ChatPanel::new(theme.clone()),
input_box: InputBox::new(theme.clone()),
status_bar: StatusBar::new(current_model, mode, theme.clone()),
todo_panel: TodoPanel::new(theme.clone()),
permission_popup: None,
autocomplete: Autocomplete::new(theme.clone()),
command_help: CommandHelp::new(theme.clone()),
provider_tabs: ProviderTabs::with_provider(provider, theme.clone()),
model_picker: ModelPicker::new(theme.clone()),
theme,
stats: SessionStats::new(),
history: SessionHistory::new(),
checkpoint_mgr: CheckpointManager::new(PathBuf::from(".owlen/checkpoints")),
todo_list: TodoList::new(),
provider_mode: ProviderMode::Multi(provider_manager),
opts,
perms,
ctx: ToolContext::new(),
settings,
running: true,
waiting_for_llm: false,
pending_tool: None,
permission_tx: None,
vim_mode: VimMode::Insert,
})
}
/// Get the current LLM provider client
fn get_client(&mut self) -> Result<Arc<dyn LlmProvider>> {
match &mut self.provider_mode {
ProviderMode::Single(client) => Ok(Arc::clone(client)),
ProviderMode::Multi(manager) => manager
.get_provider()
.map_err(|e| color_eyre::eyre::eyre!("{}", e)),
}
}
/// Switch to a different provider (only works in multi-provider mode)
fn switch_provider(&mut self, provider_type: ProviderType) -> Result<()> {
if let ProviderMode::Multi(manager) = &mut self.provider_mode {
match manager.switch_provider(provider_type) {
Ok(_) => {
// Update UI state
let provider = match provider_type {
ProviderType::Anthropic => Provider::Claude,
ProviderType::OpenAI => Provider::OpenAI,
ProviderType::Ollama => Provider::Ollama,
};
self.provider_tabs.set_active(provider);
// Update model and status bar
let model = manager.current_model().to_string();
self.opts.model = model.clone();
self.status_bar = StatusBar::new(model.clone(), self.perms.mode(), self.theme.clone());
self.chat_panel.add_message(ChatMessage::System(
format!("Switched to {} (model: {})", provider_type, model)
));
Ok(())
}
Err(e) => {
self.chat_panel.add_message(ChatMessage::System(
format!("Failed to switch provider: {}", e)
));
Err(color_eyre::eyre::eyre!("{}", e))
}
}
} else {
self.chat_panel.add_message(ChatMessage::System(
"Provider switching requires multi-provider mode. Restart with 'owlen' to enable.".to_string()
));
Ok(())
}
}
fn set_theme(&mut self, theme: Theme) {
self.theme = theme.clone();
self.chat_panel = ChatPanel::new(theme.clone());
@@ -116,7 +235,85 @@ impl TuiApp {
self.status_bar = StatusBar::new(self.opts.model.clone(), self.perms.mode(), theme.clone());
self.todo_panel.set_theme(theme.clone());
self.autocomplete.set_theme(theme.clone());
self.command_help.set_theme(theme);
self.command_help.set_theme(theme.clone());
self.provider_tabs.set_theme(theme.clone());
self.model_picker.set_theme(theme);
}
/// Open the model picker for the current provider
async fn open_model_picker(&mut self) {
if let ProviderMode::Multi(manager) = &self.provider_mode {
let provider_type = manager.current_provider_type();
let current_model = manager.current_model().to_string();
// Show loading state immediately
self.model_picker.show_loading(provider_type);
// Fetch models from provider
match manager.list_models_for_provider(provider_type).await {
Ok(models) => {
if models.is_empty() {
self.model_picker.show_error("No models available".to_string());
} else {
self.model_picker.show(models, &provider_type.to_string(), &current_model);
}
}
Err(e) => {
// Show error state with option to use fallback models
self.model_picker.show_error(e.to_string());
}
}
} else {
self.chat_panel.add_message(ChatMessage::System(
"Model picker requires multi-provider mode. Use [1][2][3] to switch providers first.".to_string()
));
}
}
/// Set the model for the current provider
fn set_current_model(&mut self, model: String) {
if let ProviderMode::Multi(manager) = &mut self.provider_mode {
manager.set_current_model(model.clone());
self.opts.model = model.clone();
self.status_bar = StatusBar::new(model.clone(), self.perms.mode(), self.theme.clone());
self.chat_panel.add_message(ChatMessage::System(
format!("Model changed to: {}", model)
));
}
}
/// Show keyboard shortcuts help
fn show_shortcuts_help(&mut self) {
let shortcuts = r#"
--- Keyboard Shortcuts ---
Provider Switching (Normal mode or empty input):
[1] [2] [3] Switch provider (Claude/Ollama/OpenAI)
Tab Cycle through providers
M Open model picker
Chat Navigation (Normal mode or empty input):
j / k Select next/prev message
J / K Scroll chat down/up (3 lines)
g / G Scroll to top/bottom
Esc Clear selection
Scrolling (works anytime):
PageUp/Down Scroll page up/down
Vim Modes:
Esc Normal mode (navigation)
i Insert mode (typing)
: Command mode
Input:
Enter Send message
Ctrl+c Quit
Commands: /help, /model <name>, /clear, /theme <name>
"#;
self.chat_panel.add_message(ChatMessage::System(shortcuts.trim().to_string()));
}
/// Get the public todo list for external updates
@@ -234,7 +431,24 @@ impl TuiApp {
}
});
// No welcome messages added - empty state shows "Start a conversation..."
// Show first-run welcome message if this is the first time
if config_agent::is_first_run() {
self.chat_panel.add_message(ChatMessage::System(
"Welcome to Owlen! 🦉\n\n\
You're starting with:\n\
• Provider: Ollama (local, free)\n\
• Model: qwen3:8b (tool-capable)\n\n\
Quick start:\n\
• [1][2][3] - Switch providers (Claude/Ollama/OpenAI)\n\
• [Tab] - Cycle through providers\n\
• [m] - Open model picker (in Normal mode)\n\
• [Esc] - Enter Normal mode\n\
• /help - Show all commands\n\n\
To authenticate with cloud providers:\n\
• Run: owlen login anthropic\n\
• Run: owlen login openai".to_string()
));
}
// Main event loop
while self.running {
@@ -247,6 +461,9 @@ impl TuiApp {
// Render header: OWLEN left, model + vim mode right
self.render_header(frame, layout.header_area);
// Render provider tabs
self.provider_tabs.render(frame, layout.tabs_area);
// Render top divider (horizontal rule)
self.render_divider(frame, layout.top_divider);
@@ -280,12 +497,17 @@ impl TuiApp {
self.autocomplete.render(frame, layout.input_area);
}
// 2. Command help overlay (centered modal)
// 2. Model picker (centered modal)
if self.model_picker.is_visible() {
self.model_picker.render(frame, size);
}
// 3. Command help overlay (centered modal)
if self.command_help.is_visible() {
self.command_help.render(frame, size);
}
// 3. Permission popup (highest priority)
// 4. Permission popup (highest priority)
if let Some(popup) = &self.permission_popup {
popup.render(frame, size);
}
@@ -390,7 +612,24 @@ impl TuiApp {
return Ok(());
}
// 3. Autocomplete dropdown
// 3. Model picker
if self.model_picker.is_visible() {
let current_model = self.opts.model.clone();
match self.model_picker.handle_key(key, &current_model) {
PickerResult::Selected(model) => {
self.set_current_model(model);
}
PickerResult::Cancelled => {
// Just closed, no action
}
PickerResult::Handled | PickerResult::NotHandled => {
// Navigation or unhandled key
}
}
return Ok(());
}
// 4. Autocomplete dropdown
if self.autocomplete.is_visible() {
match self.autocomplete.handle_key(key) {
AutocompleteResult::Confirmed(cmd) => {
@@ -478,6 +717,26 @@ impl TuiApp {
AppEvent::ToggleTodo => {
self.todo_panel.toggle();
}
AppEvent::SwitchProvider(provider_type) => {
let _ = self.switch_provider(provider_type);
}
AppEvent::CycleProvider => {
// Get current provider type and cycle to next
let current = match self.provider_tabs.active() {
Provider::Claude => ProviderType::Anthropic,
Provider::Ollama => ProviderType::Ollama,
Provider::OpenAI => ProviderType::OpenAI,
};
let next = match current {
ProviderType::Anthropic => ProviderType::Ollama,
ProviderType::Ollama => ProviderType::OpenAI,
ProviderType::OpenAI => ProviderType::Anthropic,
};
let _ = self.switch_provider(next);
}
AppEvent::OpenModelPicker => {
self.open_model_picker().await;
}
AppEvent::Quit => {
self.running = false;
}
@@ -494,6 +753,94 @@ impl TuiApp {
) -> Result<()> {
use crate::components::InputEvent;
// Global navigation keys that work in any mode
match key.code {
// PageUp - Scroll chat up one page (always works)
KeyCode::PageUp => {
self.chat_panel.page_up(20);
return Ok(());
}
// PageDown - Scroll chat down one page (always works)
KeyCode::PageDown => {
self.chat_panel.page_down(20);
return Ok(());
}
_ => {}
}
// Check for provider switching keys when input is empty or in Normal mode
let input_empty = self.input_box.text().is_empty();
if input_empty || self.vim_mode == VimMode::Normal {
match key.code {
// [1] - Switch to Claude (Anthropic)
KeyCode::Char('1') => {
let _ = event_tx.send(AppEvent::SwitchProvider(ProviderType::Anthropic));
return Ok(());
}
// [2] - Switch to Ollama
KeyCode::Char('2') => {
let _ = event_tx.send(AppEvent::SwitchProvider(ProviderType::Ollama));
return Ok(());
}
// [3] - Switch to OpenAI
KeyCode::Char('3') => {
let _ = event_tx.send(AppEvent::SwitchProvider(ProviderType::OpenAI));
return Ok(());
}
// Tab - Cycle providers
KeyCode::Tab => {
let _ = event_tx.send(AppEvent::CycleProvider);
return Ok(());
}
// '?' - Show shortcuts help
KeyCode::Char('?') => {
self.show_shortcuts_help();
return Ok(());
}
// 'M' (Shift+m) - Open model picker
KeyCode::Char('M') => {
let _ = event_tx.send(AppEvent::OpenModelPicker);
return Ok(());
}
// 'j' - Navigate to next message (focus)
KeyCode::Char('j') => {
self.chat_panel.focus_next();
return Ok(());
}
// 'k' - Navigate to previous message (focus)
KeyCode::Char('k') => {
self.chat_panel.focus_previous();
return Ok(());
}
// 'J' (Shift+j) - Scroll chat down
KeyCode::Char('J') => {
self.chat_panel.scroll_down(3);
return Ok(());
}
// 'K' (Shift+k) - Scroll chat up
KeyCode::Char('K') => {
self.chat_panel.scroll_up(3);
return Ok(());
}
// 'G' (Shift+g) - Scroll to bottom
KeyCode::Char('G') => {
self.chat_panel.scroll_to_bottom();
return Ok(());
}
// 'g' - Scroll to top (vim-like gg, simplified to single g)
KeyCode::Char('g') => {
self.chat_panel.scroll_to_top();
return Ok(());
}
// Esc also clears message focus
KeyCode::Esc => {
self.chat_panel.clear_focus();
// Don't return - let it also handle vim mode change
}
_ => {}
}
}
// Handle the key in input box
if let Some(event) = self.input_box.handle_key(key) {
match event {
@@ -550,6 +897,17 @@ impl TuiApp {
return Ok(());
}
// Get the current provider client
let client = match self.get_client() {
Ok(c) => c,
Err(e) => {
self.chat_panel.add_message(ChatMessage::System(
format!("Failed to get provider: {}", e)
));
return Ok(());
}
};
// Add user message to chat IMMEDIATELY so it shows before AI response
self.chat_panel
.add_message(ChatMessage::User(message.clone()));
@@ -561,7 +919,6 @@ impl TuiApp {
let _ = event_tx.send(AppEvent::StreamStart);
// Spawn streaming in background task
let client = Arc::clone(&self.client);
let opts = self.opts.clone();
let tx = event_tx.clone();
let message_owned = message.clone();
@@ -715,6 +1072,9 @@ impl TuiApp {
let mut iteration = 0;
let mut final_response = String::new();
// Get the current provider client
let client = self.get_client()?;
loop {
iteration += 1;
if iteration > max_iterations {
@@ -725,8 +1085,7 @@ impl TuiApp {
}
// Call LLM with streaming using the LlmProvider trait
use llm_core::LlmProvider;
let mut stream = self.client
let mut stream = client
.chat_stream(&messages, &self.opts, Some(&tools))
.await
.map_err(|e| color_eyre::eyre::eyre!("LLM provider error: {}", e))?;

View File

@@ -133,6 +133,12 @@ impl ChatPanel {
self.auto_scroll = true;
}
/// Scroll to top
pub fn scroll_to_top(&mut self) {
self.scroll_offset = 0;
self.auto_scroll = false;
}
/// Page up
pub fn page_up(&mut self, page_size: usize) {
self.scroll_up(page_size.saturating_sub(2));
@@ -200,6 +206,12 @@ impl ChatPanel {
}
/// Count total lines for scroll calculation
/// Must match exactly what render() produces:
/// - User: 1 (role line) + N (content) + 1 (empty) = N + 2
/// - Assistant: 1 (role line) + N (content) + 1 (empty) = N + 2
/// - ToolCall: 1 (call line) + 1 (empty) = 2
/// - ToolResult: 1 (result line) + 1 (empty) = 2
/// - System: N (content lines) + 1 (empty) = N + 1
fn count_total_lines(&self, area: Rect) -> usize {
let mut line_count = 0;
let wrap_width = area.width.saturating_sub(4) as usize;
@@ -208,15 +220,20 @@ impl ChatPanel {
line_count += match &msg.message {
ChatMessage::User(content) => {
let wrapped = textwrap::wrap(content, wrap_width);
wrapped.len() + 1 // +1 for spacing
// 1 role line + N content lines + 1 empty line
1 + wrapped.len() + 1
}
ChatMessage::Assistant(content) => {
let wrapped = textwrap::wrap(content, wrap_width);
wrapped.len() + 1
// 1 role line + N content lines + 1 empty line
1 + wrapped.len() + 1
}
ChatMessage::ToolCall { .. } => 2,
ChatMessage::ToolResult { .. } => 2,
ChatMessage::System(_) => 1,
ChatMessage::System(content) => {
// N content lines + 1 empty line (no role line for system)
content.lines().count() + 1
}
};
}
@@ -357,11 +374,14 @@ impl ChatPanel {
}
ChatMessage::System(content) => {
// System messages: just dim text, no prefix
text_lines.push(Line::from(vec![
Span::styled(" ", Style::default()),
Span::styled(content.to_string(), self.theme.system_message),
]));
// System messages: handle multi-line content
for line in content.lines() {
text_lines.push(Line::from(vec![
Span::styled(" ", Style::default()),
Span::styled(line.to_string(), self.theme.system_message),
]));
}
text_lines.push(Line::from(""));
}
}
}

View File

@@ -4,6 +4,7 @@ mod autocomplete;
mod chat_panel;
mod command_help;
mod input_box;
mod model_picker;
mod permission_popup;
mod provider_tabs;
mod status_bar;
@@ -13,6 +14,7 @@ pub use autocomplete::{Autocomplete, AutocompleteOption, AutocompleteResult};
pub use chat_panel::{ChatMessage, ChatPanel, DisplayMessage};
pub use command_help::{Command, CommandHelp};
pub use input_box::{InputBox, InputEvent};
pub use model_picker::{ModelPicker, PickerResult, PickerState};
pub use permission_popup::{PermissionOption, PermissionPopup};
pub use provider_tabs::ProviderTabs;
pub use status_bar::{AppState, StatusBar};

View File

@@ -0,0 +1,811 @@
//! Model Picker Component
//!
//! A dropdown-style picker for selecting models from the current provider.
//! Triggered by pressing 'm' when input is empty or in Normal mode.
use crate::theme::Theme;
use crossterm::event::{KeyCode, KeyEvent};
use llm_core::{ModelInfo, ProviderType};
use ratatui::{
layout::Rect,
style::{Modifier, Style},
text::{Line, Span},
widgets::{Block, Borders, Clear, Paragraph},
Frame,
};
/// Result of handling a key event in the model picker
pub enum PickerResult {
/// Model was selected
Selected(String),
/// Picker was cancelled
Cancelled,
/// Key was handled, no action needed
Handled,
/// Key was not handled
NotHandled,
}
/// State of the model picker
#[derive(Debug, Clone, PartialEq)]
pub enum PickerState {
/// Picker is hidden
Hidden,
/// Loading models from provider
Loading,
/// Picker is ready with models
Ready,
/// Error loading models
Error(String),
}
/// Maximum number of visible models in the picker
const MAX_VISIBLE_MODELS: usize = 10;
/// Model picker dropdown component
pub struct ModelPicker {
/// Available models for the current provider
models: Vec<ModelInfo>,
/// Currently selected index (within filtered_indices)
selected_index: usize,
/// Scroll offset for the visible window
scroll_offset: usize,
/// Picker state (hidden, loading, ready, error)
state: PickerState,
/// Filter text for searching models
filter: String,
/// Filtered model indices
filtered_indices: Vec<usize>,
/// Theme for styling
theme: Theme,
/// Provider name for display
provider_name: String,
/// Current provider type (for fallback models)
provider_type: Option<ProviderType>,
}
impl ModelPicker {
/// Create a new model picker
pub fn new(theme: Theme) -> Self {
Self {
models: Vec::new(),
selected_index: 0,
scroll_offset: 0,
state: PickerState::Hidden,
filter: String::new(),
filtered_indices: Vec::new(),
theme,
provider_name: String::new(),
provider_type: None,
}
}
/// Show the loading state while fetching models
pub fn show_loading(&mut self, provider_type: ProviderType) {
// Capitalize provider name for display
let name = provider_type.to_string();
self.provider_name = capitalize_first(&name);
self.provider_type = Some(provider_type);
self.models.clear();
self.filtered_indices.clear();
self.filter.clear();
self.scroll_offset = 0;
self.state = PickerState::Loading;
}
/// Show the picker with models for a provider
pub fn show(&mut self, models: Vec<ModelInfo>, provider_name: &str, current_model: &str) {
self.models = models;
self.provider_name = provider_name.to_string();
self.filter.clear();
self.scroll_offset = 0;
self.update_filter();
// Find and select current model (index within filtered_indices)
let model_idx = self
.models
.iter()
.position(|m| m.id == current_model)
.unwrap_or(0);
// Find position in filtered list
self.selected_index = self
.filtered_indices
.iter()
.position(|&i| i == model_idx)
.unwrap_or(0);
// Adjust scroll to show selected item
self.ensure_selected_visible();
self.state = PickerState::Ready;
}
/// Show an error state
pub fn show_error(&mut self, error: String) {
self.state = PickerState::Error(error);
}
/// Hide the picker
pub fn hide(&mut self) {
self.state = PickerState::Hidden;
self.filter.clear();
self.provider_type = None;
}
/// Check if picker is visible (any non-hidden state)
pub fn is_visible(&self) -> bool {
!matches!(self.state, PickerState::Hidden)
}
/// Get current picker state
pub fn state(&self) -> &PickerState {
&self.state
}
/// Use fallback models for the current provider
pub fn use_fallback_models(&mut self, current_model: &str) {
if let Some(provider_type) = self.provider_type {
let models = get_fallback_models(provider_type);
if !models.is_empty() {
self.show(models, &provider_type.to_string(), current_model);
} else {
self.show_error("No models available".to_string());
}
}
}
/// Update the theme
pub fn set_theme(&mut self, theme: Theme) {
self.theme = theme;
}
/// Update filter and recalculate visible models
fn update_filter(&mut self) {
let filter_lower = self.filter.to_lowercase();
self.filtered_indices = self
.models
.iter()
.enumerate()
.filter(|(_, m)| {
if filter_lower.is_empty() {
true
} else {
m.id.to_lowercase().contains(&filter_lower)
|| m.display_name
.as_ref()
.map(|n| n.to_lowercase().contains(&filter_lower))
.unwrap_or(false)
}
})
.map(|(i, _)| i)
.collect();
// Reset selection to first item and scroll to top when filter changes
self.selected_index = 0;
self.scroll_offset = 0;
}
/// Handle a key event
pub fn handle_key(&mut self, key: KeyEvent, current_model: &str) -> PickerResult {
// Handle different states
match &self.state {
PickerState::Hidden => return PickerResult::NotHandled,
PickerState::Loading => {
// Only allow escape while loading
if key.code == KeyCode::Esc {
self.hide();
return PickerResult::Cancelled;
}
return PickerResult::Handled;
}
PickerState::Error(_) => {
match key.code {
KeyCode::Esc => {
self.hide();
return PickerResult::Cancelled;
}
KeyCode::Char('f') | KeyCode::Char('F') => {
// Use fallback models
self.use_fallback_models(current_model);
return PickerResult::Handled;
}
_ => return PickerResult::Handled,
}
}
PickerState::Ready => {
// Fall through to model selection logic
}
}
// Ready state - handle model selection
match key.code {
KeyCode::Esc => {
self.hide();
PickerResult::Cancelled
}
KeyCode::Enter => {
// selected_index is position in filtered_indices
// get the actual model index, then the model
if let Some(&model_idx) = self.filtered_indices.get(self.selected_index) {
let model_id = self.models[model_idx].id.clone();
self.hide();
PickerResult::Selected(model_id)
} else {
PickerResult::Cancelled
}
}
KeyCode::Up | KeyCode::Char('k') => {
self.select_previous();
PickerResult::Handled
}
KeyCode::Down | KeyCode::Char('j') => {
self.select_next();
PickerResult::Handled
}
KeyCode::Char(c) => {
self.filter.push(c);
self.update_filter();
PickerResult::Handled
}
KeyCode::Backspace => {
self.filter.pop();
self.update_filter();
PickerResult::Handled
}
_ => PickerResult::NotHandled,
}
}
/// Select next model (selected_index is position in filtered_indices)
fn select_next(&mut self) {
if self.filtered_indices.is_empty() {
return;
}
// Move to next position (wrapping)
self.selected_index = (self.selected_index + 1) % self.filtered_indices.len();
self.ensure_selected_visible();
}
/// Select previous model (selected_index is position in filtered_indices)
fn select_previous(&mut self) {
if self.filtered_indices.is_empty() {
return;
}
// Move to previous position (wrapping)
if self.selected_index == 0 {
self.selected_index = self.filtered_indices.len() - 1;
} else {
self.selected_index -= 1;
}
self.ensure_selected_visible();
}
/// Ensure the selected item is visible by adjusting scroll_offset
fn ensure_selected_visible(&mut self) {
// If selected is above the visible window, scroll up
if self.selected_index < self.scroll_offset {
self.scroll_offset = self.selected_index;
}
// If selected is below the visible window, scroll down
else if self.selected_index >= self.scroll_offset + MAX_VISIBLE_MODELS {
self.scroll_offset = self.selected_index.saturating_sub(MAX_VISIBLE_MODELS - 1);
}
}
/// Render the model picker as a centered modal
pub fn render(&self, frame: &mut Frame, area: Rect) {
match &self.state {
PickerState::Hidden => return,
PickerState::Loading => self.render_loading(frame, area),
PickerState::Ready => self.render_models(frame, area),
PickerState::Error(msg) => self.render_error(frame, area, msg),
}
}
/// Render the loading state
fn render_loading(&self, frame: &mut Frame, area: Rect) {
let modal_width = 50.min(area.width.saturating_sub(4));
let modal_height = 5;
let modal_x = (area.width.saturating_sub(modal_width)) / 2;
let modal_y = (area.height.saturating_sub(modal_height)) / 2;
let modal_area = Rect {
x: modal_x,
y: modal_y,
width: modal_width,
height: modal_height,
};
frame.render_widget(Clear, modal_area);
let lines = vec![
Line::from(Span::styled(
format!("Model Picker - {}", self.provider_name),
Style::default()
.fg(self.theme.palette.accent)
.add_modifier(Modifier::BOLD),
)),
Line::from(""),
Line::from(Span::styled(
"Loading models...",
self.theme.status_dim,
)),
];
let block = Block::default()
.borders(Borders::ALL)
.border_style(Style::default().fg(self.theme.palette.border))
.style(Style::default().bg(self.theme.palette.overlay_bg));
let paragraph = Paragraph::new(lines).block(block);
frame.render_widget(paragraph, modal_area);
}
/// Render the error state
fn render_error(&self, frame: &mut Frame, area: Rect, error: &str) {
let modal_width = 60.min(area.width.saturating_sub(4));
let modal_height = 7;
let modal_x = (area.width.saturating_sub(modal_width)) / 2;
let modal_y = (area.height.saturating_sub(modal_height)) / 2;
let modal_area = Rect {
x: modal_x,
y: modal_y,
width: modal_width,
height: modal_height,
};
frame.render_widget(Clear, modal_area);
let lines = vec![
Line::from(Span::styled(
format!("Model Picker - {}", self.provider_name),
Style::default()
.fg(self.theme.palette.accent)
.add_modifier(Modifier::BOLD),
)),
Line::from(""),
Line::from(Span::styled(
format!("Error: {}", error),
Style::default().fg(self.theme.palette.error),
)),
Line::from(""),
Line::from(Span::styled(
"Press [f] for fallback models, [Esc] to close",
self.theme.status_dim,
)),
];
let block = Block::default()
.borders(Borders::ALL)
.border_style(Style::default().fg(self.theme.palette.border))
.style(Style::default().bg(self.theme.palette.overlay_bg));
let paragraph = Paragraph::new(lines).block(block);
frame.render_widget(paragraph, modal_area);
}
/// Render the model list
fn render_models(&self, frame: &mut Frame, area: Rect) {
if self.models.is_empty() {
return;
}
// Calculate visible window
let total_models = self.filtered_indices.len();
let visible_count = total_models.min(MAX_VISIBLE_MODELS);
// Calculate modal dimensions
let modal_width = 60.min(area.width.saturating_sub(4));
let modal_height = visible_count as u16 + 6; // 2 for border, 1 for header, 1 for filter, 1 for separator, 1 for scroll hint
let modal_x = (area.width.saturating_sub(modal_width)) / 2;
let modal_y = (area.height.saturating_sub(modal_height)) / 2;
let modal_area = Rect {
x: modal_x,
y: modal_y,
width: modal_width,
height: modal_height,
};
// Clear background
frame.render_widget(Clear, modal_area);
// Build the modal content
let mut lines = Vec::new();
// Header with provider name and count
let header = if total_models > MAX_VISIBLE_MODELS {
format!(
"Select Model - {} ({}-{}/{})",
self.provider_name,
self.scroll_offset + 1,
(self.scroll_offset + visible_count).min(total_models),
total_models
)
} else {
format!("Select Model - {} ({})", self.provider_name, total_models)
};
lines.push(Line::from(vec![
Span::styled(
header,
Style::default()
.fg(self.theme.palette.accent)
.add_modifier(Modifier::BOLD),
),
]));
// Filter line
let filter_text = if self.filter.is_empty() {
"Type to filter... (j/k to navigate, Enter to select)".to_string()
} else {
format!("Filter: {}", self.filter)
};
lines.push(Line::from(Span::styled(
filter_text,
self.theme.status_dim,
)));
// Separator
lines.push(Line::from(Span::styled(
"".repeat((modal_width - 2) as usize),
self.theme.status_dim,
)));
// Show scroll up indicator if needed
if self.scroll_offset > 0 {
lines.push(Line::from(Span::styled(
" ▲ more above",
self.theme.status_dim,
)));
}
// Model list - show visible window based on scroll_offset
let visible_range = self.scroll_offset..(self.scroll_offset + visible_count).min(total_models);
for (display_idx, &model_idx) in self.filtered_indices.iter().enumerate() {
// Skip items outside visible window
if display_idx < visible_range.start || display_idx >= visible_range.end {
continue;
}
let model = &self.models[model_idx];
let is_selected = display_idx == self.selected_index;
let display_name = model
.display_name
.as_ref()
.unwrap_or(&model.id)
.clone();
// Build model line
let prefix = if is_selected { "" } else { " " };
let tool_indicator = if model.supports_tools { " [tools]" } else { "" };
let vision_indicator = if model.supports_vision { " [vision]" } else { "" };
let style = if is_selected {
Style::default()
.fg(self.theme.palette.accent)
.add_modifier(Modifier::BOLD)
} else {
Style::default().fg(self.theme.palette.fg)
};
lines.push(Line::from(vec![
Span::styled(prefix, style),
Span::styled(display_name, style),
Span::styled(
format!("{}{}", tool_indicator, vision_indicator),
self.theme.status_dim,
),
]));
// Add pricing info on a second line for selected model
if is_selected {
if let (Some(input_price), Some(output_price)) =
(model.input_price_per_mtok, model.output_price_per_mtok)
{
lines.push(Line::from(Span::styled(
format!(
" ${:.2}/MTok in, ${:.2}/MTok out",
input_price, output_price
),
self.theme.status_dim,
)));
}
}
}
// Show scroll down indicator if needed
if self.scroll_offset + visible_count < total_models {
lines.push(Line::from(Span::styled(
format!("{} more below", total_models - self.scroll_offset - visible_count),
self.theme.status_dim,
)));
}
// Render
let block = Block::default()
.borders(Borders::ALL)
.border_style(Style::default().fg(self.theme.palette.border))
.style(Style::default().bg(self.theme.palette.overlay_bg));
let paragraph = Paragraph::new(lines).block(block);
frame.render_widget(paragraph, modal_area);
}
}
/// Capitalize the first letter of a string
fn capitalize_first(s: &str) -> String {
let mut chars = s.chars();
match chars.next() {
None => String::new(),
Some(first) => first.to_uppercase().chain(chars).collect(),
}
}
/// Get fallback models for a provider when API call fails
fn get_fallback_models(provider: ProviderType) -> Vec<ModelInfo> {
match provider {
ProviderType::Ollama => vec![
ModelInfo {
id: "qwen3:8b".to_string(),
display_name: Some("Qwen 3 8B".to_string()),
description: Some("Efficient reasoning model with tool support".to_string()),
context_window: Some(32768),
max_output_tokens: Some(8192),
supports_tools: true,
supports_vision: false,
input_price_per_mtok: None,
output_price_per_mtok: None,
},
ModelInfo {
id: "llama3.2:latest".to_string(),
display_name: Some("Llama 3.2".to_string()),
description: Some("Meta's latest model with vision".to_string()),
context_window: Some(128000),
max_output_tokens: Some(4096),
supports_tools: true,
supports_vision: true,
input_price_per_mtok: None,
output_price_per_mtok: None,
},
ModelInfo {
id: "deepseek-coder-v2:latest".to_string(),
display_name: Some("DeepSeek Coder V2".to_string()),
description: Some("Coding-focused model".to_string()),
context_window: Some(65536),
max_output_tokens: Some(8192),
supports_tools: true,
supports_vision: false,
input_price_per_mtok: None,
output_price_per_mtok: None,
},
ModelInfo {
id: "mistral:latest".to_string(),
display_name: Some("Mistral 7B".to_string()),
description: Some("Fast and efficient model".to_string()),
context_window: Some(32768),
max_output_tokens: Some(4096),
supports_tools: true,
supports_vision: false,
input_price_per_mtok: None,
output_price_per_mtok: None,
},
],
ProviderType::Anthropic => vec![
ModelInfo {
id: "claude-sonnet-4-20250514".to_string(),
display_name: Some("Claude Sonnet 4".to_string()),
description: Some("Best balance of speed and capability".to_string()),
context_window: Some(200000),
max_output_tokens: Some(8192),
supports_tools: true,
supports_vision: true,
input_price_per_mtok: Some(3.0),
output_price_per_mtok: Some(15.0),
},
ModelInfo {
id: "claude-opus-4-20250514".to_string(),
display_name: Some("Claude Opus 4".to_string()),
description: Some("Most capable model".to_string()),
context_window: Some(200000),
max_output_tokens: Some(8192),
supports_tools: true,
supports_vision: true,
input_price_per_mtok: Some(15.0),
output_price_per_mtok: Some(75.0),
},
ModelInfo {
id: "claude-haiku-3-5-20241022".to_string(),
display_name: Some("Claude Haiku 3.5".to_string()),
description: Some("Fast and affordable".to_string()),
context_window: Some(200000),
max_output_tokens: Some(8192),
supports_tools: true,
supports_vision: true,
input_price_per_mtok: Some(0.80),
output_price_per_mtok: Some(4.0),
},
],
ProviderType::OpenAI => vec![
ModelInfo {
id: "gpt-4o".to_string(),
display_name: Some("GPT-4o".to_string()),
description: Some("Most capable GPT-4 model".to_string()),
context_window: Some(128000),
max_output_tokens: Some(16384),
supports_tools: true,
supports_vision: true,
input_price_per_mtok: Some(2.50),
output_price_per_mtok: Some(10.0),
},
ModelInfo {
id: "gpt-4o-mini".to_string(),
display_name: Some("GPT-4o Mini".to_string()),
description: Some("Fast and affordable GPT-4".to_string()),
context_window: Some(128000),
max_output_tokens: Some(16384),
supports_tools: true,
supports_vision: true,
input_price_per_mtok: Some(0.15),
output_price_per_mtok: Some(0.60),
},
ModelInfo {
id: "gpt-4-turbo".to_string(),
display_name: Some("GPT-4 Turbo".to_string()),
description: Some("Previous generation GPT-4".to_string()),
context_window: Some(128000),
max_output_tokens: Some(4096),
supports_tools: true,
supports_vision: true,
input_price_per_mtok: Some(10.0),
output_price_per_mtok: Some(30.0),
},
ModelInfo {
id: "o1".to_string(),
display_name: Some("o1".to_string()),
description: Some("Reasoning model".to_string()),
context_window: Some(200000),
max_output_tokens: Some(100000),
supports_tools: false,
supports_vision: true,
input_price_per_mtok: Some(15.0),
output_price_per_mtok: Some(60.0),
},
],
}
}
#[cfg(test)]
mod tests {
use super::*;
fn create_test_models() -> Vec<ModelInfo> {
vec![
ModelInfo {
id: "model-a".to_string(),
display_name: Some("Model A".to_string()),
description: None,
context_window: Some(4096),
max_output_tokens: Some(1024),
supports_tools: true,
supports_vision: false,
input_price_per_mtok: Some(1.0),
output_price_per_mtok: Some(2.0),
},
ModelInfo {
id: "model-b".to_string(),
display_name: Some("Model B".to_string()),
description: None,
context_window: Some(8192),
max_output_tokens: Some(2048),
supports_tools: true,
supports_vision: true,
input_price_per_mtok: Some(0.5),
output_price_per_mtok: Some(1.0),
},
]
}
#[test]
fn test_model_picker_show_hide() {
let theme = Theme::default();
let mut picker = ModelPicker::new(theme);
assert!(!picker.is_visible());
assert_eq!(*picker.state(), PickerState::Hidden);
picker.show(create_test_models(), "Test Provider", "model-a");
assert!(picker.is_visible());
assert_eq!(*picker.state(), PickerState::Ready);
picker.hide();
assert!(!picker.is_visible());
assert_eq!(*picker.state(), PickerState::Hidden);
}
#[test]
fn test_model_picker_loading_state() {
let theme = Theme::default();
let mut picker = ModelPicker::new(theme);
picker.show_loading(ProviderType::Anthropic);
assert!(picker.is_visible());
assert_eq!(*picker.state(), PickerState::Loading);
// Provider name is capitalized for display
assert_eq!(picker.provider_name, "Anthropic");
}
#[test]
fn test_model_picker_error_state() {
let theme = Theme::default();
let mut picker = ModelPicker::new(theme);
picker.show_loading(ProviderType::Ollama);
picker.show_error("Connection refused".to_string());
assert!(picker.is_visible());
assert!(matches!(picker.state(), PickerState::Error(_)));
}
#[test]
fn test_model_picker_fallback_models() {
let theme = Theme::default();
let mut picker = ModelPicker::new(theme);
// Start in loading state
picker.show_loading(ProviderType::Anthropic);
picker.show_error("API error".to_string());
// Use fallback models
picker.use_fallback_models("claude-sonnet-4-20250514");
assert!(picker.is_visible());
assert_eq!(*picker.state(), PickerState::Ready);
assert!(!picker.models.is_empty());
}
#[test]
fn test_model_picker_navigation() {
let theme = Theme::default();
let mut picker = ModelPicker::new(theme);
picker.show(create_test_models(), "Test Provider", "model-a");
assert_eq!(picker.selected_index, 0);
picker.select_next();
assert_eq!(picker.selected_index, 1);
picker.select_next();
assert_eq!(picker.selected_index, 0); // Wraps around
picker.select_previous();
assert_eq!(picker.selected_index, 1);
}
#[test]
fn test_model_picker_filter() {
let theme = Theme::default();
let mut picker = ModelPicker::new(theme);
picker.show(create_test_models(), "Test Provider", "model-a");
assert_eq!(picker.filtered_indices.len(), 2);
picker.filter = "model-b".to_string();
picker.update_filter();
assert_eq!(picker.filtered_indices.len(), 1);
assert_eq!(picker.filtered_indices[0], 1);
}
#[test]
fn test_fallback_models_exist_for_all_providers() {
assert!(!get_fallback_models(ProviderType::Ollama).is_empty());
assert!(!get_fallback_models(ProviderType::Anthropic).is_empty());
assert!(!get_fallback_models(ProviderType::OpenAI).is_empty());
}
}

View File

@@ -1,4 +1,5 @@
use crossterm::event::{KeyCode, KeyEvent, KeyModifiers};
use llm_core::ProviderType;
use serde_json::Value;
/// Application events that drive the TUI
@@ -35,6 +36,12 @@ pub enum AppEvent {
ScrollDown,
/// Toggle the todo panel
ToggleTodo,
/// Switch to a specific provider
SwitchProvider(ProviderType),
/// Cycle to the next provider (Tab key)
CycleProvider,
/// Open model picker
OpenModelPicker,
/// Application should quit
Quit,
}

View File

@@ -40,8 +40,9 @@ impl AppLayout {
/// Calculate layout with todo panel of specified height
///
/// Simplified layout without provider tabs:
/// Layout with provider tabs:
/// - Header (1 line)
/// - Provider tabs (1 line)
/// - Top divider (1 line)
/// - Chat area (flexible)
/// - Todo panel (optional)
@@ -54,6 +55,7 @@ impl AppLayout {
.direction(Direction::Vertical)
.constraints([
Constraint::Length(1), // Header
Constraint::Length(1), // Provider tabs
Constraint::Length(1), // Top divider
Constraint::Min(5), // Chat area (flexible)
Constraint::Length(todo_height), // Todo panel
@@ -67,6 +69,7 @@ impl AppLayout {
.direction(Direction::Vertical)
.constraints([
Constraint::Length(1), // Header
Constraint::Length(1), // Provider tabs
Constraint::Length(1), // Top divider
Constraint::Min(5), // Chat area (flexible)
Constraint::Length(0), // No todo panel
@@ -79,13 +82,13 @@ impl AppLayout {
Self {
header_area: chunks[0],
tabs_area: Rect::default(), // Not used in simplified layout
top_divider: chunks[1],
chat_area: chunks[2],
todo_area: chunks[3],
bottom_divider: chunks[4],
input_area: chunks[5],
status_area: chunks[6],
tabs_area: chunks[1],
top_divider: chunks[2],
chat_area: chunks[3],
todo_area: chunks[4],
bottom_divider: chunks[5],
input_area: chunks[6],
status_area: chunks[7],
}
}
@@ -200,11 +203,15 @@ mod tests {
assert_eq!(layout.header_area.y, 0);
assert_eq!(layout.header_area.height, 1);
// Tabs should be after header
assert_eq!(layout.tabs_area.y, 1);
assert_eq!(layout.tabs_area.height, 1);
// Status should be at bottom
assert_eq!(layout.status_area.y, 39);
assert_eq!(layout.status_area.height, 1);
// Chat area should have most of the space
// Chat area should have most of the space (40 - header - tabs - divider*2 - input - status = 34)
assert!(layout.chat_area.height > 20);
}

View File

@@ -5,6 +5,7 @@ pub mod events;
pub mod formatting;
pub mod layout;
pub mod output;
pub mod provider_manager;
pub mod theme;
pub use app::TuiApp;
@@ -15,11 +16,13 @@ pub use formatting::{
FormattedContent, MarkdownRenderer, SyntaxHighlighter,
format_file_path, format_tool_name, format_error, format_success, format_warning, format_info,
};
pub use provider_manager::ProviderManager;
use auth_manager::AuthManager;
use color_eyre::eyre::Result;
use std::sync::Arc;
/// Run the TUI application
/// Run the TUI application with a single provider (legacy mode)
pub async fn run(
client: Arc<dyn llm_core::LlmProvider>,
opts: llm_core::ChatOptions,
@@ -29,3 +32,14 @@ pub async fn run(
let mut app = TuiApp::new(client, opts, perms, settings)?;
app.run().await
}
/// Run the TUI application with multi-provider support
pub async fn run_with_providers(
auth_manager: Arc<AuthManager>,
perms: permissions::PermissionManager,
settings: config_agent::Settings,
) -> Result<()> {
let provider_manager = ProviderManager::new(auth_manager, settings.clone());
let mut app = TuiApp::with_provider_manager(provider_manager, perms, settings)?;
app.run().await
}

View File

@@ -0,0 +1,417 @@
//! Provider Manager for In-Session Switching
//!
//! Manages multiple LLM providers with lazy initialization, enabling
//! seamless provider switching during a TUI session without restart.
use auth_manager::AuthManager;
use llm_anthropic::AnthropicClient;
use llm_core::{AuthMethod, LlmProvider, ModelInfo, ProviderInfo, ProviderType};
use llm_ollama::OllamaClient;
use llm_openai::OpenAIClient;
use std::collections::HashMap;
use std::sync::Arc;
/// Error type for provider operations
#[derive(Debug)]
pub enum ProviderError {
/// Provider requires authentication
AuthRequired(String),
/// Failed to create provider
CreationFailed(String),
/// Failed to list models
ModelListFailed(String),
}
impl std::fmt::Display for ProviderError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::AuthRequired(msg) => write!(f, "Authentication required: {}", msg),
Self::CreationFailed(msg) => write!(f, "Provider creation failed: {}", msg),
Self::ModelListFailed(msg) => write!(f, "Failed to list models: {}", msg),
}
}
}
impl std::error::Error for ProviderError {}
/// Manages multiple LLM providers with lazy initialization
pub struct ProviderManager {
/// Auth manager for retrieving credentials
auth_manager: Arc<AuthManager>,
/// Cached provider clients (created on-demand)
providers: HashMap<ProviderType, Arc<dyn LlmProvider>>,
/// Current model per provider
models: HashMap<ProviderType, String>,
/// Currently active provider
current_provider: ProviderType,
/// Ollama base URL for local instances
ollama_url: String,
/// Settings for fallback API keys
settings: config_agent::Settings,
}
impl ProviderManager {
/// Create a new provider manager
pub fn new(
auth_manager: Arc<AuthManager>,
settings: config_agent::Settings,
) -> Self {
// Determine initial provider from settings
let initial_provider = settings.get_provider().unwrap_or(ProviderType::Ollama);
// Initialize models from per-provider settings or defaults
let mut models = HashMap::new();
models.insert(
ProviderType::Ollama,
settings.get_model_for_provider(ProviderType::Ollama),
);
models.insert(
ProviderType::Anthropic,
settings.get_model_for_provider(ProviderType::Anthropic),
);
models.insert(
ProviderType::OpenAI,
settings.get_model_for_provider(ProviderType::OpenAI),
);
Self {
auth_manager,
providers: HashMap::new(),
models,
current_provider: initial_provider,
ollama_url: settings.ollama_url.clone(),
settings,
}
}
/// Get the currently active provider type
pub fn current_provider_type(&self) -> ProviderType {
self.current_provider
}
/// Get the current model for the active provider
pub fn current_model(&self) -> &str {
self.models
.get(&self.current_provider)
.map(|s| s.as_str())
.unwrap_or(self.current_provider.default_model())
}
/// Get the model for a specific provider
pub fn model_for_provider(&self, provider: ProviderType) -> &str {
self.models
.get(&provider)
.map(|s| s.as_str())
.unwrap_or(provider.default_model())
}
/// Set the model for a provider
pub fn set_model(&mut self, provider: ProviderType, model: String) {
self.models.insert(provider, model.clone());
// Update settings and save
self.settings.set_model_for_provider(provider, &model);
if let Err(e) = self.settings.save() {
// Log but don't fail - saving is best-effort
eprintln!("Warning: Failed to save model preference: {}", e);
}
// If provider is already initialized, we need to recreate it with the new model
// For simplicity, just remove it so it will be recreated on next access
self.providers.remove(&provider);
}
/// Set the model for the current provider
pub fn set_current_model(&mut self, model: String) {
self.set_model(self.current_provider, model);
}
/// Check if a provider is authenticated
pub fn is_authenticated(&self, provider: ProviderType) -> bool {
match provider {
ProviderType::Ollama => true, // Local Ollama doesn't need auth
_ => self.auth_manager.get_auth(provider).is_ok(),
}
}
/// Get the active provider client, creating it if necessary
pub fn get_provider(&mut self) -> Result<Arc<dyn LlmProvider>, ProviderError> {
self.get_provider_for_type(self.current_provider)
}
/// Get a specific provider client, creating it if necessary
pub fn get_provider_for_type(
&mut self,
provider_type: ProviderType,
) -> Result<Arc<dyn LlmProvider>, ProviderError> {
// Return cached provider if available
if let Some(provider) = self.providers.get(&provider_type) {
return Ok(Arc::clone(provider));
}
// Create new provider
let model = self.model_for_provider(provider_type).to_string();
let provider = self.create_provider(provider_type, &model)?;
// Cache and return
self.providers.insert(provider_type, Arc::clone(&provider));
Ok(provider)
}
/// Switch to a different provider
pub fn switch_provider(&mut self, provider_type: ProviderType) -> Result<Arc<dyn LlmProvider>, ProviderError> {
self.current_provider = provider_type;
self.get_provider()
}
/// Create a provider client with authentication
fn create_provider(
&self,
provider_type: ProviderType,
model: &str,
) -> Result<Arc<dyn LlmProvider>, ProviderError> {
match provider_type {
ProviderType::Ollama => {
// Check for Ollama Cloud vs local
let use_cloud = model.ends_with("-cloud");
let client = if use_cloud {
// Try to get Ollama Cloud API key
match self.auth_manager.get_auth(ProviderType::Ollama) {
Ok(AuthMethod::ApiKey(key)) => {
OllamaClient::with_cloud().with_api_key(key)
}
_ => {
return Err(ProviderError::AuthRequired(
"Ollama Cloud requires API key. Run 'owlen login ollama'".to_string()
));
}
}
} else {
// Local Ollama - no auth needed
let mut client = OllamaClient::new(&self.ollama_url);
// Add API key if available (for authenticated local instances)
if let Ok(AuthMethod::ApiKey(key)) = self.auth_manager.get_auth(ProviderType::Ollama) {
client = client.with_api_key(key);
}
client
};
Ok(Arc::new(client.with_model(model)) as Arc<dyn LlmProvider>)
}
ProviderType::Anthropic => {
// Try auth manager first, then settings fallback
let auth = self.auth_manager.get_auth(ProviderType::Anthropic)
.ok()
.or_else(|| self.settings.anthropic_api_key.clone().map(AuthMethod::ApiKey))
.ok_or_else(|| ProviderError::AuthRequired(
"Run 'owlen login anthropic' or set ANTHROPIC_API_KEY".to_string()
))?;
let client = AnthropicClient::with_auth(auth).with_model(model);
Ok(Arc::new(client) as Arc<dyn LlmProvider>)
}
ProviderType::OpenAI => {
// Try auth manager first, then settings fallback
let auth = self.auth_manager.get_auth(ProviderType::OpenAI)
.ok()
.or_else(|| self.settings.openai_api_key.clone().map(AuthMethod::ApiKey))
.ok_or_else(|| ProviderError::AuthRequired(
"Run 'owlen login openai' or set OPENAI_API_KEY".to_string()
))?;
let client = OpenAIClient::with_auth(auth).with_model(model);
Ok(Arc::new(client) as Arc<dyn LlmProvider>)
}
}
}
/// List available models for the current provider
pub async fn list_models(&self) -> Result<Vec<ModelInfo>, ProviderError> {
self.list_models_for_provider(self.current_provider).await
}
/// List available models for a specific provider (only tool-capable models)
pub async fn list_models_for_provider(
&self,
provider_type: ProviderType,
) -> Result<Vec<ModelInfo>, ProviderError> {
let models = match provider_type {
ProviderType::Ollama => {
// For Ollama, we need to fetch from the API
let client = OllamaClient::new(&self.ollama_url);
let mut models = client
.list_models()
.await
.map_err(|e| ProviderError::ModelListFailed(e.to_string()))?;
// Update supports_tools based on known tool-capable model patterns
for model in &mut models {
model.supports_tools = is_ollama_tool_capable(&model.id);
}
models
}
ProviderType::Anthropic => {
// Anthropic: return hardcoded list (no API endpoint)
let client = AnthropicClient::new("dummy"); // Key not needed for list_models
client
.list_models()
.await
.map_err(|e| ProviderError::ModelListFailed(e.to_string()))?
}
ProviderType::OpenAI => {
// OpenAI: return hardcoded list
let client = OpenAIClient::new("dummy");
client
.list_models()
.await
.map_err(|e| ProviderError::ModelListFailed(e.to_string()))?
}
};
// Filter to only tool-capable models
Ok(models.into_iter().filter(|m| m.supports_tools).collect())
}
/// Get authentication status for all providers
pub fn auth_status(&self) -> Vec<(ProviderType, bool, Option<String>)> {
vec![
(
ProviderType::Ollama,
true, // Always "authenticated" for local
Some("Local (no auth required)".to_string()),
),
(
ProviderType::Anthropic,
self.is_authenticated(ProviderType::Anthropic),
if self.is_authenticated(ProviderType::Anthropic) {
Some("API key configured".to_string())
} else {
Some("Not authenticated".to_string())
},
),
(
ProviderType::OpenAI,
self.is_authenticated(ProviderType::OpenAI),
if self.is_authenticated(ProviderType::OpenAI) {
Some("API key configured".to_string())
} else {
Some("Not authenticated".to_string())
},
),
]
}
}
/// Check if an Ollama model is known to support tool calling
fn is_ollama_tool_capable(model_id: &str) -> bool {
let model_lower = model_id.to_lowercase();
// Extract base model name (before the colon for size variants)
let base_name = model_lower.split(':').next().unwrap_or(&model_lower);
// Models known to support tool calling well
let tool_capable_patterns = [
"qwen", // Qwen models (qwen3, qwen2.5, etc.)
"llama3.1", // Llama 3.1 and above
"llama3.2", // Llama 3.2
"llama3.3", // Llama 3.3
"mistral", // Mistral models
"mixtral", // Mixtral models
"deepseek", // DeepSeek models
"command-r", // Cohere Command-R
"gemma2", // Gemma 2 (some versions)
"phi3", // Phi-3 models
"phi4", // Phi-4 models
"granite", // IBM Granite
"hermes", // Hermes models
"openhermes", // OpenHermes
"nous-hermes", // Nous Hermes
"dolphin", // Dolphin models
"wizard", // WizardLM
"codellama", // Code Llama
"starcoder", // StarCoder
"codegemma", // CodeGemma
"gpt-oss", // GPT-OSS models
];
// Check if model matches any known pattern
for pattern in tool_capable_patterns {
if base_name.contains(pattern) {
// Exclude small models (< 4B parameters) as they often struggle with tools
if let Some(size_part) = model_lower.split(':').nth(1) {
if is_small_model(size_part) {
return false;
}
}
return true;
}
}
false
}
/// Check if a model size string indicates a small model (< 4B parameters)
fn is_small_model(size_str: &str) -> bool {
// Extract the numeric part from strings like "1b", "1.5b", "3b", "8b", etc.
let size_lower = size_str.to_lowercase();
// Try to extract the number before 'b'
if let Some(b_pos) = size_lower.find('b') {
let num_part = &size_lower[..b_pos];
// Parse as float to handle "1.5", "0.5", etc.
if let Ok(size) = num_part.parse::<f32>() {
return size < 4.0; // Exclude models smaller than 4B
}
}
false // If we can't parse it, assume it's fine
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_default_models() {
let settings = config_agent::Settings::default();
let auth_manager = Arc::new(AuthManager::new().unwrap());
let manager = ProviderManager::new(auth_manager, settings);
// Check default models are set
assert!(!manager.model_for_provider(ProviderType::Ollama).is_empty());
assert!(!manager.model_for_provider(ProviderType::Anthropic).is_empty());
assert!(!manager.model_for_provider(ProviderType::OpenAI).is_empty());
}
#[test]
fn test_ollama_tool_capability_detection() {
// Tool-capable models
assert!(is_ollama_tool_capable("qwen3:8b"));
assert!(is_ollama_tool_capable("qwen2.5:7b"));
assert!(is_ollama_tool_capable("llama3.1:8b"));
assert!(is_ollama_tool_capable("llama3.2:8b"));
assert!(is_ollama_tool_capable("mistral:7b"));
assert!(is_ollama_tool_capable("deepseek-coder:6.7b"));
assert!(is_ollama_tool_capable("gpt-oss:120b-cloud"));
// Small models excluded (1b, 2b, 3b)
assert!(!is_ollama_tool_capable("llama3.2:1b"));
assert!(!is_ollama_tool_capable("llama3.2:3b"));
assert!(!is_ollama_tool_capable("qwen2.5:1.5b"));
// Unknown models
assert!(!is_ollama_tool_capable("gemma:7b")); // gemma (not gemma2)
assert!(!is_ollama_tool_capable("llama2:7b")); // llama2, not 3.x
assert!(!is_ollama_tool_capable("tinyllama:1b"));
}
}

View File

@@ -196,30 +196,46 @@ impl LlmProvider for OllamaClient {
.map_err(|e| LlmError::Http(e.to_string()))?;
let bytes_stream = resp.bytes_stream();
// NDJSON parser: split by '\n', parse each as JSON and stream the results
// NDJSON parser with buffering for partial lines across chunks
// Uses scan to maintain state (incomplete line buffer) between chunks
use std::sync::{Arc, Mutex};
let buffer: Arc<Mutex<String>> = Arc::new(Mutex::new(String::new()));
let converted_stream = bytes_stream
.map(|result| {
.map(move |result| {
result.map_err(|e| LlmError::Http(e.to_string()))
})
.map_ok(|bytes| {
// Convert the chunk to a UTF-8 string and own it
.map_ok(move |bytes| {
let buffer = Arc::clone(&buffer);
// Convert the chunk to a UTF-8 string
let txt = String::from_utf8_lossy(&bytes).into_owned();
// Parse each non-empty line into a ChatResponseChunk
let results: Vec<Result<llm_core::StreamChunk, LlmError>> = txt
.lines()
.filter_map(|line| {
let trimmed = line.trim();
if trimmed.is_empty() {
None
} else {
Some(
serde_json::from_str::<ChatResponseChunk>(trimmed)
.map(|chunk| llm_core::StreamChunk::from(chunk))
.map_err(|e| LlmError::Json(e.to_string())),
)
}
})
.collect();
// Get the buffered incomplete line from previous chunk
let mut buf = buffer.lock().unwrap();
let combined = std::mem::take(&mut *buf) + &txt;
// Split by newlines, keeping track of complete vs incomplete lines
let mut results: Vec<Result<llm_core::StreamChunk, LlmError>> = Vec::new();
let mut lines: Vec<&str> = combined.split('\n').collect();
// If the data doesn't end with newline, the last element is incomplete
// Save it for the next chunk
if !combined.ends_with('\n') && !lines.is_empty() {
*buf = lines.pop().unwrap_or("").to_string();
}
// Parse all complete lines
for line in lines {
let trimmed = line.trim();
if !trimmed.is_empty() {
results.push(
serde_json::from_str::<ChatResponseChunk>(trimmed)
.map(|chunk| llm_core::StreamChunk::from(chunk))
.map_err(|e| LlmError::Json(format!("{}: {}", e, trimmed))),
);
}
}
futures::stream::iter(results)
})
.try_flatten();

View File

@@ -31,3 +31,4 @@ open = "5"
[dev-dependencies]
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
tempfile = "3"

View File

@@ -588,6 +588,20 @@ mod tests {
#[test]
fn test_env_override_loading() {
// Create a temp dir for config to avoid loading system credentials
let temp_dir = tempfile::tempdir().unwrap();
let temp_path = temp_dir.path().to_str().unwrap();
// Clear any env vars that might interfere and set XDG_CONFIG_HOME
// Also disable keyring to ensure we only use the temp config dir
// SAFETY: Single-threaded test context
unsafe {
std::env::set_var("XDG_CONFIG_HOME", temp_path);
std::env::set_var("OWLEN_KEYRING_DISABLE", "1");
std::env::remove_var("ANTHROPIC_API_KEY");
std::env::remove_var("OWLEN_ANTHROPIC_API_KEY");
}
// Set env var (unsafe in Rust 2024 due to potential thread safety issues)
// SAFETY: This is a single-threaded test, no concurrent access
unsafe {
@@ -606,15 +620,37 @@ mod tests {
// SAFETY: Single-threaded test
unsafe {
std::env::remove_var("ANTHROPIC_API_KEY");
std::env::remove_var("XDG_CONFIG_HOME");
std::env::remove_var("OWLEN_KEYRING_DISABLE");
}
}
#[test]
fn test_ollama_no_auth_required() {
// Create a temp dir for config
let temp_dir = tempfile::tempdir().unwrap();
let temp_path = temp_dir.path().to_str().unwrap();
// Clear any env vars that might interfere and set XDG_CONFIG_HOME
// Also disable keyring
// SAFETY: Single-threaded test context
unsafe {
std::env::set_var("XDG_CONFIG_HOME", temp_path);
std::env::set_var("OWLEN_KEYRING_DISABLE", "1");
std::env::remove_var("OLLAMA_API_KEY");
std::env::remove_var("OWLEN_API_KEY");
}
let manager = AuthManager::new().unwrap();
let auth = manager.get_auth(ProviderType::Ollama).unwrap();
assert!(matches!(auth, AuthMethod::None));
// Clean up
unsafe {
std::env::remove_var("XDG_CONFIG_HOME");
std::env::remove_var("OWLEN_KEYRING_DISABLE");
}
}
#[test]

View File

@@ -9,6 +9,7 @@ rust-version.workspace = true
serde = { version = "1", features = ["derive"] }
directories = "5"
figment = { version = "0.10", features = ["toml", "env"] }
toml = "0.8"
permissions = { path = "../permissions" }
llm-core = { path = "../../llm/core" }

View File

@@ -28,6 +28,11 @@ impl KeyringStore {
/// Check if the keyring is available on this system
fn check_availability() -> bool {
// Allow disabling keyring via environment variable
if std::env::var("OWLEN_KEYRING_DISABLE").is_ok_and(|v| v == "1" || v.to_lowercase() == "true") {
return false;
}
// Try to actually store and delete a test entry
// Entry::new() always succeeds on Linux, we need to test set_password()
match Entry::new(SERVICE_NAME, "__test_availability__") {

View File

@@ -0,0 +1,303 @@
# Subagent Orchestration Enhancement
This document describes the enhanced Task tool with proper subagent orchestration support using plugin agents.
## Overview
The Task tool has been enhanced to support a new architecture for spawning and managing specialized subagents. The system now integrates with the plugin system's `AgentDefinition` type, allowing both built-in and plugin-provided agents to be orchestrated.
## Key Components
### 1. SubagentConfig
Configuration structure for spawning subagents:
```rust
pub struct SubagentConfig {
/// Agent type/name (e.g., "code-reviewer", "explore")
pub agent_type: String,
/// Task prompt for the agent
pub prompt: String,
/// Optional model override
pub model: Option<String>,
/// Tool whitelist (if None, uses agent's default)
pub tools: Option<Vec<String>>,
/// Parsed agent definition (if from plugin)
pub definition: Option<AgentDefinition>,
}
```
**Builder Pattern:**
```rust
let config = SubagentConfig::new("explore".to_string(), "Find all Rust files".to_string())
.with_model("claude-3-opus".to_string())
.with_tools(vec!["read".to_string(), "glob".to_string()]);
```
### 2. SubagentRegistry
Thread-safe registry for tracking available agents:
```rust
pub struct SubagentRegistry {
agents: Arc<RwLock<HashMap<String, AgentDefinition>>>,
}
```
**Key Methods:**
- `new()` - Create empty registry
- `register_builtin()` - Register built-in agents
- `register_from_plugins(Vec<AgentDefinition>)` - Register plugin agents
- `get(name: &str)` - Get agent by name
- `list()` - List all agents with descriptions
- `contains(name: &str)` - Check if agent exists
- `agent_names()` - Get all agent names
**Usage:**
```rust
let registry = SubagentRegistry::new();
registry.register_builtin();
// Load plugin agents
let plugin_manager = PluginManager::new();
plugin_manager.load_all()?;
let plugin_agents = plugin_manager.load_all_agents();
registry.register_from_plugins(plugin_agents);
// Use registry
if let Some(agent) = registry.get("explore") {
println!("Agent: {} - {}", agent.name, agent.description);
}
```
### 3. Built-in Agents
The system includes six specialized built-in agents:
#### explore
- **Purpose:** Codebase exploration
- **Tools:** read, glob, grep, ls
- **Color:** blue
- **Use Cases:** Finding files, understanding structure
#### plan
- **Purpose:** Implementation planning
- **Tools:** read, glob, grep
- **Color:** green
- **Use Cases:** Designing architectures, creating strategies
#### code-reviewer
- **Purpose:** Code analysis
- **Tools:** read, grep, glob (read-only)
- **Color:** yellow
- **Use Cases:** Quality review, bug detection
#### test-writer
- **Purpose:** Test creation
- **Tools:** read, write, edit, grep, glob
- **Color:** cyan
- **Use Cases:** Writing unit tests, integration tests
#### doc-writer
- **Purpose:** Documentation
- **Tools:** read, write, edit, grep, glob
- **Color:** magenta
- **Use Cases:** Writing READMEs, API docs
#### refactorer
- **Purpose:** Code refactoring
- **Tools:** read, write, edit, grep, glob (no bash)
- **Color:** red
- **Use Cases:** Improving structure, applying patterns
## Future Implementation
The following functions will be implemented to complete the orchestration system:
### spawn_subagent
```rust
/// Spawn a subagent with the given configuration
pub async fn spawn_subagent<P: LlmProvider>(
provider: &P,
registry: &SubagentRegistry,
config: SubagentConfig,
perms: &PermissionManager,
) -> Result<String>
```
**Behavior:**
1. Look up agent definition from registry or config
2. Extract system prompt and tool whitelist from definition
3. Build full prompt combining system prompt + task
4. Create filtered permission manager if tool whitelist specified
5. Run agent loop with system prompt
6. Return result string
### spawn_parallel
```rust
/// Spawn multiple subagents in parallel and collect results
pub async fn spawn_parallel<P: LlmProvider + Clone>(
provider: &P,
registry: &SubagentRegistry,
configs: Vec<SubagentConfig>,
perms: &PermissionManager,
) -> Vec<Result<String>>
```
**Behavior:**
1. Create futures for each config
2. Execute all in parallel using `join_all`
3. Return vector of results
**Note:** Requires `PermissionManager` to implement `Clone`. This may need to be added to the permissions crate.
## Integration Points
### With Plugin System
```rust
use plugins::PluginManager;
use tools_task::SubagentRegistry;
let mut plugin_manager = PluginManager::new();
plugin_manager.load_all()?;
let registry = SubagentRegistry::new();
registry.register_builtin();
registry.register_from_plugins(plugin_manager.load_all_agents());
```
### With Agent Core
The subagent execution will integrate with `agent-core` by:
1. Calling the same `run_agent_loop` function used by main agent
2. Passing filtered tool definitions based on agent's tool whitelist
3. Using agent-specific system prompts
4. Inheriting parent's permission manager (or creating restricted copy)
### With Permission System
Subagents respect the permission system:
- Tool whitelist from agent definition restricts available tools
- Permission manager checks are still applied
- Parent's mode (plan/acceptEdits/code) is inherited
## Example Usage Patterns
### Basic Exploration
```rust
let registry = SubagentRegistry::new();
registry.register_builtin();
let config = SubagentConfig::new(
"explore".to_string(),
"Find all test files in the codebase".to_string()
);
let result = spawn_subagent(&provider, &registry, config, &perms).await?;
println!("Found files:\n{}", result);
```
### Parallel Analysis
```rust
let configs = vec![
SubagentConfig::new("explore".to_string(), "Find all Rust files".to_string()),
SubagentConfig::new("code-reviewer".to_string(), "Review auth module".to_string()),
SubagentConfig::new("test-writer".to_string(), "Check test coverage".to_string()),
];
let results = spawn_parallel(&provider, &registry, configs, &perms).await;
for (i, result) in results.iter().enumerate() {
match result {
Ok(output) => println!("Agent {} completed:\n{}", i, output),
Err(e) => eprintln!("Agent {} failed: {}", i, e),
}
}
```
### Custom Plugin Agent
```rust
// Plugin provides custom-analyzer agent
let config = SubagentConfig::new(
"custom-analyzer".to_string(),
"Analyze security vulnerabilities".to_string()
);
if registry.contains("custom-analyzer") {
let result = spawn_subagent(&provider, &registry, config, &perms).await?;
} else {
eprintln!("Agent not found. Available: {:?}", registry.agent_names());
}
```
## Migration Guide
### From Legacy Subagent API
The legacy `Subagent` struct and keyword-based matching is still available for backward compatibility:
```rust
// Legacy API (still works)
let agent = Subagent::new(
"reader".to_string(),
"Read-only agent".to_string(),
vec!["read".to_string()],
vec![Tool::Read, Tool::Grep],
);
```
**Migrate to new API:**
1. Use `SubagentRegistry` instead of custom keyword matching
2. Use `SubagentConfig` instead of direct agent instantiation
3. Use `spawn_subagent` instead of manual tool execution
## Testing
Run tests:
```bash
cargo test -p tools-task
```
All tests pass, including:
- Registry builtin registration
- Plugin agent registration
- Config builder pattern
- Legacy API backward compatibility
## Dependencies
- `plugins` - For `AgentDefinition` type
- `parking_lot` - For `RwLock` in thread-safe registry
- `permissions` - For tool permission checks
- `color-eyre` - For error handling
- `serde` / `serde_json` - For serialization
## Future Work
1. **Implement spawn_subagent:** Complete the actual subagent spawning logic
2. **Add Clone to PermissionManager:** Required for parallel execution
3. **System Prompt Support:** Ensure agent loop respects system prompts
4. **Tool Filtering:** Implement filtered tool definitions based on whitelist
5. **Progress Tracking:** Add hooks for monitoring subagent progress
6. **Error Recovery:** Handle subagent failures gracefully
7. **Resource Limits:** Add timeout and resource constraints
8. **Inter-Agent Communication:** Allow agents to share context
## Related Files
- `/home/cnachtigall/data/git/projects/Owlibou/owlen/crates/tools/task/src/lib.rs` - Main implementation
- `/home/cnachtigall/data/git/projects/Owlibou/owlen/crates/tools/task/Cargo.toml` - Dependencies
- `/home/cnachtigall/data/git/projects/Owlibou/owlen/crates/platform/plugins/src/lib.rs` - AgentDefinition type
- `/home/cnachtigall/data/git/projects/Owlibou/owlen/crates/core/agent/src/lib.rs` - Agent execution loop
- `/home/cnachtigall/data/git/projects/Owlibou/owlen/crates/platform/permissions/src/lib.rs` - Permission system