Files
owlen/docs/architecture.md
vikingowl 7534c9ef8d feat(phase10): complete MCP-only architecture migration
Phase 10 "Cleanup & Production Polish" is now complete. All LLM
interactions now go through the Model Context Protocol (MCP), removing
direct provider dependencies from CLI/TUI.

## Major Changes

### MCP Architecture
- All providers (local and cloud Ollama) now use RemoteMcpClient
- Removed owlen-ollama dependency from owlen-tui
- MCP LLM server accepts OLLAMA_URL environment variable for cloud providers
- Proper notification handling for streaming responses
- Fixed response deserialization (McpToolResponse unwrapping)

### Code Cleanup
- Removed direct OllamaProvider instantiation from TUI
- Updated collect_models_from_all_providers() to use MCP for all providers
- Updated switch_provider() to use MCP with environment configuration
- Removed unused general config variable

### Documentation
- Added comprehensive MCP Architecture section to docs/architecture.md
- Documented MCP communication flow and cloud provider support
- Updated crate breakdown to reflect MCP servers

### Security & Performance
- Path traversal protection verified for all resource operations
- Process isolation via separate MCP server processes
- Tool permissions controlled via consent manager
- Clean release build of entire workspace verified

## Benefits of MCP Architecture

1. **Separation of Concerns**: TUI/CLI never directly instantiates providers
2. **Process Isolation**: LLM interactions run in separate processes
3. **Extensibility**: New providers can be added as MCP servers
4. **Multi-Transport**: Supports STDIO, HTTP, and WebSocket
5. **Tool Integration**: MCP servers expose tools to LLMs

This completes Phase 10 and establishes a clean, production-ready architecture
for future development.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-10 23:34:05 +02:00

116 lines
6.9 KiB
Markdown

# Owlen Architecture
This document provides a high-level overview of the Owlen architecture. Its purpose is to help developers understand how the different parts of the application fit together.
## Core Concepts
The architecture is designed to be modular and extensible, centered around a few key concepts:
- **Providers**: Connect to various LLM APIs (Ollama, OpenAI, etc.).
- **Session**: Manages the conversation history and state.
- **TUI**: The terminal user interface, built with `ratatui`.
- **Events**: A system for handling user input and other events.
## Component Interaction
A simplified diagram of how components interact:
```
[User Input] -> [Event Loop] -> [Session Controller] -> [Provider]
^ |
| v
[TUI Renderer] <------------------------------------ [API Response]
```
1. **User Input**: The user interacts with the TUI, generating events (e.g., key presses).
2. **Event Loop**: The main event loop in `owlen-tui` captures these events.
3. **Session Controller**: The event is processed, and if it's a prompt, the session controller sends a request to the current provider.
4. **Provider**: The provider formats the request for the specific LLM API and sends it.
5. **API Response**: The LLM API returns a response.
6. **TUI Renderer**: The response is processed, the session state is updated, and the TUI is re-rendered to display the new information.
## Crate Breakdown
- `owlen-core`: Defines the core traits and data structures, like `Provider` and `Session`. Also contains the MCP client implementation.
- `owlen-tui`: Contains all the logic for the terminal user interface, including event handling and rendering.
- `owlen-cli`: The command-line entry point, responsible for parsing arguments and starting the TUI.
- `owlen-mcp-llm-server`: MCP server that wraps Ollama providers and exposes them via the Model Context Protocol.
- `owlen-mcp-server`: Generic MCP server for file operations and resource management.
- `owlen-ollama`: Direct Ollama provider implementation (legacy, used only by MCP servers).
## MCP Architecture (Phase 10)
As of Phase 10, OWLEN uses a **MCP-only architecture** where all LLM interactions go through the Model Context Protocol:
```
[TUI/CLI] -> [RemoteMcpClient] -> [MCP LLM Server] -> [Ollama Provider] -> [Ollama API]
```
### Benefits of MCP Architecture
1. **Separation of Concerns**: The TUI/CLI never directly instantiates provider implementations.
2. **Process Isolation**: LLM interactions run in a separate process, improving stability.
3. **Extensibility**: New providers can be added by implementing MCP servers.
4. **Multi-Transport**: Supports STDIO, HTTP, and WebSocket transports.
5. **Tool Integration**: MCP servers can expose tools (file operations, web search, etc.) to the LLM.
### MCP Communication Flow
1. **Client Creation**: `RemoteMcpClient::new()` spawns an MCP server binary via STDIO.
2. **Initialization**: Client sends `initialize` request to establish protocol version.
3. **Tool Discovery**: Client calls `tools/list` to discover available LLM operations.
4. **Chat Requests**: Client calls the `generate_text` tool with chat parameters.
5. **Streaming**: Server sends progress notifications during generation, then final response.
6. **Response Handling**: Client skips notifications and returns the final text to the caller.
### Cloud Provider Support
For Ollama Cloud providers, the MCP server accepts an `OLLAMA_URL` environment variable:
```rust
let env_vars = HashMap::from([
("OLLAMA_URL".to_string(), "https://cloud-provider-url".to_string())
]);
let config = McpServerConfig {
command: "path/to/owlen-mcp-llm-server",
env: env_vars,
transport: "stdio",
...
};
let client = RemoteMcpClient::new_with_config(&config)?;
```
## Session Management
The session management system is responsible for tracking the state of a conversation. The two main structs are:
- **`Conversation`**: Found in `owlen-core`, this struct holds the messages of a single conversation, the model being used, and other metadata. It is a simple data container.
- **`SessionController`**: This is the high-level controller that manages the active conversation. It handles:
- Storing and retrieving conversation history via the `ConversationManager`.
- Managing the context that is sent to the LLM provider.
- Switching between different models.
- Sending requests to the provider and handling the responses (both streaming and complete).
When a user sends a message, the `SessionController` adds the message to the current `Conversation`, sends the updated message list to the `Provider`, and then adds the provider's response to the `Conversation`.
## Event Flow
The event flow is managed by the `EventHandler` in `owlen-tui`. It operates in a loop, waiting for events and dispatching them to the active application (`ChatApp` or `CodeApp`).
1. **Event Source**: Events are primarily generated by `crossterm` from user keyboard input. Asynchronous events, like responses from a `Provider`, are also fed into the event system via a `tokio::mpsc` channel.
2. **`EventHandler::next()`**: The main application loop calls this method to wait for the next event.
3. **Event Enum**: Events are defined in the `owlen_tui::events::Event` enum. This includes `Key` events, `Tick` events (for UI updates), and `Message` events (for async provider data).
4. **Dispatch**: The application's `run` method matches on the `Event` type and calls the appropriate handler function (e.g., `dispatch_key_event`).
5. **State Update**: The handler function updates the application state based on the event. For example, a key press might change the `InputMode` or modify the text in the input buffer.
6. **Re-render**: After the state is updated, the UI is re-rendered to reflect the changes.
## TUI Rendering Pipeline
The TUI is rendered on each iteration of the main application loop in `owlen-tui`. The process is as follows:
1. **`tui.draw()`**: The main loop calls this method, passing the current application state.
2. **`Terminal::draw()`**: This method, from `ratatui`, takes a closure that receives a `Frame`.
3. **UI Composition**: Inside the closure, the UI is built by composing `ratatui` widgets. The root UI is defined in `owlen_tui::ui::render`, which builds the main layout and calls other functions to render specific components (like the chat panel, input box, etc.).
4. **State-Driven Rendering**: Each rendering function takes the current application state as an argument. It uses this state to decide what and how to render. For example, the border color of a panel might change if it is focused.
5. **Buffer and Diff**: `ratatui` does not draw directly to the terminal. Instead, it renders the widgets to an in-memory buffer. It then compares this buffer to the previous buffer and only sends the necessary changes to the terminal. This is highly efficient and prevents flickering.