Files
owlen/docs/architecture.md

9.0 KiB

Owlen Architecture

This document provides a high-level overview of the Owlen architecture. Its purpose is to help developers understand how the different parts of the application fit together.

Core Concepts

The architecture is designed to be modular and extensible, centered around a few key concepts:

  • Providers: Connect to various LLM APIs (Ollama, OpenAI, etc.).
  • Session: Manages the conversation history and state.
  • TUI: The terminal user interface, built with ratatui.
  • Events: A system for handling user input and other events.

Component Interaction

A simplified diagram of how components interact:

[User Input] -> [Event Loop] -> [Session Controller] -> [Provider]
      ^                                                     |
      |                                                     v
[TUI Renderer] <------------------------------------ [API Response]
  1. User Input: The user interacts with the TUI, generating events (e.g., key presses).
  2. Event Loop: The main event loop in owlen-tui captures these events.
  3. Session Controller: The event is processed, and if it's a prompt, the session controller sends a request to the current provider.
  4. Provider: The provider formats the request for the specific LLM API and sends it.
  5. API Response: The LLM API returns a response.
  6. TUI Renderer: The response is processed, the session state is updated, and the TUI is re-rendered to display the new information.

Crate Breakdown

  • owlen-core: Defines the LlmProvider abstraction, routing, configuration, session state, encryption, and the MCP client layer. This crate is UI-agnostic and must not depend on concrete providers, terminals, or blocking I/O.
  • owlen-tui: Hosts all terminal UI behaviour (event loop, rendering, input modes) while delegating business logic and provider access back to owlen-core.
  • owlen-cli: Small entry point that parses command-line options, resolves configuration, selects providers, and launches either the TUI or headless agent flows by calling into owlen-core.
  • owlen-mcp-llm-server: Runs concrete providers (e.g., Ollama) behind an MCP boundary, exposing them as generate_text tools. This crate owns provider-specific wiring and process sandboxing.
  • owlen-mcp-server: Generic MCP server for file operations and resource management.
  • owlen-ollama: Direct Ollama provider implementation (legacy, used only by MCP servers).

Boundary Guidelines

  • owlen-core: The dependency ceiling for most crates. Keep it free of terminal logic, CLIs, or provider-specific HTTP clients. New features should expose traits or data types here and let other crates supply concrete implementations.
  • owlen-cli: Only orchestrates startup/shutdown. Avoid adding business logic; when a new command needs behaviour, implement it in owlen-core or another library crate and invoke it from the CLI.
  • owlen-mcp-llm-server: The only crate that should directly talk to Ollama (or other provider processes). TUI/CLI code communicates with providers exclusively through MCP clients in owlen-core.

MCP Architecture (Phase 10)

As of Phase 10, OWLEN uses a MCP-only architecture where all LLM interactions go through the Model Context Protocol:

[TUI/CLI] -> [RemoteMcpClient] -> [MCP LLM Server] -> [Ollama Provider] -> [Ollama API]

Benefits of MCP Architecture

  1. Separation of Concerns: The TUI/CLI never directly instantiates provider implementations.
  2. Process Isolation: LLM interactions run in a separate process, improving stability.
  3. Extensibility: New providers can be added by implementing MCP servers.
  4. Multi-Transport: Supports STDIO, HTTP, and WebSocket transports.
  5. Tool Integration: MCP servers can expose tools (file operations, web search, etc.) to the LLM.

MCP Communication Flow

  1. Client Creation: RemoteMcpClient::new() spawns an MCP server binary via STDIO.
  2. Initialization: Client sends initialize request to establish protocol version.
  3. Tool Discovery: Client calls tools/list to discover available LLM operations.
  4. Chat Requests: Client calls the generate_text tool with chat parameters.
  5. Streaming: Server sends progress notifications during generation, then final response.
  6. Response Handling: Client skips notifications and returns the final text to the caller.

Cloud Provider Support

For Ollama Cloud providers, the MCP server accepts an OLLAMA_URL environment variable:

let env_vars = HashMap::from([
    ("OLLAMA_URL".to_string(), "https://cloud-provider-url".to_string())
]);
let config = McpServerConfig {
    command: "path/to/owlen-mcp-llm-server",
    env: env_vars,
    transport: "stdio",
    ...
};
let client = RemoteMcpClient::new_with_config(&config)?;

Vim Mode State Machine

The TUI follows a Vim-inspired modal workflow. Maintaining the transitions keeps keyboard handling predictable:

  • Normal → Insert: triggered by keys such as i, a, or o; pressing Esc returns to Normal.
  • Normal → Visual: v enters visual selection; Esc or completing a selection returns to Normal.
  • Normal → Command: : opens command mode; executing a command or cancelling with Esc returns to Normal.
  • Normal → Auxiliary modes: ? (help), :provider, :model, and similar commands open transient overlays that always exit back to Normal once dismissed.
  • Insert/Visual/Command → Normal: pressing Esc always restores the neutral state.

The status line shows the active mode (for example, “Normal mode • Press F1 for help”), which doubles as a quick regression check during manual testing.

Session Management

The session management system is responsible for tracking the state of a conversation. The two main structs are:

  • Conversation: Found in owlen-core, this struct holds the messages of a single conversation, the model being used, and other metadata. It is a simple data container.
  • SessionController: This is the high-level controller that manages the active conversation. It handles:
    • Storing and retrieving conversation history via the ConversationManager.
    • Managing the context that is sent to the LLM provider.
    • Switching between different models.
    • Sending requests to the provider and handling the responses (both streaming and complete).

When a user sends a message, the SessionController adds the message to the current Conversation, sends the updated message list to the Provider, and then adds the provider's response to the Conversation.

Event Flow

The event flow is managed by the EventHandler in owlen-tui. It operates in a loop, waiting for events and dispatching them to the active application (ChatApp or CodeApp).

  1. Event Source: Events are primarily generated by crossterm from user keyboard input. Asynchronous events, like responses from a Provider, are also fed into the event system via a tokio::mpsc channel.
  2. EventHandler::next(): The main application loop calls this method to wait for the next event.
  3. Event Enum: Events are defined in the owlen_tui::events::Event enum. This includes Key events, Tick events (for UI updates), and Message events (for async provider data).
  4. Dispatch: The application's run method matches on the Event type and calls the appropriate handler function (e.g., dispatch_key_event).
  5. State Update: The handler function updates the application state based on the event. For example, a key press might change the InputMode or modify the text in the input buffer.
  6. Re-render: After the state is updated, the UI is re-rendered to reflect the changes.

TUI Rendering Pipeline

The TUI is rendered on each iteration of the main application loop in owlen-tui. The process is as follows:

  1. tui.draw(): The main loop calls this method, passing the current application state.
  2. Terminal::draw(): This method, from ratatui, takes a closure that receives a Frame.
  3. UI Composition: Inside the closure, the UI is built by composing ratatui widgets. The root UI is defined in owlen_tui::ui::render, which builds the main layout and calls other functions to render specific components (like the chat panel, input box, etc.).
  4. State-Driven Rendering: Each rendering function takes the current application state as an argument. It uses this state to decide what and how to render. For example, the border color of a panel might change if it is focused.
  5. Buffer and Diff: ratatui does not draw directly to the terminal. Instead, it renders the widgets to an in-memory buffer. It then compares this buffer to the previous buffer and only sends the necessary changes to the terminal. This is highly efficient and prevents flickering.

The command palette and other modal helpers expose lightweight state structs in owlen_tui::state. These components keep business logic (suggestion filtering, selection state, etc.) independent from rendering, which in turn makes them straightforward to unit test.