Files
owly-news/backend-rust/ROADMAP.md

15 KiB

Owly News Summariser - Project Roadmap

This document outlines the strategic approach for transforming the project through three phases: Python-to-Rust backend migration, CLI application addition, and Vue-to-Dioxus frontend migration.

Project Structure Strategy

Current Phase: Axum API Setup

owly-news-summariser/
├── src/
│   ├── main.rs              # Entry point (will evolve)
│   ├── db.rs                # Database connection & SQLx setup
│   ├── api.rs               # API module declaration
│   ├── api/                 # API-specific modules (no mod.rs needed)
│   │   ├── routes.rs        # Route definitions
│   │   ├── middleware.rs    # Custom middleware
│   │   └── handlers.rs      # Request handlers & business logic
│   ├── models.rs            # Models module declaration
│   ├── models/              # Data models & database entities
│   │   ├── user.rs
│   │   ├── article.rs
│   │   └── summary.rs
│   ├── services.rs          # Services module declaration
│   ├── services/            # Business logic layer
│   │   ├── news_service.rs
│   │   ├── summary_service.rs
│   │   └── scraping_service.rs  # Article content extraction
│   └── config.rs            # Configuration management
├── migrations/              # SQLx migrations (managed by SQLx CLI)
├── frontend/                # Keep existing Vue frontend for now
├── config.toml              # Configuration file with AI settings
└── Cargo.toml

Phase 2: Multi-Binary Structure (API + CLI)

owly-news-summariser/
├── src/
│   ├── lib.rs               # Shared library code
│   ├── bin/
│   │   ├── server.rs        # API server binary
│   │   └── cli.rs           # CLI application binary
│   ├── [same module structure as Phase 1]
├── migrations/
├── frontend/
├── completions/             # Shell completion scripts
│   ├── owly.bash
│   ├── owly.zsh
│   └── owly.fish
├── config.toml
└── Cargo.toml               # Updated for multiple binaries

Phase 3: Full Rust Stack

owly-news-summariser/
├── src/
│   ├── [same structure as Phase 2]
├── migrations/
├── frontend-dioxus/         # New Dioxus frontend
├── frontend/                # Legacy Vue (to be removed)
├── completions/
├── config.toml
└── Cargo.toml

Core Features & Architecture

Article Processing Workflow

Hybrid Approach: RSS Feeds + Manual Submissions with Configurable AI

  1. Article Collection

    • RSS feed monitoring and batch processing
    • Manual article URL submission
    • Store original content and metadata in database
  2. Content Processing Pipeline

    • Fetch RSS articles → scrape full content → store in DB
    • Display articles immediately with "Awaiting summary" status
    • Optional AI summarization with configurable prompts
    • Support for AI-free mode (content storage only)
    • Background async summarization with status updates
    • Support for re-summarization without re-fetching
  3. AI Configuration System

    • Toggle AI summarization on/off globally
    • Configurable AI prompts from config file
    • Support for different AI providers (Ollama, OpenAI, etc.)
    • Temperature and context settings
    • Fallback modes when AI is unavailable
  4. Database Schema

    Articles: id, title, url, source_type, rss_content, full_content, 
             summary, summary_status, ai_enabled, created_at, updated_at
    Sources: RSS feeds, Manual submissions with priority levels
    Config: AI settings, prompts, provider configurations
    
  5. Processing Priority System

    • High: Manually submitted articles (immediate processing)
    • Normal: Recent RSS articles
    • Low: RSS backlog
    • Skip: AI-disabled articles (content-only storage)

Step-by-Step Process

Phase 1: Axum API Implementation

Step 1: Core Infrastructure Setup

  • Set up database connection pooling with SQLx
  • Enhanced Configuration System:
    • Extend config.toml with AI settings
    • AI provider configurations (Ollama, OpenAI, local models)
    • Customizable AI prompts and templates
    • Global AI enable/disable toggle
    • Processing batch sizes and timeouts
  • Establish error handling patterns with anyhow
  • Set up logging infrastructure

Step 2: Data Layer

  • Design database schema with article source tracking and AI settings
  • Create SQLx migrations using sqlx migrate add
  • Implement article models with RSS/manual source types
  • Add AI configuration storage and retrieval
  • Create database access layer with proper async patterns
  • Use SQLx's compile-time checked queries

Step 3: Content Processing Services

  • Implement RSS feed fetching and parsing
  • Create web scraping service for full article content
  • Flexible AI Service:
    • Configurable AI summarization with prompt templates
    • Support for multiple AI providers
    • AI-disabled mode for content-only processing
    • Retry logic and fallback strategies
    • Custom prompt injection from config
  • Build background job system for async summarization
  • Add content validation and error handling

Step 4: API Layer Architecture

  • Article Management Routes:
    • GET /api/articles - List all articles with status
    • POST /api/articles - Submit manual article URL
    • GET /api/articles/:id - Get specific article
    • POST /api/articles/:id/summary - Trigger/re-trigger summary
    • POST /api/articles/:id/toggle-ai - Enable/disable AI for article
  • RSS Feed Management:
    • GET /api/feeds - List RSS feeds
    • POST /api/feeds - Add RSS feed
    • POST /api/feeds/:id/sync - Manual feed sync
  • Configuration Management:
    • GET /api/config - Get current configuration
    • POST /api/config/ai - Update AI settings
    • POST /api/config/prompts - Update AI prompts
  • Summary Management:
    • GET /api/summaries - List summaries
    • WebSocket/SSE for real-time summary updates

Step 5: Frontend Integration Features

  • Article list with status indicators and AI toggle
  • "Add Article" form with URL input, validation, and AI option
  • AI configuration panel with prompt editing
  • Real-time status updates for processing articles
  • Bulk article import functionality with AI settings
  • Summary regeneration controls with prompt modification
  • Global AI enable/disable toggle in settings

Step 6: Integration & Testing

  • Test all API endpoints thoroughly
  • Test AI-enabled and AI-disabled workflows
  • Ensure Vue frontend works seamlessly with new Rust backend
  • Performance testing and optimization
  • Deploy and monitor

Phase 2: CLI Application Addition

Step 1: Restructure for Multiple Binaries

  • Move API code to src/bin/server.rs
  • Create src/bin/cli.rs for CLI application
  • Keep shared logic in src/lib.rs
  • Update Cargo.toml to support multiple binaries
  • Add clap with completion feature

Step 2: CLI Architecture with Auto-completion

  • Shell Completion Support:
    • Generate bash, zsh, and fish completion scripts
    • owly completion bash > /etc/bash_completion.d/owly
    • owly completion zsh > ~/.zsh/completions/_owly
    • Dynamic completion for article IDs, feed URLs, etc.
  • Enhanced CLI Commands:
    • owly add-url <url> [--no-ai] - Add single article for processing
    • owly add-feed <rss-url> [--no-ai] - Add RSS feed
    • owly sync [--no-ai] - Sync all RSS feeds
    • owly process [--ai-only|--no-ai] - Process pending articles
    • owly list [--status pending|completed|failed] - List articles with filtering
    • owly summarize <id> [--prompt <custom-prompt>] - Re-summarize specific article
    • owly config ai [--enable|--disable] - Configure AI settings
    • owly config prompt [--set <prompt>|--reset] - Manage AI prompts
    • owly completion <shell> - Generate shell completions
  • Reuse existing services and models from the API
  • Create CLI-specific output formatting with rich progress indicators
  • Implement batch processing capabilities with progress bars
  • Support piping URLs for bulk processing
  • Interactive prompt editing mode

Step 3: Shared Core Logic

  • Extract common functionality into library crates
  • Ensure both API and CLI can use the same business logic
  • Implement proper configuration management for both contexts
  • Share article processing pipeline between web and CLI
  • Unified AI service layer for both interfaces

Phase 3: Dioxus Frontend Migration

Step 1: Parallel Development

  • Create new frontend-dioxus/ directory
  • Keep existing Vue frontend running during development
  • Set up Dioxus project structure with proper routing

Step 2: Component Architecture

  • Design reusable Dioxus components
  • Core Components:
    • ArticleList with status filtering and AI indicators
    • AddArticleForm with URL validation and AI toggle
    • SummaryDisplay with regeneration controls and prompt editing
    • FeedManager for RSS feed management with AI settings
    • StatusIndicator for processing states
    • ConfigPanel for AI settings and prompt management
    • AIToggle for per-article AI control
  • Implement state management (similar to Pinia in Vue)
  • Create API client layer for communication with Rust backend
  • WebSocket integration for real-time updates

Step 3: Feature Parity & UX Enhancements

  • Port Vue components to Dioxus incrementally
  • Enhanced UX Features:
    • Smart URL validation with preview
    • Batch article import with drag-and-drop and AI options
    • Advanced filtering and search (by AI status, processing state)
    • Processing queue management with AI priority
    • Summary comparison tools
    • AI prompt editor with syntax highlighting
    • Configuration import/export functionality
  • Implement proper error handling and loading states
  • Progressive Web App features

Step 4: Final Migration

  • Switch production traffic to Dioxus frontend
  • Remove Vue frontend after thorough testing
  • Optimize bundle size and performance

Key Strategic Considerations

1. Modern Rust Practices

  • Use modern module structure without mod.rs files
  • Leverage SQLx's built-in migration and connection management
  • Follow Rust 2018+ edition conventions

2. Maintain Backward Compatibility

  • Keep API contracts stable during Vue-to-Dioxus transition
  • Use feature flags for gradual rollouts
  • Support legacy configurations during migration

3. Shared Code Architecture

  • Design your core business logic to be framework-agnostic
  • Use workspace structure for better code organization
  • Consider extracting domain logic into separate crates
  • AI service abstraction for provider flexibility

4. Content Processing Strategy

  • Resilient Pipeline: Store original content for offline re-processing
  • Progressive Enhancement: Show articles immediately, summaries when ready
  • Flexible AI Integration: Support both AI-enabled and AI-disabled workflows
  • Error Recovery: Graceful handling of scraping/summarization failures
  • Rate Limiting: Respect external API limits and website politeness
  • Configuration-Driven: All AI behavior controlled via config files

5. CLI User Experience

  • Shell Integration: Auto-completion for all major shells
  • Interactive Mode: Guided prompts for complex operations
  • Batch Processing: Efficient handling of large article sets
  • Progress Indicators: Clear feedback for long-running operations
  • Flexible Output: JSON, table, and human-readable formats

6. Testing Strategy

  • Unit tests for business logic
  • Integration tests for API endpoints
  • End-to-end tests for the full stack
  • CLI integration tests with completion validation
  • Content scraping reliability tests
  • AI service mocking and fallback testing

7. Configuration Management

  • Layered Configuration:
    • Default settings
    • Config file overrides
    • Environment variable overrides
    • Runtime API updates
  • Support for different deployment scenarios
  • Proper secrets management
  • Hot-reloading of AI prompts and settings
  • Configuration validation and migration

8. Database Strategy

  • Use SQLx migrations for schema evolution (sqlx migrate add/run)
  • Leverage compile-time checked queries with SQLx macros
  • Implement proper connection pooling and error handling
  • Let SQLx handle what it does best - don't reinvent the wheel
  • Design for both read-heavy (web UI) and write-heavy (batch processing) patterns
  • Store AI configuration and prompt history

Enhanced Configuration File Structure

[server]
host = '127.0.0.1'
port = 8090

[ai]
enabled = true
provider = "ollama"  # ollama, openai, local
timeout_seconds = 120
temperature = 0.1
max_tokens = 1000

[ai.ollama]
host = "http://localhost:11434"
model = "llama2"
context_size = 8192

[ai.openai]
api_key_env = "OPENAI_API_KEY"
model = "gpt-3.5-turbo"

[ai.prompts]
default = """
Analyze this article and provide a concise summary in JSON format:
{
  "title": "article title",
  "summary": "brief summary in 2-3 sentences",
  "key_points": ["point1", "point2", "point3"]
}

Article URL: {url}
Article Title: {title}
Article Content: {content}
"""

news = """Custom prompt for news articles..."""
technical = """Custom prompt for technical articles..."""

[processing]
batch_size = 10
max_concurrent = 5
retry_attempts = 3
priority_manual = true

[cli]
default_output = "table"  # table, json, compact
auto_confirm = false
show_progress = true

What SQLx Handles for You

  • Migrations: Use sqlx migrate add <name> to create, sqlx::migrate!() macro to embed
  • Connection Pooling: Built-in SqlitePool with configuration options
  • Query Safety: Compile-time checked queries prevent SQL injection and typos
  • Type Safety: Automatic Rust type mapping from database types

Future Enhancements (Post Phase 3)

Advanced Features

  • Machine Learning: Article classification and topic clustering
  • Analytics Dashboard: Reading patterns and trending topics
  • Content Deduplication: Identify and merge similar articles
  • Multi-language Support: Translation and summarization
  • API Rate Limiting: User quotas and fair usage policies
  • Advanced AI Features: Custom model training, prompt optimization

Extended Integrations

  • Browser Extension: One-click article addition (far future)
  • Mobile App: Native iOS/Android apps using Rust core
  • Social Features: Shared reading lists and collaborative summaries
  • Export Features: PDF, EPUB, and other format exports
  • Third-party Integrations: Pocket, Instapaper, read-later services
  • AI Provider Ecosystem: Support for more AI services and local models

Scalability Improvements

  • Microservices: Split processing services for horizontal scaling
  • Message Queues: Redis/RabbitMQ for high-throughput processing
  • Caching Layer: Redis for frequently accessed summaries
  • CDN Integration: Static asset and summary caching
  • Multi-database: PostgreSQL for production, advanced analytics
  • Distributed AI: Load balancing across multiple AI providers