[update] updated ROADMAP.md with new project architecture details, enhanced phase descriptions, and added configuration/system design elements

2025-08-06 11:46:43 +02:00
parent f22259b863
commit aa520efb82
1 changed files with 256 additions and 32 deletions
--- a/backend-rust/ROADMAP.md
+++ b/backend-rust/ROADMAP.md
@@ -6,7 +6,6 @@ This document outlines the strategic approach for transforming the project throu

 ### Current Phase: Axum API Setup
 ```
-
 owly-news-summariser/
 ├── src/
 │   ├── main.rs              # Entry point (will evolve)
@@ -24,15 +23,16 @@ owly-news-summariser/
 │   ├── services.rs          # Services module declaration
 │   ├── services/            # Business logic layer
 │   │   ├── news_service.rs
-│   │   └── summary_service.rs
+│   │   ├── summary_service.rs
+│   │   └── scraping_service.rs  # Article content extraction
 │   └── config.rs            # Configuration management
 ├── migrations/              # SQLx migrations (managed by SQLx CLI)
 ├── frontend/                # Keep existing Vue frontend for now
+├── config.toml              # Configuration file with AI settings
 └── Cargo.toml
 ```
 ### Phase 2: Multi-Binary Structure (API + CLI)
 ```
-
 owly-news-summariser/
 ├── src/
 │   ├── lib.rs               # Shared library code
@@ -42,49 +42,131 @@ owly-news-summariser/
 │   ├── [same module structure as Phase 1]
 ├── migrations/
 ├── frontend/
+├── completions/             # Shell completion scripts
+│   ├── owly.bash
+│   ├── owly.zsh
+│   └── owly.fish
+├── config.toml
 └── Cargo.toml               # Updated for multiple binaries
 ```
 ### Phase 3: Full Rust Stack
 ```
-
 owly-news-summariser/
 ├── src/
 │   ├── [same structure as Phase 2]
 ├── migrations/
 ├── frontend-dioxus/         # New Dioxus frontend
 ├── frontend/                # Legacy Vue (to be removed)
+├── completions/
+├── config.toml
 └── Cargo.toml
 ```
+
+## Core Features & Architecture
+
+### Article Processing Workflow
+**Hybrid Approach: RSS Feeds + Manual Submissions with Configurable AI**
+
+1. **Article Collection**
+    - RSS feed monitoring and batch processing
+    - Manual article URL submission
+    - Store original content and metadata in database
+
+2. **Content Processing Pipeline**
+    - Fetch RSS articles → scrape full content → store in DB
+    - Display articles immediately with "Awaiting summary" status
+    - Optional AI summarization with configurable prompts
+    - Support for AI-free mode (content storage only)
+    - Background async summarization with status updates
+    - Support for re-summarization without re-fetching
+
+3. **AI Configuration System**
+    - Toggle AI summarization on/off globally
+    - Configurable AI prompts from config file
+    - Support for different AI providers (Ollama, OpenAI, etc.)
+    - Temperature and context settings
+    - Fallback modes when AI is unavailable
+
+4. **Database Schema**
+   ```
+   Articles: id, title, url, source_type, rss_content, full_content, 
+            summary, summary_status, ai_enabled, created_at, updated_at
+   Sources: RSS feeds, Manual submissions with priority levels
+   Config: AI settings, prompts, provider configurations
+   ```
+
+5. **Processing Priority System**
+    - High: Manually submitted articles (immediate processing)
+    - Normal: Recent RSS articles
+    - Low: RSS backlog
+    - Skip: AI-disabled articles (content-only storage)
+
 ## Step-by-Step Process

 ### Phase 1: Axum API Implementation

 **Step 1: Core Infrastructure Setup**
 - Set up database connection pooling with SQLx
- Create configuration management system (environment variables, config files)
+- **Enhanced Configuration System**:
+    - Extend config.toml with AI settings
+    - AI provider configurations (Ollama, OpenAI, local models)
+    - Customizable AI prompts and templates
+    - Global AI enable/disable toggle
+    - Processing batch sizes and timeouts
 - Establish error handling patterns with `anyhow`
 - Set up logging infrastructure

 **Step 2: Data Layer**
- Design your database schema and create SQLx migrations using `sqlx migrate add`
- Create Rust structs that mirror your Python backend's data models
- Implement database access layer with proper async patterns
+- Design database schema with article source tracking and AI settings
+- Create SQLx migrations using `sqlx migrate add`
+- Implement article models with RSS/manual source types
+- Add AI configuration storage and retrieval
+- Create database access layer with proper async patterns
 - Use SQLx's compile-time checked queries

-**Step 3: API Layer Architecture**
- Create modular route structure (users, articles, summaries, etc.)
- Implement middleware for CORS, authentication, logging
- Set up request/response serialization with Serde
- Create proper error responses and status codes
+**Step 3: Content Processing Services**
+- Implement RSS feed fetching and parsing
+- Create web scraping service for full article content
+- **Flexible AI Service**:
+    - Configurable AI summarization with prompt templates
+    - Support for multiple AI providers
+    - AI-disabled mode for content-only processing
+    - Retry logic and fallback strategies
+    - Custom prompt injection from config
+- Build background job system for async summarization
+- Add content validation and error handling

-**Step 4: Business Logic Migration**
- Port your Python backend logic to Rust services
- Maintain API compatibility with your existing Vue frontend
- Implement proper async patterns for external API calls
- Add comprehensive testing
+**Step 4: API Layer Architecture**
+- **Article Management Routes**:
+    - `GET /api/articles` - List all articles with status
+    - `POST /api/articles` - Submit manual article URL
+    - `GET /api/articles/:id` - Get specific article
+    - `POST /api/articles/:id/summary` - Trigger/re-trigger summary
+    - `POST /api/articles/:id/toggle-ai` - Enable/disable AI for article
+- **RSS Feed Management**:
+    - `GET /api/feeds` - List RSS feeds
+    - `POST /api/feeds` - Add RSS feed
+    - `POST /api/feeds/:id/sync` - Manual feed sync
+- **Configuration Management**:
+    - `GET /api/config` - Get current configuration
+    - `POST /api/config/ai` - Update AI settings
+    - `POST /api/config/prompts` - Update AI prompts
+- **Summary Management**:
+    - `GET /api/summaries` - List summaries
+    - WebSocket/SSE for real-time summary updates

-**Step 5: Integration & Testing**
- Test API endpoints thoroughly
+**Step 5: Frontend Integration Features**
+- Article list with status indicators and AI toggle
+- "Add Article" form with URL input, validation, and AI option
+- AI configuration panel with prompt editing
+- Real-time status updates for processing articles
+- Bulk article import functionality with AI settings
+- Summary regeneration controls with prompt modification
+- Global AI enable/disable toggle in settings
+
+**Step 6: Integration & Testing**
+- Test all API endpoints thoroughly
+- Test AI-enabled and AI-disabled workflows
 - Ensure Vue frontend works seamlessly with new Rust backend
 - Performance testing and optimization
 - Deploy and monitor
@@ -96,17 +178,36 @@ owly-news-summariser/
 - Create `src/bin/cli.rs` for CLI application
 - Keep shared logic in `src/lib.rs`
 - Update Cargo.toml to support multiple binaries
+- Add clap with completion feature

-**Step 2: CLI Architecture**
- Use clap for command-line argument parsing
+**Step 2: CLI Architecture with Auto-completion**
+- **Shell Completion Support**:
+    - Generate bash, zsh, and fish completion scripts
+    - `owly completion bash > /etc/bash_completion.d/owly`
+    - `owly completion zsh > ~/.zsh/completions/_owly`
+    - Dynamic completion for article IDs, feed URLs, etc.
+- **Enhanced CLI Commands**:
+    - `owly add-url <url> [--no-ai]` - Add single article for processing
+    - `owly add-feed <rss-url> [--no-ai]` - Add RSS feed
+    - `owly sync [--no-ai]` - Sync all RSS feeds
+    - `owly process [--ai-only|--no-ai]` - Process pending articles
+    - `owly list [--status pending|completed|failed]` - List articles with filtering
+    - `owly summarize <id> [--prompt <custom-prompt>]` - Re-summarize specific article
+    - `owly config ai [--enable|--disable]` - Configure AI settings
+    - `owly config prompt [--set <prompt>|--reset]` - Manage AI prompts
+    - `owly completion <shell>` - Generate shell completions
 - Reuse existing services and models from the API
- Create CLI-specific output formatting
- Implement batch processing capabilities
+- Create CLI-specific output formatting with rich progress indicators
+- Implement batch processing capabilities with progress bars
+- Support piping URLs for bulk processing
+- Interactive prompt editing mode

 **Step 3: Shared Core Logic**
 - Extract common functionality into library crates
 - Ensure both API and CLI can use the same business logic
 - Implement proper configuration management for both contexts
+- Share article processing pipeline between web and CLI
+- Unified AI service layer for both interfaces

 ### Phase 3: Dioxus Frontend Migration

@@ -117,13 +218,30 @@ owly-news-summariser/

 **Step 2: Component Architecture**
 - Design reusable Dioxus components
+- **Core Components**:
+    - ArticleList with status filtering and AI indicators
+    - AddArticleForm with URL validation and AI toggle
+    - SummaryDisplay with regeneration controls and prompt editing
+    - FeedManager for RSS feed management with AI settings
+    - StatusIndicator for processing states
+    - ConfigPanel for AI settings and prompt management
+    - AIToggle for per-article AI control
 - Implement state management (similar to Pinia in Vue)
 - Create API client layer for communication with Rust backend
+- WebSocket integration for real-time updates

-**Step 3: Feature Parity**
+**Step 3: Feature Parity & UX Enhancements**
 - Port Vue components to Dioxus incrementally
- Ensure UI/UX consistency
+- **Enhanced UX Features**:
+    - Smart URL validation with preview
+    - Batch article import with drag-and-drop and AI options
+    - Advanced filtering and search (by AI status, processing state)
+    - Processing queue management with AI priority
+    - Summary comparison tools
+    - AI prompt editor with syntax highlighting
+    - Configuration import/export functionality
 - Implement proper error handling and loading states
+- Progressive Web App features

 **Step 4: Final Migration**
 - Switch production traffic to Dioxus frontend
@@ -140,28 +258,107 @@ owly-news-summariser/
 ### 2. Maintain Backward Compatibility
 - Keep API contracts stable during Vue-to-Dioxus transition
 - Use feature flags for gradual rollouts
+- Support legacy configurations during migration

 ### 3. Shared Code Architecture
 - Design your core business logic to be framework-agnostic
 - Use workspace structure for better code organization
 - Consider extracting domain logic into separate crates
+- AI service abstraction for provider flexibility

-### 4. Testing Strategy
+### 4. Content Processing Strategy
+- **Resilient Pipeline**: Store original content for offline re-processing
+- **Progressive Enhancement**: Show articles immediately, summaries when ready
+- **Flexible AI Integration**: Support both AI-enabled and AI-disabled workflows
+- **Error Recovery**: Graceful handling of scraping/summarization failures
+- **Rate Limiting**: Respect external API limits and website politeness
+- **Configuration-Driven**: All AI behavior controlled via config files
+
+### 5. CLI User Experience
+- **Shell Integration**: Auto-completion for all major shells
+- **Interactive Mode**: Guided prompts for complex operations
+- **Batch Processing**: Efficient handling of large article sets
+- **Progress Indicators**: Clear feedback for long-running operations
+- **Flexible Output**: JSON, table, and human-readable formats
+
+### 6. Testing Strategy
 - Unit tests for business logic
 - Integration tests for API endpoints
 - End-to-end tests for the full stack
- CLI integration tests
+- CLI integration tests with completion validation
+- Content scraping reliability tests
+- AI service mocking and fallback testing

-### 5. Configuration Management
- Environment-based configuration
- Support for different deployment scenarios (API-only, CLI-only, full stack)
+### 7. Configuration Management
+- **Layered Configuration**:
+    - Default settings
+    - Config file overrides
+    - Environment variable overrides
+    - Runtime API updates
+- Support for different deployment scenarios
 - Proper secrets management
+- Hot-reloading of AI prompts and settings
+- Configuration validation and migration

-### 6. Database Strategy
+### 8. Database Strategy
 - Use SQLx migrations for schema evolution (`sqlx migrate add/run`)
 - Leverage compile-time checked queries with SQLx macros
 - Implement proper connection pooling and error handling
 - Let SQLx handle what it does best - don't reinvent the wheel
+- Design for both read-heavy (web UI) and write-heavy (batch processing) patterns
+- Store AI configuration and prompt history
+
+## Enhanced Configuration File Structure
+
+```toml
+[server]
+host = '127.0.0.1'
+port = 8090
+
+[ai]
+enabled = true
+provider = "ollama"  # ollama, openai, local
+timeout_seconds = 120
+temperature = 0.1
+max_tokens = 1000
+
+[ai.ollama]
+host = "http://localhost:11434"
+model = "llama2"
+context_size = 8192
+
+[ai.openai]
+api_key_env = "OPENAI_API_KEY"
+model = "gpt-3.5-turbo"
+
+[ai.prompts]
+default = """
+Analyze this article and provide a concise summary in JSON format:
+{
+  "title": "article title",
+  "summary": "brief summary in 2-3 sentences",
+  "key_points": ["point1", "point2", "point3"]
+}
+
+Article URL: {url}
+Article Title: {title}
+Article Content: {content}
+"""
+
+news = """Custom prompt for news articles..."""
+technical = """Custom prompt for technical articles..."""
+
+[processing]
+batch_size = 10
+max_concurrent = 5
+retry_attempts = 3
+priority_manual = true
+
+[cli]
+default_output = "table"  # table, json, compact
+auto_confirm = false
+show_progress = true
+```

 ## What SQLx Handles for You

@@ -169,3 +366,30 @@ owly-news-summariser/
 - **Connection Pooling**: Built-in `SqlitePool` with configuration options
 - **Query Safety**: Compile-time checked queries prevent SQL injection and typos
 - **Type Safety**: Automatic Rust type mapping from database types
+
+## Future Enhancements (Post Phase 3)
+
+### Advanced Features
+- **Machine Learning**: Article classification and topic clustering
+- **Analytics Dashboard**: Reading patterns and trending topics
+- **Content Deduplication**: Identify and merge similar articles
+- **Multi-language Support**: Translation and summarization
+- **API Rate Limiting**: User quotas and fair usage policies
+- **Advanced AI Features**: Custom model training, prompt optimization
+
+### Extended Integrations
+- **Browser Extension**: One-click article addition (far future)
+- **Mobile App**: Native iOS/Android apps using Rust core
+- **Social Features**: Shared reading lists and collaborative summaries
+- **Export Features**: PDF, EPUB, and other format exports
+- **Third-party Integrations**: Pocket, Instapaper, read-later services
+- **AI Provider Ecosystem**: Support for more AI services and local models
+
+### Scalability Improvements
+- **Microservices**: Split processing services for horizontal scaling
+- **Message Queues**: Redis/RabbitMQ for high-throughput processing
+- **Caching Layer**: Redis for frequently accessed summaries
+- **CDN Integration**: Static asset and summary caching
+- **Multi-database**: PostgreSQL for production, advanced analytics
+- **Distributed AI**: Load balancing across multiple AI providers
+```