Files
owly-news/backend-rust/ROADMAP.md

23 KiB

Owly News Summariser - Project Roadmap

This document outlines the strategic approach for transforming the project through three phases: Python-to-Rust backend migration, CLI application addition, and Vue-to-Dioxus frontend migration.

Project Structure Strategy

Current Phase: Axum API Setup


owly-news-summariser/
├── src/
│   ├── main.rs              # Entry point (will evolve)
│   ├── db.rs                # Database connection & SQLx setup
│   ├── api.rs               # API module declaration
│   ├── api/                 # API-specific modules (no mod.rs needed)
│   │   ├── routes.rs        # Route definitions
│   │   ├── middleware.rs    # Custom middleware
│   │   └── handlers.rs      # Request handlers & business logic
│   ├── models.rs            # Models module declaration
│   ├── models/              # Data models & database entities
│   │   ├── user.rs
│   │   ├── article.rs
│   │   ├── summary.rs
│   │   ├── tag.rs           # Tag models and relationships
│   │   ├── analytics.rs     # Analytics and statistics models
│   │   └── settings.rs      # User settings and preferences
│   ├── services.rs          # Services module declaration
│   ├── services/            # Business logic layer
│   │   ├── news_service.rs
│   │   ├── summary_service.rs
│   │   ├── scraping_service.rs  # Article content extraction
│   │   ├── tagging_service.rs   # AI-powered tagging
│   │   ├── analytics_service.rs # Reading stats and analytics
│   │   └── sharing_service.rs   # Article sharing functionality
│   └── config.rs            # Configuration management
├── migrations/              # SQLx migrations (managed by SQLx CLI)
├── frontend/                # Keep existing Vue frontend for now
├── config.toml              # Configuration file with AI settings
└── Cargo.toml

Phase 2: Multi-Binary Structure (API + CLI)


owly-news-summariser/
├── src/
│   ├── lib.rs               # Shared library code
│   ├── bin/
│   │   ├── server.rs        # API server binary
│   │   └── cli.rs           # CLI application binary
│   ├── [same module structure as Phase 1]
├── migrations/
├── frontend/
├── completions/             # Shell completion scripts
│   ├── owly.bash
│   ├── owly.zsh
│   └── owly.fish
├── config.toml
└── Cargo.toml               # Updated for multiple binaries

Phase 3: Full Rust Stack


owly-news-summariser/
├── src/
│   ├── [same structure as Phase 2]
├── migrations/
├── frontend-dioxus/         # New Dioxus frontend
├── frontend/                # Legacy Vue (to be removed)
├── completions/
├── config.toml
└── Cargo.toml

Core Features & Architecture

Article Processing & Display Workflow

Hybrid Approach: RSS Feeds + Manual Submissions with Smart Content Management

  1. Article Collection

    • RSS feed monitoring and batch processing
    • Manual article URL submission
    • Store original content and metadata in database
  2. Content Processing Pipeline

    • Fetch RSS articles → scrape full content → store in DB
    • Compact Article Display:
      • Title (primary display)
      • RSS description text
      • Tags (visual indicators)
      • Time posted (from RSS)
      • Time added (when added to system)
      • Action buttons: [Full Article] [Summary] [Source]
    • On-Demand Content Loading:
      • Full Article: Display complete scraped content
      • Summary: Show AI-generated summary
      • Source: Open original URL in new tab
    • Background async processing with status updates
    • Support for re-processing without re-fetching
  3. Intelligent Tagging System

    • Automatic Tag Generation: AI analyzes content and assigns relevant tags
    • Geographic & Source Tags: AI-generated location tags (countries, regions, cities) and publication source tags
    • Content Category Tags: Technology, Politics, Business, Sports, Health, etc.
    • Visual Tag Display: Color-coded tags in compact article view with hierarchical display
    • Tag Filtering: Quick filtering by clicking tags with smart suggestions
    • Custom Tags: User-defined tags and categories
    • Tag Confidence: Visual indicators for AI vs manual tags
    • Tag Migration: Automatic conversion of legacy country filters to geographic tags
  4. Analytics & Statistics System

    • Reading Analytics:
      • Articles read vs added
      • Reading time tracking
      • Most read categories and tags
      • Reading patterns over time
    • Content Analytics:
      • Source reliability and quality metrics
      • Tag usage statistics
      • Processing success rates
      • Content freshness tracking
    • Performance Metrics:
      • AI processing times
      • Scraping success rates
      • User engagement patterns
  5. Advanced Filtering System

    • Multi-Criteria Filtering:
      • By tags (single or multiple with AND/OR logic)
      • By geographic tags (country, region, city with hierarchical filtering)
      • By content categories and topics
      • By date ranges (posted, added, read)
      • By processing status (pending, completed, failed)
      • By content availability (scraped, summary, RSS-only)
      • By read/unread status
    • Smart Filter Migration: Automatic conversion of legacy country filters to tag-based equivalents
    • Saved Filter Presets:
      • Custom filter combinations
      • Quick access to frequent searches
      • Geographic preset templates (e.g., "European Tech News", "US Politics")
    • Smart Suggestions: Filter suggestions based on usage patterns and tag relationships
  6. Settings & Management System

    • User Preferences:
      • Default article view mode
      • Tag display preferences with geographic hierarchy settings
      • Reading tracking settings
      • Notification preferences
    • System Settings:
      • AI configuration (via API and config file)
      • Processing settings
      • Display customization
      • Export preferences
    • Content Management:
      • Bulk operations (mark read, delete, retag)
      • Archive old articles
      • Export/import functionality
      • Legacy data migration tools
  7. Article Sharing System

    • Multiple Share Formats:
      • Clean text format with title, summary, and source link
      • Markdown format for developers
      • Rich HTML format for email/web
      • JSON format for API integration
    • Copy to Clipboard: One-click formatted sharing
    • Share Templates: Customizable sharing formats
    • Privacy Controls: Control what information is included in shares
  8. Database Schema

Articles: id, title, url, source_type, rss_content, full_content,
summary, processing_status, published_at, added_at, read_at,
read_count, reading_time, ai_enabled, created_at, updated_at
Tags: id, name, category, description, color, usage_count, parent_id, created_at
ArticleTags: article_id, tag_id, confidence_score, ai_generated, created_at
ReadingStats: user_id, article_id, read_at, reading_time, completion_rate
FilterPresets: id, name, filter_criteria, user_id, created_at
Settings: key, value, category, user_id, updated_at
ShareTemplates: id, name, format, template_content, created_at
LegacyMigration: old_filter_type, old_value, new_tag_ids, migrated_at

Step-by-Step Process

Phase 1: Axum API Implementation

Step 1: Core Infrastructure Setup

  • Set up database connection pooling with SQLx
  • Enhanced Configuration System:
    • Extend config.toml with comprehensive settings
    • AI provider configurations with separate summary/tagging settings
    • Display preferences and UI customization
    • Analytics and tracking preferences
    • Sharing templates and formats
    • Filter and search settings
    • Geographic tagging preferences
  • Establish error handling patterns with anyhow
  • Set up logging and analytics infrastructure

Step 2: Data Layer

  • Design comprehensive database schema with analytics and settings support
  • Create SQLx migrations for all tables including analytics and user preferences
  • Implement hierarchical tag system with geographic and content categories
  • Add legacy migration support for country filters
  • Implement article models with reading tracking and statistics
  • Add settings and preferences data layer
  • Create analytics data models and aggregation queries
  • Implement sharing templates and format management
  • Use SQLx's compile-time checked queries

Step 3: Enhanced Services Layer

  • Content Processing Services:
    • RSS feed fetching and parsing
    • Web scraping with quality tracking
    • AI services for summary and tagging
  • Enhanced Tagging Service:
    • Geographic location detection and tagging
    • Content category classification
    • Hierarchical tag relationships
    • Legacy filter migration logic
  • Analytics Service:
    • Reading statistics collection and aggregation
    • Content performance metrics
    • User behavior tracking
    • Trend analysis and insights
  • Settings Management Service:
    • User preference handling
    • System configuration management
    • Real-time settings updates
  • Sharing Service:
    • Multiple format generation
    • Template processing
    • Privacy-aware content filtering
  • Advanced Filtering Service:
    • Complex query building with geographic hierarchy
    • Filter preset management
    • Search optimization
    • Legacy filter migration

Step 4: Comprehensive API Layer

  • Article Management Routes:
    • GET /api/articles - List articles with compact display data
    • POST /api/articles - Submit manual article URL
    • GET /api/articles/:id - Get basic article info
    • GET /api/articles/:id/full - Get complete scraped content
    • GET /api/articles/:id/summary - Get AI summary
    • POST /api/articles/:id/read - Mark as read and track reading time
    • POST /api/articles/:id/share - Generate shareable content
  • Analytics Routes:
    • GET /api/analytics/dashboard - Main analytics dashboard data
    • GET /api/analytics/reading-stats - Personal reading statistics
    • GET /api/analytics/content-stats - Content and source analytics
    • GET /api/analytics/trends - Trending topics and patterns
    • GET /api/analytics/export - Export analytics data
  • Enhanced Filtering & Search Routes:
    • GET /api/filters/presets - Get saved filter presets
    • POST /api/filters/presets - Save new filter preset
    • GET /api/search/suggestions - Get search and filter suggestions
    • POST /api/search - Advanced search with multiple criteria
    • POST /api/filters/migrate - Migrate legacy country filters to tags
  • Settings Routes:
    • GET /api/settings - Get all user settings
    • PUT /api/settings - Update user settings
    • GET /api/settings/system - Get system configuration
    • PUT /api/settings/system - Update system settings (admin)
  • Enhanced Tag Management Routes:
    • GET /api/tags - List tags with usage statistics and hierarchy
    • GET /api/tags/geographic - Get geographic tag hierarchy
    • GET /api/tags/trending - Get trending tags
    • POST /api/tags/:id/follow - Follow/unfollow tag for notifications
    • GET /api/tags/categories - Get tag categories and relationships
  • Sharing Routes:
    • GET /api/share/templates - Get sharing templates
    • POST /api/share/templates - Create custom sharing template
    • POST /api/articles/:id/share/:format - Generate share content

Step 5: Enhanced Frontend Features

  • Compact Article Display:
    • Card-based layout with title, RSS excerpt, tags, and timestamps
    • Action buttons for Full Article, Summary, and Source
    • Hierarchical tag display with geographic and category indicators
    • Reading status and progress indicators
  • Advanced Analytics Dashboard:
    • Reading statistics with charts and trends
    • Content source performance metrics
    • Tag usage and trending topics with geographic breakdowns
    • Personal reading insights and goals
  • Comprehensive Filtering Interface:
    • Multi-criteria filter builder with geographic hierarchy
    • Saved filter presets with quick access
    • Smart filter suggestions based on tag relationships
    • Visual filter indicators and clear actions
    • Legacy filter migration interface
  • Settings Management Panel:
    • User preference configuration
    • AI and processing settings
    • Display and UI customization
    • Export/import functionality
  • Enhanced Sharing System:
    • Quick share buttons with format selection
    • Copy-to-clipboard functionality
    • Custom sharing templates
    • Preview before sharing

Step 6: Integration & Testing

  • Test all API endpoints with comprehensive coverage
  • Test analytics collection and aggregation
  • Test enhanced filtering and search functionality
  • Test legacy filter migration
  • Validate settings persistence and real-time updates
  • Test sharing functionality across different formats
  • Performance testing with large datasets and hierarchical tags
  • Deploy and monitor

Phase 2: CLI Application Addition

Step 1: Restructure for Multiple Binaries

  • Move API code to src/bin/server.rs
  • Create src/bin/cli.rs for CLI application
  • Keep shared logic in src/lib.rs
  • Update Cargo.toml to support multiple binaries

Step 2: Enhanced CLI with Analytics and Management

  • Core Commands:
    • owly list [--filters] [--format table|json|compact] - List articles
    • owly show <id> [--content|--summary] - Display specific article
    • owly read <id> - Mark article as read and open in pager
    • owly open <id> - Open source URL in browser
  • Analytics Commands:
    • owly stats [--period day|week|month|year] - Show reading statistics
    • owly trends [--tags|--sources|--topics|--geo] - Display trending content
    • owly analytics export [--format csv|json] - Export analytics data
  • Management Commands:
    • owly settings [--get key] [--set key=value] - Manage settings
    • owly filters [--list|--save name|--load name] - Manage filter presets
    • owly cleanup [--old|--unread|--failed] - Clean up articles
    • owly migrate [--from-country-filters] - Migrate legacy data
  • Enhanced Filtering Commands:
    • owly filter [--tag] [--geo] [--category] - Advanced filtering with geographic support
    • owly tags [--list|--hierarchy|--geo] - Tag management with geographic display
  • Sharing Commands:
    • owly share <id> [--format text|markdown|html] - Generate share content
    • owly export <id> [--template name] [--output file] - Export article

Step 3: Advanced CLI Features

  • Interactive filtering and search with geographic hierarchy
  • Real-time analytics display with charts (using ASCII graphs)
  • Bulk operations with progress indicators
  • Settings management with validation
  • Shell completion for all commands and parameters
  • Legacy data migration tools

Phase 3: Dioxus Frontend Migration

Step 1: Component Architecture

  • Core Display Components:
    • ArticleCard - Compact article display with action buttons
    • ArticleViewer - Full article content display
    • SummaryViewer - AI summary display
    • TagCloud - Interactive tag display with geographic hierarchy
    • GeographicMap - Visual geographic filtering interface
  • Analytics Components:
    • AnalyticsDashboard - Main analytics overview
    • ReadingStats - Personal reading statistics
    • TrendChart - Trending topics and patterns
    • ContentMetrics - Source and content analytics
    • GeographicAnalytics - Location-based content insights
  • Enhanced Filtering Components:
    • FilterBuilder - Advanced filter creation interface with geographic support
    • FilterPresets - Saved filter management
    • SearchBar - Smart search with suggestions
    • GeographicFilter - Hierarchical location filtering
    • MigrationTool - Legacy filter migration interface
  • Settings Components:
    • SettingsPanel - User preference management
    • SystemConfig - System-wide configuration
    • ExportImport - Data export/import functionality
  • Sharing Components:
    • ShareDialog - Sharing interface with format options
    • ShareTemplates - Custom template management

Step 2: Enhanced UX Features

  • Smart Article Display:
    • Lazy loading for performance
    • Infinite scroll with virtualization
    • Quick preview on hover
    • Keyboard navigation support
  • Advanced Analytics:
    • Interactive charts and graphs with geographic data
    • Customizable dashboard widgets
    • Goal setting and progress tracking
    • Comparison and trend analysis
  • Intelligent Filtering:
    • Auto-complete for filters with geographic suggestions
    • Visual filter builder with map integration
    • Filter combination suggestions based on tag relationships
    • Saved search notifications
  • Seamless Sharing:
    • One-click sharing with clipboard integration
    • Live preview of shared content
    • Social media format optimization
    • Batch sharing capabilities

Key Strategic Considerations

1. Performance & Scalability

  • Efficient Data Loading: Lazy loading and pagination for large datasets
  • Optimized Queries: Indexed database queries for filtering and analytics with hierarchical tag support
  • Caching Strategy: Smart caching for frequently accessed content and tag hierarchies
  • Real-time Updates: WebSocket integration for live analytics

2. User Experience Focus

  • Progressive Disclosure: Show essential info first, details on demand
  • Responsive Design: Optimized for mobile and desktop
  • Accessibility: Full keyboard navigation and screen reader support
  • Customization: User-configurable interface and behavior
  • Smooth Migration: Seamless transition from country-based to tag-based filtering

3. Analytics & Insights

  • Privacy-First: User control over data collection and retention
  • Actionable Insights: Meaningful statistics that guide reading habits
  • Performance Metrics: System health and efficiency tracking
  • Trend Analysis: Pattern recognition for content and behavior with geographic context

4. Content Management

  • Flexible Display: Multiple view modes for different use cases
  • Smart Organization: AI-assisted content categorization with geographic awareness
  • Bulk Operations: Efficient management of large article collections
  • Data Integrity: Reliable content processing and error handling
  • Legacy Support: Smooth migration from existing country-based filtering

Enhanced Configuration File Structure

[server]
host = '127.0.0.1'
port = 8090

[display]
default_view = "compact"  # compact, full, summary
articles_per_page = 50
show_reading_time = true
show_word_count = false
highlight_unread = true
theme = "auto"  # light, dark, auto

[analytics]
enabled = true
track_reading_time = true
track_scroll_position = true
retention_days = 365  # How long to keep detailed analytics
aggregate_older_data = true

[filtering]
enable_smart_suggestions = true
max_recent_filters = 10
auto_save_filters = true
default_sort = "added_desc"  # added_desc, published_desc, title_asc
enable_geographic_hierarchy = true
auto_migrate_country_filters = true

[sharing]
default_format = "text"
include_summary = true
include_tags = true
include_source = true
copy_to_clipboard = true

[sharing.templates.text]
format = """
📰 {title}

{summary}

🏷️ Tags: {tags}
🌍 Location: {geographic_tags}
🔗 Source: {url}
📅 Published: {published_at}

Shared via Owly News Summariser
"""

[sharing.templates.markdown]
format = """
# {title}

{summary}

**Tags:** {tags}  
**Location:** {geographic_tags}  
**Source:** [{url}]({url})  
**Published:** {published_at}

---
*Shared via Owly News Summariser*
"""

[ai]
enabled = true
provider = "ollama"
timeout_seconds = 120

[ai.summary]
enabled = true
temperature = 0.1
max_tokens = 1000

[ai.tagging]
enabled = true
temperature = 0.3
max_tokens = 200
max_tags_per_article = 10
min_confidence_threshold = 0.7
enable_geographic_tagging = true
enable_category_tagging = true
geographic_hierarchy_levels = 3  # country, region, city

[scraping]
timeout_seconds = 30
max_retries = 3
max_content_length = 50000
respect_robots_txt = true
rate_limit_delay_ms = 1000

[processing]
batch_size = 10
max_concurrent = 5
retry_attempts = 3
priority_manual = true
auto_mark_read_on_view = false

[migration]
auto_convert_country_filters = true
preserve_legacy_data = true
migration_batch_size = 100

[cli]
default_output = "table"
pager_command = "less"
show_progress = true
auto_confirm_bulk = false
show_geographic_hierarchy = true

Migration Strategy for Country-Based Filtering

Automatic Migration Process

  1. Data Analysis: Scan existing country filter data and RSS feed origins
  2. Tag Generation: Create geographic tags for each country with hierarchical structure
  3. Filter Conversion: Convert country-based filters to tag-based equivalents
  4. User Notification: Inform users about the migration and new capabilities
  5. Gradual Rollout: Maintain backward compatibility during transition period

Enhanced Geographic Features

  • Hierarchical Display: Country → Region → City tag hierarchy
  • Visual Map Integration: Interactive geographic filtering via map interface
  • Smart Suggestions: Related location and content suggestions
  • Multi-Level Filtering: Filter by specific cities, regions, or broader geographic areas
  • Source Intelligence: AI detection of article geographic relevance beyond RSS origin

Future Enhancements (Post Phase 3)

Advanced Analytics

  • Machine Learning Insights: Content recommendation based on reading patterns and geographic preferences
  • Predictive Analytics: Trending topic prediction with geographic context
  • Behavioral Analysis: Reading habit optimization suggestions
  • Comparative Analytics: Benchmark against reading goals and regional averages

Enhanced Content Management

  • Smart Collections: AI-curated article collections with geographic themes
  • Reading Lists: Planned reading with progress tracking
  • Content Relationships: Related article suggestions with geographic relevance
  • Advanced Search: Full-text search with relevance scoring and geographic weighting

Social & Collaboration Features

  • Reading Groups: Shared reading lists and discussions with geographic focus
  • Social Sharing: Integration with social platforms
  • Collaborative Tagging: Community-driven content organization
  • Reading Challenges: Gamification of reading habits with geographic themes

Integration & Extensibility

  • Browser Extension: Seamless article saving and reading
  • Mobile Apps: Native iOS/Android applications with location awareness
  • API Ecosystem: Third-party integrations and plugins
  • Webhook System: Real-time notifications and integrations with geographic filtering