Files
owly-news/backend-rust/ROADMAP.md

585 lines
23 KiB
Markdown

# Owly News Summariser - Project Roadmap
This document outlines the strategic approach for transforming the project through three phases: Python-to-Rust backend migration, CLI application addition, and Vue-to-Dioxus frontend migration.
## Project Structure Strategy
### Current Phase: Axum API Setup
```
owly-news-summariser/
├── src/
│ ├── main.rs # Entry point (will evolve)
│ ├── db.rs # Database connection & SQLx setup
│ ├── api.rs # API module declaration
│ ├── api/ # API-specific modules (no mod.rs needed)
│ │ ├── routes.rs # Route definitions
│ │ ├── middleware.rs # Custom middleware
│ │ └── handlers.rs # Request handlers & business logic
│ ├── models.rs # Models module declaration
│ ├── models/ # Data models & database entities
│ │ ├── user.rs
│ │ ├── article.rs
│ │ ├── summary.rs
│ │ ├── tag.rs # Tag models and relationships
│ │ ├── analytics.rs # Analytics and statistics models
│ │ └── settings.rs # User settings and preferences
│ ├── services.rs # Services module declaration
│ ├── services/ # Business logic layer
│ │ ├── news_service.rs
│ │ ├── summary_service.rs
│ │ ├── scraping_service.rs # Article content extraction
│ │ ├── tagging_service.rs # AI-powered tagging
│ │ ├── analytics_service.rs # Reading stats and analytics
│ │ └── sharing_service.rs # Article sharing functionality
│ └── config.rs # Configuration management
├── migrations/ # SQLx migrations (managed by SQLx CLI)
├── frontend/ # Keep existing Vue frontend for now
├── config.toml # Configuration file with AI settings
└── Cargo.toml
```
### Phase 2: Multi-Binary Structure (API + CLI)
```
owly-news-summariser/
├── src/
│ ├── lib.rs # Shared library code
│ ├── bin/
│ │ ├── server.rs # API server binary
│ │ └── cli.rs # CLI application binary
│ ├── [same module structure as Phase 1]
├── migrations/
├── frontend/
├── completions/ # Shell completion scripts
│ ├── owly.bash
│ ├── owly.zsh
│ └── owly.fish
├── config.toml
└── Cargo.toml # Updated for multiple binaries
```
### Phase 3: Full Rust Stack
```
owly-news-summariser/
├── src/
│ ├── [same structure as Phase 2]
├── migrations/
├── frontend-dioxus/ # New Dioxus frontend
├── frontend/ # Legacy Vue (to be removed)
├── completions/
├── config.toml
└── Cargo.toml
```
## Core Features & Architecture
### Article Processing & Display Workflow
**Hybrid Approach: RSS Feeds + Manual Submissions with Smart Content Management**
1. **Article Collection**
- RSS feed monitoring and batch processing
- Manual article URL submission
- Store original content and metadata in database
2. **Content Processing Pipeline**
- Fetch RSS articles → scrape full content → store in DB
- **Compact Article Display**:
- Title (primary display)
- RSS description text
- Tags (visual indicators)
- Time posted (from RSS)
- Time added (when added to system)
- Action buttons: [Full Article] [Summary] [Source]
- **On-Demand Content Loading**:
- Full Article: Display complete scraped content
- Summary: Show AI-generated summary
- Source: Open original URL in new tab
- Background async processing with status updates
- Support for re-processing without re-fetching
3. **Intelligent Tagging System**
- **Automatic Tag Generation**: AI analyzes content and assigns relevant tags
- **Geographic & Source Tags**: AI-generated location tags (countries, regions, cities) and publication source tags
- **Content Category Tags**: Technology, Politics, Business, Sports, Health, etc.
- **Visual Tag Display**: Color-coded tags in compact article view with hierarchical display
- **Tag Filtering**: Quick filtering by clicking tags with smart suggestions
- **Custom Tags**: User-defined tags and categories
- **Tag Confidence**: Visual indicators for AI vs manual tags
- **Tag Migration**: Automatic conversion of legacy country filters to geographic tags
4. **Analytics & Statistics System**
- **Reading Analytics**:
- Articles read vs added
- Reading time tracking
- Most read categories and tags
- Reading patterns over time
- **Content Analytics**:
- Source reliability and quality metrics
- Tag usage statistics
- Processing success rates
- Content freshness tracking
- **Performance Metrics**:
- AI processing times
- Scraping success rates
- User engagement patterns
5. **Advanced Filtering System**
- **Multi-Criteria Filtering**:
- By tags (single or multiple with AND/OR logic)
- By geographic tags (country, region, city with hierarchical filtering)
- By content categories and topics
- By date ranges (posted, added, read)
- By processing status (pending, completed, failed)
- By content availability (scraped, summary, RSS-only)
- By read/unread status
- **Smart Filter Migration**: Automatic conversion of legacy country filters to tag-based equivalents
- **Saved Filter Presets**:
- Custom filter combinations
- Quick access to frequent searches
- Geographic preset templates (e.g., "European Tech News", "US Politics")
- **Smart Suggestions**: Filter suggestions based on usage patterns and tag relationships
6. **Settings & Management System**
- **User Preferences**:
- Default article view mode
- Tag display preferences with geographic hierarchy settings
- Reading tracking settings
- Notification preferences
- **System Settings**:
- AI configuration (via API and config file)
- Processing settings
- Display customization
- Export preferences
- **Content Management**:
- Bulk operations (mark read, delete, retag)
- Archive old articles
- Export/import functionality
- Legacy data migration tools
7. **Article Sharing System**
- **Multiple Share Formats**:
- Clean text format with title, summary, and source link
- Markdown format for developers
- Rich HTML format for email/web
- JSON format for API integration
- **Copy to Clipboard**: One-click formatted sharing
- **Share Templates**: Customizable sharing formats
- **Privacy Controls**: Control what information is included in shares
8. **Database Schema**
```
Articles: id, title, url, source_type, rss_content, full_content,
summary, processing_status, published_at, added_at, read_at,
read_count, reading_time, ai_enabled, created_at, updated_at
Tags: id, name, category, description, color, usage_count, parent_id, created_at
ArticleTags: article_id, tag_id, confidence_score, ai_generated, created_at
ReadingStats: user_id, article_id, read_at, reading_time, completion_rate
FilterPresets: id, name, filter_criteria, user_id, created_at
Settings: key, value, category, user_id, updated_at
ShareTemplates: id, name, format, template_content, created_at
LegacyMigration: old_filter_type, old_value, new_tag_ids, migrated_at
```
## Step-by-Step Process
### Phase 1: Axum API Implementation
**Step 1: Core Infrastructure Setup**
- Set up database connection pooling with SQLx
- **Enhanced Configuration System**:
- Extend config.toml with comprehensive settings
- AI provider configurations with separate summary/tagging settings
- Display preferences and UI customization
- Analytics and tracking preferences
- Sharing templates and formats
- Filter and search settings
- Geographic tagging preferences
- Establish error handling patterns with `anyhow`
- Set up logging and analytics infrastructure
**Step 2: Data Layer**
- Design comprehensive database schema with analytics and settings support
- Create SQLx migrations for all tables including analytics and user preferences
- Implement hierarchical tag system with geographic and content categories
- Add legacy migration support for country filters
- Implement article models with reading tracking and statistics
- Add settings and preferences data layer
- Create analytics data models and aggregation queries
- Implement sharing templates and format management
- Use SQLx's compile-time checked queries
**Step 3: Enhanced Services Layer**
- **Content Processing Services**:
- RSS feed fetching and parsing
- Web scraping with quality tracking
- AI services for summary and tagging
- **Enhanced Tagging Service**:
- Geographic location detection and tagging
- Content category classification
- Hierarchical tag relationships
- Legacy filter migration logic
- **Analytics Service**:
- Reading statistics collection and aggregation
- Content performance metrics
- User behavior tracking
- Trend analysis and insights
- **Settings Management Service**:
- User preference handling
- System configuration management
- Real-time settings updates
- **Sharing Service**:
- Multiple format generation
- Template processing
- Privacy-aware content filtering
- **Advanced Filtering Service**:
- Complex query building with geographic hierarchy
- Filter preset management
- Search optimization
- Legacy filter migration
**Step 4: Comprehensive API Layer**
- **Article Management Routes**:
- `GET /api/articles` - List articles with compact display data
- `POST /api/articles` - Submit manual article URL
- `GET /api/articles/:id` - Get basic article info
- `GET /api/articles/:id/full` - Get complete scraped content
- `GET /api/articles/:id/summary` - Get AI summary
- `POST /api/articles/:id/read` - Mark as read and track reading time
- `POST /api/articles/:id/share` - Generate shareable content
- **Analytics Routes**:
- `GET /api/analytics/dashboard` - Main analytics dashboard data
- `GET /api/analytics/reading-stats` - Personal reading statistics
- `GET /api/analytics/content-stats` - Content and source analytics
- `GET /api/analytics/trends` - Trending topics and patterns
- `GET /api/analytics/export` - Export analytics data
- **Enhanced Filtering & Search Routes**:
- `GET /api/filters/presets` - Get saved filter presets
- `POST /api/filters/presets` - Save new filter preset
- `GET /api/search/suggestions` - Get search and filter suggestions
- `POST /api/search` - Advanced search with multiple criteria
- `POST /api/filters/migrate` - Migrate legacy country filters to tags
- **Settings Routes**:
- `GET /api/settings` - Get all user settings
- `PUT /api/settings` - Update user settings
- `GET /api/settings/system` - Get system configuration
- `PUT /api/settings/system` - Update system settings (admin)
- **Enhanced Tag Management Routes**:
- `GET /api/tags` - List tags with usage statistics and hierarchy
- `GET /api/tags/geographic` - Get geographic tag hierarchy
- `GET /api/tags/trending` - Get trending tags
- `POST /api/tags/:id/follow` - Follow/unfollow tag for notifications
- `GET /api/tags/categories` - Get tag categories and relationships
- **Sharing Routes**:
- `GET /api/share/templates` - Get sharing templates
- `POST /api/share/templates` - Create custom sharing template
- `POST /api/articles/:id/share/:format` - Generate share content
**Step 5: Enhanced Frontend Features**
- **Compact Article Display**:
- Card-based layout with title, RSS excerpt, tags, and timestamps
- Action buttons for Full Article, Summary, and Source
- Hierarchical tag display with geographic and category indicators
- Reading status and progress indicators
- **Advanced Analytics Dashboard**:
- Reading statistics with charts and trends
- Content source performance metrics
- Tag usage and trending topics with geographic breakdowns
- Personal reading insights and goals
- **Comprehensive Filtering Interface**:
- Multi-criteria filter builder with geographic hierarchy
- Saved filter presets with quick access
- Smart filter suggestions based on tag relationships
- Visual filter indicators and clear actions
- Legacy filter migration interface
- **Settings Management Panel**:
- User preference configuration
- AI and processing settings
- Display and UI customization
- Export/import functionality
- **Enhanced Sharing System**:
- Quick share buttons with format selection
- Copy-to-clipboard functionality
- Custom sharing templates
- Preview before sharing
**Step 6: Integration & Testing**
- Test all API endpoints with comprehensive coverage
- Test analytics collection and aggregation
- Test enhanced filtering and search functionality
- Test legacy filter migration
- Validate settings persistence and real-time updates
- Test sharing functionality across different formats
- Performance testing with large datasets and hierarchical tags
- Deploy and monitor
### Phase 2: CLI Application Addition
**Step 1: Restructure for Multiple Binaries**
- Move API code to `src/bin/server.rs`
- Create `src/bin/cli.rs` for CLI application
- Keep shared logic in `src/lib.rs`
- Update Cargo.toml to support multiple binaries
**Step 2: Enhanced CLI with Analytics and Management**
- **Core Commands**:
- `owly list [--filters] [--format table|json|compact]` - List articles
- `owly show <id> [--content|--summary]` - Display specific article
- `owly read <id>` - Mark article as read and open in pager
- `owly open <id>` - Open source URL in browser
- **Analytics Commands**:
- `owly stats [--period day|week|month|year]` - Show reading statistics
- `owly trends [--tags|--sources|--topics|--geo]` - Display trending content
- `owly analytics export [--format csv|json]` - Export analytics data
- **Management Commands**:
- `owly settings [--get key] [--set key=value]` - Manage settings
- `owly filters [--list|--save name|--load name]` - Manage filter presets
- `owly cleanup [--old|--unread|--failed]` - Clean up articles
- `owly migrate [--from-country-filters]` - Migrate legacy data
- **Enhanced Filtering Commands**:
- `owly filter [--tag] [--geo] [--category]` - Advanced filtering with geographic support
- `owly tags [--list|--hierarchy|--geo]` - Tag management with geographic display
- **Sharing Commands**:
- `owly share <id> [--format text|markdown|html]` - Generate share content
- `owly export <id> [--template name] [--output file]` - Export article
**Step 3: Advanced CLI Features**
- Interactive filtering and search with geographic hierarchy
- Real-time analytics display with charts (using ASCII graphs)
- Bulk operations with progress indicators
- Settings management with validation
- Shell completion for all commands and parameters
- Legacy data migration tools
### Phase 3: Dioxus Frontend Migration
**Step 1: Component Architecture**
- **Core Display Components**:
- `ArticleCard` - Compact article display with action buttons
- `ArticleViewer` - Full article content display
- `SummaryViewer` - AI summary display
- `TagCloud` - Interactive tag display with geographic hierarchy
- `GeographicMap` - Visual geographic filtering interface
- **Analytics Components**:
- `AnalyticsDashboard` - Main analytics overview
- `ReadingStats` - Personal reading statistics
- `TrendChart` - Trending topics and patterns
- `ContentMetrics` - Source and content analytics
- `GeographicAnalytics` - Location-based content insights
- **Enhanced Filtering Components**:
- `FilterBuilder` - Advanced filter creation interface with geographic support
- `FilterPresets` - Saved filter management
- `SearchBar` - Smart search with suggestions
- `GeographicFilter` - Hierarchical location filtering
- `MigrationTool` - Legacy filter migration interface
- **Settings Components**:
- `SettingsPanel` - User preference management
- `SystemConfig` - System-wide configuration
- `ExportImport` - Data export/import functionality
- **Sharing Components**:
- `ShareDialog` - Sharing interface with format options
- `ShareTemplates` - Custom template management
**Step 2: Enhanced UX Features**
- **Smart Article Display**:
- Lazy loading for performance
- Infinite scroll with virtualization
- Quick preview on hover
- Keyboard navigation support
- **Advanced Analytics**:
- Interactive charts and graphs with geographic data
- Customizable dashboard widgets
- Goal setting and progress tracking
- Comparison and trend analysis
- **Intelligent Filtering**:
- Auto-complete for filters with geographic suggestions
- Visual filter builder with map integration
- Filter combination suggestions based on tag relationships
- Saved search notifications
- **Seamless Sharing**:
- One-click sharing with clipboard integration
- Live preview of shared content
- Social media format optimization
- Batch sharing capabilities
## Key Strategic Considerations
### 1. Performance & Scalability
- **Efficient Data Loading**: Lazy loading and pagination for large datasets
- **Optimized Queries**: Indexed database queries for filtering and analytics with hierarchical tag support
- **Caching Strategy**: Smart caching for frequently accessed content and tag hierarchies
- **Real-time Updates**: WebSocket integration for live analytics
### 2. User Experience Focus
- **Progressive Disclosure**: Show essential info first, details on demand
- **Responsive Design**: Optimized for mobile and desktop
- **Accessibility**: Full keyboard navigation and screen reader support
- **Customization**: User-configurable interface and behavior
- **Smooth Migration**: Seamless transition from country-based to tag-based filtering
### 3. Analytics & Insights
- **Privacy-First**: User control over data collection and retention
- **Actionable Insights**: Meaningful statistics that guide reading habits
- **Performance Metrics**: System health and efficiency tracking
- **Trend Analysis**: Pattern recognition for content and behavior with geographic context
### 4. Content Management
- **Flexible Display**: Multiple view modes for different use cases
- **Smart Organization**: AI-assisted content categorization with geographic awareness
- **Bulk Operations**: Efficient management of large article collections
- **Data Integrity**: Reliable content processing and error handling
- **Legacy Support**: Smooth migration from existing country-based filtering
## Enhanced Configuration File Structure
```toml
[server]
host = '127.0.0.1'
port = 8090
[display]
default_view = "compact" # compact, full, summary
articles_per_page = 50
show_reading_time = true
show_word_count = false
highlight_unread = true
theme = "auto" # light, dark, auto
[analytics]
enabled = true
track_reading_time = true
track_scroll_position = true
retention_days = 365 # How long to keep detailed analytics
aggregate_older_data = true
[filtering]
enable_smart_suggestions = true
max_recent_filters = 10
auto_save_filters = true
default_sort = "added_desc" # added_desc, published_desc, title_asc
enable_geographic_hierarchy = true
auto_migrate_country_filters = true
[sharing]
default_format = "text"
include_summary = true
include_tags = true
include_source = true
copy_to_clipboard = true
[sharing.templates.text]
format = """
📰 {title}
{summary}
🏷️ Tags: {tags}
🌍 Location: {geographic_tags}
🔗 Source: {url}
📅 Published: {published_at}
Shared via Owly News Summariser
"""
[sharing.templates.markdown]
format = """
# {title}
{summary}
**Tags:** {tags}
**Location:** {geographic_tags}
**Source:** [{url}]({url})
**Published:** {published_at}
---
*Shared via Owly News Summariser*
"""
[ai]
enabled = true
provider = "ollama"
timeout_seconds = 120
[ai.summary]
enabled = true
temperature = 0.1
max_tokens = 1000
[ai.tagging]
enabled = true
temperature = 0.3
max_tokens = 200
max_tags_per_article = 10
min_confidence_threshold = 0.7
enable_geographic_tagging = true
enable_category_tagging = true
geographic_hierarchy_levels = 3 # country, region, city
[scraping]
timeout_seconds = 30
max_retries = 3
max_content_length = 50000
respect_robots_txt = true
rate_limit_delay_ms = 1000
[processing]
batch_size = 10
max_concurrent = 5
retry_attempts = 3
priority_manual = true
auto_mark_read_on_view = false
[migration]
auto_convert_country_filters = true
preserve_legacy_data = true
migration_batch_size = 100
[cli]
default_output = "table"
pager_command = "less"
show_progress = true
auto_confirm_bulk = false
show_geographic_hierarchy = true
```
## Migration Strategy for Country-Based Filtering
### Automatic Migration Process
1. **Data Analysis**: Scan existing country filter data and RSS feed origins
2. **Tag Generation**: Create geographic tags for each country with hierarchical structure
3. **Filter Conversion**: Convert country-based filters to tag-based equivalents
4. **User Notification**: Inform users about the migration and new capabilities
5. **Gradual Rollout**: Maintain backward compatibility during transition period
### Enhanced Geographic Features
- **Hierarchical Display**: Country → Region → City tag hierarchy
- **Visual Map Integration**: Interactive geographic filtering via map interface
- **Smart Suggestions**: Related location and content suggestions
- **Multi-Level Filtering**: Filter by specific cities, regions, or broader geographic areas
- **Source Intelligence**: AI detection of article geographic relevance beyond RSS origin
## Future Enhancements (Post Phase 3)
### Advanced Analytics
- **Machine Learning Insights**: Content recommendation based on reading patterns and geographic preferences
- **Predictive Analytics**: Trending topic prediction with geographic context
- **Behavioral Analysis**: Reading habit optimization suggestions
- **Comparative Analytics**: Benchmark against reading goals and regional averages
### Enhanced Content Management
- **Smart Collections**: AI-curated article collections with geographic themes
- **Reading Lists**: Planned reading with progress tracking
- **Content Relationships**: Related article suggestions with geographic relevance
- **Advanced Search**: Full-text search with relevance scoring and geographic weighting
### Social & Collaboration Features
- **Reading Groups**: Shared reading lists and discussions with geographic focus
- **Social Sharing**: Integration with social platforms
- **Collaborative Tagging**: Community-driven content organization
- **Reading Challenges**: Gamification of reading habits with geographic themes
### Integration & Extensibility
- **Browser Extension**: Seamless article saving and reading
- **Mobile Apps**: Native iOS/Android applications with location awareness
- **API Ecosystem**: Third-party integrations and plugins
- **Webhook System**: Real-time notifications and integrations with geographic filtering