[refactor] rename and simplify ProgressManager
to FileProgress
, enhance caching logic, update Hugging Face API integration, and clean up unused comments
Some checks failed
CI / build (push) Has been cancelled
Some checks failed
CI / build (push) Has been cancelled
This commit is contained in:
150
README.md
150
README.md
@@ -1,122 +1,68 @@
|
||||
# PolyScribe
|
||||
|
||||
PolyScribe is a fast, local-first CLI for transcribing audio/video and merging existing JSON transcripts. It uses whisper-rs under the hood, can discover and download Whisper models automatically, and supports CPU and optional GPU backends (CUDA, ROCm/HIP, Vulkan).
|
||||
Local-first transcription and plugins.
|
||||
|
||||
Key features
|
||||
- Transcribe audio and common video files using ffmpeg for audio extraction.
|
||||
- Merge multiple JSON transcripts, or merge and also keep per-file outputs.
|
||||
- Model management: interactive downloader and non-interactive updater with hash verification.
|
||||
- GPU backend selection at runtime; auto-detects available accelerators.
|
||||
- Clean outputs (JSON and SRT), speaker naming prompts, and useful logging controls.
|
||||
## Features
|
||||
|
||||
Prerequisites
|
||||
- Rust toolchain (rustup recommended)
|
||||
- ffmpeg available on PATH
|
||||
- Optional for GPU acceleration at runtime: CUDA, ROCm/HIP, or Vulkan drivers (match your build features)
|
||||
- **Local-first**: Works offline with downloaded models
|
||||
- **Multiple backends**: CPU, CUDA, ROCm/HIP, and Vulkan support
|
||||
- **Plugin system**: Extensible via JSON-RPC plugins
|
||||
- **Model management**: Automatic download and verification of Whisper models
|
||||
- **Manifest caching**: Local cache for Hugging Face model manifests to reduce network requests
|
||||
|
||||
Installation
|
||||
- Build from source (CPU-only by default):
|
||||
- rustup install stable
|
||||
- rustup default stable
|
||||
- cargo build --release
|
||||
- Binary path: ./target/release/polyscribe
|
||||
- GPU builds (optional): build with features
|
||||
- CUDA: cargo build --release --features gpu-cuda
|
||||
- HIP: cargo build --release --features gpu-hip
|
||||
- Vulkan: cargo build --release --features gpu-vulkan
|
||||
## Model Management
|
||||
|
||||
Quickstart
|
||||
1) Download a model (first run can prompt you):
|
||||
- ./target/release/polyscribe models download
|
||||
- In the interactive picker, use Up/Down to navigate, Space to toggle selections, and Enter to confirm. Models are grouped by base (e.g., tiny, base, small).
|
||||
PolyScribe automatically manages Whisper models from Hugging Face:
|
||||
|
||||
2) Transcribe a file:
|
||||
- ./target/release/polyscribe -v -o output my_audio.mp3
|
||||
This writes JSON and SRT into the output directory with a date prefix.
|
||||
```bash
|
||||
# Download models interactively
|
||||
polyscribe models download
|
||||
|
||||
Shell completions and man page
|
||||
- Completions: ./target/release/polyscribe completions <bash|zsh|fish|powershell|elvish> > polyscribe.<ext>
|
||||
- Then install into your shell’s completion directory.
|
||||
- Man page: ./target/release/polyscribe man > polyscribe.1 (then copy to your manpath)
|
||||
# Update existing models
|
||||
polyscribe models update
|
||||
|
||||
Model locations
|
||||
- Development (debug builds): ./models next to the project.
|
||||
- Packaged/release builds: $XDG_DATA_HOME/polyscribe/models or ~/.local/share/polyscribe/models.
|
||||
- Override via env var: POLYSCRIBE_MODELS_DIR=/path/to/models.
|
||||
- Force a specific model file via env var: WHISPER_MODEL=/path/to/model.bin.
|
||||
# Clear manifest cache (force fresh fetch)
|
||||
polyscribe models clear-cache
|
||||
```
|
||||
|
||||
Most-used CLI flags and subcommands
|
||||
- -o, --output FILE_OR_DIR: Output path base (date prefix added). If omitted, JSON prints to stdout.
|
||||
- -m, --merge: Merge all inputs into one output; otherwise one output per input.
|
||||
- --merge-and-separate: Write both merged output and separate per-input outputs (requires -o dir).
|
||||
- --set-speaker-names: Prompt for a speaker label per input file.
|
||||
- Subcommands:
|
||||
- models update: Verify/update local models by size/hash against the upstream manifest.
|
||||
- models download: Interactive model list + multi-select download.
|
||||
- --language LANG: Language code hint (e.g., en, de). English-only models reject non-en hints.
|
||||
- --gpu-backend [auto|cpu|cuda|hip|vulkan]: Select backend (auto by default).
|
||||
- --gpu-layers N: Offload N layers to GPU when supported.
|
||||
- -v/--verbose (repeatable): Increase log verbosity. -vv shows very detailed logs.
|
||||
- -q/--quiet: Suppress non-error logs (stderr); does not silence stdout results.
|
||||
- --no-interaction: Never prompt; suitable for CI.
|
||||
### Manifest Caching
|
||||
|
||||
Minimal usage examples
|
||||
- Transcribe an audio file to JSON/SRT:
|
||||
- ./target/release/polyscribe -o output samples/podcast_clip.mp3
|
||||
- Merge multiple transcripts into one:
|
||||
- ./target/release/polyscribe -m -o output merged input/a.json input/b.json
|
||||
- Update local models non-interactively (good for CI):
|
||||
- ./target/release/polyscribe models update --no-interaction -q
|
||||
- Download models interactively:
|
||||
- ./target/release/polyscribe models download
|
||||
The Hugging Face model manifest is cached locally to avoid repeated network requests:
|
||||
|
||||
Troubleshooting & docs
|
||||
- docs/faq.md – common issues and solutions (missing ffmpeg, GPU selection, model paths)
|
||||
- docs/usage.md – complete CLI reference and workflows
|
||||
- docs/development.md – build, run, and contribute locally
|
||||
- docs/design.md – architecture overview and decisions
|
||||
- docs/release-packaging.md – packaging notes for distributions
|
||||
- CONTRIBUTING.md – PR checklist and CI workflow
|
||||
- **Default TTL**: 24 hours
|
||||
- **Cache location**: `$XDG_CACHE_HOME/polyscribe/manifest/` (or platform equivalent)
|
||||
- **Environment variables**:
|
||||
- `POLYSCRIBE_NO_CACHE_MANIFEST=1`: Disable caching
|
||||
- `POLYSCRIBE_MANIFEST_TTL_SECONDS=3600`: Set custom TTL (in seconds)
|
||||
|
||||
CI status: 
|
||||
## Installation
|
||||
|
||||
License
|
||||
-------
|
||||
This project is licensed under the MIT License — see the LICENSE file for details.
|
||||
```bash
|
||||
cargo install --path .
|
||||
```
|
||||
|
||||
---
|
||||
## Usage
|
||||
|
||||
Workspace layout
|
||||
- This repo is a Cargo workspace using resolver = "3".
|
||||
- Members:
|
||||
- crates/polyscribe-core — types, errors, config service, core helpers.
|
||||
- crates/polyscribe-protocol — PSP/1 serde types for NDJSON over stdio.
|
||||
- crates/polyscribe-host — plugin discovery/runner, progress forwarding.
|
||||
- crates/polyscribe-cli — the CLI, using host + core.
|
||||
- plugins/polyscribe-plugin-tubescribe — stub plugin used for verification.
|
||||
```bash
|
||||
# Transcribe audio/video
|
||||
polyscribe transcribe input.mp4
|
||||
|
||||
Build and run
|
||||
- Build all: cargo build --workspace --all-targets
|
||||
- CLI help: cargo run -p polyscribe-cli -- --help
|
||||
# Merge multiple transcripts
|
||||
polyscribe transcribe --merge input1.json input2.json
|
||||
|
||||
Plugins
|
||||
- Build and link the example plugin into your XDG data plugin dir:
|
||||
- make -C plugins/polyscribe-plugin-tubescribe link
|
||||
- This creates a symlink at: $XDG_DATA_HOME/polyscribe/plugins/polyscribe-plugin-tubescribe (defaults to ~/.local/share on Linux).
|
||||
- Discover installed plugins:
|
||||
- cargo run -p polyscribe-cli -- plugins list
|
||||
- Show a plugin's capabilities:
|
||||
- cargo run -p polyscribe-cli -- plugins info tubescribe
|
||||
- Run a plugin command (JSON-RPC over NDJSON via stdio):
|
||||
- cargo run -p polyscribe-cli -- plugins run tubescribe generate_metadata --json '{"input":{"kind":"text","summary":"hello world"}}'
|
||||
# Use specific GPU backend
|
||||
polyscribe transcribe --gpu-backend cuda input.mp4
|
||||
```
|
||||
|
||||
Verification commands
|
||||
- The above commands are used for acceptance; expected behavior:
|
||||
- plugins list shows "tubescribe" once linked.
|
||||
- plugins info tubescribe prints JSON capabilities.
|
||||
- plugins run ... prints progress events and a JSON result.
|
||||
## Development
|
||||
|
||||
Notes
|
||||
- No absolute paths are hardcoded; config and plugin dirs respect XDG on Linux and platform equivalents via directories.
|
||||
- Plugins must be non-interactive (no TTY prompts). All interaction stays in the host/CLI.
|
||||
- Config files are written atomically and support env overrides: POLYSCRIBE__SECTION__KEY=value.
|
||||
```bash
|
||||
# Build
|
||||
cargo build
|
||||
|
||||
# Run tests
|
||||
cargo test
|
||||
|
||||
# Run with verbose logging
|
||||
cargo run -- --verbose transcribe input.mp4
|
||||
```
|
||||
|
@@ -103,6 +103,8 @@ pub enum ModelsCmd {
|
||||
Update,
|
||||
/// Interactive multi-select downloader
|
||||
Download,
|
||||
/// Clear the cached Hugging Face manifest
|
||||
ClearCache,
|
||||
}
|
||||
|
||||
#[derive(Debug, Subcommand)]
|
||||
|
@@ -3,14 +3,14 @@ mod cli;
|
||||
use anyhow::{Context, Result, anyhow};
|
||||
use clap::{CommandFactory, Parser};
|
||||
use cli::{Cli, Commands, GpuBackend, ModelsCmd, PluginsCmd};
|
||||
use polyscribe_core::models; // Added: call into core models
|
||||
use polyscribe_core::{config::ConfigService, ui::progress::ProgressReporter};
|
||||
use polyscribe_core::models;
|
||||
use polyscribe_core::ui::progress::ProgressReporter;
|
||||
use polyscribe_host::PluginManager;
|
||||
use tokio::io::AsyncWriteExt;
|
||||
use tracing_subscriber::EnvFilter;
|
||||
|
||||
fn init_tracing(quiet: bool, verbose: u8) {
|
||||
let level = if quiet {
|
||||
let log_level = if quiet {
|
||||
"error"
|
||||
} else {
|
||||
match verbose {
|
||||
@@ -20,7 +20,7 @@ fn init_tracing(quiet: bool, verbose: u8) {
|
||||
}
|
||||
};
|
||||
|
||||
let filter = EnvFilter::try_from_default_env().unwrap_or_else(|_| EnvFilter::new(level));
|
||||
let filter = EnvFilter::try_from_default_env().unwrap_or_else(|_| EnvFilter::new(log_level));
|
||||
tracing_subscriber::fmt()
|
||||
.with_env_filter(filter)
|
||||
.with_target(false)
|
||||
@@ -35,24 +35,17 @@ async fn main() -> Result<()> {
|
||||
|
||||
init_tracing(args.quiet, args.verbose);
|
||||
|
||||
// Propagate UI flags to core so ui facade can apply policy
|
||||
polyscribe_core::set_quiet(args.quiet);
|
||||
polyscribe_core::set_no_interaction(args.no_interaction);
|
||||
polyscribe_core::set_verbose(args.verbose);
|
||||
polyscribe_core::set_no_progress(args.no_progress);
|
||||
|
||||
let _cfg = ConfigService::load_or_default().context("loading configuration")?;
|
||||
|
||||
match args.command {
|
||||
Commands::Transcribe {
|
||||
output: _output,
|
||||
merge: _merge,
|
||||
merge_and_separate: _merge_and_separate,
|
||||
language: _language,
|
||||
set_speaker_names: _set_speaker_names,
|
||||
gpu_backend,
|
||||
gpu_layers,
|
||||
inputs,
|
||||
..
|
||||
} => {
|
||||
polyscribe_core::ui::info("starting transcription workflow");
|
||||
let mut progress = ProgressReporter::new(args.no_interaction);
|
||||
@@ -94,27 +87,35 @@ async fn main() -> Result<()> {
|
||||
.context("running downloader")?;
|
||||
polyscribe_core::ui::success("Model download complete.");
|
||||
}
|
||||
ModelsCmd::ClearCache => {
|
||||
polyscribe_core::ui::info("clearing manifest cache");
|
||||
tokio::task::spawn_blocking(models::clear_manifest_cache)
|
||||
.await
|
||||
.map_err(|e| anyhow!("blocking task join error: {e}"))?
|
||||
.context("clearing cache")?;
|
||||
polyscribe_core::ui::success("Manifest cache cleared.");
|
||||
}
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
||||
Commands::Plugins { cmd } => {
|
||||
let pm = PluginManager;
|
||||
let plugin_manager = PluginManager;
|
||||
|
||||
match cmd {
|
||||
PluginsCmd::List => {
|
||||
let list = pm.list().context("discovering plugins")?;
|
||||
let list = plugin_manager.list().context("discovering plugins")?;
|
||||
for item in list {
|
||||
polyscribe_core::ui::info(item.name);
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
PluginsCmd::Info { name } => {
|
||||
let info = pm
|
||||
let info = plugin_manager
|
||||
.info(&name)
|
||||
.with_context(|| format!("getting info for {}", name))?;
|
||||
let s = serde_json::to_string_pretty(&info)?;
|
||||
polyscribe_core::ui::info(s);
|
||||
let info_json = serde_json::to_string_pretty(&info)?;
|
||||
polyscribe_core::ui::info(info_json);
|
||||
Ok(())
|
||||
}
|
||||
PluginsCmd::Run {
|
||||
@@ -123,7 +124,7 @@ async fn main() -> Result<()> {
|
||||
json,
|
||||
} => {
|
||||
let payload = json.unwrap_or_else(|| "{}".to_string());
|
||||
let mut child = pm
|
||||
let mut child = plugin_manager
|
||||
.spawn(&name, &command)
|
||||
.with_context(|| format!("spawning plugin {name} {command}"))?;
|
||||
|
||||
@@ -134,7 +135,7 @@ async fn main() -> Result<()> {
|
||||
.context("writing JSON payload to plugin stdin")?;
|
||||
}
|
||||
|
||||
let status = pm.forward_stdio(&mut child).await?;
|
||||
let status = plugin_manager.forward_stdio(&mut child).await?;
|
||||
if !status.success() {
|
||||
polyscribe_core::ui::error(format!(
|
||||
"plugin returned non-zero exit code: {}",
|
||||
|
@@ -1,12 +1,14 @@
|
||||
// SPDX-License-Identifier: MIT
|
||||
// Move original build.rs behavior into core crate
|
||||
|
||||
fn main() {
|
||||
// Only run special build steps when gpu-vulkan feature is enabled.
|
||||
let vulkan_enabled = std::env::var("CARGO_FEATURE_GPU_VULKAN").is_ok();
|
||||
println!("cargo:rerun-if-changed=extern/whisper.cpp");
|
||||
if !vulkan_enabled {
|
||||
println!(
|
||||
"cargo:warning=gpu-vulkan feature is disabled; skipping Vulkan-dependent build steps."
|
||||
);
|
||||
return;
|
||||
}
|
||||
println!("cargo:rerun-if-changed=extern/whisper.cpp");
|
||||
println!(
|
||||
"cargo:warning=Building with gpu-vulkan: ensure Vulkan SDK/loader are installed. Future versions will compile whisper.cpp via CMake."
|
||||
);
|
||||
|
@@ -1,7 +1,5 @@
|
||||
// SPDX-License-Identifier: MIT
|
||||
// Copyright (c) 2025 <COPYRIGHT HOLDER>. All rights reserved.
|
||||
|
||||
//! Transcription backend selection and implementations (CPU/GPU) used by PolyScribe.
|
||||
use crate::OutputEntry;
|
||||
use crate::prelude::*;
|
||||
use crate::{decode_audio_to_pcm_f32_ffmpeg, find_model_file};
|
||||
@@ -9,27 +7,17 @@ use anyhow::{Context, anyhow};
|
||||
use std::env;
|
||||
use std::path::Path;
|
||||
|
||||
// Re-export a public enum for CLI parsing usage
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||
/// Kind of transcription backend to use.
|
||||
pub enum BackendKind {
|
||||
/// Automatically detect the best available backend (CUDA > HIP > Vulkan > CPU).
|
||||
Auto,
|
||||
/// Pure CPU backend using whisper-rs.
|
||||
Cpu,
|
||||
/// NVIDIA CUDA backend (requires CUDA runtime available at load time and proper feature build).
|
||||
Cuda,
|
||||
/// AMD ROCm/HIP backend (requires hip/rocBLAS libraries available and proper feature build).
|
||||
Hip,
|
||||
/// Vulkan backend (experimental; requires Vulkan loader/SDK and feature build).
|
||||
Vulkan,
|
||||
}
|
||||
|
||||
/// Abstraction for a transcription backend.
|
||||
pub trait TranscribeBackend {
|
||||
/// Backend kind implemented by this type.
|
||||
fn kind(&self) -> BackendKind;
|
||||
/// Transcribe the given audio and return transcript entries.
|
||||
fn transcribe(
|
||||
&self,
|
||||
audio_path: &Path,
|
||||
@@ -40,15 +28,13 @@ pub trait TranscribeBackend {
|
||||
) -> Result<Vec<OutputEntry>>;
|
||||
}
|
||||
|
||||
fn check_lib(_names: &[&str]) -> bool {
|
||||
fn is_library_available(_names: &[&str]) -> bool {
|
||||
#[cfg(test)]
|
||||
{
|
||||
// During unit tests, avoid touching system libs to prevent loader crashes in CI.
|
||||
false
|
||||
}
|
||||
#[cfg(not(test))]
|
||||
{
|
||||
// Disabled runtime dlopen probing to avoid loader instability; rely on environment overrides.
|
||||
false
|
||||
}
|
||||
}
|
||||
@@ -57,7 +43,7 @@ fn cuda_available() -> bool {
|
||||
if let Ok(x) = env::var("POLYSCRIBE_TEST_FORCE_CUDA") {
|
||||
return x == "1";
|
||||
}
|
||||
check_lib(&[
|
||||
is_library_available(&[
|
||||
"libcudart.so",
|
||||
"libcudart.so.12",
|
||||
"libcudart.so.11",
|
||||
@@ -70,26 +56,22 @@ fn hip_available() -> bool {
|
||||
if let Ok(x) = env::var("POLYSCRIBE_TEST_FORCE_HIP") {
|
||||
return x == "1";
|
||||
}
|
||||
check_lib(&["libhipblas.so", "librocblas.so"])
|
||||
is_library_available(&["libhipblas.so", "librocblas.so"])
|
||||
}
|
||||
|
||||
fn vulkan_available() -> bool {
|
||||
if let Ok(x) = env::var("POLYSCRIBE_TEST_FORCE_VULKAN") {
|
||||
return x == "1";
|
||||
}
|
||||
check_lib(&["libvulkan.so.1", "libvulkan.so"])
|
||||
is_library_available(&["libvulkan.so.1", "libvulkan.so"])
|
||||
}
|
||||
|
||||
/// CPU-based transcription backend using whisper-rs.
|
||||
#[derive(Default)]
|
||||
pub struct CpuBackend;
|
||||
/// CUDA-accelerated transcription backend for NVIDIA GPUs.
|
||||
#[derive(Default)]
|
||||
pub struct CudaBackend;
|
||||
/// ROCm/HIP-accelerated transcription backend for AMD GPUs.
|
||||
#[derive(Default)]
|
||||
pub struct HipBackend;
|
||||
/// Vulkan-based transcription backend (experimental/incomplete).
|
||||
#[derive(Default)]
|
||||
pub struct VulkanBackend;
|
||||
|
||||
@@ -135,25 +117,13 @@ impl TranscribeBackend for VulkanBackend {
|
||||
}
|
||||
}
|
||||
|
||||
/// Result of choosing a transcription backend.
|
||||
pub struct SelectionResult {
|
||||
/// The constructed backend instance to perform transcription with.
|
||||
pub struct BackendSelection {
|
||||
pub backend: Box<dyn TranscribeBackend + Send + Sync>,
|
||||
/// Which backend kind was ultimately selected.
|
||||
pub chosen: BackendKind,
|
||||
/// Which backend kinds were detected as available on this system.
|
||||
pub detected: Vec<BackendKind>,
|
||||
}
|
||||
|
||||
/// Select an appropriate backend based on user request and system detection.
|
||||
///
|
||||
/// If `requested` is `BackendKind::Auto`, the function prefers CUDA, then HIP,
|
||||
/// then Vulkan, falling back to CPU when no GPU backend is detected. When a
|
||||
/// specific GPU backend is requested but unavailable, an error is returned with
|
||||
/// guidance on how to enable it.
|
||||
///
|
||||
/// Set `verbose` to true to print detection/selection info to stderr.
|
||||
pub fn select_backend(requested: BackendKind, verbose: bool) -> Result<SelectionResult> {
|
||||
pub fn select_backend(requested: BackendKind, verbose: bool) -> Result<BackendSelection> {
|
||||
let mut detected = Vec::new();
|
||||
if cuda_available() {
|
||||
detected.push(BackendKind::Cuda);
|
||||
@@ -171,7 +141,7 @@ pub fn select_backend(requested: BackendKind, verbose: bool) -> Result<Selection
|
||||
BackendKind::Cuda => Box::new(CudaBackend),
|
||||
BackendKind::Hip => Box::new(HipBackend),
|
||||
BackendKind::Vulkan => Box::new(VulkanBackend),
|
||||
BackendKind::Auto => Box::new(CpuBackend), // placeholder for Auto
|
||||
BackendKind::Auto => Box::new(CpuBackend),
|
||||
}
|
||||
};
|
||||
|
||||
@@ -222,14 +192,13 @@ pub fn select_backend(requested: BackendKind, verbose: bool) -> Result<Selection
|
||||
crate::dlog!(1, "Selected backend: {:?}", chosen);
|
||||
}
|
||||
|
||||
Ok(SelectionResult {
|
||||
Ok(BackendSelection {
|
||||
backend: instantiate_backend(chosen),
|
||||
chosen,
|
||||
detected,
|
||||
})
|
||||
}
|
||||
|
||||
// Internal helper: transcription using whisper-rs with CPU/GPU (depending on build features)
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
pub(crate) fn transcribe_with_whisper_rs(
|
||||
audio_path: &Path,
|
||||
@@ -268,7 +237,6 @@ pub(crate) fn transcribe_with_whisper_rs(
|
||||
.ok_or_else(|| anyhow!("Model path not valid UTF-8: {}", model_path.display()))?;
|
||||
|
||||
if crate::verbose_level() < 2 {
|
||||
// Some builds of whisper/ggml expect these env vars; harmless if unknown
|
||||
unsafe {
|
||||
std::env::set_var("GGML_LOG_LEVEL", "0");
|
||||
std::env::set_var("WHISPER_PRINT_PROGRESS", "0");
|
||||
|
@@ -1,101 +1,104 @@
|
||||
use crate::prelude::*;
|
||||
use directories::ProjectDirs;
|
||||
// SPDX-License-Identifier: MIT
|
||||
|
||||
use serde::{Deserialize, Serialize};
|
||||
use std::{fs, path::PathBuf};
|
||||
use std::env;
|
||||
use std::path::PathBuf;
|
||||
|
||||
const ENV_PREFIX: &str = "POLYSCRIBE";
|
||||
|
||||
/// Configuration for the Polyscribe application
|
||||
///
|
||||
/// Contains paths to models and plugins directories that can be customized
|
||||
/// through configuration files or environment variables.
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
|
||||
pub struct Config {
|
||||
/// Directory path where ML models are stored
|
||||
pub models_dir: Option<PathBuf>,
|
||||
/// Directory path where plugins are stored
|
||||
pub plugins_dir: Option<PathBuf>,
|
||||
}
|
||||
|
||||
// Default is derived
|
||||
|
||||
/// Service for managing Polyscribe configuration
|
||||
///
|
||||
/// Provides functionality to load, save, and access configuration settings
|
||||
/// from disk or environment variables.
|
||||
pub struct ConfigService;
|
||||
|
||||
impl ConfigService {
|
||||
/// Loads configuration from disk or returns default values if not found
|
||||
///
|
||||
/// This function attempts to read the configuration file from disk. If the file
|
||||
/// doesn't exist or can't be parsed, it falls back to default values.
|
||||
/// Environment variable overrides are then applied to the configuration.
|
||||
pub fn load_or_default() -> Result<Config> {
|
||||
let mut cfg = Self::read_disk().unwrap_or_default();
|
||||
Self::apply_env_overrides(&mut cfg)?;
|
||||
Ok(cfg)
|
||||
pub const ENV_NO_CACHE_MANIFEST: &'static str = "POLYSCRIBE_NO_CACHE_MANIFEST";
|
||||
pub const ENV_MANIFEST_TTL_SECONDS: &'static str = "POLYSCRIBE_MANIFEST_TTL_SECONDS";
|
||||
pub const ENV_MODELS_DIR: &'static str = "POLYSCRIBE_MODELS_DIR";
|
||||
pub const ENV_USER_AGENT: &'static str = "POLYSCRIBE_USER_AGENT";
|
||||
pub const ENV_HTTP_TIMEOUT_SECS: &'static str = "POLYSCRIBE_HTTP_TIMEOUT_SECS";
|
||||
pub const ENV_HF_REPO: &'static str = "POLYSCRIBE_HF_REPO";
|
||||
pub const ENV_CACHE_FILENAME: &'static str = "POLYSCRIBE_MANIFEST_CACHE_FILENAME";
|
||||
|
||||
pub const DEFAULT_USER_AGENT: &'static str = "polyscribe/0.1";
|
||||
pub const DEFAULT_DOWNLOADER_UA: &'static str = "polyscribe-model-downloader/1";
|
||||
pub const DEFAULT_HF_REPO: &'static str = "ggerganov/whisper.cpp";
|
||||
pub const DEFAULT_CACHE_FILENAME: &'static str = "hf_manifest_whisper_cpp.json";
|
||||
pub const DEFAULT_HTTP_TIMEOUT_SECS: u64 = 8;
|
||||
pub const DEFAULT_MANIFEST_CACHE_TTL_SECONDS: u64 = 24 * 60 * 60;
|
||||
|
||||
pub fn project_dirs() -> Option<directories::ProjectDirs> {
|
||||
directories::ProjectDirs::from("dev", "polyscribe", "polyscribe")
|
||||
}
|
||||
|
||||
/// Saves the configuration to disk
|
||||
///
|
||||
/// This function serializes the configuration to TOML format and writes it
|
||||
/// to the standard configuration directory for the application.
|
||||
/// Returns an error if writing fails or if project directories cannot be determined.
|
||||
pub fn save(cfg: &Config) -> Result<()> {
|
||||
let Some(dirs) = Self::dirs() else {
|
||||
return Err(Error::Other("unable to get project dirs".into()));
|
||||
};
|
||||
let cfg_dir = dirs.config_dir();
|
||||
fs::create_dir_all(cfg_dir)?;
|
||||
let path = cfg_dir.join("config.toml");
|
||||
let s = toml::to_string_pretty(cfg)?;
|
||||
fs::write(path, s)?;
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn read_disk() -> Option<Config> {
|
||||
let dirs = Self::dirs()?;
|
||||
let path = dirs.config_dir().join("config.toml");
|
||||
let s = fs::read_to_string(path).ok()?;
|
||||
toml::from_str(&s).ok()
|
||||
}
|
||||
|
||||
fn apply_env_overrides(cfg: &mut Config) -> Result<()> {
|
||||
// POLYSCRIBE__SECTION__KEY format reserved for future nested config.
|
||||
if let Ok(v) = std::env::var(format!("{ENV_PREFIX}_MODELS_DIR")) {
|
||||
cfg.models_dir = Some(PathBuf::from(v));
|
||||
}
|
||||
if let Ok(v) = std::env::var(format!("{ENV_PREFIX}_PLUGINS_DIR")) {
|
||||
cfg.plugins_dir = Some(PathBuf::from(v));
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Returns the standard project directories for the application
|
||||
///
|
||||
/// This function creates a ProjectDirs instance with the appropriate
|
||||
/// organization and application names for Polyscribe.
|
||||
/// Returns None if the project directories cannot be determined.
|
||||
pub fn dirs() -> Option<ProjectDirs> {
|
||||
ProjectDirs::from("dev", "polyscribe", "polyscribe")
|
||||
}
|
||||
|
||||
/// Returns the default directory path for storing ML models
|
||||
///
|
||||
/// This function determines the standard data directory for the application
|
||||
/// and appends a 'models' subdirectory to it.
|
||||
/// Returns None if the project directories cannot be determined.
|
||||
pub fn default_models_dir() -> Option<PathBuf> {
|
||||
Self::dirs().map(|d| d.data_dir().join("models"))
|
||||
Self::project_dirs().map(|d| d.data_dir().join("models"))
|
||||
}
|
||||
|
||||
/// Returns the default directory path for storing plugins
|
||||
///
|
||||
/// This function determines the standard data directory for the application
|
||||
/// and appends a 'plugins' subdirectory to it.
|
||||
/// Returns None if the project directories cannot be determined.
|
||||
pub fn default_plugins_dir() -> Option<PathBuf> {
|
||||
Self::dirs().map(|d| d.data_dir().join("plugins"))
|
||||
Self::project_dirs().map(|d| d.data_dir().join("plugins"))
|
||||
}
|
||||
|
||||
pub fn manifest_cache_dir() -> Option<PathBuf> {
|
||||
Self::project_dirs().map(|d| d.cache_dir().join("manifest"))
|
||||
}
|
||||
|
||||
pub fn bypass_manifest_cache() -> bool {
|
||||
env::var(Self::ENV_NO_CACHE_MANIFEST).is_ok()
|
||||
}
|
||||
|
||||
pub fn manifest_cache_ttl_seconds() -> u64 {
|
||||
env::var(Self::ENV_MANIFEST_TTL_SECONDS)
|
||||
.ok()
|
||||
.and_then(|s| s.parse::<u64>().ok())
|
||||
.unwrap_or(Self::DEFAULT_MANIFEST_CACHE_TTL_SECONDS)
|
||||
}
|
||||
|
||||
pub fn manifest_cache_filename() -> String {
|
||||
env::var(Self::ENV_CACHE_FILENAME)
|
||||
.unwrap_or_else(|_| Self::DEFAULT_CACHE_FILENAME.to_string())
|
||||
}
|
||||
|
||||
pub fn models_dir(cfg: Option<&Config>) -> Option<PathBuf> {
|
||||
if let Ok(env_dir) = env::var(Self::ENV_MODELS_DIR) {
|
||||
if !env_dir.is_empty() {
|
||||
return Some(PathBuf::from(env_dir));
|
||||
}
|
||||
}
|
||||
if let Some(c) = cfg {
|
||||
if let Some(dir) = c.models_dir.clone() {
|
||||
return Some(dir);
|
||||
}
|
||||
}
|
||||
Self::default_models_dir()
|
||||
}
|
||||
|
||||
pub fn user_agent() -> String {
|
||||
env::var(Self::ENV_USER_AGENT).unwrap_or_else(|_| Self::DEFAULT_USER_AGENT.to_string())
|
||||
}
|
||||
|
||||
pub fn downloader_user_agent() -> String {
|
||||
env::var(Self::ENV_USER_AGENT).unwrap_or_else(|_| Self::DEFAULT_DOWNLOADER_UA.to_string())
|
||||
}
|
||||
|
||||
pub fn http_timeout_secs() -> u64 {
|
||||
env::var(Self::ENV_HTTP_TIMEOUT_SECS)
|
||||
.ok()
|
||||
.and_then(|s| s.parse::<u64>().ok())
|
||||
.unwrap_or(Self::DEFAULT_HTTP_TIMEOUT_SECS)
|
||||
}
|
||||
|
||||
pub fn hf_repo() -> String {
|
||||
env::var(Self::ENV_HF_REPO).unwrap_or_else(|_| Self::DEFAULT_HF_REPO.to_string())
|
||||
}
|
||||
|
||||
pub fn hf_api_base_for(repo: &str) -> String {
|
||||
format!("https://huggingface.co/api/models/{}", repo)
|
||||
}
|
||||
|
||||
pub fn manifest_cache_path() -> Option<PathBuf> {
|
||||
let dir = Self::manifest_cache_dir()?;
|
||||
Some(dir.join(Self::manifest_cache_filename()))
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
|
||||
pub struct Config {
|
||||
pub models_dir: Option<PathBuf>,
|
||||
pub plugins_dir: Option<PathBuf>,
|
||||
}
|
||||
|
@@ -1,38 +1,26 @@
|
||||
use thiserror::Error;
|
||||
|
||||
/// Error types for the polyscribe-core crate.
|
||||
#[derive(Debug, Error)]
|
||||
///
|
||||
/// This enum represents various error conditions that can occur during
|
||||
/// operations in this crate, including I/O errors, serialization/deserialization
|
||||
/// errors, and environment variable access errors.
|
||||
pub enum Error {
|
||||
#[error("I/O error: {0}")]
|
||||
/// Represents an I/O error that occurred during file or stream operations
|
||||
Io(#[from] std::io::Error),
|
||||
|
||||
#[error("serde error: {0}")]
|
||||
/// Represents a JSON serialization or deserialization error
|
||||
Serde(#[from] serde_json::Error),
|
||||
|
||||
#[error("toml error: {0}")]
|
||||
/// Represents a TOML deserialization error
|
||||
Toml(#[from] toml::de::Error),
|
||||
|
||||
#[error("toml ser error: {0}")]
|
||||
/// Represents a TOML serialization error
|
||||
TomlSer(#[from] toml::ser::Error),
|
||||
|
||||
#[error("env var error: {0}")]
|
||||
/// Represents an error that occurred during environment variable access
|
||||
EnvVar(#[from] std::env::VarError),
|
||||
|
||||
#[error("http error: {0}")]
|
||||
/// Represents an HTTP client error from reqwest
|
||||
Http(#[from] reqwest::Error),
|
||||
|
||||
#[error("other: {0}")]
|
||||
/// Represents a general error condition with a custom message
|
||||
Other(String),
|
||||
}
|
||||
|
||||
|
@@ -1,14 +1,8 @@
|
||||
// SPDX-License-Identifier: MIT
|
||||
// Copyright (c) 2025 <COPYRIGHT HOLDER>. All rights reserved.
|
||||
|
||||
#![forbid(elided_lifetimes_in_paths)]
|
||||
#![forbid(unused_must_use)]
|
||||
#![deny(missing_docs)]
|
||||
#![warn(clippy::all)]
|
||||
//! PolyScribe library: business logic and core types.
|
||||
//!
|
||||
//! This crate exposes the reusable parts of the PolyScribe CLI as a library.
|
||||
//! The binary entry point (main.rs) remains a thin CLI wrapper.
|
||||
|
||||
use std::sync::atomic::{AtomicBool, AtomicU8, Ordering};
|
||||
|
||||
@@ -22,56 +16,44 @@ use std::process::Command;
|
||||
#[cfg(unix)]
|
||||
use libc::{O_WRONLY, close, dup, dup2, open};
|
||||
|
||||
/// Global runtime flags
|
||||
static QUIET: AtomicBool = AtomicBool::new(false);
|
||||
static NO_INTERACTION: AtomicBool = AtomicBool::new(false);
|
||||
static VERBOSE: AtomicU8 = AtomicU8::new(0);
|
||||
static NO_PROGRESS: AtomicBool = AtomicBool::new(false);
|
||||
|
||||
/// Set quiet mode: when true, non-interactive logs should be suppressed.
|
||||
pub fn set_quiet(enabled: bool) {
|
||||
QUIET.store(enabled, Ordering::Relaxed);
|
||||
}
|
||||
/// Return current quiet mode state.
|
||||
pub fn is_quiet() -> bool {
|
||||
QUIET.load(Ordering::Relaxed)
|
||||
}
|
||||
|
||||
/// Set non-interactive mode: when true, interactive prompts must be skipped.
|
||||
pub fn set_no_interaction(enabled: bool) {
|
||||
NO_INTERACTION.store(enabled, Ordering::Relaxed);
|
||||
}
|
||||
/// Return current non-interactive state.
|
||||
pub fn is_no_interaction() -> bool {
|
||||
NO_INTERACTION.load(Ordering::Relaxed)
|
||||
}
|
||||
|
||||
/// Set verbose level (0 = normal, 1 = verbose, 2 = super-verbose)
|
||||
pub fn set_verbose(level: u8) {
|
||||
VERBOSE.store(level, Ordering::Relaxed);
|
||||
}
|
||||
/// Get current verbose level.
|
||||
pub fn verbose_level() -> u8 {
|
||||
VERBOSE.load(Ordering::Relaxed)
|
||||
}
|
||||
|
||||
/// Disable interactive progress indicators (bars/spinners)
|
||||
pub fn set_no_progress(enabled: bool) {
|
||||
NO_PROGRESS.store(enabled, Ordering::Relaxed);
|
||||
}
|
||||
/// Return current no-progress state
|
||||
pub fn is_no_progress() -> bool {
|
||||
NO_PROGRESS.load(Ordering::Relaxed)
|
||||
}
|
||||
|
||||
/// Check whether stdin is connected to a TTY. Used to avoid blocking prompts when not interactive.
|
||||
pub fn stdin_is_tty() -> bool {
|
||||
use std::io::IsTerminal as _;
|
||||
std::io::stdin().is_terminal()
|
||||
}
|
||||
|
||||
/// A guard that temporarily redirects stderr to /dev/null on Unix when quiet mode is active.
|
||||
/// No-op on non-Unix or when quiet is disabled. Restores stderr on drop.
|
||||
pub struct StderrSilencer {
|
||||
#[cfg(unix)]
|
||||
old_stderr_fd: i32,
|
||||
@@ -81,7 +63,6 @@ pub struct StderrSilencer {
|
||||
}
|
||||
|
||||
impl StderrSilencer {
|
||||
/// Activate stderr silencing if quiet is set and on Unix; otherwise returns a no-op guard.
|
||||
pub fn activate_if_quiet() -> Self {
|
||||
if !is_quiet() {
|
||||
return Self {
|
||||
@@ -95,7 +76,6 @@ impl StderrSilencer {
|
||||
Self::activate()
|
||||
}
|
||||
|
||||
/// Activate stderr silencing unconditionally (used internally); no-op on non-Unix.
|
||||
pub fn activate() -> Self {
|
||||
#[cfg(unix)]
|
||||
unsafe {
|
||||
@@ -107,7 +87,6 @@ impl StderrSilencer {
|
||||
devnull_fd: -1,
|
||||
};
|
||||
}
|
||||
// Open /dev/null for writing
|
||||
let devnull_cstr = std::ffi::CString::new("/dev/null").unwrap();
|
||||
let devnull_fd = open(devnull_cstr.as_ptr(), O_WRONLY);
|
||||
if devnull_fd < 0 {
|
||||
@@ -154,7 +133,6 @@ impl Drop for StderrSilencer {
|
||||
}
|
||||
}
|
||||
|
||||
/// Run the given closure with stderr temporarily silenced (Unix-only). Returns the closure result.
|
||||
pub fn with_suppressed_stderr<F, T>(f: F) -> T
|
||||
where
|
||||
F: FnOnce() -> T,
|
||||
@@ -165,13 +143,11 @@ where
|
||||
result
|
||||
}
|
||||
|
||||
/// Log an error line (always printed).
|
||||
#[macro_export]
|
||||
macro_rules! elog {
|
||||
($($arg:tt)*) => {{ $crate::ui::error(format!($($arg)*)); }}
|
||||
}
|
||||
|
||||
/// Log an informational line using the UI helper unless quiet mode is enabled.
|
||||
#[macro_export]
|
||||
macro_rules! ilog {
|
||||
($($arg:tt)*) => {{
|
||||
@@ -179,7 +155,6 @@ macro_rules! ilog {
|
||||
}}
|
||||
}
|
||||
|
||||
/// Log a debug/trace line when verbose level is at least the given level (u8).
|
||||
#[macro_export]
|
||||
macro_rules! dlog {
|
||||
($lvl:expr, $($arg:tt)*) => {{
|
||||
@@ -187,44 +162,28 @@ macro_rules! dlog {
|
||||
}}
|
||||
}
|
||||
|
||||
/// Backward-compatibility: map old qlog! to ilog!
|
||||
#[macro_export]
|
||||
macro_rules! qlog {
|
||||
($($arg:tt)*) => {{ $crate::ilog!($($arg)*); }}
|
||||
}
|
||||
|
||||
pub mod backend;
|
||||
/// Configuration handling for PolyScribe
|
||||
pub mod config;
|
||||
pub mod models;
|
||||
// Use the file-backed ui.rs module, which also declares its own `progress` submodule.
|
||||
/// Error definitions for the PolyScribe library
|
||||
pub mod error;
|
||||
pub mod ui;
|
||||
pub use error::Error;
|
||||
pub mod prelude;
|
||||
|
||||
/// Transcript entry for a single segment.
|
||||
#[derive(Debug, serde::Serialize, Clone)]
|
||||
pub struct OutputEntry {
|
||||
/// Sequential id in output ordering.
|
||||
pub id: u64,
|
||||
/// Speaker label associated with the segment.
|
||||
pub speaker: String,
|
||||
/// Start time in seconds.
|
||||
pub start: f64,
|
||||
/// End time in seconds.
|
||||
pub end: f64,
|
||||
/// Text content.
|
||||
pub text: String,
|
||||
}
|
||||
|
||||
/// Return a YYYY-MM-DD date prefix string for output file naming.
|
||||
pub fn date_prefix() -> String {
|
||||
Local::now().format("%Y-%m-%d").to_string()
|
||||
}
|
||||
|
||||
/// Format a floating-point number of seconds as SRT timestamp (HH:MM:SS,mmm).
|
||||
pub fn format_srt_time(seconds: f64) -> String {
|
||||
let total_ms = (seconds * 1000.0).round() as i64;
|
||||
let ms = total_ms % 1000;
|
||||
@@ -235,7 +194,6 @@ pub fn format_srt_time(seconds: f64) -> String {
|
||||
format!("{hour:02}:{min:02}:{sec:02},{ms:03}")
|
||||
}
|
||||
|
||||
/// Render a list of transcript entries to SRT format.
|
||||
pub fn render_srt(entries: &[OutputEntry]) -> String {
|
||||
let mut srt = String::new();
|
||||
for (index, entry) in entries.iter().enumerate() {
|
||||
@@ -256,7 +214,6 @@ pub fn render_srt(entries: &[OutputEntry]) -> String {
|
||||
srt
|
||||
}
|
||||
|
||||
/// Determine the default models directory, honoring POLYSCRIBE_MODELS_DIR override.
|
||||
pub fn models_dir_path() -> PathBuf {
|
||||
if let Ok(env_val) = env::var("POLYSCRIBE_MODELS_DIR") {
|
||||
let env_path = PathBuf::from(env_val);
|
||||
@@ -284,7 +241,6 @@ pub fn models_dir_path() -> PathBuf {
|
||||
PathBuf::from("models")
|
||||
}
|
||||
|
||||
/// Normalize a language identifier to a short ISO code when possible.
|
||||
pub fn normalize_lang_code(input: &str) -> Option<String> {
|
||||
let mut lang = input.trim().to_lowercase();
|
||||
if lang.is_empty() || lang == "auto" || lang == "c" || lang == "posix" {
|
||||
@@ -356,9 +312,7 @@ pub fn normalize_lang_code(input: &str) -> Option<String> {
|
||||
Some(code.to_string())
|
||||
}
|
||||
|
||||
/// Find the Whisper model file path to use.
|
||||
pub fn find_model_file() -> Result<PathBuf> {
|
||||
// 1) Explicit override via environment
|
||||
if let Ok(path) = env::var("WHISPER_MODEL") {
|
||||
let p = PathBuf::from(path);
|
||||
if !p.exists() {
|
||||
@@ -378,7 +332,6 @@ pub fn find_model_file() -> Result<PathBuf> {
|
||||
return Ok(p);
|
||||
}
|
||||
|
||||
// 2) Resolve models directory and ensure it exists and is a directory
|
||||
let models_dir = models_dir_path();
|
||||
if models_dir.exists() && !models_dir.is_dir() {
|
||||
return Err(anyhow!(
|
||||
@@ -394,7 +347,6 @@ pub fn find_model_file() -> Result<PathBuf> {
|
||||
)
|
||||
})?;
|
||||
|
||||
// 3) Gather candidate .bin files (regular files only), prefer largest
|
||||
let mut candidates = Vec::new();
|
||||
for entry in std::fs::read_dir(&models_dir)
|
||||
.with_context(|| format!("Failed to read models dir: {}", models_dir.display()))?
|
||||
@@ -402,7 +354,6 @@ pub fn find_model_file() -> Result<PathBuf> {
|
||||
let entry = entry?;
|
||||
let path = entry.path();
|
||||
|
||||
// Only consider .bin files
|
||||
let is_bin = path
|
||||
.extension()
|
||||
.and_then(|s| s.to_str())
|
||||
@@ -411,7 +362,6 @@ pub fn find_model_file() -> Result<PathBuf> {
|
||||
continue;
|
||||
}
|
||||
|
||||
// Only consider regular files
|
||||
let md = match std::fs::metadata(&path) {
|
||||
Ok(m) if m.is_file() => m,
|
||||
_ => continue,
|
||||
@@ -421,7 +371,6 @@ pub fn find_model_file() -> Result<PathBuf> {
|
||||
}
|
||||
|
||||
if candidates.is_empty() {
|
||||
// 4) Fallback to known tiny English model if present
|
||||
let fallback = models_dir.join("ggml-tiny.en.bin");
|
||||
if fallback.is_file() {
|
||||
return Ok(fallback);
|
||||
@@ -439,19 +388,16 @@ pub fn find_model_file() -> Result<PathBuf> {
|
||||
Ok(path)
|
||||
}
|
||||
|
||||
/// Decode an audio file into PCM f32 samples using ffmpeg (ffmpeg executable required).
|
||||
pub fn decode_audio_to_pcm_f32_ffmpeg(audio_path: &Path) -> Result<Vec<f32>> {
|
||||
let in_path = audio_path
|
||||
.to_str()
|
||||
.ok_or_else(|| anyhow!("Audio path must be valid UTF-8: {}", audio_path.display()))?;
|
||||
|
||||
// Use a raw f32le file to match the -f f32le output format.
|
||||
let tmp_raw = std::env::temp_dir().join("polyscribe_tmp_input.f32le");
|
||||
let tmp_raw_str = tmp_raw
|
||||
.to_str()
|
||||
.ok_or_else(|| anyhow!("Temp path not valid UTF-8: {}", tmp_raw.display()))?;
|
||||
|
||||
// ffmpeg -i input -f f32le -ac 1 -ar 16000 -y /tmp/tmp.f32le
|
||||
let status = Command::new("ffmpeg")
|
||||
.arg("-hide_banner")
|
||||
.arg("-loglevel")
|
||||
@@ -480,10 +426,8 @@ pub fn decode_audio_to_pcm_f32_ffmpeg(audio_path: &Path) -> Result<Vec<f32>> {
|
||||
let raw = std::fs::read(&tmp_raw)
|
||||
.with_context(|| format!("Failed to read temp PCM file: {}", tmp_raw.display()))?;
|
||||
|
||||
// Best-effort cleanup of the temp file
|
||||
let _ = std::fs::remove_file(&tmp_raw);
|
||||
|
||||
// Interpret raw bytes as f32 little-endian
|
||||
if raw.len() % 4 != 0 {
|
||||
return Err(anyhow!("Decoded PCM file length not multiple of 4: {}", raw.len()).into());
|
||||
}
|
||||
|
@@ -1,9 +1,6 @@
|
||||
// SPDX-License-Identifier: MIT
|
||||
//! Model management for PolyScribe: discovery, download, and verification.
|
||||
//! Fetches the live file table from Hugging Face, using size and sha256
|
||||
//! data for verification. Falls back to scraping the repository tree page
|
||||
//! if the JSON API is unavailable or incomplete. No built-in manifest.
|
||||
|
||||
use crate::config::ConfigService;
|
||||
use crate::prelude::*;
|
||||
use anyhow::{Context, anyhow};
|
||||
use chrono::{DateTime, Utc};
|
||||
@@ -12,13 +9,13 @@ use reqwest::blocking::Client;
|
||||
use reqwest::header::{
|
||||
ACCEPT_RANGES, CONTENT_LENGTH, CONTENT_RANGE, ETAG, IF_RANGE, LAST_MODIFIED, RANGE,
|
||||
};
|
||||
use serde::Deserialize;
|
||||
use serde::{Deserialize, Serialize};
|
||||
use sha2::{Digest, Sha256};
|
||||
use std::collections::BTreeSet;
|
||||
use std::fs::{self, File, OpenOptions};
|
||||
use std::io::{Read, Write};
|
||||
use std::path::{Path, PathBuf};
|
||||
use std::time::{Duration, Instant};
|
||||
use std::time::{Duration, Instant, SystemTime, UNIX_EPOCH};
|
||||
|
||||
fn format_size_mb(size: Option<u64>) -> String {
|
||||
match size {
|
||||
@@ -35,7 +32,6 @@ fn format_size_gib(bytes: u64) -> String {
|
||||
format!("{gib:.2} GiB")
|
||||
}
|
||||
|
||||
// Short date formatter (RFC -> yyyy-mm-dd)
|
||||
fn short_date(s: &str) -> String {
|
||||
DateTime::parse_from_rfc3339(s)
|
||||
.ok()
|
||||
@@ -43,12 +39,10 @@ fn short_date(s: &str) -> String {
|
||||
.unwrap_or_else(|| s.to_string())
|
||||
}
|
||||
|
||||
// Free disk space using libc::statvfs (already in Cargo)
|
||||
fn free_space_bytes_for_path(path: &Path) -> Result<u64> {
|
||||
use libc::statvfs;
|
||||
use std::ffi::CString;
|
||||
|
||||
// use parent dir or current dir if none
|
||||
let dir = if path.is_dir() {
|
||||
path
|
||||
} else {
|
||||
@@ -66,9 +60,7 @@ fn free_space_bytes_for_path(path: &Path) -> Result<u64> {
|
||||
}
|
||||
}
|
||||
|
||||
// Minimal mirror note shown in single-line style
|
||||
fn mirror_label(url: &str) -> &'static str {
|
||||
// Very light heuristic; replace with your actual mirror selection if you have it
|
||||
if url.contains("eu") {
|
||||
"EU mirror"
|
||||
} else if url.contains("us") {
|
||||
@@ -78,7 +70,6 @@ fn mirror_label(url: &str) -> &'static str {
|
||||
}
|
||||
}
|
||||
|
||||
// Perform a HEAD to get size/etag/last-modified and fill what we can
|
||||
type HeadMeta = (Option<u64>, Option<String>, Option<String>, bool);
|
||||
|
||||
fn head_entry(client: &Client, url: &str) -> Result<HeadMeta> {
|
||||
@@ -107,39 +98,27 @@ fn head_entry(client: &Client, url: &str) -> Result<HeadMeta> {
|
||||
Ok((len, etag, last_mod, ranges_ok))
|
||||
}
|
||||
|
||||
/// Represents a downloadable Whisper model artifact.
|
||||
#[derive(Debug, Clone)]
|
||||
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
|
||||
struct ModelEntry {
|
||||
/// Display name and local short name (informational; may equal stem of file)
|
||||
name: String,
|
||||
/// Remote file name (with extension)
|
||||
file: String,
|
||||
/// Remote URL
|
||||
url: String,
|
||||
/// Expected file size (optional)
|
||||
size: Option<u64>,
|
||||
/// Expected SHA-256 in hex (optional)
|
||||
sha256: Option<String>,
|
||||
/// New: last modified timestamp string if available
|
||||
last_modified: Option<String>,
|
||||
/// New: parsed base and variant for 2-step UI
|
||||
base: String,
|
||||
variant: String,
|
||||
}
|
||||
|
||||
// -------- Hugging Face API integration --------
|
||||
|
||||
#[derive(Debug, Deserialize)]
|
||||
struct HfModelInfo {
|
||||
// Returned sometimes at /api/models/{repo}
|
||||
siblings: Option<Vec<HfFile>>,
|
||||
// Returned when using `?expand=files`
|
||||
files: Option<Vec<HfFile>>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Deserialize)]
|
||||
struct HfLfsInfo {
|
||||
// Sometimes an "oid" like "sha256:<hex>"
|
||||
oid: Option<String>,
|
||||
size: Option<u64>,
|
||||
sha256: Option<String>,
|
||||
@@ -147,53 +126,33 @@ struct HfLfsInfo {
|
||||
|
||||
#[derive(Debug, Deserialize)]
|
||||
struct HfFile {
|
||||
// Relative filename within repo (e.g., "ggml-tiny.bin")
|
||||
rfilename: String,
|
||||
// Size reported at top-level for non-LFS files; often present
|
||||
size: Option<u64>,
|
||||
// Some entries include sha256 at top level
|
||||
sha256: Option<String>,
|
||||
// LFS metadata with size and possibly sha256 embedded
|
||||
lfs: Option<HfLfsInfo>,
|
||||
// New: last modified timestamp provided by HF API on expanded files
|
||||
#[serde(rename = "lastModified")]
|
||||
last_modified: Option<String>,
|
||||
}
|
||||
|
||||
fn parse_base_variant(display_name: &str) -> (String, String) {
|
||||
// display_name is name without ggml-/gguf- and without .bin
|
||||
// Examples:
|
||||
// - "tiny" -> base=tiny, variant=default
|
||||
// - "tiny.en" -> base=tiny, variant=en
|
||||
// - "base" -> base=base, variant=default
|
||||
// - "large-v2" -> base=large, variant=v2
|
||||
// - "large-v3" -> base=large, variant=v3
|
||||
// - "medium" -> base=medium, variant=default
|
||||
let mut variant = "default".to_string();
|
||||
|
||||
// Split off dot-based suffix (e.g., ".en")
|
||||
let mut head = display_name;
|
||||
if let Some((h, rest)) = display_name.split_once('.') {
|
||||
head = h;
|
||||
// if there is more than one dot, just keep everything after first as variant
|
||||
variant = rest.to_string();
|
||||
}
|
||||
|
||||
// Handle hyphenated versions like large-v2
|
||||
if let Some((b, v)) = head.split_once('-') {
|
||||
return (b.to_string(), v.to_string());
|
||||
}
|
||||
|
||||
(head.to_string(), variant)
|
||||
}
|
||||
|
||||
/// Build a manifest by calling the Hugging Face API for a repo.
|
||||
/// Prefers the plain API URL, then retries with `?expand=files` if needed.
|
||||
fn hf_repo_manifest_api(repo: &str) -> Result<Vec<ModelEntry>> {
|
||||
let client = Client::builder().user_agent("polyscribe/0.1").build()?;
|
||||
let client = Client::builder()
|
||||
.user_agent(ConfigService::user_agent())
|
||||
.build()?;
|
||||
|
||||
// 1) Try the plain API you specified
|
||||
let base = format!("https://huggingface.co/api/models/{}", repo);
|
||||
let base = ConfigService::hf_api_base_for(repo);
|
||||
let resp = client.get(&base).send()?;
|
||||
let mut entries = if resp.status().is_success() {
|
||||
let info: HfModelInfo = resp.json()?;
|
||||
@@ -202,7 +161,6 @@ fn hf_repo_manifest_api(repo: &str) -> Result<Vec<ModelEntry>> {
|
||||
Vec::new()
|
||||
};
|
||||
|
||||
// 2) If empty, try with expand=files (some repos require this for full file listing)
|
||||
if entries.is_empty() {
|
||||
let url = format!("{base}?expand=files");
|
||||
let resp2 = client.get(&url).send()?;
|
||||
@@ -228,7 +186,6 @@ fn hf_info_to_entries(repo: &str, info: HfModelInfo) -> Result<Vec<ModelEntry>>
|
||||
continue;
|
||||
}
|
||||
|
||||
// Derive a simple display name from the file stem
|
||||
let stem = fname.strip_suffix(".bin").unwrap_or(&fname).to_string();
|
||||
let name_no_prefix = stem
|
||||
.strip_prefix("ggml-")
|
||||
@@ -236,7 +193,6 @@ fn hf_info_to_entries(repo: &str, info: HfModelInfo) -> Result<Vec<ModelEntry>>
|
||||
.unwrap_or(&stem)
|
||||
.to_string();
|
||||
|
||||
// Prefer explicit sha256; else try to parse from LFS oid "sha256:<hex>"
|
||||
let sha_from_lfs = f.lfs.as_ref().and_then(|l| {
|
||||
l.sha256.clone().or_else(|| {
|
||||
l.oid
|
||||
@@ -268,12 +224,11 @@ fn hf_info_to_entries(repo: &str, info: HfModelInfo) -> Result<Vec<ModelEntry>>
|
||||
Ok(out)
|
||||
}
|
||||
|
||||
// -------- HTML scraping fallback (tree view) --------
|
||||
|
||||
/// Scrape the repository tree page when the API doesn't return a usable list.
|
||||
/// Note: sizes and hashes are generally unavailable in this path.
|
||||
fn scrape_tree_manifest(repo: &str) -> Result<Vec<ModelEntry>> {
|
||||
let client = Client::builder().user_agent("polyscribe/0.1").build()?;
|
||||
let client = Client::builder()
|
||||
.user_agent(ConfigService::user_agent())
|
||||
.build()?;
|
||||
|
||||
let url = format!("https://huggingface.co/{}/tree/main?recursive=1", repo);
|
||||
let resp = client.get(&url).send()?;
|
||||
@@ -282,10 +237,6 @@ fn scrape_tree_manifest(repo: &str) -> Result<Vec<ModelEntry>> {
|
||||
}
|
||||
let html = resp.text()?;
|
||||
|
||||
// Extract .bin paths from links. Match both blob/main and resolve/main.
|
||||
// Example matches:
|
||||
// - /{repo}/blob/main/ggml-base.en.bin
|
||||
// - /{repo}/resolve/main/ggml-base.en.bin
|
||||
let mut files = BTreeSet::new();
|
||||
for mat in html.match_indices(".bin") {
|
||||
let end = mat.0 + 4;
|
||||
@@ -346,13 +297,8 @@ fn scrape_tree_manifest(repo: &str) -> Result<Vec<ModelEntry>> {
|
||||
Ok(out)
|
||||
}
|
||||
|
||||
// -------- Metadata enrichment via HEAD (size/hash/last-modified) --------
|
||||
|
||||
fn parse_sha_from_header_value(s: &str) -> Option<String> {
|
||||
// Common HF patterns:
|
||||
// - ETag: "SHA256:<hex>"
|
||||
// - X-Linked-ETag: "SHA256:<hex>"
|
||||
// - Sometimes weak etags: W/"SHA256:<hex>"
|
||||
let lower = s.to_ascii_lowercase();
|
||||
if let Some(idx) = lower.find("sha256:") {
|
||||
let tail = &lower[idx + "sha256:".len()..];
|
||||
@@ -365,14 +311,13 @@ fn parse_sha_from_header_value(s: &str) -> Option<String> {
|
||||
}
|
||||
|
||||
fn enrich_entry_via_head(entry: &mut ModelEntry) -> Result<()> {
|
||||
// If we already have everything, nothing to do
|
||||
if entry.size.is_some() && entry.sha256.is_some() && entry.last_modified.is_some() {
|
||||
return Ok(());
|
||||
}
|
||||
|
||||
let client = Client::builder()
|
||||
.user_agent("polyscribe/0.1")
|
||||
.timeout(Duration::from_secs(8))
|
||||
.user_agent(ConfigService::user_agent())
|
||||
.timeout(Duration::from_secs(ConfigService::http_timeout_secs()))
|
||||
.build()?;
|
||||
|
||||
let mut head_url = entry.url.clone();
|
||||
@@ -397,7 +342,6 @@ fn enrich_entry_via_head(entry: &mut ModelEntry) -> Result<()> {
|
||||
let mut filled_sha = false;
|
||||
let mut filled_lm = false;
|
||||
|
||||
// Content-Length
|
||||
if entry.size.is_none()
|
||||
&& let Some(sz) = resp
|
||||
.headers()
|
||||
@@ -409,7 +353,6 @@ fn enrich_entry_via_head(entry: &mut ModelEntry) -> Result<()> {
|
||||
filled_size = true;
|
||||
}
|
||||
|
||||
// SHA256 from headers if available
|
||||
if entry.sha256.is_none() {
|
||||
let _ = resp
|
||||
.headers()
|
||||
@@ -433,7 +376,6 @@ fn enrich_entry_via_head(entry: &mut ModelEntry) -> Result<()> {
|
||||
}
|
||||
}
|
||||
|
||||
// Last-Modified
|
||||
if entry.last_modified.is_none() {
|
||||
let _ = resp
|
||||
.headers()
|
||||
@@ -477,28 +419,204 @@ fn enrich_entry_via_head(entry: &mut ModelEntry) -> Result<()> {
|
||||
Ok(())
|
||||
}
|
||||
|
||||
// -------- Online manifest (API first, then scrape) --------
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
struct CachedManifest {
|
||||
fetched_at: u64,
|
||||
etag: Option<String>,
|
||||
last_modified: Option<String>,
|
||||
entries: Vec<ModelEntry>,
|
||||
}
|
||||
|
||||
fn get_cache_dir() -> Result<PathBuf> {
|
||||
Ok(ConfigService::manifest_cache_dir()
|
||||
.ok_or_else(|| anyhow!("could not determine platform directories"))?)
|
||||
}
|
||||
|
||||
fn get_cached_manifest_path() -> Result<PathBuf> {
|
||||
let cache_dir = get_cache_dir()?;
|
||||
Ok(cache_dir.join(ConfigService::manifest_cache_filename()))
|
||||
}
|
||||
|
||||
fn should_bypass_cache() -> bool {
|
||||
ConfigService::bypass_manifest_cache()
|
||||
}
|
||||
|
||||
fn get_cache_ttl() -> u64 {
|
||||
ConfigService::manifest_cache_ttl_seconds()
|
||||
}
|
||||
|
||||
fn load_cached_manifest() -> Option<CachedManifest> {
|
||||
if should_bypass_cache() {
|
||||
return None;
|
||||
}
|
||||
|
||||
let cache_path = get_cached_manifest_path().ok()?;
|
||||
if !cache_path.exists() {
|
||||
return None;
|
||||
}
|
||||
|
||||
let cache_file = File::open(cache_path).ok()?;
|
||||
let cached: CachedManifest = serde_json::from_reader(cache_file).ok()?;
|
||||
|
||||
let now = SystemTime::now().duration_since(UNIX_EPOCH).ok()?.as_secs();
|
||||
|
||||
let ttl = get_cache_ttl();
|
||||
if now.saturating_sub(cached.fetched_at) > ttl {
|
||||
crate::dlog!(
|
||||
1,
|
||||
"Cache expired (age: {}s, TTL: {}s)",
|
||||
now.saturating_sub(cached.fetched_at),
|
||||
ttl
|
||||
);
|
||||
return None;
|
||||
}
|
||||
|
||||
crate::dlog!(
|
||||
1,
|
||||
"Using cached manifest (age: {}s)",
|
||||
now.saturating_sub(cached.fetched_at)
|
||||
);
|
||||
Some(cached)
|
||||
}
|
||||
|
||||
fn save_manifest_to_cache(
|
||||
entries: &[ModelEntry],
|
||||
etag: Option<&str>,
|
||||
last_modified: Option<&str>,
|
||||
) -> Result<()> {
|
||||
if should_bypass_cache() {
|
||||
return Ok(());
|
||||
}
|
||||
|
||||
let cache_dir = get_cache_dir()?;
|
||||
fs::create_dir_all(&cache_dir)?;
|
||||
|
||||
let cache_path = get_cached_manifest_path()?;
|
||||
let now = SystemTime::now()
|
||||
.duration_since(UNIX_EPOCH)
|
||||
.map_err(|_| anyhow!("system time error"))?
|
||||
.as_secs();
|
||||
|
||||
let cached = CachedManifest {
|
||||
fetched_at: now,
|
||||
etag: etag.map(|s| s.to_string()),
|
||||
last_modified: last_modified.map(|s| s.to_string()),
|
||||
entries: entries.to_vec(),
|
||||
};
|
||||
|
||||
let cache_file = OpenOptions::new()
|
||||
.create(true)
|
||||
.write(true)
|
||||
.truncate(true)
|
||||
.open(&cache_path)
|
||||
.with_context(|| format!("opening cache file {}", cache_path.display()))?;
|
||||
|
||||
serde_json::to_writer_pretty(cache_file, &cached)
|
||||
.with_context(|| "serializing cached manifest")?;
|
||||
|
||||
crate::dlog!(1, "Saved manifest to cache: {} entries", entries.len());
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn fetch_manifest_with_cache() -> Result<Vec<ModelEntry>> {
|
||||
let cached = load_cached_manifest();
|
||||
|
||||
let client = Client::builder()
|
||||
.user_agent(ConfigService::user_agent())
|
||||
.build()?;
|
||||
let repo = ConfigService::hf_repo();
|
||||
let base_url = ConfigService::hf_api_base_for(&repo);
|
||||
|
||||
let mut req = client.get(&base_url);
|
||||
if let Some(ref cached) = cached {
|
||||
if let Some(ref etag) = cached.etag {
|
||||
req = req.header("If-None-Match", format!("\"{}\"", etag));
|
||||
} else if let Some(ref last_mod) = cached.last_modified {
|
||||
req = req.header("If-Modified-Since", last_mod);
|
||||
}
|
||||
}
|
||||
|
||||
let resp = req.send()?;
|
||||
|
||||
if resp.status().as_u16() == 304 {
|
||||
if let Some(cached) = cached {
|
||||
crate::dlog!(1, "Manifest not modified, using cache");
|
||||
return Ok(cached.entries);
|
||||
}
|
||||
}
|
||||
|
||||
if !resp.status().is_success() {
|
||||
return Err(anyhow!("HF API {} for {}", resp.status(), base_url).into());
|
||||
}
|
||||
|
||||
let etag = resp
|
||||
.headers()
|
||||
.get(ETAG)
|
||||
.and_then(|v| v.to_str().ok())
|
||||
.map(|s| s.trim_matches('"').to_string());
|
||||
|
||||
let last_modified = resp
|
||||
.headers()
|
||||
.get(LAST_MODIFIED)
|
||||
.and_then(|v| v.to_str().ok())
|
||||
.map(|s| s.to_string());
|
||||
|
||||
let info: HfModelInfo = resp.json()?;
|
||||
let mut entries = hf_info_to_entries(&repo, info)?;
|
||||
|
||||
if entries.is_empty() {
|
||||
let url = format!("{}?expand=files", base_url);
|
||||
let resp2 = client.get(&url).send()?;
|
||||
if !resp2.status().is_success() {
|
||||
return Err(anyhow!("HF API {} for {}", resp2.status(), url).into());
|
||||
}
|
||||
let info: HfModelInfo = resp2.json()?;
|
||||
entries = hf_info_to_entries(&repo, info)?;
|
||||
}
|
||||
|
||||
if entries.is_empty() {
|
||||
return Err(anyhow!("HF API returned no usable .bin files").into());
|
||||
}
|
||||
|
||||
let _ = save_manifest_to_cache(&entries, etag.as_deref(), last_modified.as_deref());
|
||||
|
||||
Ok(entries)
|
||||
}
|
||||
|
||||
/// Returns the current manifest (online only).
|
||||
fn current_manifest() -> Result<Vec<ModelEntry>> {
|
||||
let started = Instant::now();
|
||||
crate::dlog!(1, "Fetching HF manifest…");
|
||||
|
||||
// 1) Load from API, else scrape
|
||||
let mut list = match hf_repo_manifest_api("ggerganov/whisper.cpp") {
|
||||
let mut list = match fetch_manifest_with_cache() {
|
||||
Ok(list) if !list.is_empty() => {
|
||||
crate::dlog!(1, "Manifest loaded from HF API ({} entries)", list.len());
|
||||
crate::dlog!(
|
||||
1,
|
||||
"Manifest loaded from HF API with cache ({} entries)",
|
||||
list.len()
|
||||
);
|
||||
list
|
||||
}
|
||||
_ => {
|
||||
crate::ilog!("Falling back to scraping the repository tree page");
|
||||
let scraped = scrape_tree_manifest("ggerganov/whisper.cpp")?;
|
||||
crate::dlog!(1, "Manifest loaded via scrape ({} entries)", scraped.len());
|
||||
scraped
|
||||
crate::ilog!("Cache failed, falling back to direct API");
|
||||
let repo = ConfigService::hf_repo();
|
||||
let list = match hf_repo_manifest_api(&repo) {
|
||||
Ok(list) if !list.is_empty() => {
|
||||
crate::dlog!(1, "Manifest loaded from HF API ({} entries)", list.len());
|
||||
list
|
||||
}
|
||||
_ => {
|
||||
crate::ilog!("Falling back to scraping the repository tree page");
|
||||
let scraped = scrape_tree_manifest(&repo)?;
|
||||
crate::dlog!(1, "Manifest loaded via scrape ({} entries)", scraped.len());
|
||||
scraped
|
||||
}
|
||||
};
|
||||
|
||||
let _ = save_manifest_to_cache(&list, None, None);
|
||||
list
|
||||
}
|
||||
};
|
||||
|
||||
// 2) Enrich missing metadata so the UI can show sizes and hashes
|
||||
let mut need_enrich = 0usize;
|
||||
for m in &list {
|
||||
if m.size.is_none() || m.sha256.is_none() || m.last_modified.is_none() {
|
||||
@@ -532,8 +650,6 @@ fn current_manifest() -> Result<Vec<ModelEntry>> {
|
||||
Ok(list)
|
||||
}
|
||||
|
||||
/// Pick the best local Whisper model in the given directory.
|
||||
/// Heuristic: choose the largest .bin file by size. Returns None if none found.
|
||||
pub fn pick_best_local_model(dir: &Path) -> Option<PathBuf> {
|
||||
let rd = fs::read_dir(dir).ok()?;
|
||||
rd.flatten()
|
||||
@@ -549,39 +665,23 @@ pub fn pick_best_local_model(dir: &Path) -> Option<PathBuf> {
|
||||
.map(|(_, p)| p)
|
||||
}
|
||||
|
||||
/// Returns the directory where models should be stored based on platform conventions.
|
||||
fn resolve_models_dir() -> Result<PathBuf> {
|
||||
let dirs = directories::ProjectDirs::from("dev", "polyscribe", "polyscribe")
|
||||
.ok_or_else(|| anyhow!("could not determine platform directories"))?;
|
||||
let data_dir = dirs.data_dir().join("models");
|
||||
Ok(data_dir)
|
||||
Ok(ConfigService::models_dir(None)
|
||||
.ok_or_else(|| anyhow!("could not determine models directory"))?)
|
||||
}
|
||||
|
||||
// Example of a non-interactive path ensuring a given model by name exists, with improved copy.
|
||||
// Wire this into CLI flags as needed.
|
||||
/// Ensures a model is available by name, downloading it if necessary.
|
||||
/// This is a non-interactive version that doesn't prompt the user.
|
||||
///
|
||||
/// # Arguments
|
||||
/// * `name` - Name of the model to ensure is available
|
||||
///
|
||||
/// # Returns
|
||||
/// * `Result<PathBuf>` - Path to the downloaded model file on success
|
||||
pub fn ensure_model_available_noninteractive(name: &str) -> Result<PathBuf> {
|
||||
let entry = find_manifest_entry(name)?.ok_or_else(|| anyhow!("unknown model: {name}"))?;
|
||||
|
||||
// Resolve destination file path; ensure XDG path (or your existing logic)
|
||||
let dir = resolve_models_dir()?; // implement or reuse your existing directory resolver
|
||||
let dir = resolve_models_dir()?;
|
||||
fs::create_dir_all(&dir).ok();
|
||||
let dest = dir.join(&entry.file);
|
||||
|
||||
// If already matches, early return
|
||||
if file_matches(&dest, entry.size, entry.sha256.as_deref())? {
|
||||
crate::ui::info(format!("Already up to date: {}", dest.display()));
|
||||
return Ok(dest);
|
||||
}
|
||||
|
||||
// Single-line header
|
||||
let base = &entry.base;
|
||||
let variant = &entry.variant;
|
||||
let size_str = format_size_mb(entry.size);
|
||||
@@ -596,9 +696,16 @@ pub fn ensure_model_available_noninteractive(name: &str) -> Result<PathBuf> {
|
||||
Ok(dest)
|
||||
}
|
||||
|
||||
pub fn clear_manifest_cache() -> Result<()> {
|
||||
let cache_path = get_cached_manifest_path()?;
|
||||
if cache_path.exists() {
|
||||
fs::remove_file(&cache_path)?;
|
||||
crate::dlog!(1, "Cleared manifest cache");
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn find_manifest_entry(name: &str) -> Result<Option<ModelEntry>> {
|
||||
// Accept either manifest display name, file stem, or direct file name.
|
||||
// Normalize: strip ".bin" for comparisons and also handle input that already includes it.
|
||||
let wanted_name = name
|
||||
.strip_suffix(".bin")
|
||||
.unwrap_or(name)
|
||||
@@ -622,10 +729,6 @@ fn find_manifest_entry(name: &str) -> Result<Option<ModelEntry>> {
|
||||
Ok(None)
|
||||
}
|
||||
|
||||
// Return true if the file at `path` matches expected size and/or sha256 (when provided).
|
||||
// - If sha256 is provided, verify it (preferred).
|
||||
// - Else if size is provided, check size.
|
||||
// - If neither provided, return false (cannot verify).
|
||||
fn file_matches(path: &Path, size: Option<u64>, sha256_hex: Option<&str>) -> Result<bool> {
|
||||
if !path.exists() {
|
||||
return Ok(false);
|
||||
@@ -655,21 +758,14 @@ fn file_matches(path: &Path, size: Option<u64>, sha256_hex: Option<&str>) -> Res
|
||||
Ok(false)
|
||||
}
|
||||
|
||||
// Download with:
|
||||
// - Free-space preflight (size * 1.1 overhead).
|
||||
// - Resume via Range if .part exists and server supports it.
|
||||
// - Atomic write: download to .part (temp) then rename.
|
||||
// - Checksum verification when available.
|
||||
// - Single-line progress UI.
|
||||
fn download_with_progress(dest_path: &Path, entry: &ModelEntry) -> Result<()> {
|
||||
let url = &entry.url;
|
||||
let client = Client::builder()
|
||||
.user_agent("polyscribe-model-downloader/1")
|
||||
.user_agent(ConfigService::downloader_user_agent())
|
||||
.build()?;
|
||||
|
||||
crate::ui::info(format!("Resolving source: {} ({})", mirror_label(url), url));
|
||||
|
||||
// HEAD for size/etag/ranges
|
||||
let (mut total_len, remote_etag, _remote_last_mod, ranges_ok) =
|
||||
head_entry(&client, url).context("probing remote file")?;
|
||||
|
||||
@@ -710,9 +806,6 @@ fn download_with_progress(dest_path: &Path, entry: &ModelEntry) -> Result<()> {
|
||||
.open(&part_path)
|
||||
.with_context(|| format!("opening {}", part_path.display()))?;
|
||||
|
||||
// Build request:
|
||||
// - Fresh download: plain GET (no If-None-Match).
|
||||
// - Resume: Range + optional If-Range with ETag.
|
||||
let mut req = client.get(url);
|
||||
if ranges_ok && resume_from > 0 {
|
||||
req = req.header(RANGE, format!("bytes={resume_from}-"));
|
||||
@@ -729,30 +822,21 @@ fn download_with_progress(dest_path: &Path, entry: &ModelEntry) -> Result<()> {
|
||||
let start = Instant::now();
|
||||
let mut resp = req.send()?.error_for_status()?;
|
||||
|
||||
// Defensive: if server returns 304 but we don't have a valid cached copy, retry without conditionals.
|
||||
if resp.status().as_u16() == 304 && resume_from == 0 {
|
||||
// Fresh download must not be conditional; redo as plain GET
|
||||
let req2 = client.get(url);
|
||||
resp = req2.send()?.error_for_status()?;
|
||||
}
|
||||
|
||||
// If server ignored RANGE and returned full body, reset partial
|
||||
let is_partial_response = resp.headers().get(CONTENT_RANGE).is_some();
|
||||
if resume_from > 0 && !is_partial_response {
|
||||
// Server did not honor range → start over
|
||||
drop(part_file);
|
||||
fs::remove_file(&part_path).ok();
|
||||
// Reset local accounting; we also reinitialize the progress bar below
|
||||
// and reopen the part file. No need to re-read this variable afterwards.
|
||||
let _ = 0; // avoid unused-assignment lint for resume_from
|
||||
|
||||
// Plain GET without conditional headers
|
||||
let req2 = client.get(url);
|
||||
resp = req2.send()?.error_for_status()?;
|
||||
bar.stop("restarting");
|
||||
bar = crate::ui::BytesProgress::start(pb_total, "Downloading", 0);
|
||||
|
||||
// Reopen the part file since we dropped it
|
||||
part_file = OpenOptions::new()
|
||||
.create(true)
|
||||
.read(true)
|
||||
@@ -842,10 +926,6 @@ fn download_with_progress(dest_path: &Path, entry: &ModelEntry) -> Result<()> {
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Run an interactive model downloader UI (2-step):
|
||||
/// 1) Choose model base (tiny, small, base, medium, large)
|
||||
/// 2) Choose model type/variant specific to that base
|
||||
/// Displays meta info (size and last updated). Does not show raw ggml filenames.
|
||||
pub fn run_interactive_model_downloader() -> Result<()> {
|
||||
use crate::ui;
|
||||
|
||||
@@ -877,7 +957,6 @@ pub fn run_interactive_model_downloader() -> Result<()> {
|
||||
|
||||
ui::intro("PolyScribe model downloader");
|
||||
|
||||
// Build Select items for bases with counts and size ranges
|
||||
let mut base_labels: Vec<String> = Vec::new();
|
||||
for base in &ordered_bases {
|
||||
let variants = &by_base[base];
|
||||
@@ -904,7 +983,6 @@ pub fn run_interactive_model_downloader() -> Result<()> {
|
||||
let base_idx = ui::prompt_select("Choose a model base", &base_refs)?;
|
||||
let chosen_base = ordered_bases[base_idx].clone();
|
||||
|
||||
// Prepare variant list for chosen base
|
||||
let mut variants = by_base.remove(&chosen_base).unwrap_or_default();
|
||||
variants.sort_by(|a, b| {
|
||||
let rank = |v: &str| match v {
|
||||
@@ -917,7 +995,6 @@ pub fn run_interactive_model_downloader() -> Result<()> {
|
||||
.then_with(|| a.variant.cmp(&b.variant))
|
||||
});
|
||||
|
||||
// Build Multi-Select items for variants
|
||||
let mut variant_labels: Vec<String> = Vec::new();
|
||||
for m in &variants {
|
||||
let size = format_size_mb(m.size.as_ref().copied());
|
||||
@@ -953,7 +1030,6 @@ pub fn run_interactive_model_downloader() -> Result<()> {
|
||||
|
||||
ui::println_above_bars("Downloading selected models...");
|
||||
|
||||
// Setup multi-progress when multiple items are selected
|
||||
let labels: Vec<String> = picks
|
||||
.iter()
|
||||
.map(|&i| {
|
||||
@@ -961,12 +1037,12 @@ pub fn run_interactive_model_downloader() -> Result<()> {
|
||||
format!("{} ({})", m.name, format_size_mb(m.size))
|
||||
})
|
||||
.collect();
|
||||
let mut pm = ui::progress::ProgressManager::default_for_files(labels.len());
|
||||
let mut pm = ui::progress::FileProgress::default_for_files(labels.len());
|
||||
pm.init_files(&labels);
|
||||
|
||||
for (bar_idx, idx) in picks.into_iter().enumerate() {
|
||||
let picked = variants[idx].clone();
|
||||
pm.set_per_message(bar_idx, "downloading");
|
||||
pm.set_file_message(bar_idx, "downloading");
|
||||
let _path = ensure_model_available_noninteractive(&picked.name)?;
|
||||
pm.mark_file_done(bar_idx);
|
||||
ui::success(format!("Ready: {}", picked.name));
|
||||
@@ -977,9 +1053,6 @@ pub fn run_interactive_model_downloader() -> Result<()> {
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Verify/update local models by comparing with the online manifest.
|
||||
/// - If a model file exists and matches expected size/hash (when provided), it is kept.
|
||||
/// - If missing or mismatched, it will be downloaded.
|
||||
pub fn update_local_models() -> Result<()> {
|
||||
use crate::ui;
|
||||
use std::collections::HashMap;
|
||||
@@ -990,7 +1063,6 @@ pub fn update_local_models() -> Result<()> {
|
||||
|
||||
ui::info("Checking locally available models, then verifying against the online manifest…");
|
||||
|
||||
// Index manifest by filename and by stem/display name for matching.
|
||||
let mut by_file: HashMap<String, ModelEntry> = HashMap::new();
|
||||
let mut by_stem_or_name: HashMap<String, ModelEntry> = HashMap::new();
|
||||
for m in manifest {
|
||||
@@ -1007,7 +1079,6 @@ pub fn update_local_models() -> Result<()> {
|
||||
let mut updated = 0usize;
|
||||
let mut up_to_date = 0usize;
|
||||
|
||||
// Enumerate only local .bin files.
|
||||
let rd = fs::read_dir(&dir).with_context(|| format!("reading models dir {}", dir.display()))?;
|
||||
let entries: Vec<_> = rd.flatten().collect();
|
||||
|
||||
@@ -1034,7 +1105,6 @@ pub fn update_local_models() -> Result<()> {
|
||||
let file_lc = file_name.to_ascii_lowercase();
|
||||
let stem_lc = file_lc.strip_suffix(".bin").unwrap_or(&file_lc).to_string();
|
||||
|
||||
// Try to find a matching manifest entry for this local file.
|
||||
let mut manifest_entry = by_file
|
||||
.get(&file_lc)
|
||||
.or_else(|| by_stem_or_name.get(&stem_lc))
|
||||
@@ -1048,24 +1118,20 @@ pub fn update_local_models() -> Result<()> {
|
||||
continue;
|
||||
};
|
||||
|
||||
// Enrich metadata before verification (helps when API lacked size/hash)
|
||||
let _ = enrich_entry_via_head(&mut m);
|
||||
|
||||
// Determine target filename from manifest; if different, download to the canonical name.
|
||||
let target_path = if m.file.eq_ignore_ascii_case(&file_name) {
|
||||
path.clone()
|
||||
} else {
|
||||
dir.join(&m.file)
|
||||
};
|
||||
|
||||
// If the target already exists and matches (size/hash when available), it is up-to-date.
|
||||
if target_path.exists() && file_matches(&target_path, m.size, m.sha256.as_deref())? {
|
||||
crate::dlog!(1, "OK: {}", target_path.display());
|
||||
up_to_date += 1;
|
||||
continue;
|
||||
}
|
||||
|
||||
// If the current file is the same as the target and mismatched, remove before re-download.
|
||||
if target_path == path && target_path.exists() {
|
||||
crate::ilog!("Updating {}", file_name);
|
||||
let _ = fs::remove_file(&target_path);
|
||||
@@ -1088,3 +1154,76 @@ pub fn update_local_models() -> Result<()> {
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use std::env;
|
||||
|
||||
#[test]
|
||||
fn test_cache_bypass_environment() {
|
||||
unsafe {
|
||||
env::remove_var(ConfigService::ENV_NO_CACHE_MANIFEST);
|
||||
}
|
||||
assert!(!should_bypass_cache());
|
||||
|
||||
unsafe {
|
||||
env::set_var(ConfigService::ENV_NO_CACHE_MANIFEST, "1");
|
||||
}
|
||||
assert!(should_bypass_cache());
|
||||
|
||||
unsafe {
|
||||
env::remove_var(ConfigService::ENV_NO_CACHE_MANIFEST);
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_cache_ttl_environment() {
|
||||
unsafe {
|
||||
env::remove_var(ConfigService::ENV_MANIFEST_TTL_SECONDS);
|
||||
}
|
||||
assert_eq!(
|
||||
get_cache_ttl(),
|
||||
ConfigService::DEFAULT_MANIFEST_CACHE_TTL_SECONDS
|
||||
);
|
||||
|
||||
unsafe {
|
||||
env::set_var(ConfigService::ENV_MANIFEST_TTL_SECONDS, "3600");
|
||||
}
|
||||
assert_eq!(get_cache_ttl(), 3600);
|
||||
|
||||
unsafe {
|
||||
env::remove_var(ConfigService::ENV_MANIFEST_TTL_SECONDS);
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_cached_manifest_serialization() {
|
||||
let entries = vec![ModelEntry {
|
||||
name: "test".to_string(),
|
||||
file: "test.bin".to_string(),
|
||||
url: "https://example.com/test.bin".to_string(),
|
||||
size: Some(1024),
|
||||
sha256: Some("abc123".to_string()),
|
||||
last_modified: Some("2023-01-01T00:00:00Z".to_string()),
|
||||
base: "test".to_string(),
|
||||
variant: "default".to_string(),
|
||||
}];
|
||||
|
||||
let cached = CachedManifest {
|
||||
fetched_at: 1234567890,
|
||||
etag: Some("etag123".to_string()),
|
||||
last_modified: Some("2023-01-01T00:00:00Z".to_string()),
|
||||
entries: entries.clone(),
|
||||
};
|
||||
|
||||
let json = serde_json::to_string(&cached).unwrap();
|
||||
let deserialized: CachedManifest = serde_json::from_str(&json).unwrap();
|
||||
|
||||
assert_eq!(deserialized.fetched_at, cached.fetched_at);
|
||||
assert_eq!(deserialized.etag, cached.etag);
|
||||
assert_eq!(deserialized.last_modified, cached.last_modified);
|
||||
assert_eq!(deserialized.entries.len(), entries.len());
|
||||
assert_eq!(deserialized.entries[0].name, entries[0].name);
|
||||
}
|
||||
}
|
||||
|
@@ -1,16 +1,7 @@
|
||||
// rust
|
||||
//! Commonly used exports for convenient glob-imports in binaries and tests.
|
||||
//! Usage: `use polyscribe_core::prelude::*;`
|
||||
|
||||
pub use crate::backend::*;
|
||||
pub use crate::config::*;
|
||||
pub use crate::error::Error;
|
||||
pub use crate::models::*;
|
||||
|
||||
// If you frequently use UI helpers across binaries/tests, export them too.
|
||||
// Keep this lean to avoid pulling UI everywhere unintentionally.
|
||||
#[allow(unused_imports)]
|
||||
pub use crate::ui::*;
|
||||
|
||||
/// A convenient alias for `std::result::Result` with the error type defaulting to [`Error`].
|
||||
pub type Result<T, E = Error> = std::result::Result<T, E>;
|
||||
|
@@ -1,62 +1,46 @@
|
||||
// SPDX-License-Identifier: MIT
|
||||
// Copyright (c) 2025 <COPYRIGHT HOLDER>. All rights reserved.
|
||||
|
||||
//! UI helpers powered by cliclack for interactive console experiences.
|
||||
//! Centralizes prompts, logging, and progress primitives.
|
||||
|
||||
/// Progress indicators and reporting tools for displaying task completion.
|
||||
pub mod progress;
|
||||
|
||||
use std::io;
|
||||
use std::io::IsTerminal;
|
||||
|
||||
/// Log an informational message.
|
||||
pub fn info(msg: impl AsRef<str>) {
|
||||
let m = msg.as_ref();
|
||||
let _ = cliclack::log::info(m);
|
||||
}
|
||||
|
||||
/// Log a warning message.
|
||||
pub fn warn(msg: impl AsRef<str>) {
|
||||
let m = msg.as_ref();
|
||||
let _ = cliclack::log::warning(m);
|
||||
}
|
||||
|
||||
/// Log an error message.
|
||||
pub fn error(msg: impl AsRef<str>) {
|
||||
let m = msg.as_ref();
|
||||
let _ = cliclack::log::error(m);
|
||||
}
|
||||
|
||||
/// Log a success message.
|
||||
pub fn success(msg: impl AsRef<str>) {
|
||||
let m = msg.as_ref();
|
||||
let _ = cliclack::log::success(m);
|
||||
}
|
||||
|
||||
/// Log a note message with a prompt and a message.
|
||||
pub fn note(prompt: impl AsRef<str>, message: impl AsRef<str>) {
|
||||
let _ = cliclack::note(prompt.as_ref(), message.as_ref());
|
||||
}
|
||||
|
||||
/// Print a short intro header.
|
||||
pub fn intro(title: impl AsRef<str>) {
|
||||
let _ = cliclack::intro(title.as_ref());
|
||||
}
|
||||
|
||||
/// Print a short outro footer.
|
||||
pub fn outro(msg: impl AsRef<str>) {
|
||||
let _ = cliclack::outro(msg.as_ref());
|
||||
}
|
||||
|
||||
/// Print a line that should appear above any progress indicators.
|
||||
pub fn println_above_bars(line: impl AsRef<str>) {
|
||||
let _ = cliclack::log::info(line.as_ref());
|
||||
}
|
||||
|
||||
/// Prompt for input on stdin using cliclack's input component.
|
||||
/// Returns default if provided and user enters empty string.
|
||||
/// In non-interactive workflows, callers should skip prompt based on their flags.
|
||||
pub fn prompt_input(prompt: &str, default: Option<&str>) -> io::Result<String> {
|
||||
if crate::is_no_interaction() || !crate::stdin_is_tty() {
|
||||
return Ok(default.unwrap_or("").to_string());
|
||||
@@ -68,7 +52,6 @@ pub fn prompt_input(prompt: &str, default: Option<&str>) -> io::Result<String> {
|
||||
q.interact().map_err(|e| io::Error::other(e.to_string()))
|
||||
}
|
||||
|
||||
/// Present a single-choice selector and return the selected index.
|
||||
pub fn prompt_select(prompt: &str, items: &[&str]) -> io::Result<usize> {
|
||||
if crate::is_no_interaction() || !crate::stdin_is_tty() {
|
||||
return Err(io::Error::other("interactive prompt disabled"));
|
||||
@@ -80,7 +63,6 @@ pub fn prompt_select(prompt: &str, items: &[&str]) -> io::Result<usize> {
|
||||
sel.interact().map_err(|e| io::Error::other(e.to_string()))
|
||||
}
|
||||
|
||||
/// Present a multi-choice selector and return indices of selected items.
|
||||
pub fn prompt_multi_select(
|
||||
prompt: &str,
|
||||
items: &[&str],
|
||||
@@ -106,17 +88,14 @@ pub fn prompt_multi_select(
|
||||
ms.interact().map_err(|e| io::Error::other(e.to_string()))
|
||||
}
|
||||
|
||||
/// Confirm prompt with default, respecting non-interactive mode.
|
||||
pub fn prompt_confirm(prompt: &str, default: bool) -> io::Result<bool> {
|
||||
if crate::is_no_interaction() || !crate::stdin_is_tty() {
|
||||
return Ok(default);
|
||||
}
|
||||
let mut q = cliclack::confirm(prompt);
|
||||
// If `cliclack::confirm` lacks default, we simply ask; caller can handle ESC/cancel if needed.
|
||||
q.interact().map_err(|e| io::Error::other(e.to_string()))
|
||||
}
|
||||
|
||||
/// Read a secret/password without echoing, respecting non-interactive mode.
|
||||
pub fn prompt_password(prompt: &str) -> io::Result<String> {
|
||||
if crate::is_no_interaction() || !crate::stdin_is_tty() {
|
||||
return Err(io::Error::other(
|
||||
@@ -127,7 +106,6 @@ pub fn prompt_password(prompt: &str) -> io::Result<String> {
|
||||
q.interact().map_err(|e| io::Error::other(e.to_string()))
|
||||
}
|
||||
|
||||
/// Input with validation closure; on non-interactive returns default or error when no default.
|
||||
pub fn prompt_input_validated<F>(
|
||||
prompt: &str,
|
||||
default: Option<&str>,
|
||||
@@ -151,18 +129,12 @@ where
|
||||
.map_err(|e| io::Error::other(e.to_string()))
|
||||
}
|
||||
|
||||
/// A simple spinner wrapper built on top of `cliclack::spinner()`.
|
||||
///
|
||||
/// This wrapper provides a minimal API with start/stop/success/error methods
|
||||
/// to standardize spinner usage across the project.
|
||||
pub struct Spinner(cliclack::ProgressBar);
|
||||
|
||||
impl Spinner {
|
||||
/// Creates and starts a new spinner with the provided status text.
|
||||
pub fn start(text: impl AsRef<str>) -> Self {
|
||||
if crate::is_no_progress() || crate::is_no_interaction() || !std::io::stderr().is_terminal()
|
||||
{
|
||||
// Fallback: no spinner, but log start
|
||||
let _ = cliclack::log::info(text.as_ref());
|
||||
let s = cliclack::spinner();
|
||||
Self(s)
|
||||
@@ -172,7 +144,6 @@ impl Spinner {
|
||||
Self(s)
|
||||
}
|
||||
}
|
||||
/// Stops the spinner with a submitted/completed style and message.
|
||||
pub fn stop(self, text: impl AsRef<str>) {
|
||||
let s = self.0;
|
||||
if crate::is_no_progress() {
|
||||
@@ -181,17 +152,14 @@ impl Spinner {
|
||||
s.stop(text.as_ref());
|
||||
}
|
||||
}
|
||||
/// Marks the spinner as successfully finished (alias for `stop`).
|
||||
pub fn success(self, text: impl AsRef<str>) {
|
||||
let s = self.0;
|
||||
// cliclack progress bar uses `stop` for successful completion styling
|
||||
if crate::is_no_progress() {
|
||||
let _ = cliclack::log::success(text.as_ref());
|
||||
} else {
|
||||
s.stop(text.as_ref());
|
||||
}
|
||||
}
|
||||
/// Marks the spinner as failed with an error style and message.
|
||||
pub fn error(self, text: impl AsRef<str>) {
|
||||
let s = self.0;
|
||||
if crate::is_no_progress() {
|
||||
@@ -202,11 +170,9 @@ impl Spinner {
|
||||
}
|
||||
}
|
||||
|
||||
/// Byte-count progress bar that respects `--no-progress` and TTY state.
|
||||
pub struct BytesProgress(Option<cliclack::ProgressBar>);
|
||||
|
||||
impl BytesProgress {
|
||||
/// Start a new progress bar with a total and initial position.
|
||||
pub fn start(total: u64, text: &str, initial: u64) -> Self {
|
||||
if crate::is_no_progress()
|
||||
|| crate::is_no_interaction()
|
||||
@@ -224,14 +190,12 @@ impl BytesProgress {
|
||||
Self(Some(b))
|
||||
}
|
||||
|
||||
/// Increment by delta bytes.
|
||||
pub fn inc(&mut self, delta: u64) {
|
||||
if let Some(b) = self.0.as_mut() {
|
||||
b.inc(delta);
|
||||
}
|
||||
}
|
||||
|
||||
/// Stop with a message.
|
||||
pub fn stop(mut self, text: &str) {
|
||||
if let Some(b) = self.0.take() {
|
||||
b.stop(text);
|
||||
@@ -240,7 +204,6 @@ impl BytesProgress {
|
||||
}
|
||||
}
|
||||
|
||||
/// Mark as error with a message.
|
||||
pub fn error(mut self, text: &str) {
|
||||
if let Some(b) = self.0.take() {
|
||||
b.error(text);
|
||||
|
@@ -1,125 +1,109 @@
|
||||
// SPDX-License-Identifier: MIT
|
||||
// Copyright (c) 2025 <COPYRIGHT HOLDER>. All rights reserved.
|
||||
|
||||
use std::io::IsTerminal as _;
|
||||
|
||||
/// Manages a set of per-file progress bars plus a top aggregate bar using cliclack.
|
||||
pub struct ProgressManager {
|
||||
pub struct FileProgress {
|
||||
enabled: bool,
|
||||
per: Vec<cliclack::ProgressBar>,
|
||||
total: Option<cliclack::ProgressBar>,
|
||||
file_bars: Vec<cliclack::ProgressBar>,
|
||||
total_bar: Option<cliclack::ProgressBar>,
|
||||
completed: usize,
|
||||
total_len: usize,
|
||||
total_file_count: usize,
|
||||
}
|
||||
|
||||
impl ProgressManager {
|
||||
/// Create a new manager with the given enabled flag.
|
||||
impl FileProgress {
|
||||
pub fn new(enabled: bool) -> Self {
|
||||
Self {
|
||||
enabled,
|
||||
per: Vec::new(),
|
||||
total: None,
|
||||
file_bars: Vec::new(),
|
||||
total_bar: None,
|
||||
completed: 0,
|
||||
total_len: 0,
|
||||
total_file_count: 0,
|
||||
}
|
||||
}
|
||||
|
||||
/// Create a manager that enables bars when `n > 1`, stderr is a TTY, and not quiet.
|
||||
pub fn default_for_files(n: usize) -> Self {
|
||||
let enabled = n > 1
|
||||
pub fn default_for_files(file_count: usize) -> Self {
|
||||
let enabled = file_count > 1
|
||||
&& std::io::stderr().is_terminal()
|
||||
&& !crate::is_quiet()
|
||||
&& !crate::is_no_progress();
|
||||
Self::new(enabled)
|
||||
}
|
||||
|
||||
/// Initialize bars for the given file labels. If disabled or single file, no-op.
|
||||
pub fn init_files(&mut self, labels: &[String]) {
|
||||
self.total_len = labels.len();
|
||||
self.total_file_count = labels.len();
|
||||
if !self.enabled || labels.len() <= 1 {
|
||||
// No bars in single-file mode or when disabled
|
||||
self.enabled = false;
|
||||
return;
|
||||
}
|
||||
// Aggregate bar at the top
|
||||
let total = cliclack::progress_bar(labels.len() as u64);
|
||||
total.start("Total");
|
||||
self.total = Some(total);
|
||||
// Per-file bars (100% scale for each)
|
||||
self.total_bar = Some(total);
|
||||
for label in labels {
|
||||
let pb = cliclack::progress_bar(100);
|
||||
pb.start(label);
|
||||
self.per.push(pb);
|
||||
self.file_bars.push(pb);
|
||||
}
|
||||
}
|
||||
|
||||
/// Returns true when bars are enabled (multi-file TTY mode).
|
||||
pub fn is_enabled(&self) -> bool {
|
||||
self.enabled
|
||||
}
|
||||
|
||||
/// Update a per-file bar message.
|
||||
pub fn set_per_message(&mut self, idx: usize, message: &str) {
|
||||
pub fn set_file_message(&mut self, idx: usize, message: &str) {
|
||||
if !self.enabled {
|
||||
return;
|
||||
}
|
||||
if let Some(pb) = self.per.get_mut(idx) {
|
||||
if let Some(pb) = self.file_bars.get_mut(idx) {
|
||||
pb.set_message(message);
|
||||
}
|
||||
}
|
||||
|
||||
/// Update a per-file bar percent (0..=100).
|
||||
pub fn set_per_percent(&mut self, idx: usize, percent: u64) {
|
||||
pub fn set_file_percent(&mut self, idx: usize, percent: u64) {
|
||||
if !self.enabled {
|
||||
return;
|
||||
}
|
||||
if let Some(pb) = self.per.get_mut(idx) {
|
||||
if let Some(pb) = self.file_bars.get_mut(idx) {
|
||||
let p = percent.min(100);
|
||||
pb.set_message(format!("{p}%"));
|
||||
}
|
||||
}
|
||||
|
||||
/// Mark a file as finished (set to 100% and update total counter).
|
||||
pub fn mark_file_done(&mut self, idx: usize) {
|
||||
if !self.enabled {
|
||||
return;
|
||||
}
|
||||
if let Some(pb) = self.per.get_mut(idx) {
|
||||
if let Some(pb) = self.file_bars.get_mut(idx) {
|
||||
pb.stop("done");
|
||||
}
|
||||
self.completed += 1;
|
||||
if let Some(total) = &mut self.total {
|
||||
if let Some(total) = &mut self.total_bar {
|
||||
total.inc(1);
|
||||
if self.completed >= self.total_len {
|
||||
if self.completed >= self.total_file_count {
|
||||
total.stop("all done");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Finish the aggregate bar with a custom message.
|
||||
pub fn finish_total(&mut self, message: &str) {
|
||||
if !self.enabled {
|
||||
return;
|
||||
}
|
||||
if let Some(total) = &mut self.total {
|
||||
if let Some(total) = &mut self.total_bar {
|
||||
total.stop(message);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// A simple reporter for displaying progress messages using cliclack logging.
|
||||
#[derive(Debug)]
|
||||
pub struct ProgressReporter {
|
||||
non_interactive: bool,
|
||||
}
|
||||
|
||||
impl ProgressReporter {
|
||||
/// Creates a new progress reporter.
|
||||
pub fn new(non_interactive: bool) -> Self {
|
||||
Self { non_interactive }
|
||||
}
|
||||
|
||||
/// Displays a progress step message.
|
||||
pub fn step(&mut self, message: &str) {
|
||||
if self.non_interactive {
|
||||
let _ = cliclack::log::info(format!("[..] {message}"));
|
||||
@@ -128,7 +112,6 @@ impl ProgressReporter {
|
||||
}
|
||||
}
|
||||
|
||||
/// Displays a completion message.
|
||||
pub fn finish_with_message(&mut self, message: &str) {
|
||||
if self.non_interactive {
|
||||
let _ = cliclack::log::info(format!("[ok] {message}"));
|
||||
|
@@ -1,7 +1,10 @@
|
||||
use anyhow::{Context, Result};
|
||||
use serde::Deserialize;
|
||||
use std::process::Stdio;
|
||||
use std::{env, fs, os::unix::fs::PermissionsExt, path::{Path, PathBuf}};
|
||||
use std::{
|
||||
env, fs,
|
||||
os::unix::fs::PermissionsExt,
|
||||
path::Path,
|
||||
};
|
||||
use tokio::{
|
||||
io::{AsyncBufReadExt, BufReader},
|
||||
process::{Child as TokioChild, Command},
|
||||
@@ -20,20 +23,17 @@ impl PluginManager {
|
||||
pub fn list(&self) -> Result<Vec<PluginInfo>> {
|
||||
let mut plugins = Vec::new();
|
||||
|
||||
// 1) Scan PATH entries for executables starting with "polyscribe-plugin-"
|
||||
if let Ok(path) = env::var("PATH") {
|
||||
for dir in env::split_paths(&path) {
|
||||
scan_dir_for_plugins(&dir, &mut plugins);
|
||||
}
|
||||
}
|
||||
|
||||
// 2) Scan XDG data dir: $XDG_DATA_HOME/polyscribe/plugins or platform equiv
|
||||
if let Some(dirs) = directories::ProjectDirs::from("dev", "polyscribe", "polyscribe") {
|
||||
let plugin_dir = dirs.data_dir().join("plugins");
|
||||
scan_dir_for_plugins(&plugin_dir, &mut plugins);
|
||||
}
|
||||
|
||||
// 3) De-duplicate by binary path
|
||||
plugins.sort_by(|a, b| a.path.cmp(&b.path));
|
||||
plugins.dedup_by(|a, b| a.path == b.path);
|
||||
Ok(plugins)
|
||||
@@ -93,11 +93,9 @@ fn is_executable(path: &Path) -> bool {
|
||||
{
|
||||
if let Ok(meta) = fs::metadata(path) {
|
||||
let mode = meta.permissions().mode();
|
||||
// if any execute bit is set
|
||||
return mode & 0o111 != 0;
|
||||
}
|
||||
}
|
||||
// Fallback for non-unix (treat files as candidates)
|
||||
true
|
||||
}
|
||||
|
||||
@@ -119,9 +117,3 @@ fn scan_dir_for_plugins(dir: &Path, out: &mut Vec<PluginInfo>) {
|
||||
}
|
||||
}
|
||||
|
||||
#[allow(dead_code)]
|
||||
#[derive(Debug, Deserialize)]
|
||||
struct Capability {
|
||||
command: String,
|
||||
summary: String,
|
||||
}
|
||||
|
@@ -1,5 +1,4 @@
|
||||
// SPDX-License-Identifier: MIT
|
||||
// Stub plugin: tubescribe
|
||||
|
||||
use anyhow::{Context, Result};
|
||||
use clap::Parser;
|
||||
@@ -36,7 +35,6 @@ fn main() -> Result<()> {
|
||||
serve_once()?;
|
||||
return Ok(());
|
||||
}
|
||||
// Default: show capabilities (friendly behavior if run without flags)
|
||||
let caps = psp::Capabilities {
|
||||
name: "tubescribe".to_string(),
|
||||
version: env!("CARGO_PKG_VERSION").to_string(),
|
||||
@@ -49,14 +47,12 @@ fn main() -> Result<()> {
|
||||
}
|
||||
|
||||
fn serve_once() -> Result<()> {
|
||||
// Read exactly one line (one request)
|
||||
let stdin = std::io::stdin();
|
||||
let mut reader = BufReader::new(stdin.lock());
|
||||
let mut line = String::new();
|
||||
reader.read_line(&mut line).context("failed to read request line")?;
|
||||
let req: psp::JsonRpcRequest = serde_json::from_str(line.trim()).context("invalid JSON-RPC request")?;
|
||||
|
||||
// Simulate doing some work with progress
|
||||
emit(&psp::StreamItem::progress(5, Some("start".into()), Some("initializing".into())))?;
|
||||
std::thread::sleep(std::time::Duration::from_millis(50));
|
||||
emit(&psp::StreamItem::progress(25, Some("probe".into()), Some("probing sources".into())))?;
|
||||
@@ -65,7 +61,6 @@ fn serve_once() -> Result<()> {
|
||||
std::thread::sleep(std::time::Duration::from_millis(50));
|
||||
emit(&psp::StreamItem::progress(90, Some("finalize".into()), Some("finalizing".into())))?;
|
||||
|
||||
// Handle method and produce result
|
||||
let result = match req.method.as_str() {
|
||||
"generate_metadata" => {
|
||||
let title = "Canned title";
|
||||
@@ -78,7 +73,6 @@ fn serve_once() -> Result<()> {
|
||||
})
|
||||
}
|
||||
other => {
|
||||
// Unknown method
|
||||
let err = psp::StreamItem::err(req.id.clone(), -32601, format!("Method not found: {}", other), None);
|
||||
emit(&err)?;
|
||||
return Ok(());
|
||||
|
Reference in New Issue
Block a user