Files
workflow-miner/docs/research-atuin.md
vikingowl 48923450f8 docs: add architecture plan and research notes
Initial project documentation for workflow-miner — a Rust CLI + zsh
plugin that mines recurring command workflows from Atuin shell history.
2026-02-22 08:41:50 +01:00

4.3 KiB

Research: Atuin Shell History

What is Atuin

Atuin is a shell history replacement written in Rust. Instead of the flat ~/.zsh_history text file, it stores commands in SQLite with rich metadata. ~25k GitHub stars, 200k+ users, 220M+ synced history entries.

  • Shells: zsh, bash, fish, nushell, xonsh (PowerShell tier-2)
  • Architecture: Rust binary + SQLite at ~/.local/share/atuin/history.db
  • Hooks into shell via preexec/precmd (zsh) or equivalent
  • Doesn't replace existing history file — runs alongside it
  • Optional e2e encrypted cross-machine sync (self-hostable or cloud)

SQLite Schema

Database location: ~/.local/share/atuin/history.db (configurable via db_path in ~/.config/atuin/config.toml, respects $XDG_DATA_HOME).

Database engine: SQLite with WAL journal mode and normal synchronous setting.

Tables

CREATE TABLE IF NOT EXISTS history (
    id        TEXT PRIMARY KEY,
    timestamp INTEGER NOT NULL,    -- Unix nanoseconds (i64)
    duration  INTEGER NOT NULL,    -- Command duration in nanoseconds
    exit      INTEGER NOT NULL,    -- Exit code
    command   TEXT NOT NULL,       -- The shell command string
    cwd       TEXT NOT NULL,       -- Working directory
    session   TEXT NOT NULL,       -- Terminal session ID
    hostname  TEXT NOT NULL,       -- Machine hostname
    deleted_at INTEGER,            -- Soft-delete timestamp (nullable)
    UNIQUE(timestamp, cwd, command)
);

Indexes

CREATE INDEX IF NOT EXISTS idx_history_timestamp ON history(timestamp);
CREATE INDEX IF NOT EXISTS idx_history_command ON history(command);
CREATE INDEX IF NOT EXISTS idx_history_command_timestamp ON history(command, timestamp);

Rust History struct

Field Rust Type SQLite Column
id String id TEXT
timestamp chrono::DateTime<Utc> timestamp INTEGER (nanoseconds)
duration i64 duration INTEGER
exit i64 exit INTEGER
command String command TEXT
cwd String cwd TEXT
session String session TEXT
hostname String hostname TEXT
deleted_at Option<chrono::DateTime<Utc>> deleted_at INTEGER (nullable)

Migrations history

5 migrations in crates/atuin-client/migrations/:

  1. 20210422143411_create_history.sql — initial table + indexes
  2. 20220505083406_create-events.sql — events table (later dropped)
  3. 20220806155627_interactive_search_index.sql — compound index on (command, timestamp)
  4. 20230315220114_drop-events.sql — dropped events table
  5. 20230319185725_deleted_at.sql — added soft-delete column

Key observations for workflow mining

  • Session grouping: The session column groups commands by terminal session — essential for sequential pattern mining.
  • Timestamp ordering: Nanosecond precision allows precise ordering within sessions.
  • CWD tracking: cwd provides project/directory context.
  • No pipeline decomposition: cat foo | grep bar is stored as a single command string.
  • UNIQUE constraint on (timestamp, cwd, command) prevents exact duplicates.
  • Existing indexes on command and (command, timestamp) help frequency analysis.
  • Record store is a separate database (record.db) with encrypted sync data — irrelevant for local mining.
  • WAL mode: Safe for concurrent read access from a separate process.
SELECT command, timestamp, cwd, session, duration, exit
FROM history
WHERE deleted_at IS NULL
ORDER BY session, timestamp;

Atuin Desktop

Separate product (open beta, recently open-sourced Apache 2.0). An executable runbook editor with CRDT-based collaboration. Think "Notion but the code blocks run." Autocomplete draws from synced history. Manual authoring — not automatic extraction.

Source references