docs: refresh agent + user docs
This commit is contained in:
199
AGENTS.md
199
AGENTS.md
@@ -1,159 +1,72 @@
|
||||
# StudIP Sync Agent
|
||||
|
||||
## Goal
|
||||
This document equips future agents with the current mental model for the `studip-sync` CLI so new work can focus on real gaps instead of rediscovering context.
|
||||
|
||||
Implement a command-line tool in Rust that performs a **one-way sync** of files from Stud.IP (JSON:API at Uni Trier) to the local filesystem. [web:68]
|
||||
The local directory structure must be: `<semester>/<course>/<studip-folders>/<files>`. [web:88]
|
||||
## Mission & Constraints
|
||||
|
||||
## Environment
|
||||
- Goal: one-way sync of documents from Stud.IP’s JSON:API (Uni Trier) to the local filesystem using Rust (async `tokio`, `reqwest`, `serde`, TOML config/state).
|
||||
- Target platform: Linux (Arch) following the XDG base directory spec.
|
||||
- Binary name must stay `studip-sync`; code must stay `cargo fmt` + `cargo clippy --all-targets --all-features -- -D warnings` + `cargo test` clean.
|
||||
- All configuration/state is TOML. The config file (mode `0600`) stores the Base64-encoded Basic auth string; state caches user/semester/course/file metadata.
|
||||
- Directory layout requirement: `<download_root>/<semester_key>/<course>/<studip-folder-hierarchy>/<files>`. Never upload to Stud.IP; pruning is opt-in via `--prune`.
|
||||
|
||||
- Target OS: Linux (Arch, follow XDG base directory conventions). [web:135]
|
||||
- Language: Rust (2021 edition).
|
||||
- Use `reqwest` + `tokio` for HTTP and async, `serde` for JSON and TOML, and standard Rust CLI patterns. [web:111][web:131]
|
||||
- Build as a single binary named `studip-sync`.
|
||||
> **Rust edition note:** The crate currently targets Rust 2024 even though the original brief called for 2021. Keep this divergence in mind if MSRV compatibility matters.
|
||||
|
||||
## Code Quality: Formatting and Linting
|
||||
## Repository Map
|
||||
|
||||
- Use `rustfmt` as the standard formatter for all Rust code; code must be kept `cargo fmt` clean. [web:144][web:148]
|
||||
- Use `clippy` as the linter; the project must pass `cargo clippy --all-targets --all-features -- -D warnings` with no warnings. [web:144][web:149][web:159]
|
||||
- Add a `rustfmt.toml` and (optionally) a `clippy.toml` where needed, but prefer default settings to stay idiomatic. [web:144][web:151]
|
||||
- If CI is present, include steps that run `cargo fmt --all -- --check`, `cargo clippy --all-targets --all-features -- -D warnings`, and `cargo test`. [web:147][web:150][web:159]
|
||||
| Path | Purpose |
|
||||
| --- | --- |
|
||||
| `src/main.rs` | Minimal entry point that parses CLI args and drives async runtime. |
|
||||
| `src/cli.rs` | All subcommand implementations plus sync logic, prompt helpers, pruning, naming utilities, and state updates. |
|
||||
| `src/config.rs` | Multi-profile TOML config loader/saver; enforces 0600 perms on write. |
|
||||
| `src/state.rs` | TOML cache schema for user/semesters/courses/files plus helpers to read/write/mutate per profile. |
|
||||
| `src/paths.rs` | Resolves XDG-compliant config/data dirs with optional overrides. |
|
||||
| `src/studip_client.rs` | Thin JSON:API client (Basic auth header, pagination helper, download streaming). |
|
||||
| `src/semesters.rs` | Converts human semester titles (“WiSe 2024/25”) into stable keys (`ws2425`). |
|
||||
| `src/logging.rs` | Tracing subscriber setup with quiet/debug/json/verbosity knobs. |
|
||||
| `docs/studip/` | Offline copy of Stud.IP JSON:API docs for reference (no code). |
|
||||
|
||||
## API
|
||||
## Runtime Flow
|
||||
|
||||
- Base URL: configurable, default `https://studip.uni-trier.de`. [web:68]
|
||||
- JSON:API root: `<base_url>/jsonapi.php/v1`. [web:68]
|
||||
- Authentication: HTTP Basic (username/password), encoded once as base64 and stored in TOML config. [web:1][web:118]
|
||||
- Use JSON:API routes such as:
|
||||
- `GET /users/me` to resolve the current user and related links (courses, folders, file-refs). [web:68][web:106]
|
||||
- `GET /users/{user_id}/courses` to list enrolled courses. [web:85]
|
||||
- Course-specific routes for folders and documents/file-refs, using the documented JSON:API routes for Stud.IP (e.g. `/courses/{course_id}/documents`). [web:88][web:93]
|
||||
1. `studip-sync auth` collects credentials (CL flags, env, or interactive prompts), Base64-encodes `username:password`, and persists it in the active profile.
|
||||
2. `studip-sync list-courses` builds a `StudipClient`, resolves/caches the user ID via `/users/me`, paginates `/users/{id}/courses`, fetches missing semesters, upserts course metadata into `state.toml`, and prints a table sorted by semester/title.
|
||||
3. `studip-sync sync`:
|
||||
- Resolves download root (`config.download_root` or `$XDG_DATA_HOME/studip-sync/downloads`) and ensures directories exist unless `--dry-run`.
|
||||
- Refreshes course + semester info, then for each course performs a depth-first walk: `/courses/{id}/folders` ➜ `/folders/{id}/file-refs` ➜ `/folders/{id}/folders`. Pagination is handled by `fetch_all_pages`.
|
||||
- Normalizes path components and uses `NameRegistry` to avoid collisions, guaranteeing human-readable yet unique names.
|
||||
- Checks file state (size, modified timestamp, checksum) against `state.toml` to skip unchanged files; downloads stream to `*.part` before rename.
|
||||
- Records remote metadata + local path hints in state. `--dry-run` reports actions without touching disk; `--prune` (plus non–dry-run) deletes stray files/dirs with `walkdir`.
|
||||
4. HTTP errors propagate via `anyhow`, but 401/403 currently surface as generic failures—production UX should point users to `studip-sync auth`.
|
||||
|
||||
### Field notes (2025-02-16)
|
||||
## Configuration & State
|
||||
|
||||
- `/users/me` returns the canonical user ID (`cbcee42edfea…`), full profile attributes, and relationship URLs (courses, folders, file-refs, etc.). Cache the `id` immediately so later runs can skip this discovery call unless credentials change.
|
||||
- `/users/{id}/courses` is paginated via `meta.page { offset, limit, total }` and `links.first/last` (e.g. `/jsonapi.php/v1/users/.../courses?page[offset]=0&page[limit]=30`). Default limit is 30; loop by bumping `offset` until `offset >= total`. Each course provides `start-semester`/`end-semester` relationships to semester IDs, course numbers, and titles.
|
||||
- `/semesters/{id}` exposes only human strings like `"WiSe 2024/25"` plus ISO start/end timestamps—no canonical short keys. Derive keys such as `ws2425` from the title or `start` year and cache the mapping `semester_id → key` in `state.toml`.
|
||||
- `/courses/{id}/folders` lists folder nodes with attributes (`folder-type`, `is-empty`, mkdate/chdate) and nested relationships: follow `/folders/{folder_id}/folders` recursively for subfolders, because `meta.count` only reports a child count.
|
||||
- `/folders/{id}/file-refs` is the primary listing for downloadable files. Each `file-ref` has attributes (`name`, `filesize`, `mkdate`, `chdate`, MIME, `is-downloadable`), relationships back to the parent folder/course, and a `meta.download-url` like `/sendfile.php?...`. Prepend the configured base URL before downloading.
|
||||
- `/files/{id}` only repeats size/timestamp data and links back to `file-refs`; it does **not** expose checksums. Track change detection via `(file-ref id, filesize, chdate)` and/or compute local hashes.
|
||||
- File/folder listings share the same JSON:API pagination scheme. Always honor the `meta.page` counts and `links.first/last/next` to avoid missing entries in large folders.
|
||||
- Config path: `${XDG_CONFIG_HOME:-~/.config}/studip-sync/config.toml`. Example keys: `base_url`, `jsonapi_path`, `basic_auth_b64`, `download_root`, `max_concurrent_downloads`.
|
||||
- State path: `${XDG_DATA_HOME:-~/.local/share}/studip-sync/state.toml`.
|
||||
- `profiles.<name>.user_id` caches `/users/me`.
|
||||
- `profiles.<name>.semesters.<key>` stores semester IDs/titles/keys.
|
||||
- `profiles.<name>.courses.<id>` keeps display names + `last_sync`.
|
||||
- `profiles.<name>.files.<file_ref_id>` remembers size, checksum, timestamps, and the last local path to avoid redundant downloads.
|
||||
- Multiple profiles are supported; `--profile` switches, otherwise the config’s `default_profile` is used.
|
||||
|
||||
## Configuration (TOML, including paths)
|
||||
## Development Workflow
|
||||
|
||||
All configuration and state in this project must use **TOML**. [web:131]
|
||||
1. Install a recent Rust toolchain (`rustup toolchain install stable` if needed).
|
||||
2. Lint/test loop:
|
||||
```bash
|
||||
cargo fmt --all -- --check
|
||||
cargo clippy --all-targets --all-features -- -D warnings
|
||||
cargo test
|
||||
```
|
||||
3. Use `cargo run -- <subcommand>` for manual verification (e.g., `cargo run -- auth`, `cargo run -- list-courses --refresh`, `cargo run -- sync --dry-run`).
|
||||
4. Keep dependencies minimal; avoid logging sensitive strings (`basic_auth_b64`, plaintext passwords).
|
||||
|
||||
- Primary config file: XDG-compliant, e.g. `~/.config/studip-sync/config.toml`. [web:131][web:135]
|
||||
- Example `config.toml` keys:
|
||||
## Known Gaps / Backlog
|
||||
|
||||
```
|
||||
- `ConfigProfile::max_concurrent_downloads` is defined but unused; downloads happen sequentially. Introduce a bounded task queue if concurrency is needed.
|
||||
- `SyncArgs::since` exists but is not wired into any API calls; ideal future work would leverage Stud.IP filters or local timestamps.
|
||||
- No automated tests (unit/integration) are present; critical helpers like `semesters::infer_key`, `normalize_component`, and state transitions should gain coverage.
|
||||
- Error UX for auth failures could be clearer (detect 401/403 and prompt users to re-run `auth`).
|
||||
- There is no CI config; if one is added, ensure it runs fmt/clippy/test.
|
||||
- Verify long-term compatibility with Rust 2024 or document the minimum supported version explicitly.
|
||||
|
||||
base_url = "https://studip.uni-trier.de"
|
||||
jsonapi_path = "/jsonapi.php/v1"
|
||||
|
||||
# Authorization header value without the "Basic " prefix, base64("username:password").
|
||||
|
||||
basic_auth_b64 = "..."
|
||||
|
||||
# Local base directory for synced files.
|
||||
|
||||
download_root = "/home/<user>/StudIP"
|
||||
|
||||
# Maximum concurrent HTTP downloads.
|
||||
|
||||
max_concurrent_downloads = 3
|
||||
|
||||
```
|
||||
|
||||
- The `download_root` directory determines where the tool creates `semester/course/folders/files`. [web:68]
|
||||
- The config file must be created with mode `0600` and never contain anything except necessary settings and the base64-encoded credential. [web:118][web:122]
|
||||
|
||||
### Credentials and auth
|
||||
|
||||
- On first run (or when running `studip-sync auth`), prompt interactively for username and password. [web:118]
|
||||
- Construct `username:password`, base64-encode it, and store the result as `basic_auth_b64` in `config.toml`. [web:1][web:118]
|
||||
- At runtime, send `Authorization: Basic <basic_auth_b64>` on all JSON:API requests. [web:1][web:68]
|
||||
- Never log or print the password, `basic_auth_b64`, or full `Authorization` header. [web:118][web:128]
|
||||
- On HTTP `401` or `403` from a known-good endpoint like `/users/me`, treat this as auth failure:
|
||||
- Non-interactive runs: exit with a non-zero code and a clear message asking the user to run `studip-sync auth`. [web:118]
|
||||
- Interactive runs: optionally prompt again and update `basic_auth_b64`.
|
||||
|
||||
## State (TOML as well)
|
||||
|
||||
- State file must also be TOML, stored under XDG data dir, e.g. `~/.local/share/studip-sync/state.toml`. [web:131][web:135]
|
||||
- State is non-secret cached data:
|
||||
|
||||
```
|
||||
|
||||
user_id = "cbcee42edfea9232fecc3e414ef79d06"
|
||||
|
||||
[semesters."ws2526"]
|
||||
id = "830eb86ad41d8f695d016647d557218a"
|
||||
title = "Wintersemester 2025/26"
|
||||
|
||||
[semesters."ss25"]
|
||||
id = "..."
|
||||
title = "Sommersemester 2025"
|
||||
|
||||
[courses."830eb86a-...-course-id"]
|
||||
name = "Rechnerstrukturen - Übung"
|
||||
semester_key = "ws2526"
|
||||
last_sync = "2025-11-14T12:34:56Z"
|
||||
|
||||
```
|
||||
|
||||
- The tool should:
|
||||
- Cache `user_id` after the first successful `/users/me` call. [web:68][web:106]
|
||||
- Cache semester IDs and human-readable keys (`ws2526`, `ss25`) after discovering them via JSON:API. [web:68]
|
||||
- Optionally store course and last-sync metadata to reduce API calls (e.g. using `filter[since]` if supported). [web:88][web:93]
|
||||
|
||||
## Directory structure
|
||||
|
||||
- All downloads must go under `download_root`, respecting:
|
||||
`download_root/<semester_key>/<course_name>/<studip_folder_path>/<file>`.
|
||||
- `semester_key` is resolved from the state file (`ws2526`, `ss25`, etc.). [web:68]
|
||||
- `course_name` and Stud.IP folder/file names should be normalized to safe filesystem paths (handle spaces, umlauts, and special characters) while staying human-readable. [web:68][web:104]
|
||||
|
||||
## Sync semantics
|
||||
|
||||
- One-way sync: Stud.IP → local filesystem only; never upload or modify data on Stud.IP. [web:68]
|
||||
- Default behavior:
|
||||
- Create directories and download new or changed files under `download_root`.
|
||||
- Never delete local files by default.
|
||||
- Provide optional flags:
|
||||
- `--prune`: delete local files that no longer exist on Stud.IP.
|
||||
- `--dry-run`: print planned actions (creates/downloads/deletes) without modifying the filesystem.
|
||||
|
||||
## Minimizing API usage and load
|
||||
|
||||
- Use cached `user_id` and semester mappings from `state.toml` to avoid repeated discovery calls. [web:68]
|
||||
- When listing course documents, use JSON:API pagination and any available filters (e.g. `filter[since]`) supported by Stud.IP’s document routes. [web:88][web:93]
|
||||
- Avoid re-downloading unchanged files by checking JSON:API attributes such as ID, size, and modification time against the stored state. [web:93][web:106]
|
||||
|
||||
## CLI interface
|
||||
|
||||
- Binary name: `studip-sync`.
|
||||
- Subcommands:
|
||||
- `studip-sync auth`: set or update credentials; writes `config.toml`.
|
||||
- `studip-sync sync`: perform sync from Stud.IP to `download_root`.
|
||||
- `studip-sync list-courses`: list known courses with semester keys and IDs from state (refreshing if needed).
|
||||
- Use standard exit codes:
|
||||
- `0` on success.
|
||||
- Non-zero on errors (auth failure, network error, JSON parse error, filesystem failure). [web:118]
|
||||
|
||||
## Performance & safety
|
||||
|
||||
- Limit concurrent HTTP requests (configurable via `max_concurrent_downloads`, default 3). [web:68]
|
||||
- Stream file downloads directly to disk; do not load entire files into memory. [web:88]
|
||||
- Handle HTTP and I/O errors gracefully with clear messages and without panicking.
|
||||
- Keep dependencies minimal and use idiomatic Rust project structuring for maintainability. [web:136][web:137]
|
||||
|
||||
## Extensibility
|
||||
|
||||
- Internally, separate concerns into modules:
|
||||
- `config` (TOML load/save for config and state).
|
||||
- `studip_client` (JSON:API HTTP client).
|
||||
- `sync` (sync logic and directory mapping).
|
||||
- `cli` (argument parsing, subcommands). [web:136][web:137]
|
||||
- Represent core entities as Rust types: `Semester`, `Course`, `Folder`, `FileRef`. [web:68][web:93]
|
||||
- Design so that a future `MoodleProvider` can implement the same internal traits (e.g. `LmsProvider`) without changing the CLI surface.
|
||||
Keep this guide updated whenever major flow or architecture changes land so that future agents can jump straight into implementation work.
|
||||
|
||||
Reference in New Issue
Block a user