Files
studipsync/README.md

102 lines
6.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# studip-sync
`studip-sync` is a Rust CLI that performs a one-way sync of Stud.IP course materials (via the Uni Trier JSON:API) into a local directory tree. The tool persists config/state in TOML, talks to the API with HTTP Basic auth, and keeps the local filesystem organized as `<download_root>/<semester>/<course>/<folder>/<files>`.
## Key Features
- `init-config` writes a ready-to-edit config template (respecting `--download-root` and `--force` to overwrite).
- `auth` subcommand stores Base64-encoded credentials per profile (passwords are never logged).
- `list-courses` fetches `/users/me`, paginates enrolled courses, infers semester keys, caches the metadata, and prints a concise table.
- `sync` traverses every course folder/file tree, normalizes names (Unicode NFKD + transliteration so `Ökologie/ß/œ` becomes `Oekologie/ss/oe`), streams downloads to disk, tracks checksums/remote timestamps, and supports `--dry-run`, `--prune`, and `--since <semester|date>` filters (e.g., `--since ws2526` or `--since 01032024`).
- XDG-compliant config (`~/.config/studip-sync/config.toml`) and state (`~/.local/share/studip-sync/state.toml`) stores everything in TOML.
- Extensive logging controls: `--quiet`, `--verbose/-v`, `--debug`, and `--json`.
## Directory Layout & Data Files
- Config lives under `${XDG_CONFIG_HOME:-~/.config}/studip-sync/config.toml`. Override this with `--config-dir` if you want the config somewhere else.
- State is cached in `${XDG_DATA_HOME:-~/.local/share}/studip-sync/state.toml`; `--data-dir` only changes this location (and anything else the tool stores under data, such as the default downloads folder). Use this when you want the state cache on a different disk but keep the config where it is.
- `download_root` determines where files land. If omitted, it falls back to `<data-dir>/downloads`, so moving the data dir automatically relocates the default downloads. Setting `download_root` explicitly decouples it from the data dir. Each path segment is sanitized to keep names human-readable yet filesystem-safe. Semester entries cached in `state.toml` now include start/end timestamps so CLI filters such as `--since ws2526` know when a term begins (`list-courses --refresh` also re-fetches any cached semester still missing those timestamps).
## Getting Started
1. **Prerequisites** Install a recent Rust toolchain (Rust 1.75+ recommended) and ensure you can reach `https://studip.uni-trier.de`.
2. **Build & validate** From the repo root run:
```bash
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo test
```
3. **First run**:
```bash
# Optionally scaffold a config template (safe no-op if it exists)
cargo run -- init-config --download-root "$HOME/StudIP"
# Store credentials (prompts for username/password by default)
cargo run -- auth
# Inspect courses and cache semester data
cargo run -- list-courses --refresh
# Perform a dry-run sync to see planned actions
cargo run -- sync --dry-run
# Run the real sync (omit --dry-run); add --prune to delete stray files
cargo run -- sync --prune
```
Use `--profile`, `--config-dir`, or `--data-dir` when working with multiple identities or non-standard paths.
## Configuration Reference
Example `config.toml`:
```toml
default_profile = "default"
[profiles.default]
base_url = "https://studip.uni-trier.de"
jsonapi_path = "/jsonapi.php/v1"
basic_auth_b64 = "base64(username:password)"
download_root = "/home/alex/StudIP"
max_concurrent_downloads = 3 # placeholder for future concurrency control
```
- The file is written with `0600` permissions. Never commit credentials—`auth` manages them interactively or through `--username/--password` / `STUDIP_SYNC_USERNAME|PASSWORD`.
- Multiple profiles can be added under `[profiles.<name>]`; pass `--profile <name>` when invoking the CLI to switch.
## CLI Reference
| Subcommand | Description | Helpful flags |
| --- | --- | --- |
| `init-config` | Write a default config template (fails if config exists unless forced). | `--force`, `--download-root` |
| `auth` | Collect username/password, encode them, and save them to the active profile. | `--non-interactive`, `--username`, `--password` |
| `list-courses` | List cached or freshly fetched courses with semester keys and IDs. | `--refresh` |
| `sync` | Download files for every enrolled course into the local tree. | `--dry-run`, `--prune`, `--since <semester key | DDMMYY | DDMMYYYY | RFC3339>` |
Global flags: `--quiet`, `--debug`, `--json`, `-v/--verbose` (stackable), `--config-dir`, `--data-dir` (state + default downloads), `--profile`.
## Sync Behavior
1. Resolve user ID (cached in `state.toml`) and fetch current courses.
2. Cache missing semesters via `/semesters/{id}` and infer keys like `ws2425` / `ss25`. When `--refresh` is passed, already-known semesters that never recorded a `start` timestamp are re-fetched so `--since` filters have the data they need.
3. For each course:
- Walk folders using the JSON:API pagination helpers; fetch nested folders via `/folders/{id}/folders`.
- List file refs via `/folders/{id}/file-refs`, normalize filenames (including transliteration of umlauts/ligatures like `ä→ae`, `Ö→Oe`, `ß→ss`, `œ→oe`), and ensure unique siblings through a `NameRegistry`.
- Skip downloads when the local file exists and matches the stored checksum / size / remote `chdate`.
- Stream downloads to `*.part`, hash contents on the fly, then rename atomically to the final path.
4. Maintain a set of remote files so `--prune` can remove local files that no longer exist remotely (and optionally delete now-empty directories). When `--since` is provided, files whose remote `chdate` precedes the resolved timestamp (semester start or explicit date) are skipped; newer files continue through the regular checksum/size logic.
5. `--dry-run` prints planned work but never writes to disk.
## Development Notes
- The HTTP client limits itself to GETs with Basic auth; non-success responses are surfaced verbatim via `anyhow`.
- All downloads currently run sequentially; `ConfigProfile::max_concurrent_downloads` is in place for a future bounded task executor.
- Offline JSON:API documentation lives under `docs/studip/` to keep this repo usable without network access.
## Roadmap / Known Gaps
1. Implement real concurrent downloads that honor `max_concurrent_downloads`.
2. Wire `--since` into Stud.IP filters (if available) or local heuristics to reduce API load.
3. Add unit/integration tests (`semesters::infer_key`, naming helpers, pruning) and consider fixtures for Stud.IP responses.
4. Improve auth failure UX by detecting 401/403 and prompting the user to re-run `studip-sync auth`.
5. Evaluate whether the crate should target Rust 2021 (per the original requirement) or explicitly document Rust 2024 as the minimum supported version.