19 Commits

Author SHA1 Message Date
0997d4befa feat(auth): D1 non-breaking security foundations
- CORS: rewrite middleware with Vary: Origin, regex origin patterns,
  startup validation, and prod boot-fail on empty allowlist; shared
  CORSConfig exported for CSRF reuse
- CSRF: new Origin/Referer check middleware sharing CORS allowlist;
  Bearer-token clients exempt; mounts globally after CORS
- Argon2id: new password package with PHC format, bcrypt dispatch, and
  NeedsRehash; lazy upgrade on login in auth service
- Rate limiting: add RateLimitByKey with custom key function; apply
  per-route limits to /auth/login, /refresh, /2fa/verify,
  /auth/magic-link, and /auth/password
- apierror: add CSRFMismatch and RefreshReuse error constructors
- Migrations: 000027 (session model schema columns for D2/D3),
  000028 (TOTP secret_v2 column + totp_backup_codes table)
- cmd/totp-encrypt: one-shot job to encrypt existing TOTP secrets
2026-04-26 11:54:37 +02:00
69c6453e26 feat(similarity): confidence calibration anchors + Ronneburg failure-case fixtures (B4-B5)
- Add confidence scale (0.95-1.00 / 0.70-0.90 / 0.50-0.70 / 0.00-0.50)
  with four annotated few-shot examples to the similarity system prompt
- Add two Ronneburg real-world pairs to similarity.json: descriptive-prefix
  swap and low-trigram-overlap rename, both expected same=true
2026-04-25 18:00:39 +02:00
2f32d4b954 chore: remove Mistral/Ollama legacy references after Gemini migration
Rename mistral.go → llm_enricher.go and mistral_test.go →
llm_enricher_test.go; update all test function names and stale model
strings (mistral-large-latest → gemini-2.5-flash-lite); drop Ollama
block from .env; mark superseded planning specs; update provider
references in planning docs and CLAUDE.md to Google Gemini.
2026-04-25 17:31:58 +02:00
3ddfd87408 feat(ai): migrate to Google Gemini 2.5 Flash-Lite, drop Mistral/Ollama
Replace the Mistral + Ollama AI stack with a single Google Gemini provider
backed by google.golang.org/genai. API key moves from env/Helm to the DB
(AES-256-GCM, key derived from JWT_SECRET via HKDF) so it can be rotated
via the admin UI without a pod restart.

New:
- pkg/crypto/secretbox — AES-256-GCM encrypt/decrypt for secrets at rest
- pkg/ai/gemini — GeminiProvider with grounding, structured output, usage
  recording, and hot-reload (Reinitialize swaps client under mutex)
- pkg/ai/usage — UsageRecorder interface + UsageEvent struct
- domain/settings/store — DB-backed settings (model, grounding toggle, key)
- domain/settings/usage — UsageRepo implementing UsageRecorder; ai_usage table
- migrations 000021 (system_settings) + 000022 (ai_usage)
- settings API: GET /ai, POST /ai/key, POST /ai/model, POST /ai/grounding,
  GET /ai/usage
- admin UI: 4-card settings page — provider status, model selector, grounding
  toggle with quota, usage rollups + recent-calls table

Removed:
- pkg/ai/ollama, mistral_provider, ratelimiter (+ tests)
- Helm AI_API_KEY, AI_PROVIDER, AI_MODEL_COMPLEX, AI_AGENT_DISCOVERY,
  AI_RATE_LIMIT_RPS env vars

Call sites set Grounded+CallType: research (true/"research"), enrich Pass B
(true/"enrich_b"), similarity (false/"similarity"). Integration test updated
to use a stub ai.Provider instead of a fake Ollama HTTP server.
2026-04-25 09:54:49 +02:00
24e072b63d feat(ai): pluggable provider interface, Ollama + Mistral impls, migrate Pass2 sites
Replaces the Mistral-only ai.Client with an ai.Provider interface backed by
Ollama and Mistral implementations. Migrates enrichment + similarity callers
to ai.Provider.Chat. Research endpoint returns 501 until commit 2 reinstates
it on the new orchestrator.
2026-04-24 16:35:18 +02:00
126cc58cbf refactor(discovery-eval): share JSON helpers, trim narration, tighten signatures
- Extract readJSONFile + writeJSONAtomic in cache.go; category cache
  reuses them (saveCategoryCache is one line, loadCategoryCache uses
  the standard load-or-empty shape).
- Drop dead errMsg param from scoreCategoryResult (always "").
- Wrap writeCategoryReport errors with context for consistency.
- Wrap runSimilarityMode / runCategoryMode's 5 per-mode flags into an
  evalConfig struct so params don't drift.
- Promote validModes to a package-level var.
- Remove redundant cache = new...() fallback after load* (both load
  helpers already return a non-nil empty cache on error).
- Strip narrating / diff-referencing comments per CLAUDE.md; keep the
  one genuine WHY on normalizeCategory (divergence from normalize.Name).

Net -54 lines across 4 files; go build + go vet + tests green.
2026-04-24 12:59:06 +02:00
88d0ae9d96 feat(discovery): category eval mode for the LLM enricher
Ship 2 MR 5b. Extends discovery-eval with a second mode that grades
MistralLLMEnricher's category output against labelled ground truth.
Accuracy + per-label confusion matrix so mix-ups between similar
categories (mittelaltermarkt vs ritterfest, weihnachtsmarkt vs
kirchweih) are visible at a glance.

Usage:
  -mode similarity  — existing MR 5 path, unchanged.
  -mode category    — new: scrapes quellen URLs, asks LLM for
                       {category, opening_hours, description},
                       scores category only.

Structure
- main.go: split into runSimilarityMode + runCategoryMode. Both
  share ai.Client construction and the ctx timeout (bumped to 15min
  for category mode since scraping adds I/O). Mode dispatched on
  -mode flag; unknown modes exit 2.
- category.go: fixture / cache / run / metrics / report — parallel
  to the similarity files, not shared because the data shapes differ
  enough that generics would add more noise than they save. Cache
  key is sha256(markt_name_lower|stadt_lower|year|model); separate
  from SimilarityPairKey since that one takes two rows.
- fixtures/category.json: 10 hand-labelled DACH-market rows
  exercising the categories we expect the LLM to produce —
  mittelaltermarkt, weihnachtsmarkt, ritterfest, ritterturnier,
  handwerkermarkt, schlossfest, kirchweih. Each row lists a quelle
  URL the enricher will scrape live (first run only; cache takes
  over after).
- normalizeCategory: strips casing + German umlauts + the -märkte
  plural drift so a correctly-categorised row doesn't get scored
  wrong for cosmetic LLM output variation.

Metrics: Accuracy + per-label confusion matrix. Confusion format is
`want → predictions` with `!` markers on off-diagonal predictions —
readable in a terminal, machine-parseable in the JSON report.
Mismatches are listed at the end with want/got pairs so operators
can spot prompt failures and patch either the prompt or the fixture.

Threshold gate reads accuracy (not F1) — category is multi-class,
precision/recall don't have a single-label meaning.

Tests: normalisation edge cases (casing, umlaut, plural, trimming),
scoring drift tolerance, metrics counts + confusion matrix shape,
errors excluded from confusion, cache round-trip + model scoping,
missing/corrupt file handling.

.gitignore adds .cat-eval-cache.json and cat-eval-report.json.

Follow-ups (MR 5c / later): opening_hours and description scoring.
Both need fuzzier matching (regex structure vs LLM judge) which is
its own design problem.
2026-04-24 12:44:26 +02:00
cf5408ab66 feat(discovery): eval harness for the AI similarity classifier
Ship 2 MR 5. Adds a CLI that measures MistralSimilarityClassifier
against a labelled fixture: precision, recall, F1, accuracy, plus a
confidence calibration table so we can tell whether "90% confident"
verdicts are actually right 90% of the time.

Usage: go run ./backend/cmd/discovery-eval -fixture ... -cache ...
-threshold 0.8 -report eval-report.json.

Structure
- main.go: arg parsing + wiring (ai.Client, classifier, cache,
  metrics). The work happens in realMain() which returns an exit code
  — keeps defers running on error paths.
- fixture.go: parses labelled pairs JSON. Fixture authors only need to
  fill in name/stadt/year; name_normalized falls back to name when
  omitted.
- cache.go: file-backed map keyed by SimilarityPairKey + model string.
  Symmetric (a,b) == (b,a). Atomic writes (temp file + rename) so a
  crashed run cannot corrupt the cache. Corrupt-file load returns an
  empty usable cache and reports the parse error.
- run.go: executes each pair through the classifier, populating the
  cache. Individual classify errors are downgraded to "not correct"
  and logged — the run always finishes so the operator sees whatever
  data is available.
- metrics.go: confusion matrix, P/R/F1/accuracy, per-confidence-
  bucket calibration ([0-0.5), [0.5-0.75), [0.75-0.9), [0.9-1.0]).
  Prints human summary + surfaces highest-confidence mismatches
  first (most actionable for prompt iteration). Optional JSON report.
- Threshold gate: -threshold N exits non-zero when F1<N. Default 0
  (gating disabled until we have a baseline F1).

Fixture: seeds 15 hand-crafted DACH-market pairs covering the edge
cases we actually care about — umlaut drift (Straßburg/Strassburg),
year difference on a recurring series, word-reordering, distinct
events at the same venue, historical proper names (Striezelmarkt),
same city with multiple distinct Christmas markets. Operator extends
over time; each pair carries a `note` explaining the case it locks.

.gitignore adds .eval-cache.json and eval-report.json — neither
should land in the repo.

Tests cover metrics edge cases (all correct, imbalanced,
no-positive-predictions-no-NaN, calibration bucket assignment,
cache accounting, empty input) and cache behaviour (round-trip,
symmetric lookup, model-scoped invalidation, missing/corrupt file
handling, atomic-write leaves no temp files).

Out of scope for MR 5: enrichment field accuracy (fuzzy text
scoring is its own problem — tracked for a follow-up), CI wiring
(needs a baseline F1 first).
2026-04-24 12:26:18 +02:00
3add4fb7ad refactor(discovery): remove Mistral Pass 0 path; /crawl is canonical
Deletes agent_client.go, agent_client_test.go, and the discovery-compare
diagnostic CLI. Removes Tick/PickBuckets/processOneBucket/processBucketResponse
from Service; renames NewServiceWithCrawler to NewService. Drops BatchSize,
ForwardMonths, AgentDiscovery config fields and their env reads. PickStaleBuckets
and UpdateBucketQueried removed from Repository interface (no callers). Stats
hardcodes forwardMonths=12. /tick route removed; /crawl is now the only machine
path, still protected by requireTickToken middleware.
2026-04-18 17:42:30 +02:00
7c8a8c6419 fix(discovery): review follow-ups — konfidenz signal, end-date default, determinism, rate-limit=0
- Service.Crawl derives Konfidenz from merged source count + rank instead of
  hardcoded "mittel". Two+ sources -> "hoch"; single curated source ->
  "mittel"; single suendenfrei (prose regex) -> "niedrig".
- New AgentStatus constant "crawler" replaces "bestaetigt" for crawler rows
  so the validator's agent-specific rules don't fire on them and operators
  can filter the queue by origin. Added Konfidenz* and AgentStatus*
  constants to model.go.
- Default EndDatum to StartDatum when a source reports a single date
  (festival_alarm one-day events, suendenfrei lines without a "bis" range).
  Avoids Service.Accept rejecting nil-EndDatum rows.
- Sort PerSource names before assembling raw events for merge — makes
  merged output order deterministic across runs.
- NewHandler: manualRateLimitPerHour <= 0 now explicitly disables the
  rate limit (previously silently floored to 1/hour). Documented behavior
  for all three cases in a constructor comment.
- Added four new tests for Service.Crawl failure/quality paths:
  LinkCheckFailed, DedupedQueue, EndDatum default, multi-source Konfidenz.
- Documented the substring-match approximation in
  cmd/discovery-compare/main.go's groupCrawlerByBucket — diagnostic-only,
  not safe for production routing.
2026-04-18 16:35:26 +02:00
c5a4bc441c feat(cmd): discovery-compare CLI for pre-cutover coverage verification 2026-04-18 16:08:48 +02:00
580b9d5e3c feat: add admin panel, market submissions, and email notifications
- Admin CRUD endpoints for markets with role-based middleware
- Anonymous market submission with Cloudflare Turnstile verification
- SMTP email notifications on new submissions (LogSender fallback)
- Market status workflow (pending/approved/rejected) with admin notes
- Nullable location column for submissions without coordinates
- CLI tool for promoting users to admin role
- Slug generation package extracted from seed
- Rate limiting on submission endpoint (3/hour per IP)
- Mailpit added to docker-compose for local email testing
2026-02-27 11:03:44 +01:00
cb2e8c4cde fix(seed): gofmt struct field alignment 2026-02-22 19:13:53 +01:00
78046848f2 feat(seed): enrich markets data with real web-researched info
- Reduced from 312 to 272 markets (removed 39 unverified/unconfirmed entries)
- All 272 markets now have real descriptions sourced from official websites
- Added admission prices (adult/child cents + notes) where available
- Added street addresses and corrected venue names
- Added opening hours for markets where published
- Updated websites to full https:// URLs
- Updated seedMarket struct with new fields: Street, Description,
  AdmissionAdultCents, AdmissionChildCents, AdmissionNotes
- Seed INSERT now uses description/admission from JSON instead of
  generating template descriptions; falls back to generated desc if empty
- Added jsonString() helper for SQL-safe JSON encoding
2026-02-22 19:09:21 +01:00
549df60f09 fix(seed): resolve lint issues (noctx, goconst) 2026-02-22 11:41:33 +01:00
993bab1218 fix(seed): skip seeding if database already contains markets 2026-02-22 11:39:26 +01:00
ffaee89243 feat(seed): add 311 real medieval markets for 2026
Scrape marktkalendarium.de for 2026 market data and replace the
placeholder seed with real events across DE/AT/CH.

- Embed 311 markets as JSON with name, dates, city, zip, venue,
  organizer and website
- Geocode coordinates via Nominatim with caching and rate limiting
- Auto-derive Bundesland from postal code prefix
- Generate descriptions based on event type keywords
- Support DATABASE_URL env var for direct production seeding
2026-02-22 11:38:14 +01:00
3145dba255 fix(lint): resolve all golangci-lint v2 issues
- Disable revive exported/package-comments rules (style, not correctness)
- Use errors.Is instead of == for pgx.ErrNoRows comparisons
- Use errors.As instead of type assertion on validator errors
- Use http.NewRequestWithContext instead of client.Get (noctx)
- Check resp.Body.Close error return (errcheck)
- Run gofmt on files with formatting drift
2026-02-22 10:09:46 +01:00
a1d93f7a8e feat: implement MVP backend API
Go backend with Gin, pgx, Valkey (go-valkey), and PostGIS.

Domains:
- Market search with PostGIS geo-queries (ST_DWithin, ST_Distance),
  German full-text search (tsvector + ILIKE fallback for compound words),
  date range filtering, pagination, and slug-based detail endpoint
- Auth with email+password (bcrypt), JWT access tokens (15min),
  session tokens (30d, dual Valkey+Postgres storage), OAuth
  (Google/GitHub/Facebook), magic links, and TOTP 2FA
- User profile with CRUD, soft-delete (30d grace), and restore

Infrastructure:
- 6 database migrations (users, sessions, oauth_accounts, magic_links,
  markets with PostGIS+FTS, totp_secrets)
- Middleware: recovery, request ID, structured logging (slog), CORS,
  per-IP rate limiting, JWT auth
- Seed data: 10 medieval markets across DACH region
- Docker Compose (PostGIS 17 + Valkey 8), multi-stage Dockerfile,
  Woodpecker CI pipeline, Kubernetes manifests
- Justfile, golangci-lint config, env example
2026-02-18 05:52:20 +01:00