Pass 1 and Pass 2 now detect Mistral web_search rate limits (shared with
the Pass 0 CronJob) and return a proper HTTP 429 with Retry-After: 60
instead of a generic 500 "AI research failed". Pass 2 is enrichment-only,
so rate-limits there fall through with pass1 results intact.
- pkg/ai: new shared IsRateLimit helper + DefaultRetryAfterSeconds=60.
discovery/service.go drops its local copy and imports the shared one.
- apierror.TooManyRequests now accepts an optional custom message so the
response body can include "try again in ~60s".
- market/research.go: respondRateLimited helper sets Retry-After,
downgrades the log line from ERROR to WARN (rate-limits are expected
state, not a fault), and returns 429 with a structured rate_limited
code the admin UI can key off of.
Pass 0 agents produce schema-valid but semantically wrong output: markets
claimed in the wrong bundesland, status 'bestaetigt' with a hinweis about
Vorjahresdaten, etc. The schema alone can't catch these. This validator
does, as a blocking gate before InsertDiscovered.
Checks (Pass 0 scope):
- bundesland_mismatch: agent's bundesland must equal bucket.region, with
a light normalizer for CH 'Kanton X' prefix so Phase B can refine the
Schweiz seed without a signature break.
- status_hinweis_inconsistent: if agent_status=='bestaetigt' AND hinweis
contains 'vorjahr' (case-insensitive), the agent contradicted itself.
Errors drop the market (counted as summary.validation_failed); warnings
would get merged into hinweis — no warning-level checks exist yet at
Pass 0 scope, placeholder reserved.
Phase B (research agent) checks will extend this file: oeffnungszeiten
dedup, start_datum window coverage, full quellen liveness for Pass 1.
Pass 0 splits every month into two halves (H1 = days 1-15, H2 = 16-EOM)
so each agent call fits within Mistral's 4096 max_tokens budget. The
response schema picks up richer per-market signals and dead agent URLs
get filtered before they land in the admin queue.
DB:
- 000015: add halbmonat char(2) to discovery_buckets, widen unique key,
backfill existing rows as H1 + insert H2 siblings (624 → 1248 rows).
- 000016: rename discovered_markets.extraktion → konfidenz with
best-effort value mapping (verbatim→hoch, abgeleitet→mittel); add
agent_status column.
Backend:
- model: Bucket gains Halbmonat; Pass0Bucket same. Pass0Market renames
Extraktion → Konfidenz and adds AgentStatus (JSON tag "status").
DiscoveredMarket mirrors both fields; queue-lifecycle Status column
stays distinct from agent-reported AgentStatus.
- repository: all SELECT/INSERT touched to use the new columns; picker
orders by year_month, halbmonat so H1 runs before H2 in the same
month.
- agent client: prompt now injects halbmonat and recherche_datum (today)
so the agent has explicit date context.
- link verification: new LinkChecker does concurrent HEAD (GET fallback
on 405) with a 5s timeout. FilterURLs runs before InsertDiscovered —
markets whose quellen all fail are dropped and counted as
link_check_failed in TickSummary. Failing website URLs are cleared
but don't block insert.
- Service.linkChecker is a narrow interface so tests inject a noop
stub instead of hitting the network.
Web:
- DiscoveredMarket type gains konfidenz + agent_status, drops extraktion.
- Queue column renames "Extraktion" → "Konfidenz" with three-level
coloring (hoch=emerald, mittel=amber, niedrig=red, else neutral).
- A small pill next to markt_name surfaces agent_status when it's not
"bestaetigt" — red for "abgesagt", amber for "unklar" and
"vorjahr_unbestaetigt" — so risky entries are obvious before accept.
Expanding any row in the discovery queue now reveals:
- Quellen as clickable URLs (was just a count)
- Hinweis if the agent emitted one
- Inline edit form for markt_name, stadt, bundesland, start/end date,
and website — the fields the Pass 0 agent gets wrong most often
Backend:
- PATCH /admin/discovery/queue/:id applies a partial update to pending
entries via a COALESCE-based SQL update. Only fields that were set
are written.
- Service recomputes name_normalized when markt_name or stadt change so
dedup stays consistent after edits.
- Status check ensures only 'pending' entries are mutable.
Web:
- Row state $expandedId holds at most one open drawer at a time.
- Dates round-trip through <input type="date"> using the shared
dateInputValue helper; form action converts back to RFC3339 for Go.
- Existing Accept/Reject buttons untouched — workflow is edit-then-accept.
Rate limits (Mistral web_search 429) used to get counted as hard errors,
marking the bucket as queried and bumping the Errors(24h) strip — even
though the right behavior is to wait and try again later.
Backend:
- isRateLimit() matches "rate limit" / "status 429" in the error string.
- On persistent rate-limit after one 10s retry: leave last_queried_at
unchanged (bucket stays eligible for next tick) and abort the
remainder of this tick — Mistral's web_search budget is shared, no
point hammering more buckets in the same batch.
- TickSummary gains rate_limited counter; Errors stays for real failures.
Frontend:
- Dates: RFC3339 → 'DD.MM.YYYY' German format, range rendered as
'DD.MM.YYYY – DD.MM.YYYY'.
- Queue table: cell horizontal padding, uppercase compact headers,
scrollable on narrow viewports, dark-mode variants on every color
(emerald/amber badges, link color, reject button), Region folds
bundesland||land into a single column (Land was always 'Deutschland'
for DACH anyway).
Without snake_case json tags, Go serializes fields as PascalCase (ID,
MarktName, etc.) — but the Svelte frontend reads snake_case. Every
row.id on the client was undefined, which made Svelte 5 see identical
'undefined' keys across the {#each queue as row (row.id)} loop and
throw each_key_duplicate.
Adds explicit snake_case tags to Bucket, DiscoveredMarket, and
RejectedDiscovery to match what the TypeScript types already expect.
Service listens on port 80 (target: container 8080). The CronJob was
curling :8080 directly, which isn't exposed by the Service — every tick
timed out after ~135s with "Could not connect to server".
Switch to {{ .Values.service.port }} so the template always tracks the
actual Service port.
Go's nil slice marshals as JSON null, not [], which crashed the Svelte
page's .length access on fresh installs where no discovery tick has
happened yet. Reproduced in production: /admin/discovery → 500 because
data.queue was null and {queue.length} dereferenced it.
Backend: initialize every returning slice in repository.go via
make([]T, 0) so zero rows serialize as [] consistently. Also applies to
PickStaleBuckets, ListSeriesByCity, and Stats.RecentErrors.
Web: coalesce data.queue / data.stats.recent_errors at the top of the
Svelte script with `?? []` so future nil-slice regressions don't take
the whole page down.
Surfaces CronJob health signals without needing kubectl: last tick time
(stale-amber if > 6h), buckets due now, errors in the last 24h (with an
expandable list of the most recent failing buckets), and queue size.
Also wires the previously-orphaned /admin/discovery route into the admin
sidebar next to Märkte.
- backend: new GET /admin/discovery/stats endpoint; Stats + BucketError
types; repository Stats() aggregates four counters + top 5 failing
buckets.
- web: +page.server.ts fetches stats in parallel with queue;
+page.svelte renders a 4-card strip above the queue table.
Previous deploys emitted 4 warnings on the discovery-tick Pod template
against the restricted:latest policy. Today they are warnings; if the
namespace enforcement tightens, admission will silently drop the Pod.
Pod-level: runAsNonRoot, runAsUser/runAsGroup 100 (curlimages/curl's
built-in non-root UID), seccompProfile RuntimeDefault.
Container-level: allowPrivilegeEscalation false, capabilities drop ALL.
pgx cannot implicitly encode int arg into text for the `$1 || ' month'`
concatenation pattern (error: "unable to encode 12 into text format for
text (OID 25): cannot find encode plan"). Multiplication with a known
interval works directly with the int parameter and is semantically
equivalent.
Discovered during the T19 smoke test — the tick endpoint returned 500
on every call before this fix.
Adds a batch/v1 CronJob that POSTs to /api/v1/admin/discovery/tick on a
configurable schedule (default every 4h). Wires DISCOVERY_TOKEN into the
ci-secrets Secret and projects discovery/AI env vars into the backend
Deployment.
Construct discoveryRepo, discoveryAgent, discoveryService, and
discoveryHandler in registerRoutes(); register all 4 discovery routes
on /api/v1 with bearer-token guard on /tick and admin-session guard on
queue management endpoints.
Normalizes market names for dedup matching: lowercase, umlaut expansion,
punctuation stripping, whitespace collapse, and leading/trailing filler
word removal. Guards stripping so edge fillers are preserved when the
remaining content is purely numeric (e.g. 'Markt 2026' stays 'markt 2026').
The home page dropped `plz` server-side, so /markets was called with
radius but no center (unfiltered) and the PLZ input rendered empty
after reload. +page.server.ts now reads plz, geocodes via /geocode,
and echoes plz back in searchParams for form rehydration.
Relaxes /geocode DTO + guard to accept PLZ without city — Nominatim
already supports postal-only lookups. URL lat/lon (GPS flow) take
priority over plz on tie-break; geocode failures fall through to no
geo-filter so the page always renders.
Post-process adapter-node output into a single self-contained
build/bundle.mjs (614 KB) via esbuild, then ship it on a fresh
alpine:3.21 base with just the node binary copied in. Drops the
node_modules + package manager baggage that comes with node:25-alpine.
- Add esbuild devDep + `bundle` script (scripts/bundle.mjs)
- Dockerfile: drop `deps` stage; final stage is alpine + node binary +
bundle + static client assets
- Uncompressed image: 177 MB -> 149 MB (-16%)
- Verified: /, /healthz, static assets all respond identically;
outbound TLS to api.marktvogt.de works via node's built-in CA bundle