# BincioActivity — Context for Claude ## What this project is BincioActivity is a federated, open-source, self-hosted activity stats platform (think personal Strava). Two-stage pipeline: 1. **`bincio extract`** (Python): GPX/FIT/TCX → BAS JSON data store 2. **`bincio render`** (Astro/Node): BAS data store → static website The BAS (BincioActivity Schema) JSON files are the federation protocol. Anyone can publish their data as BAS JSON and others can include it. ## Key design decisions - **No database, no server** — everything is static files - **Python with uv** for the extract stage - **Astro + Svelte + Tailwind + MapLibre GL + Observable Plot** for the site - **Haversine** (not geopy) for distance calculations (10x faster) - **Worker initializer pattern** for ProcessPoolExecutor — large shared data (strava_lookup dict, known_hashes frozenset) is sent once per worker via `initializer=`, not once per task - **BAS activity IDs** always use UTC with Z suffix for URL safety - **TCX files** from Garmin use both `http://` and `https://` namespace URIs — parser handles both ## User's data - Source: `~/src/cycling_data_davide/` - `activities/` — Strava export (GPX, FIT, TCX, all with .gz variants) - `Karoo_2026/` — recent Karoo device FIT files - `Karoo/` — older Karoo FIT files - `activities.csv` — Strava metadata (names, descriptions, gear) - Extracted output: `~/bincio_data/` (or `/tmp/bincio_test/` for testing) - ~3,200 input files → ~2,082 unique activities after dedup - Date range: 2014–2026 ## Project structure ``` bincio/ Python package extract/ models.py DataPoint, ParsedActivity, LapData parsers/ GPX, FIT, TCX parsers + factory sport.py sport name normalisation metrics.py haversine-based stats computation (single pass) timeseries.py downsample to 1Hz, build BAS timeseries object simplify.py RDP track simplification → GeoJSON dedup.py exact (hash) + near-duplicate detection strava_csv.py Strava activities.csv importer writer.py BAS JSON + GeoJSON writer config.py extract_config.yaml loader cli.py `bincio extract` CLI render/ cli.py `bincio render` CLI (symlinks data, runs astro build/dev) schema/ bas-v1.schema.json JSON Schema for BAS SCHEMA.md Human-readable BAS spec site/ Astro project src/ layouts/Base.astro pages/ index.astro Activity feed (loads index.json client-side) activity/[id].astro Single activity (SSG, loads detail JSON client-side) stats/index.astro Heatmap + year totals components/ ActivityFeed.svelte Card grid, sport filter, pagination ActivityDetail.svelte Map + stats + charts wrapper ActivityMap.svelte MapLibre GL (gradient track, linked hover dot) ActivityCharts.svelte Observable Plot (elevation/speed/HR/cadence tabs) StatsView.svelte Yearly heatmap + totals lib/ types.ts BAS TypeScript types format.ts formatDistance, formatDuration, sportIcon, etc. ``` ## How to run ```bash # Extract cd ~/src/bincio_activity uv run bincio extract --input ~/src/cycling_data_davide/activities --output /tmp/bincio_test # Site dev server cd site ln -sf /tmp/bincio_test public/data # symlink data BINCIO_DATA_DIR=/tmp/bincio_test npm run dev # Tests uv run pytest ``` ## MapLibre GL + Vite/Astro — known gotchas Learnt the hard way during debugging (March 2026): - **`maplibregl.workerUrl = ...` is the v3 API and silently no-ops in v4+.** The v5 API is `maplibregl.setWorkerUrl(url)`, but you don't need it at all in a normal Vite environment — MapLibre handles the blob worker automatically. - **`optimizeDeps: { exclude: ['maplibre-gl'] }` breaks tile loading.** It prevents Vite from converting MapLibre's UMD bundle to ESM. The UMD bundle uses AMD `define()` internally; served raw, the tile worker blob fails silently → black map, no tiles. The correct setting is `include: ['maplibre-gl']`. - **`build.target: 'es2022'` (and `optimizeDeps.esbuildOptions.target`) is required.** MapLibre's dependencies use ES2022 class field syntax. If esbuild downgrades it, helpers like `__publicField` aren't available inside the serialised worker blob scope → tile loading fails. This is a known upstream issue (maplibre-gl-js #6680). - **Use static imports, not dynamic `await import('maplibre-gl')`, when possible.** With `client:only="svelte"` in Astro, SSR never runs for the component so there is no `window is not defined` risk. Static import lets Vite pre-bundle correctly. - **Use `client:only="svelte"` (not `client:load`) for the activity detail page.** `client:load` does SSR + hydration; complex interactive components with MapLibre can hit hydration mismatch issues. `client:only` mounts fresh on the client only. - **MapLibre v5 requires explicit `center` and `zoom` in the Map constructor.** v4 silently defaulted to `center: [0,0], zoom: 0`. v5 leaves internal projection state undefined → `Cannot read properties of undefined (reading 'lng')` crashes on any operation that touches coordinates (markers, resize, render). Always pass `center` and `zoom` even if you plan to `fitBounds` later. - **MapLibre v5 requires `setLngLat()` on markers before `.addTo(map)`.** v4 tolerated markers without coordinates. v5 calls `Marker._update()` inside `addTo()`, which needs valid lngLat → same `'lng'` crash. Set a dummy `[0, 0]` if the real position arrives later (e.g. hover markers). ## Observable Plot — known gotchas - **Curve names are hyphenated, not camelCase.** Use `"monotone-x"`, not `"monotoneX"`. Plot uses its own curve name registry (not raw d3 identifiers). Wrong names throw `unknown curve` at runtime. The working `astro.config.mjs` Vite section: ```js vite: { optimizeDeps: { include: ['maplibre-gl'], esbuildOptions: { target: 'es2022' }, }, build: { target: 'es2022' }, }, ``` ## StatsView heatmap — colour intensity scaling Two approaches have been tried. The **active one is percentile-based** (preferred for now). ### Option A — Linear / max-relative (simpler, currently inactive) ```ts $: maxDailyKm = Math.max(...[...byDate.values()].map(v => v / 1000), 1); // inside cellColors loop: const km = total / 1000; const intensity = Math.min(0.12 + (km / maxDailyKm) * 0.88, 1.0); ``` - Busiest day = full brightness; all others scale linearly against it. - Intuitive: you can visually read "this day was ~50% of my biggest day". - Downside: one outlier (e.g. a 250 km day) compresses everything else into near-darkness. Cross-sport comparison is unfair (10 km run vs 10 km cycling look very different even when filtered to a single sport). - Legend shows actual max km: `More (X km max)`. ### Option B — Percentile rank (active) ```ts $: sortedDaily = [...byDate.values()].sort((a, b) => a - b); function pctRank(value: number, sorted: number[]): number { if (!sorted.length) return 0; let lo = 0, hi = sorted.length; while (lo < hi) { const mid = (lo + hi) >> 1; if (sorted[mid] <= value) lo = mid + 1; else hi = mid; } return lo / sorted.length; } // inside cellColors loop: const intensity = 0.12 + pctRank(total, sortedDaily) * 0.88; ``` - Each day is ranked against all other active days; the laziest active day = intensity 0.12, the busiest = 1.0. The colour scale spreads evenly regardless of km gaps. - GitHub-contribution-graph style: easy to see "busy vs quiet" relative to your own habits. - Downside: absolute effort is not visible. A 5 km walk and a 200 km ride can look the same if they're both 95th-percentile days for their respective sports. - Legend says `More (percentile · max X km)` to hint at both dimensions. ### Shared infrastructure - Blended colours: in "All" sport view, each cell's RGB is a weighted average of sport colours by distance that day. - `applyIntensity(hex, t)`: lerps from zinc-800 (#27272a = 39,39,42) to the target colour, so dim cells fade into the background rather than going black. - `$: cellColors = Map` — precomputed reactively so Svelte detects the dependency change when the sport filter or scale method changes (plain function calls with static args don't trigger Svelte re-renders). ## Activity sidecar edits — design spec Users edit activities via **sidecar markdown files** that live alongside BAS JSON in the data dir. No database, no server — consistent with the project's static-files-only philosophy. ### File naming ``` ~/bincio_data/ 2024-05-15T10:30:00Z_cycling.json ← immutable extract output (never touched) 2024-05-15T10:30:00Z_cycling.md ← user edits (sidecar) ``` Same stem as the JSON, `.md` extension. `bincio extract` never writes `.md` files, so re-running extract is always safe and will never clobber user edits. ### Sidecar format YAML frontmatter + optional Markdown body: ```markdown --- title: "Epic climb up Monte Grappa" sport: cycling # override detected sport hide_stats: [cadence] # suppress specific stat panels in detail view highlight: true # pin/feature in feed (shown first, maybe badged) private: false # exclude from public feed gear: "Trek Domane" # freeform gear note --- Rode with Marco and Giulia. Legs felt great after the rest week... ``` - All frontmatter keys are optional; omit means "keep extracted value" - The Markdown body becomes the activity's `description`, rendered as HTML in the detail page - `hide_stats` takes stat panel names: `elevation`, `speed`, `heart_rate`, `cadence`, `power` ### Where overrides are applied: the render stage The **render stage** (`bincio render`) is the right place — not extract, not the browser. - Extract → clean BAS JSON (immutable) - Render → merges sidecars → Astro build consumes enriched data A `bincio.render.merge` module walks the data dir, finds `*.md` sidecars, and produces either enriched JSON files or a separate `overrides/index.json` that Astro reads at build time. The site never needs to fetch a `.md` file at runtime — all merging is build-time, keeping the static-first guarantee. ### Federation angle Sidecars work for *remote* activities too: if you include someone else's BAS feed, you can write local `.md` sidecars for their activity IDs. Your render stage applies your overrides on top of their data. This is a natural extension of the local case. ### Editing UX: `bincio edit --serve` A separate FastAPI server (`bincio edit --serve`, default port 4041) handles all writes. The static site and Astro are untouched — no hybrid mode, no dead-code API routes in prod. **How it works:** ``` bincio edit --serve --data ~/bincio_data # starts on :4041 ``` - Serves a bundled Svelte UI (single compiled HTML, reuses existing Svelte investment) - `GET /api/activity/{id}` — returns merged BAS JSON + existing sidecar fields - `POST /api/activity/{id}` — writes `edits/{id}.md` to the data dir - `POST /api/activity/{id}/images` — multipart upload → `edits/images/{id}/{filename}` - The Astro dev server's file watcher picks up `.md` writes → incremental rebuild **Edit UI features:** - Title text input (pre-filled from BAS JSON) - Sport dropdown (pre-filled, shows all known sport types) - Markdown textarea for description, with minimal toolbar (bold, italic, link, image insert) - Live markdown preview panel - `hide_stats` checkbox group: elevation, speed, heart_rate, cadence, power - `highlight` toggle (feature in feed) - `private` toggle (suppress from feed at render time) - Image drag-and-drop zone → uploads to `edits/images/{id}/`, inserts `![]()` into textarea - Save button → POST to API → success toast **Workflow (typical):** 1. User browses the Astro dev server on :4040 2. Activity detail page has an "Edit" button (rendered only when `PUBLIC_EDIT_URL` env var is set) 3. Button links to `:4041/edit/{id}` — opens the FastAPI-served edit UI 4. User fills in form, saves → sidecar written → Astro rebuilds → refreshing :4040 shows changes The `PUBLIC_EDIT_URL` env var in `.env` controls whether the Edit button appears; leave it unset for production builds, set to `http://localhost:4041` for local dev. ### Image storage ``` ~/bincio_data/ edits/ 2024-05-15T10:30:00Z_cycling.md images/ 2024-05-15T10:30:00Z_cycling/ col-summit.jpg group-photo.jpg ``` Images are referenced in the markdown body with relative paths: `![Summit](col-summit.jpg)`. The render stage resolves relative image paths against `edits/images/{id}/` and copies them to `site/public/images/activities/{id}/` so they're served from the static site. ### Decided - **Sidecar location**: `edits/` subdirectory (not co-located with JSON) — cleaner, easier to backup/sync just your customisations independently of the extracted data - **`private: true`**: suppresses from `index.json` at render time (not client-side hide) — safer for public hosting - **`highlight`**: visual badge in feed + sorted before non-highlighted activities - **Edit UI**: `bincio edit --serve` FastAPI server (Option B) — not integrated into Astro ## Known issues / next steps - `bincio render` Python CLI is a stub — site is built via `npm run build` directly - Activity IDs in existing test data still use `+0000` format (pre-fix); re-run extract to get `Z` format - Some activities appear with both untitled and titled IDs (near-dedup timing race) - Stats page heatmap month labels are embedded in the week-column flex grid (fixed March 2026); `getWeeks` uses `localISO()` not `toISOString()` to avoid UTC/local date mismatch - Federation (remote data sources) not yet implemented in site - Friends pages (`/friends/{handle}/`) not yet implemented - `bincio render` should automate: symlink data → `astro build` - The `site/.env` file is gitignored — document the setup for new users - Add `--workers` benchmark: on 8 cores, ~7 min for 3,200 activities first run ## What "good" looks like (not yet done) - [ ] `bincio render` Python CLI wraps `astro build` properly - [ ] Friends/federation pages in site - [ ] Personal records page - [ ] Activity search / full-text filter in feed - [ ] Map thumbnail in activity cards (SVG path from GeoJSON) - [ ] GitHub Actions template for auto-publish - [ ] Karoo/Garmin Connect importers beyond Strava - [ ] `bincio.render.merge` module: walk `edits/`, parse sidecars, produce enriched data for Astro - [ ] `bincio render --watch` incremental rebuild on sidecar changes - [ ] Sidecar `.md` format: title, sport, description, hide_stats, highlight, private, images - [ ] `bincio edit --serve` FastAPI server with Svelte edit UI (port 4041) - [ ] Edit button on activity detail pages (visible when `PUBLIC_EDIT_URL` env var set) - [ ] Image upload → `edits/images/{id}/`, render stage copies to `public/images/activities/{id}/`