# BincioActivity — Context for Claude ## What this project is BincioActivity is a federated, open-source, self-hosted activity stats platform (think personal Strava). Two-stage pipeline: 1. **`bincio extract`** (Python): GPX/FIT/TCX → BAS JSON data store 2. **`bincio render`** (Astro/Node): BAS data store → static website The BAS (BincioActivity Schema) JSON files are the federation protocol. Anyone can publish their data as BAS JSON and others can include it. ## Key design decisions - **No database, no server** — everything is static files - **Python with uv** for the extract stage - **Astro + Svelte + Tailwind + MapLibre GL + Observable Plot** for the site - **Haversine** (not geopy) for distance calculations (10x faster) - **Worker initializer pattern** for ProcessPoolExecutor — large shared data (strava_lookup dict, known_hashes frozenset) is sent once per worker via `initializer=`, not once per task - **BAS activity IDs** always use UTC with Z suffix for URL safety - **TCX files** from Garmin use both `http://` and `https://` namespace URIs — parser handles both ## User's data - Source: `~/src/cycling_data_davide/` - `activities/` — Strava export (GPX, FIT, TCX, all with .gz variants) - `Karoo_2026/` — recent Karoo device FIT files - `Karoo/` — older Karoo FIT files - `activities.csv` — Strava metadata (names, descriptions, gear) - Extracted output: `~/bincio_data/` (or `/tmp/bincio_test/` for testing) - ~3,200 input files → ~2,082 unique activities after dedup - Date range: 2014–2026 ## Project structure ``` bincio/ Python package extract/ models.py DataPoint, ParsedActivity, LapData parsers/ GPX, FIT, TCX parsers + factory sport.py sport name normalisation metrics.py haversine-based stats computation (single pass) timeseries.py downsample to 1Hz, build BAS timeseries object simplify.py RDP track simplification → GeoJSON dedup.py exact (hash) + near-duplicate detection strava_csv.py Strava activities.csv importer writer.py BAS JSON + GeoJSON writer config.py extract_config.yaml loader cli.py `bincio extract` CLI render/ cli.py `bincio render` CLI (symlinks data, runs astro build/dev) schema/ bas-v1.schema.json JSON Schema for BAS SCHEMA.md Human-readable BAS spec site/ Astro project src/ layouts/Base.astro pages/ index.astro Activity feed (loads index.json client-side) activity/[id].astro Single activity (SSG, loads detail JSON client-side) stats/index.astro Heatmap + year totals components/ ActivityFeed.svelte Card grid, sport filter, pagination ActivityDetail.svelte Map + stats + charts wrapper ActivityMap.svelte MapLibre GL (gradient track, linked hover dot) ActivityCharts.svelte Observable Plot (elevation/speed/HR/cadence/power tabs) StatsView.svelte Yearly heatmap + totals lib/ types.ts BAS TypeScript types format.ts formatDistance, formatDuration, sportIcon, etc. ``` ## How to run ```bash # Extract cd ~/src/bincio_activity uv run bincio extract --input ~/src/cycling_data_davide/activities --output /tmp/bincio_test # Site dev server cd site ln -sf /tmp/bincio_test public/data # symlink data BINCIO_DATA_DIR=/tmp/bincio_test npm run dev # Tests uv run pytest ``` ## MapLibre GL + Vite/Astro — known gotchas Learnt the hard way during debugging (March 2026): - **`maplibregl.workerUrl = ...` is the v3 API and silently no-ops in v4+.** The v5 API is `maplibregl.setWorkerUrl(url)`, but you don't need it at all in a normal Vite environment — MapLibre handles the blob worker automatically. - **`optimizeDeps: { exclude: ['maplibre-gl'] }` breaks tile loading.** It prevents Vite from converting MapLibre's UMD bundle to ESM. The UMD bundle uses AMD `define()` internally; served raw, the tile worker blob fails silently → black map, no tiles. The correct setting is `include: ['maplibre-gl']`. - **`build.target: 'es2022'` (and `optimizeDeps.esbuildOptions.target`) is required.** MapLibre's dependencies use ES2022 class field syntax. If esbuild downgrades it, helpers like `__publicField` aren't available inside the serialised worker blob scope → tile loading fails. This is a known upstream issue (maplibre-gl-js #6680). - **Use static imports, not dynamic `await import('maplibre-gl')`, when possible.** With `client:only="svelte"` in Astro, SSR never runs for the component so there is no `window is not defined` risk. Static import lets Vite pre-bundle correctly. - **Use `client:only="svelte"` (not `client:load`) for the activity detail page.** `client:load` does SSR + hydration; complex interactive components with MapLibre can hit hydration mismatch issues. `client:only` mounts fresh on the client only. - **MapLibre v5 requires explicit `center` and `zoom` in the Map constructor.** v4 silently defaulted to `center: [0,0], zoom: 0`. v5 leaves internal projection state undefined → `Cannot read properties of undefined (reading 'lng')` crashes on any operation that touches coordinates (markers, resize, render). Always pass `center` and `zoom` even if you plan to `fitBounds` later. - **MapLibre v5 requires `setLngLat()` on markers before `.addTo(map)`.** v4 tolerated markers without coordinates. v5 calls `Marker._update()` inside `addTo()`, which needs valid lngLat → same `'lng'` crash. Set a dummy `[0, 0]` if the real position arrives later (e.g. hover markers). ## Observable Plot — known gotchas - **Curve names are hyphenated, not camelCase.** Use `"monotone-x"`, not `"monotoneX"`. Plot uses its own curve name registry (not raw d3 identifiers). Wrong names throw `unknown curve` at runtime. The working `astro.config.mjs` Vite section: ```js vite: { optimizeDeps: { include: ['maplibre-gl'], esbuildOptions: { target: 'es2022' }, }, build: { target: 'es2022' }, }, ``` ## StatsView heatmap — colour intensity scaling Two approaches have been tried. The **active one is percentile-based** (preferred for now). ### Option A — Linear / max-relative (simpler, currently inactive) ```ts $: maxDailyKm = Math.max(...[...byDate.values()].map(v => v / 1000), 1); // inside cellColors loop: const km = total / 1000; const intensity = Math.min(0.12 + (km / maxDailyKm) * 0.88, 1.0); ``` - Busiest day = full brightness; all others scale linearly against it. - Intuitive: you can visually read "this day was ~50% of my biggest day". - Downside: one outlier (e.g. a 250 km day) compresses everything else into near-darkness. Cross-sport comparison is unfair (10 km run vs 10 km cycling look very different even when filtered to a single sport). - Legend shows actual max km: `More (X km max)`. ### Option B — Percentile rank (active) ```ts $: sortedDaily = [...byDate.values()].sort((a, b) => a - b); function pctRank(value: number, sorted: number[]): number { if (!sorted.length) return 0; let lo = 0, hi = sorted.length; while (lo < hi) { const mid = (lo + hi) >> 1; if (sorted[mid] <= value) lo = mid + 1; else hi = mid; } return lo / sorted.length; } // inside cellColors loop: const intensity = 0.12 + pctRank(total, sortedDaily) * 0.88; ``` - Each day is ranked against all other active days; the laziest active day = intensity 0.12, the busiest = 1.0. The colour scale spreads evenly regardless of km gaps. - GitHub-contribution-graph style: easy to see "busy vs quiet" relative to your own habits. - Downside: absolute effort is not visible. A 5 km walk and a 200 km ride can look the same if they're both 95th-percentile days for their respective sports. - Legend says `More (percentile · max X km)` to hint at both dimensions. ### Shared infrastructure - Blended colours: in "All" sport view, each cell's RGB is a weighted average of sport colours by distance that day. - `applyIntensity(hex, t)`: lerps from zinc-800 (#27272a = 39,39,42) to the target colour, so dim cells fade into the background rather than going black. - `$: cellColors = Map` — precomputed reactively so Svelte detects the dependency change when the sport filter or scale method changes (plain function calls with static args don't trigger Svelte re-renders). ## ActivityCharts — controls and athlete zones `ActivityCharts.svelte` renders Observable Plot charts for the activity detail page. ### Chart controls - **Metric tabs**: Elevation · Speed · Heart Rate · Cadence · Power - **Chart type toggle** (right-aligned): `↗ Line` | `▭ Hist` - **X-axis toggle** (line mode only, shown when speed data present): `Time` | `Dist` - Distance is integrated from `speed_kmh` at 1 Hz — no extra data needed. - **Histogram controls** (visible only in histogram mode): - **Dual range slider** — trims the x domain; two overlapping `` with CSS track highlight. - **Bins slider** — exact bin count using explicit evenly-spaced thresholds (not d3's "nice" count, which ignores narrow ranges). ### Athlete zones Zones are configured in `extract_config.yaml` under `athlete:` and written into `index.json` at extract time (`owner.athlete`). The Astro activity page reads them from the index and passes them down: `[id].astro` → `ActivityDetail` → `ActivityCharts`. When viewing HR or Power in histogram mode, zone boundaries are drawn as dashed vertical rules with Z1–Z5/Z7 labels at the top of the chart. Labels and rules are clipped to the current trim range automatically. Zone color palettes: - HR (5 zones): `#60a5fa #4ade80 #facc15 #fb923c #f87171` - Power (7 zones): `#60a5fa #34d399 #facc15 #fb923c #f87171 #c084fc #f43f5e` ### Zone calculation reference (Coggan) | Zone | HR (% max HR) | Power (% FTP) | |------|--------------|---------------| | Z1 | < 55% | < 55% | | Z2 | 55–75% | 55–75% | | Z3 | 75–87% | 75–90% | | Z4 | 87–93% | 90–105% | | Z5 | > 93% | 105–120% | | Z6 | — | 120–150% | | Z7 | — | > 150% | ## Activity sidecar edits — design spec Users edit activities via **sidecar markdown files** that live alongside BAS JSON in the data dir. No database, no server — consistent with the project's static-files-only philosophy. ### File naming ``` ~/bincio_data/ 2024-05-15T10:30:00Z_cycling.json ← immutable extract output (never touched) 2024-05-15T10:30:00Z_cycling.md ← user edits (sidecar) ``` Same stem as the JSON, `.md` extension. `bincio extract` never writes `.md` files, so re-running extract is always safe and will never clobber user edits. ### Sidecar format YAML frontmatter + optional Markdown body: ```markdown --- title: "Epic climb up Monte Grappa" sport: cycling # override detected sport hide_stats: [cadence] # suppress specific stat panels in detail view highlight: true # pin/feature in feed (shown first, maybe badged) private: false # exclude from public feed gear: "Trek Domane" # freeform gear note --- Rode with Marco and Giulia. Legs felt great after the rest week... ``` - All frontmatter keys are optional; omit means "keep extracted value" - The Markdown body becomes the activity's `description`, rendered as HTML in the detail page - `hide_stats` takes stat panel names: `elevation`, `speed`, `heart_rate`, `cadence`, `power` ### Where overrides are applied: the render stage The **render stage** (`bincio render`) is the right place — not extract, not the browser. - Extract → clean BAS JSON (immutable) - Render → merges sidecars → Astro build consumes enriched data A `bincio.render.merge` module walks the data dir, finds `*.md` sidecars, and produces either enriched JSON files or a separate `overrides/index.json` that Astro reads at build time. The site never needs to fetch a `.md` file at runtime — all merging is build-time, keeping the static-first guarantee. ### Federation angle Sidecars work for *remote* activities too: if you include someone else's BAS feed, you can write local `.md` sidecars for their activity IDs. Your render stage applies your overrides on top of their data. This is a natural extension of the local case. ### Editing UX: drawer in Astro + `bincio edit` write API The edit UI is a **slide-in drawer** (`EditDrawer.svelte`) in the Astro site. The drawer fetches from and POSTs to the `bincio edit` FastAPI server (write API only — the server no longer serves its own HTML UI). **How it works:** ``` bincio render --serve # Astro dev server, port 4321 bincio edit --data-dir ~/… # write API only, port 4041 ``` - Edit button appears on the activity detail page **only when `PUBLIC_EDIT_URL` is set** in `site/.env` - Clicking Edit opens the drawer in the same page — no navigation, no copy-pasting IDs - Drawer fetches `GET /api/activity/{id}` to pre-fill, `POST /api/activity/{id}` to save - After save: server runs `merge_all()` automatically → Astro serves updated data immediately on refresh - Closing the drawer applies `title` + `description` changes optimistically to the local page state (no full reload required to see the text change) **`PUBLIC_EDIT_URL` as feature flag:** - **Unset** → no Edit button, no drawer. Works as a normal static site. Safe for public hosting. - **Set** (e.g. `http://localhost:4041`) → editing enabled. Lives in `site/.env` (gitignored). Each deployment opts in explicitly. **Edit server API (`bincio edit --data-dir `):** - `GET /api/activity/{id}` — current values (sidecar overrides layered on BAS JSON) - `POST /api/activity/{id}` — write sidecar `.md`, trigger `merge_all()` - `POST /api/activity/{id}/images` — multipart upload → `edits/images/{id}/{filename}` - `DELETE /api/activity/{id}/images/{filename}` — remove uploaded image **Edit drawer features:** - Title, sport dropdown, gear - Markdown textarea for description (images inserted as `![name](filename)` references) - Image drag-and-drop zone with chip list + delete - Hide stat panels (elevation, speed, heart_rate, cadence, power) — toggle buttons - Highlight flag (★ — sorts to top of feed, visual badge) - Private flag (⊘ — suppressed from index at render time) ### Image storage and serving ``` ~/bincio_data/ edits/ 2024-05-15T10:30:00Z_cycling.md images/ 2024-05-15T10:30:00Z_cycling/ col-summit.jpg group-photo.jpg ``` Images are referenced in the markdown body with relative paths: `![Summit](col-summit.jpg)`. `merge_all()` symlinks `edits/images/{id}/` → `_merged/activities/images/{id}/` so images are served at `data/activities/images/{id}/{filename}` by the Astro dev server. `ActivityDetail.svelte` rewrites relative image paths to this URL when rendering markdown. **Note:** browsers cannot display `.HEIC` files. Convert to JPEG/PNG first: `sips -s format jpeg photo.HEIC --out photo.jpg` (macOS). ### Decided - **Sidecar location**: `edits/` subdirectory — cleaner, easier to backup/sync independently - **Merge output**: `data/_merged/` — extracted data stays pristine; `public/data` → `_merged/` - **`private: true`**: suppressed from `index.json` at render time (not client-side hide) - **`highlight`**: sorts to top of feed; visual badge TBD - **Edit UI**: drawer in Astro site, `bincio edit` is a pure write API (no HTML serving) ## Athlete page — design plan ### Goal A `/athlete` page (and `/athlete/edit` drawer) giving the user: 1. **Performance analytics** — power curve (MMP), best efforts, optionally fitness/freshness 2. **Profile editing** — zones, gear (bikes/shoes), personal data — no YAML editing required ### Mean Maximal Power (MMP) curve For every duration D, the MMP is the highest average power sustained over any contiguous D-second window across all activities. Plotted on a log-scale x-axis. **Key features:** - **Time range filter**: all-time, last 30/90/365 days, or user-defined seasons - **Season overlay**: multiple seasons plotted on the same chart for comparison (e.g. "2023 vs 2024 vs 2025" — this is the primary use case) - **Durations**: a fixed log-scale set, e.g.: `1, 2, 5, 10, 15, 20, 30, 60, 120, 180, 300, 600, 1200, 1800, 3600` seconds - **Null handling**: if an activity is shorter than duration D, it contributes nothing to that point. No interpolation. The curve simply ends where data runs out. - **Modelled curve overlay** (future): 2-parameter Critical Power model fitted to the data; shows predicted W for any duration, even beyond recorded efforts. **Where to compute:** At **extract time**, each activity gets an `mmp` array: ```json "mmp": [[1, 850], [5, 720], [30, 580], [300, 340], [3600, 210]] ``` Each pair is `[duration_s, avg_watts]`. Only activities with power data get this field. The site then takes the **element-wise max** across all activities (filtered by date range). This keeps the site fully static — no server needed to render the curve. Computing MMP per activity is O(n × D) where n = timeseries length, D = number of duration points (~15). At 1 Hz, a 2-hour ride is 7200 points × 15 durations = trivial. Use a sliding window approach: for each duration d, maintain a running sum and advance the window one sample at a time. **Season definition** (user-configurable): ```yaml athlete: seasons: - name: "2025" start: "2025-01-01" end: "2025-12-31" - name: "2024" start: "2024-01-01" end: "2024-12-31" ``` If no seasons defined, the UI offers fixed presets (last 30d / 90d / 365d / all-time). ### Athlete profile editing — reusing edit infrastructure Same pattern as activity editing: ``` bincio edit --data-dir ~/bincio_data # same server, new endpoints ``` New API endpoints: - `GET /api/athlete` — current athlete config (zones, gear, display name) - `POST /api/athlete` — write `edits/athlete.yaml`, trigger `merge_all()` `edits/athlete.yaml` format: ```yaml display_name: "Davide" handle: "brutsalvadi" max_hr: 190 ftp_w: 210 hr_zones: - [0, 104] - [104, 142] - [142, 165] - [165, 176] - [176, 999] power_zones: - [0, 115] # ... gear: bikes: - name: "Trek Domane" type: cycling notes: "Road endurance" shoes: - name: "Asics GT-2000" type: running seasons: - name: "2025" start: "2025-01-01" end: "2025-12-31" ``` The server reads `extract_config.yaml` as base defaults, applies `edits/athlete.yaml` overrides on top, and writes back to `edits/athlete.yaml` on POST. The `extract_config.yaml` is never written by the server — it stays as the authoritative static config. `merge_all()` also writes athlete data into `_merged/athlete.json` which the site reads. ### AthleteDrawer.svelte (profile editing) Reuses the same drawer pattern as `EditDrawer.svelte`: - Number inputs for `max_hr`, `ftp_w` - Zone editor: table of rows `[lo, hi]` with + / − buttons; auto-fills `lo` from previous `hi` - Gear list: add/remove bikes and shoes; name + type + notes fields - Season list: add/remove date ranges with names ### Site page: `/athlete` Two tabs or sections: 1. **Performance** — MMP curve chart (Observable Plot, log x-axis), date range selector 2. **Profile** — display of current zones, gear list; Edit button opens AthleteDrawer The MMP chart uses `index.json`'s `activities` array (already loaded by the feed) — filter to power-having activities, pull their `mmp` arrays, take element-wise max per season. ### Implementation order 1. Add `mmp` computation to `metrics.py` and writer 2. Add `mmp` field to BAS schema and `types.ts` 3. Add `/api/athlete` GET+POST to the edit server 4. `merge_all()` writes `_merged/athlete.json` 5. Astro page `site/src/pages/athlete/index.astro` 6. `MmpChart.svelte` — Observable Plot line, log-scale x, multi-season overlay 7. `AthleteDrawer.svelte` — zones + gear editing form 8. Season config in `extract_config.yaml` / `edits/athlete.yaml` ## Known issues / next steps - `bincio render` Python CLI is a stub — site is built via `npm run build` directly - Activity IDs in existing test data still use `+0000` format (pre-fix); re-run extract to get `Z` format - Some activities appear with both untitled and titled IDs (near-dedup timing race) - Stats page heatmap month labels are embedded in the week-column flex grid (fixed March 2026); `getWeeks` uses `localISO()` not `toISOString()` to avoid UTC/local date mismatch - Federation (remote data sources) not yet implemented in site - Friends pages (`/friends/{handle}/`) not yet implemented - `bincio render` should automate: symlink data → `astro build` - The `site/.env` file is gitignored — document the setup for new users - Add `--workers` benchmark: on 8 cores, ~7 min for 3,200 activities first run ## What "good" looks like (not yet done) - [ ] `bincio render` Python CLI wraps `astro build` properly - [ ] Friends/federation pages in site - [ ] Athlete page: MMP power curve with season overlay - [ ] Athlete page: profile editor (zones, gear, seasons) via AthleteDrawer - [ ] MMP computation at extract time → `mmp` field in BAS JSON - [ ] Personal records page (best efforts: 5km, 10km, etc.) - [ ] Activity search / full-text filter in feed - [ ] Map thumbnail in activity cards (SVG path from GeoJSON) - [ ] GitHub Actions template for auto-publish - [ ] Karoo/Garmin Connect importers beyond Strava - [x] `bincio.render.merge` — sidecar parser, `_merged/` output, private filter, highlight sort - [x] `bincio edit` FastAPI write API (GET/POST activity, image upload/delete, triggers merge) - [x] `EditDrawer.svelte` — slide-in edit UI in the Astro site (no separate HTML from server) - [x] `PUBLIC_EDIT_URL` feature flag — unset = no edit UI, set = drawer enabled - [x] Markdown rendering in activity description with image path rewriting - [x] `hide_stats` support in activity detail stats panel - [x] ActivityCharts power tab (elevation/speed/HR/cadence/power) - [x] Chart type toggle: line ↔ histogram - [x] X-axis toggle: time ↔ distance (integrated from speed) - [x] Histogram dual range slider + bins slider (exact thresholds) - [x] Athlete zones in `extract_config.yaml` → `index.json` → chart overlays - [x] StatsView heatmap click-to-pin tooltip (Esc / click-outside to dismiss) - [ ] `bincio render --watch` incremental rebuild on sidecar/data changes - [ ] Highlight badge in activity feed cards - [ ] Image format warning (HEIC → JPEG conversion hint in the upload UI) - [ ] HR / power zone defaults from `max_hr` / `ftp_w` when explicit zones not set