Files
bincio-activity/CLAUDE.md
T
2026-03-30 09:05:18 +02:00

22 KiB
Raw Blame History

BincioActivity — Context for Claude

What this project is

BincioActivity is a federated, open-source, self-hosted activity stats platform (think personal Strava). Two-stage pipeline:

  1. bincio extract (Python): GPX/FIT/TCX → BAS JSON data store
  2. bincio render (Astro/Node): BAS data store → static website

The BAS (BincioActivity Schema) JSON files are the federation protocol. Anyone can publish their data as BAS JSON and others can include it.

Key design decisions

  • No database, no server — everything is static files
  • Python with uv for the extract stage
  • Astro + Svelte + Tailwind + MapLibre GL + Observable Plot for the site
  • Haversine (not geopy) for distance calculations (10x faster)
  • Worker initializer pattern for ProcessPoolExecutor — large shared data (strava_lookup dict, known_hashes frozenset) is sent once per worker via initializer=, not once per task
  • BAS activity IDs always use UTC with Z suffix for URL safety
  • TCX files from Garmin use both http:// and https:// namespace URIs — parser handles both

User's data

  • Source: ~/src/cycling_data_davide/
    • activities/ — Strava export (GPX, FIT, TCX, all with .gz variants)
    • Karoo_2026/ — recent Karoo device FIT files
    • Karoo/ — older Karoo FIT files
    • activities.csv — Strava metadata (names, descriptions, gear)
  • Extracted output: ~/bincio_data/ (or /tmp/bincio_test/ for testing)
  • ~3,200 input files → ~2,082 unique activities after dedup
  • Date range: 20142026

Project structure

bincio/                    Python package
  extract/
    models.py              DataPoint, ParsedActivity, LapData
    parsers/               GPX, FIT, TCX parsers + factory
    sport.py               sport name normalisation
    metrics.py             haversine-based stats computation (single pass)
    timeseries.py          downsample to 1Hz, build BAS timeseries object
    simplify.py            RDP track simplification → GeoJSON
    dedup.py               exact (hash) + near-duplicate detection
    strava_csv.py          Strava activities.csv importer
    writer.py              BAS JSON + GeoJSON writer
    config.py              extract_config.yaml loader
    cli.py                 `bincio extract` CLI
  render/
    cli.py                 `bincio render` CLI (symlinks data, runs astro build/dev)
schema/
  bas-v1.schema.json       JSON Schema for BAS
SCHEMA.md                  Human-readable BAS spec
site/                      Astro project
  src/
    layouts/Base.astro
    pages/
      index.astro           Activity feed (loads index.json client-side)
      activity/[id].astro   Single activity (SSG, loads detail JSON client-side)
      stats/index.astro     Heatmap + year totals
    components/
      ActivityFeed.svelte   Card grid, sport filter, pagination
      ActivityDetail.svelte Map + stats + charts wrapper
      ActivityMap.svelte    MapLibre GL (gradient track, linked hover dot)
      ActivityCharts.svelte Observable Plot (elevation/speed/HR/cadence/power tabs)
      StatsView.svelte      Yearly heatmap + totals
    lib/
      types.ts              BAS TypeScript types
      format.ts             formatDistance, formatDuration, sportIcon, etc.

How to run

# Extract
cd ~/src/bincio_activity
uv run bincio extract --input ~/src/cycling_data_davide/activities --output /tmp/bincio_test

# Site dev server
cd site
ln -sf /tmp/bincio_test public/data   # symlink data
BINCIO_DATA_DIR=/tmp/bincio_test npm run dev

# Tests
uv run pytest

MapLibre GL + Vite/Astro — known gotchas

Learnt the hard way during debugging (March 2026):

  • maplibregl.workerUrl = ... is the v3 API and silently no-ops in v4+. The v5 API is maplibregl.setWorkerUrl(url), but you don't need it at all in a normal Vite environment — MapLibre handles the blob worker automatically.

  • optimizeDeps: { exclude: ['maplibre-gl'] } breaks tile loading. It prevents Vite from converting MapLibre's UMD bundle to ESM. The UMD bundle uses AMD define() internally; served raw, the tile worker blob fails silently → black map, no tiles. The correct setting is include: ['maplibre-gl'].

  • build.target: 'es2022' (and optimizeDeps.esbuildOptions.target) is required. MapLibre's dependencies use ES2022 class field syntax. If esbuild downgrades it, helpers like __publicField aren't available inside the serialised worker blob scope → tile loading fails. This is a known upstream issue (maplibre-gl-js #6680).

  • Use static imports, not dynamic await import('maplibre-gl'), when possible. With client:only="svelte" in Astro, SSR never runs for the component so there is no window is not defined risk. Static import lets Vite pre-bundle correctly.

  • Use client:only="svelte" (not client:load) for the activity detail page. client:load does SSR + hydration; complex interactive components with MapLibre can hit hydration mismatch issues. client:only mounts fresh on the client only.

  • MapLibre v5 requires explicit center and zoom in the Map constructor. v4 silently defaulted to center: [0,0], zoom: 0. v5 leaves internal projection state undefined → Cannot read properties of undefined (reading 'lng') crashes on any operation that touches coordinates (markers, resize, render). Always pass center and zoom even if you plan to fitBounds later.

  • MapLibre v5 requires setLngLat() on markers before .addTo(map). v4 tolerated markers without coordinates. v5 calls Marker._update() inside addTo(), which needs valid lngLat → same 'lng' crash. Set a dummy [0, 0] if the real position arrives later (e.g. hover markers).

Observable Plot — known gotchas

  • Curve names are hyphenated, not camelCase. Use "monotone-x", not "monotoneX". Plot uses its own curve name registry (not raw d3 identifiers). Wrong names throw unknown curve at runtime.

The working astro.config.mjs Vite section:

vite: {
  optimizeDeps: {
    include: ['maplibre-gl'],
    esbuildOptions: { target: 'es2022' },
  },
  build: { target: 'es2022' },
},

StatsView heatmap — colour intensity scaling

Two approaches have been tried. The active one is percentile-based (preferred for now).

Option A — Linear / max-relative (simpler, currently inactive)

$: maxDailyKm = Math.max(...[...byDate.values()].map(v => v / 1000), 1);
// inside cellColors loop:
const km = total / 1000;
const intensity = Math.min(0.12 + (km / maxDailyKm) * 0.88, 1.0);
  • Busiest day = full brightness; all others scale linearly against it.
  • Intuitive: you can visually read "this day was ~50% of my biggest day".
  • Downside: one outlier (e.g. a 250 km day) compresses everything else into near-darkness. Cross-sport comparison is unfair (10 km run vs 10 km cycling look very different even when filtered to a single sport).
  • Legend shows actual max km: More (X km max).

Option B — Percentile rank (active)

$: sortedDaily = [...byDate.values()].sort((a, b) => a - b);

function pctRank(value: number, sorted: number[]): number {
  if (!sorted.length) return 0;
  let lo = 0, hi = sorted.length;
  while (lo < hi) { const mid = (lo + hi) >> 1; if (sorted[mid] <= value) lo = mid + 1; else hi = mid; }
  return lo / sorted.length;
}

// inside cellColors loop:
const intensity = 0.12 + pctRank(total, sortedDaily) * 0.88;
  • Each day is ranked against all other active days; the laziest active day = intensity 0.12, the busiest = 1.0. The colour scale spreads evenly regardless of km gaps.
  • GitHub-contribution-graph style: easy to see "busy vs quiet" relative to your own habits.
  • Downside: absolute effort is not visible. A 5 km walk and a 200 km ride can look the same if they're both 95th-percentile days for their respective sports.
  • Legend says More (percentile · max X km) to hint at both dimensions.

Shared infrastructure

  • Blended colours: in "All" sport view, each cell's RGB is a weighted average of sport colours by distance that day.
  • applyIntensity(hex, t): lerps from zinc-800 (#27272a = 39,39,42) to the target colour, so dim cells fade into the background rather than going black.
  • $: cellColors = Map<string, string> — precomputed reactively so Svelte detects the dependency change when the sport filter or scale method changes (plain function calls with static args don't trigger Svelte re-renders).

ActivityCharts — controls and athlete zones

ActivityCharts.svelte renders Observable Plot charts for the activity detail page.

Chart controls

  • Metric tabs: Elevation · Speed · Heart Rate · Cadence · Power
  • Chart type toggle (right-aligned): ↗ Line | ▭ Hist
  • X-axis toggle (line mode only, shown when speed data present): Time | Dist
    • Distance is integrated from speed_kmh at 1 Hz — no extra data needed.
  • Histogram controls (visible only in histogram mode):
    • Dual range slider — trims the x domain; two overlapping <input type="range"> with CSS track highlight.
    • Bins slider — exact bin count using explicit evenly-spaced thresholds (not d3's "nice" count, which ignores narrow ranges).

Athlete zones

Zones are configured in extract_config.yaml under athlete: and written into index.json at extract time (owner.athlete). The Astro activity page reads them from the index and passes them down: [id].astroActivityDetailActivityCharts.

When viewing HR or Power in histogram mode, zone boundaries are drawn as dashed vertical rules with Z1Z5/Z7 labels at the top of the chart. Labels and rules are clipped to the current trim range automatically.

Zone color palettes:

  • HR (5 zones): #60a5fa #4ade80 #facc15 #fb923c #f87171
  • Power (7 zones): #60a5fa #34d399 #facc15 #fb923c #f87171 #c084fc #f43f5e

Zone calculation reference (Coggan)

Zone HR (% max HR) Power (% FTP)
Z1 < 55% < 55%
Z2 5575% 5575%
Z3 7587% 7590%
Z4 8793% 90105%
Z5 > 93% 105120%
Z6 120150%
Z7 > 150%

Activity sidecar edits — design spec

Users edit activities via sidecar markdown files that live alongside BAS JSON in the data dir. No database, no server — consistent with the project's static-files-only philosophy.

File naming

~/bincio_data/
  2024-05-15T10:30:00Z_cycling.json   ← immutable extract output (never touched)
  2024-05-15T10:30:00Z_cycling.md     ← user edits (sidecar)

Same stem as the JSON, .md extension. bincio extract never writes .md files, so re-running extract is always safe and will never clobber user edits.

Sidecar format

YAML frontmatter + optional Markdown body:

---
title: "Epic climb up Monte Grappa"
sport: cycling           # override detected sport
hide_stats: [cadence]    # suppress specific stat panels in detail view
highlight: true          # pin/feature in feed (shown first, maybe badged)
private: false           # exclude from public feed
gear: "Trek Domane"      # freeform gear note
---

Rode with Marco and Giulia. Legs felt great after the rest week...
  • All frontmatter keys are optional; omit means "keep extracted value"
  • The Markdown body becomes the activity's description, rendered as HTML in the detail page
  • hide_stats takes stat panel names: elevation, speed, heart_rate, cadence, power

Where overrides are applied: the render stage

The render stage (bincio render) is the right place — not extract, not the browser.

  • Extract → clean BAS JSON (immutable)
  • Render → merges sidecars → Astro build consumes enriched data

A bincio.render.merge module walks the data dir, finds *.md sidecars, and produces either enriched JSON files or a separate overrides/index.json that Astro reads at build time. The site never needs to fetch a .md file at runtime — all merging is build-time, keeping the static-first guarantee.

Federation angle

Sidecars work for remote activities too: if you include someone else's BAS feed, you can write local .md sidecars for their activity IDs. Your render stage applies your overrides on top of their data. This is a natural extension of the local case.

Editing UX: drawer in Astro + bincio edit write API

The edit UI is a slide-in drawer (EditDrawer.svelte) in the Astro site. The drawer fetches from and POSTs to the bincio edit FastAPI server (write API only — the server no longer serves its own HTML UI).

How it works:

bincio render --serve          # Astro dev server, port 4321
bincio edit --data-dir ~/…     # write API only, port 4041
  • Edit button appears on the activity detail page only when PUBLIC_EDIT_URL is set in site/.env
  • Clicking Edit opens the drawer in the same page — no navigation, no copy-pasting IDs
  • Drawer fetches GET /api/activity/{id} to pre-fill, POST /api/activity/{id} to save
  • After save: server runs merge_all() automatically → Astro serves updated data immediately on refresh
  • Closing the drawer applies title + description changes optimistically to the local page state (no full reload required to see the text change)

PUBLIC_EDIT_URL as feature flag:

  • Unset → no Edit button, no drawer. Works as a normal static site. Safe for public hosting.
  • Set (e.g. http://localhost:4041) → editing enabled. Lives in site/.env (gitignored). Each deployment opts in explicitly.

Edit server API (bincio edit --data-dir <dir>):

  • GET /api/activity/{id} — current values (sidecar overrides layered on BAS JSON)
  • POST /api/activity/{id} — write sidecar .md, trigger merge_all()
  • POST /api/activity/{id}/images — multipart upload → edits/images/{id}/{filename}
  • DELETE /api/activity/{id}/images/{filename} — remove uploaded image

Edit drawer features:

  • Title, sport dropdown, gear
  • Markdown textarea for description (images inserted as ![name](filename) references)
  • Image drag-and-drop zone with chip list + delete
  • Hide stat panels (elevation, speed, heart_rate, cadence, power) — toggle buttons
  • Highlight flag (★ — sorts to top of feed, visual badge)
  • Private flag (⊘ — suppressed from index at render time)

Image storage and serving

~/bincio_data/
  edits/
    2024-05-15T10:30:00Z_cycling.md
    images/
      2024-05-15T10:30:00Z_cycling/
        col-summit.jpg
        group-photo.jpg

Images are referenced in the markdown body with relative paths: ![Summit](col-summit.jpg). merge_all() symlinks edits/images/{id}/_merged/activities/images/{id}/ so images are served at data/activities/images/{id}/{filename} by the Astro dev server. ActivityDetail.svelte rewrites relative image paths to this URL when rendering markdown.

Note: browsers cannot display .HEIC files. Convert to JPEG/PNG first: sips -s format jpeg photo.HEIC --out photo.jpg (macOS).

Decided

  • Sidecar location: edits/ subdirectory — cleaner, easier to backup/sync independently
  • Merge output: data/_merged/ — extracted data stays pristine; public/data_merged/
  • private: true: suppressed from index.json at render time (not client-side hide)
  • highlight: sorts to top of feed; visual badge TBD
  • Edit UI: drawer in Astro site, bincio edit is a pure write API (no HTML serving)

Athlete page — design plan

Goal

A /athlete page (and /athlete/edit drawer) giving the user:

  1. Performance analytics — power curve (MMP), best efforts, optionally fitness/freshness
  2. Profile editing — zones, gear (bikes/shoes), personal data — no YAML editing required

Mean Maximal Power (MMP) curve

For every duration D, the MMP is the highest average power sustained over any contiguous D-second window across all activities. Plotted on a log-scale x-axis.

Key features:

  • Time range filter: all-time, last 30/90/365 days, or user-defined seasons
  • Season overlay: multiple seasons plotted on the same chart for comparison (e.g. "2023 vs 2024 vs 2025" — this is the primary use case)
  • Durations: a fixed log-scale set, e.g.: 1, 2, 5, 10, 15, 20, 30, 60, 120, 180, 300, 600, 1200, 1800, 3600 seconds
  • Null handling: if an activity is shorter than duration D, it contributes nothing to that point. No interpolation. The curve simply ends where data runs out.
  • Modelled curve overlay (future): 2-parameter Critical Power model fitted to the data; shows predicted W for any duration, even beyond recorded efforts.

Where to compute:

At extract time, each activity gets an mmp array:

"mmp": [[1, 850], [5, 720], [30, 580], [300, 340], [3600, 210]]

Each pair is [duration_s, avg_watts]. Only activities with power data get this field.

The site then takes the element-wise max across all activities (filtered by date range). This keeps the site fully static — no server needed to render the curve.

Computing MMP per activity is O(n × D) where n = timeseries length, D = number of duration points (~15). At 1 Hz, a 2-hour ride is 7200 points × 15 durations = trivial. Use a sliding window approach: for each duration d, maintain a running sum and advance the window one sample at a time.

Season definition (user-configurable):

athlete:
  seasons:
    - name: "2025"
      start: "2025-01-01"
      end:   "2025-12-31"
    - name: "2024"
      start: "2024-01-01"
      end:   "2024-12-31"

If no seasons defined, the UI offers fixed presets (last 30d / 90d / 365d / all-time).

Athlete profile editing — reusing edit infrastructure

Same pattern as activity editing:

bincio edit --data-dir ~/bincio_data    # same server, new endpoints

New API endpoints:

  • GET /api/athlete — current athlete config (zones, gear, display name)
  • POST /api/athlete — write edits/athlete.yaml, trigger merge_all()

edits/athlete.yaml format:

display_name: "Davide"
handle: "brutsalvadi"
max_hr: 190
ftp_w: 210
hr_zones:
  - [0,   104]
  - [104, 142]
  - [142, 165]
  - [165, 176]
  - [176, 999]
power_zones:
  - [0,   115]
  # ...
gear:
  bikes:
    - name: "Trek Domane"
      type: cycling
      notes: "Road endurance"
  shoes:
    - name: "Asics GT-2000"
      type: running
seasons:
  - name: "2025"
    start: "2025-01-01"
    end:   "2025-12-31"

The server reads extract_config.yaml as base defaults, applies edits/athlete.yaml overrides on top, and writes back to edits/athlete.yaml on POST. The extract_config.yaml is never written by the server — it stays as the authoritative static config.

merge_all() also writes athlete data into _merged/athlete.json which the site reads.

AthleteDrawer.svelte (profile editing)

Reuses the same drawer pattern as EditDrawer.svelte:

  • Number inputs for max_hr, ftp_w
  • Zone editor: table of rows [lo, hi] with + / buttons; auto-fills lo from previous hi
  • Gear list: add/remove bikes and shoes; name + type + notes fields
  • Season list: add/remove date ranges with names

Site page: /athlete

Two tabs or sections:

  1. Performance — MMP curve chart (Observable Plot, log x-axis), date range selector
  2. Profile — display of current zones, gear list; Edit button opens AthleteDrawer

The MMP chart uses index.json's activities array (already loaded by the feed) — filter to power-having activities, pull their mmp arrays, take element-wise max per season.

Implementation order

  1. Add mmp computation to metrics.py and writer
  2. Add mmp field to BAS schema and types.ts
  3. Add /api/athlete GET+POST to the edit server
  4. merge_all() writes _merged/athlete.json
  5. Astro page site/src/pages/athlete/index.astro
  6. MmpChart.svelte — Observable Plot line, log-scale x, multi-season overlay
  7. AthleteDrawer.svelte — zones + gear editing form
  8. Season config in extract_config.yaml / edits/athlete.yaml

Known issues / next steps

  • bincio render Python CLI is a stub — site is built via npm run build directly
  • Activity IDs in existing test data still use +0000 format (pre-fix); re-run extract to get Z format
  • Some activities appear with both untitled and titled IDs (near-dedup timing race)
  • Stats page heatmap month labels are embedded in the week-column flex grid (fixed March 2026); getWeeks uses localISO() not toISOString() to avoid UTC/local date mismatch
  • Federation (remote data sources) not yet implemented in site
  • Friends pages (/friends/{handle}/) not yet implemented
  • bincio render should automate: symlink data → astro build
  • The site/.env file is gitignored — document the setup for new users
  • Add --workers benchmark: on 8 cores, ~7 min for 3,200 activities first run

What "good" looks like (not yet done)

  • bincio render Python CLI wraps astro build properly
  • Friends/federation pages in site
  • Athlete page: MMP power curve with season overlay
  • Athlete page: profile editor (zones, gear, seasons) via AthleteDrawer
  • MMP computation at extract time → mmp field in BAS JSON
  • Personal records page (best efforts: 5km, 10km, etc.)
  • Activity search / full-text filter in feed
  • Map thumbnail in activity cards (SVG path from GeoJSON)
  • GitHub Actions template for auto-publish
  • Karoo/Garmin Connect importers beyond Strava
  • bincio.render.merge — sidecar parser, _merged/ output, private filter, highlight sort
  • bincio edit FastAPI write API (GET/POST activity, image upload/delete, triggers merge)
  • EditDrawer.svelte — slide-in edit UI in the Astro site (no separate HTML from server)
  • PUBLIC_EDIT_URL feature flag — unset = no edit UI, set = drawer enabled
  • Markdown rendering in activity description with image path rewriting
  • hide_stats support in activity detail stats panel
  • ActivityCharts power tab (elevation/speed/HR/cadence/power)
  • Chart type toggle: line ↔ histogram
  • X-axis toggle: time ↔ distance (integrated from speed)
  • Histogram dual range slider + bins slider (exact thresholds)
  • Athlete zones in extract_config.yamlindex.json → chart overlays
  • StatsView heatmap click-to-pin tooltip (Esc / click-outside to dismiss)
  • bincio render --watch incremental rebuild on sidecar/data changes
  • Highlight badge in activity feed cards
  • Image format warning (HEIC → JPEG conversion hint in the upload UI)
  • HR / power zone defaults from max_hr / ftp_w when explicit zones not set