Files
bincio-activity/CLAUDE.md
T
2026-03-29 10:37:08 +02:00

7.1 KiB
Raw Blame History

BincioActivity — Context for Claude

What this project is

BincioActivity is a federated, open-source, self-hosted activity stats platform (think personal Strava). Two-stage pipeline:

  1. bincio extract (Python): GPX/FIT/TCX → BAS JSON data store
  2. bincio render (Astro/Node): BAS data store → static website

The BAS (BincioActivity Schema) JSON files are the federation protocol. Anyone can publish their data as BAS JSON and others can include it.

Key design decisions

  • No database, no server — everything is static files
  • Python with uv for the extract stage
  • Astro + Svelte + Tailwind + MapLibre GL + Observable Plot for the site
  • Haversine (not geopy) for distance calculations (10x faster)
  • Worker initializer pattern for ProcessPoolExecutor — large shared data (strava_lookup dict, known_hashes frozenset) is sent once per worker via initializer=, not once per task
  • BAS activity IDs always use UTC with Z suffix for URL safety
  • TCX files from Garmin use both http:// and https:// namespace URIs — parser handles both

User's data

  • Source: ~/src/cycling_data_davide/
    • activities/ — Strava export (GPX, FIT, TCX, all with .gz variants)
    • Karoo_2026/ — recent Karoo device FIT files
    • Karoo/ — older Karoo FIT files
    • activities.csv — Strava metadata (names, descriptions, gear)
  • Extracted output: ~/bincio_data/ (or /tmp/bincio_test/ for testing)
  • ~3,200 input files → ~2,082 unique activities after dedup
  • Date range: 20142026

Project structure

bincio/                    Python package
  extract/
    models.py              DataPoint, ParsedActivity, LapData
    parsers/               GPX, FIT, TCX parsers + factory
    sport.py               sport name normalisation
    metrics.py             haversine-based stats computation (single pass)
    timeseries.py          downsample to 1Hz, build BAS timeseries object
    simplify.py            RDP track simplification → GeoJSON
    dedup.py               exact (hash) + near-duplicate detection
    strava_csv.py          Strava activities.csv importer
    writer.py              BAS JSON + GeoJSON writer
    config.py              extract_config.yaml loader
    cli.py                 `bincio extract` CLI
  render/
    cli.py                 `bincio render` CLI (symlinks data, runs astro build/dev)
schema/
  bas-v1.schema.json       JSON Schema for BAS
SCHEMA.md                  Human-readable BAS spec
site/                      Astro project
  src/
    layouts/Base.astro
    pages/
      index.astro           Activity feed (loads index.json client-side)
      activity/[id].astro   Single activity (SSG, loads detail JSON client-side)
      stats/index.astro     Heatmap + year totals
    components/
      ActivityFeed.svelte   Card grid, sport filter, pagination
      ActivityDetail.svelte Map + stats + charts wrapper
      ActivityMap.svelte    MapLibre GL (gradient track, linked hover dot)
      ActivityCharts.svelte Observable Plot (elevation/speed/HR/cadence tabs)
      StatsView.svelte      Yearly heatmap + totals
    lib/
      types.ts              BAS TypeScript types
      format.ts             formatDistance, formatDuration, sportIcon, etc.

How to run

# Extract
cd ~/src/bincio_activity
uv run bincio extract --input ~/src/cycling_data_davide/activities --output /tmp/bincio_test

# Site dev server
cd site
ln -sf /tmp/bincio_test public/data   # symlink data
BINCIO_DATA_DIR=/tmp/bincio_test npm run dev

# Tests
uv run pytest

MapLibre GL + Vite/Astro — known gotchas

Learnt the hard way during debugging (March 2026):

  • maplibregl.workerUrl = ... is the v3 API and silently no-ops in v4+. The v5 API is maplibregl.setWorkerUrl(url), but you don't need it at all in a normal Vite environment — MapLibre handles the blob worker automatically.

  • optimizeDeps: { exclude: ['maplibre-gl'] } breaks tile loading. It prevents Vite from converting MapLibre's UMD bundle to ESM. The UMD bundle uses AMD define() internally; served raw, the tile worker blob fails silently → black map, no tiles. The correct setting is include: ['maplibre-gl'].

  • build.target: 'es2022' (and optimizeDeps.esbuildOptions.target) is required. MapLibre's dependencies use ES2022 class field syntax. If esbuild downgrades it, helpers like __publicField aren't available inside the serialised worker blob scope → tile loading fails. This is a known upstream issue (maplibre-gl-js #6680).

  • Use static imports, not dynamic await import('maplibre-gl'), when possible. With client:only="svelte" in Astro, SSR never runs for the component so there is no window is not defined risk. Static import lets Vite pre-bundle correctly.

  • Use client:only="svelte" (not client:load) for the activity detail page. client:load does SSR + hydration; complex interactive components with MapLibre can hit hydration mismatch issues. client:only mounts fresh on the client only.

  • MapLibre v5 requires explicit center and zoom in the Map constructor. v4 silently defaulted to center: [0,0], zoom: 0. v5 leaves internal projection state undefined → Cannot read properties of undefined (reading 'lng') crashes on any operation that touches coordinates (markers, resize, render). Always pass center and zoom even if you plan to fitBounds later.

  • MapLibre v5 requires setLngLat() on markers before .addTo(map). v4 tolerated markers without coordinates. v5 calls Marker._update() inside addTo(), which needs valid lngLat → same 'lng' crash. Set a dummy [0, 0] if the real position arrives later (e.g. hover markers).

Observable Plot — known gotchas

  • Curve names are hyphenated, not camelCase. Use "monotone-x", not "monotoneX". Plot uses its own curve name registry (not raw d3 identifiers). Wrong names throw unknown curve at runtime.

The working astro.config.mjs Vite section:

vite: {
  optimizeDeps: {
    include: ['maplibre-gl'],
    esbuildOptions: { target: 'es2022' },
  },
  build: { target: 'es2022' },
},

Known issues / next steps

  • bincio render Python CLI is a stub — site is built via npm run build directly
  • Activity IDs in existing test data still use +0000 format (pre-fix); re-run extract to get Z format
  • Some activities appear with both untitled and titled IDs (near-dedup timing race)
  • Stats page heatmap month labels use absolute positioning and may misalign
  • Federation (remote data sources) not yet implemented in site
  • Friends pages (/friends/{handle}/) not yet implemented
  • bincio render should automate: symlink data → astro build
  • The site/.env file is gitignored — document the setup for new users
  • Add --workers benchmark: on 8 cores, ~7 min for 3,200 activities first run

What "good" looks like (not yet done)

  • bincio render Python CLI wraps astro build properly
  • Friends/federation pages in site
  • Personal records page
  • Activity search / full-text filter in feed
  • Map thumbnail in activity cards (SVG path from GeoJSON)
  • GitHub Actions template for auto-publish
  • Karoo/Garmin Connect importers beyond Strava