Files
bincio-activity/CLAUDE.md
T
2026-03-29 10:37:08 +02:00

168 lines
7.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# BincioActivity — Context for Claude
## What this project is
BincioActivity is a federated, open-source, self-hosted activity stats platform
(think personal Strava). Two-stage pipeline:
1. **`bincio extract`** (Python): GPX/FIT/TCX → BAS JSON data store
2. **`bincio render`** (Astro/Node): BAS data store → static website
The BAS (BincioActivity Schema) JSON files are the federation protocol.
Anyone can publish their data as BAS JSON and others can include it.
## Key design decisions
- **No database, no server** — everything is static files
- **Python with uv** for the extract stage
- **Astro + Svelte + Tailwind + MapLibre GL + Observable Plot** for the site
- **Haversine** (not geopy) for distance calculations (10x faster)
- **Worker initializer pattern** for ProcessPoolExecutor — large shared data
(strava_lookup dict, known_hashes frozenset) is sent once per worker via
`initializer=`, not once per task
- **BAS activity IDs** always use UTC with Z suffix for URL safety
- **TCX files** from Garmin use both `http://` and `https://` namespace URIs —
parser handles both
## User's data
- Source: `~/src/cycling_data_davide/`
- `activities/` — Strava export (GPX, FIT, TCX, all with .gz variants)
- `Karoo_2026/` — recent Karoo device FIT files
- `Karoo/` — older Karoo FIT files
- `activities.csv` — Strava metadata (names, descriptions, gear)
- Extracted output: `~/bincio_data/` (or `/tmp/bincio_test/` for testing)
- ~3,200 input files → ~2,082 unique activities after dedup
- Date range: 20142026
## Project structure
```
bincio/ Python package
extract/
models.py DataPoint, ParsedActivity, LapData
parsers/ GPX, FIT, TCX parsers + factory
sport.py sport name normalisation
metrics.py haversine-based stats computation (single pass)
timeseries.py downsample to 1Hz, build BAS timeseries object
simplify.py RDP track simplification → GeoJSON
dedup.py exact (hash) + near-duplicate detection
strava_csv.py Strava activities.csv importer
writer.py BAS JSON + GeoJSON writer
config.py extract_config.yaml loader
cli.py `bincio extract` CLI
render/
cli.py `bincio render` CLI (symlinks data, runs astro build/dev)
schema/
bas-v1.schema.json JSON Schema for BAS
SCHEMA.md Human-readable BAS spec
site/ Astro project
src/
layouts/Base.astro
pages/
index.astro Activity feed (loads index.json client-side)
activity/[id].astro Single activity (SSG, loads detail JSON client-side)
stats/index.astro Heatmap + year totals
components/
ActivityFeed.svelte Card grid, sport filter, pagination
ActivityDetail.svelte Map + stats + charts wrapper
ActivityMap.svelte MapLibre GL (gradient track, linked hover dot)
ActivityCharts.svelte Observable Plot (elevation/speed/HR/cadence tabs)
StatsView.svelte Yearly heatmap + totals
lib/
types.ts BAS TypeScript types
format.ts formatDistance, formatDuration, sportIcon, etc.
```
## How to run
```bash
# Extract
cd ~/src/bincio_activity
uv run bincio extract --input ~/src/cycling_data_davide/activities --output /tmp/bincio_test
# Site dev server
cd site
ln -sf /tmp/bincio_test public/data # symlink data
BINCIO_DATA_DIR=/tmp/bincio_test npm run dev
# Tests
uv run pytest
```
## MapLibre GL + Vite/Astro — known gotchas
Learnt the hard way during debugging (March 2026):
- **`maplibregl.workerUrl = ...` is the v3 API and silently no-ops in v4+.**
The v5 API is `maplibregl.setWorkerUrl(url)`, but you don't need it at all in a
normal Vite environment — MapLibre handles the blob worker automatically.
- **`optimizeDeps: { exclude: ['maplibre-gl'] }` breaks tile loading.**
It prevents Vite from converting MapLibre's UMD bundle to ESM. The UMD bundle
uses AMD `define()` internally; served raw, the tile worker blob fails silently →
black map, no tiles. The correct setting is `include: ['maplibre-gl']`.
- **`build.target: 'es2022'` (and `optimizeDeps.esbuildOptions.target`) is required.**
MapLibre's dependencies use ES2022 class field syntax. If esbuild downgrades it,
helpers like `__publicField` aren't available inside the serialised worker blob
scope → tile loading fails. This is a known upstream issue (maplibre-gl-js #6680).
- **Use static imports, not dynamic `await import('maplibre-gl')`, when possible.**
With `client:only="svelte"` in Astro, SSR never runs for the component so there is
no `window is not defined` risk. Static import lets Vite pre-bundle correctly.
- **Use `client:only="svelte"` (not `client:load`) for the activity detail page.**
`client:load` does SSR + hydration; complex interactive components with MapLibre
can hit hydration mismatch issues. `client:only` mounts fresh on the client only.
- **MapLibre v5 requires explicit `center` and `zoom` in the Map constructor.**
v4 silently defaulted to `center: [0,0], zoom: 0`. v5 leaves internal projection
state undefined → `Cannot read properties of undefined (reading 'lng')` crashes
on any operation that touches coordinates (markers, resize, render). Always pass
`center` and `zoom` even if you plan to `fitBounds` later.
- **MapLibre v5 requires `setLngLat()` on markers before `.addTo(map)`.**
v4 tolerated markers without coordinates. v5 calls `Marker._update()` inside
`addTo()`, which needs valid lngLat → same `'lng'` crash. Set a dummy `[0, 0]`
if the real position arrives later (e.g. hover markers).
## Observable Plot — known gotchas
- **Curve names are hyphenated, not camelCase.**
Use `"monotone-x"`, not `"monotoneX"`. Plot uses its own curve name registry
(not raw d3 identifiers). Wrong names throw `unknown curve` at runtime.
The working `astro.config.mjs` Vite section:
```js
vite: {
optimizeDeps: {
include: ['maplibre-gl'],
esbuildOptions: { target: 'es2022' },
},
build: { target: 'es2022' },
},
```
## Known issues / next steps
- `bincio render` Python CLI is a stub — site is built via `npm run build` directly
- Activity IDs in existing test data still use `+0000` format (pre-fix); re-run extract to get `Z` format
- Some activities appear with both untitled and titled IDs (near-dedup timing race)
- Stats page heatmap month labels use absolute positioning and may misalign
- Federation (remote data sources) not yet implemented in site
- Friends pages (`/friends/{handle}/`) not yet implemented
- `bincio render` should automate: symlink data → `astro build`
- The `site/.env` file is gitignored — document the setup for new users
- Add `--workers` benchmark: on 8 cores, ~7 min for 3,200 activities first run
## What "good" looks like (not yet done)
- [ ] `bincio render` Python CLI wraps `astro build` properly
- [ ] Friends/federation pages in site
- [ ] Personal records page
- [ ] Activity search / full-text filter in feed
- [ ] Map thumbnail in activity cards (SVG path from GeoJSON)
- [ ] GitHub Actions template for auto-publish
- [ ] Karoo/Garmin Connect importers beyond Strava