diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..d4e741c --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,161 @@ +# BincioActivity — Context for Claude + +## What this project is + +BincioActivity is a federated, open-source, self-hosted activity stats platform +(think personal Strava). Two-stage pipeline: + +1. **`bincio extract`** (Python): GPX/FIT/TCX → BAS JSON data store +2. **`bincio render`** (Astro/Node): BAS data store → static website + +The BAS (BincioActivity Schema) JSON files are the federation protocol. +Anyone can publish their data as BAS JSON and others can include it. + +## Key design decisions + +- **No database, no server** — everything is static files +- **Python with uv** for the extract stage +- **Astro + Svelte + Tailwind + MapLibre GL + Observable Plot** for the site +- **Haversine** (not geopy) for distance calculations (10x faster) +- **Worker initializer pattern** for ProcessPoolExecutor — large shared data + (strava_lookup dict, known_hashes frozenset) is sent once per worker via + `initializer=`, not once per task +- **BAS activity IDs** always use UTC with Z suffix for URL safety +- **TCX files** from Garmin use both `http://` and `https://` namespace URIs — + parser handles both + +## User's data + +- Source: `~/src/cycling_data_davide/` + - `activities/` — Strava export (GPX, FIT, TCX, all with .gz variants) + - `Karoo_2026/` — recent Karoo device FIT files + - `Karoo/` — older Karoo FIT files + - `activities.csv` — Strava metadata (names, descriptions, gear) +- Extracted output: `~/bincio_data/` (or `/tmp/bincio_test/` for testing) +- ~3,200 input files → ~2,082 unique activities after dedup +- Date range: 2014–2026 + +## Project structure + +``` +bincio/ Python package + extract/ + models.py DataPoint, ParsedActivity, LapData + parsers/ GPX, FIT, TCX parsers + factory + sport.py sport name normalisation + metrics.py haversine-based stats computation (single pass) + timeseries.py downsample to 1Hz, build BAS timeseries object + simplify.py RDP track simplification → GeoJSON + dedup.py exact (hash) + near-duplicate detection + strava_csv.py Strava activities.csv importer + writer.py BAS JSON + GeoJSON writer + config.py extract_config.yaml loader + cli.py `bincio extract` CLI + render/ + cli.py `bincio render` CLI (symlinks data, runs astro build/dev) +schema/ + bas-v1.schema.json JSON Schema for BAS +SCHEMA.md Human-readable BAS spec +site/ Astro project + src/ + layouts/Base.astro + pages/ + index.astro Activity feed (loads index.json client-side) + activity/[id].astro Single activity (SSG, loads detail JSON client-side) + stats/index.astro Heatmap + year totals + components/ + ActivityFeed.svelte Card grid, sport filter, pagination + ActivityDetail.svelte Map + stats + charts wrapper + ActivityMap.svelte MapLibre GL (gradient track, linked hover dot) + ActivityCharts.svelte Observable Plot (elevation/speed/HR/cadence tabs) + StatsView.svelte Yearly heatmap + totals + lib/ + types.ts BAS TypeScript types + format.ts formatDistance, formatDuration, sportIcon, etc. +``` + +## How to run + +```bash +# Extract +cd ~/src/bincio_activity +uv run bincio extract --input ~/src/cycling_data_davide/activities --output /tmp/bincio_test + +# Site dev server +cd site +ln -sf /tmp/bincio_test public/data # symlink data +BINCIO_DATA_DIR=/tmp/bincio_test npm run dev + +# Tests +uv run pytest +``` + +## MapLibre GL + Vite/Astro — known gotchas + +Learnt the hard way during debugging (March 2026): + +- **`maplibregl.workerUrl = ...` is the v3 API and silently no-ops in v4+.** + The v5 API is `maplibregl.setWorkerUrl(url)`, but you don't need it at all in a + normal Vite environment — MapLibre handles the blob worker automatically. + +- **`optimizeDeps: { exclude: ['maplibre-gl'] }` breaks tile loading.** + It prevents Vite from converting MapLibre's UMD bundle to ESM. The UMD bundle + uses AMD `define()` internally; served raw, the tile worker blob fails silently → + black map, no tiles. The correct setting is `include: ['maplibre-gl']`. + +- **`build.target: 'es2022'` (and `optimizeDeps.esbuildOptions.target`) is required.** + MapLibre's dependencies use ES2022 class field syntax. If esbuild downgrades it, + helpers like `__publicField` aren't available inside the serialised worker blob + scope → tile loading fails. This is a known upstream issue (maplibre-gl-js #6680). + +- **Use static imports, not dynamic `await import('maplibre-gl')`, when possible.** + With `client:only="svelte"` in Astro, SSR never runs for the component so there is + no `window is not defined` risk. Static import lets Vite pre-bundle correctly. + +- **Use `client:only="svelte"` (not `client:load`) for the activity detail page.** + `client:load` does SSR + hydration; complex interactive components with MapLibre + can hit hydration mismatch issues. `client:only` mounts fresh on the client only. + +- **MapLibre v5 requires explicit `center` and `zoom` in the Map constructor.** + v4 silently defaulted to `center: [0,0], zoom: 0`. v5 leaves internal projection + state undefined → `Cannot read properties of undefined (reading 'lng')` crashes + on any operation that touches coordinates (markers, resize, render). Always pass + `center` and `zoom` even if you plan to `fitBounds` later. + +- **MapLibre v5 requires `setLngLat()` on markers before `.addTo(map)`.** + v4 tolerated markers without coordinates. v5 calls `Marker._update()` inside + `addTo()`, which needs valid lngLat → same `'lng'` crash. Set a dummy `[0, 0]` + if the real position arrives later (e.g. hover markers). + +The working `astro.config.mjs` Vite section: +```js +vite: { + optimizeDeps: { + include: ['maplibre-gl'], + esbuildOptions: { target: 'es2022' }, + }, + build: { target: 'es2022' }, +}, +``` + +## Known issues / next steps + +- `bincio render` Python CLI is a stub — site is built via `npm run build` directly +- Activity IDs in existing test data still use `+0000` format (pre-fix); re-run extract to get `Z` format +- Some activities appear with both untitled and titled IDs (near-dedup timing race) +- Stats page heatmap month labels use absolute positioning and may misalign +- Federation (remote data sources) not yet implemented in site +- Friends pages (`/friends/{handle}/`) not yet implemented +- `bincio render` should automate: symlink data → `astro build` +- The `site/.env` file is gitignored — document the setup for new users +- Add `--workers` benchmark: on 8 cores, ~7 min for 3,200 activities first run + +## What "good" looks like (not yet done) + +- [ ] `bincio render` Python CLI wraps `astro build` properly +- [ ] Friends/federation pages in site +- [ ] Personal records page +- [ ] Activity search / full-text filter in feed +- [ ] Map thumbnail in activity cards (SVG path from GeoJSON) +- [ ] GitHub Actions template for auto-publish +- [ ] Karoo/Garmin Connect importers beyond Strava diff --git a/README.md b/README.md index e69de29..2aa8a9a 100644 --- a/README.md +++ b/README.md @@ -0,0 +1,162 @@ +# BincioActivity + +A federated, open-source, self-hosted activity stats platform. +Own your data. Share what you want. Follow friends by URL. + +## What it is + +BincioActivity turns a folder of GPX/FIT/TCX files into a beautiful, modern +static website — no database, no server required. It can run from the local +filesystem, GitHub Pages, or any static host. + +**Federation**: anyone can "follow" a friend's data by adding a URL to their +config. Friends' activities appear in your site, attributed to them. + +## Quick start + +```bash +# Install +pip install bincio # or: uv add bincio + +# Extract your activities +cp extract_config.example.yaml extract_config.yaml +# edit extract_config.yaml with your paths +bincio extract + +# Build the site (requires Node ≥ 20) +cd site && npm install +BINCIO_DATA_DIR=~/bincio_data npm run build +# open site/dist/index.html +``` + +## Two stages + +### Stage 1 — Extract (`bincio extract`) + +Reads GPX, FIT, TCX files (including `.gz` compressed) and writes a +BincioActivity Schema (BAS) data store: plain JSON + GeoJSON files. + +``` +bincio extract # uses extract_config.yaml +bincio extract --input ~/rides --output ~/bincio_data +bincio extract --file ride.gpx # single file → stdout +bincio extract --since 2025-01-01 # incremental +``` + +Supported sources: +- GPX (generic, Garmin extensions) +- FIT (Garmin, Hammerhead Karoo) +- TCX (including Garmin's https:// namespace variant) +- All of the above gzip-compressed (`.gz`) +- Strava bulk export (`activities.csv` carries titles and descriptions) + +### Stage 2 — Render (`bincio render`) + +Generates a static site from the BAS data store using Astro. + +``` +cd site +BINCIO_DATA_DIR=~/bincio_data npm run dev # development +BINCIO_DATA_DIR=~/bincio_data npm run build # production build → site/dist/ +``` + +## Configuration + +### extract_config.yaml + +```yaml +owner: + handle: yourname + display_name: Your Name + +input: + dirs: + - ~/Activities/gpx + - ~/Activities/fit + metadata_csv: ~/strava_export/activities.csv # optional Strava metadata + +output: + dir: ~/bincio_data + +default_privacy: public # public | blur_start | no_gps | private + +track: + rdp_epsilon: 0.0001 # GPS simplification (~11m at equator) + +incremental: true # skip already-processed files +``` + +### site/.env + +``` +BINCIO_DATA_DIR=/path/to/bincio_data +``` + +## The BincioActivity Schema (BAS) + +The data store is a directory of plain JSON files: + +``` +bincio_data/ + index.json ← activity feed + owner manifest + activities/ + {id}.json ← full activity with timeseries + {id}.geojson ← simplified GPS track +``` + +See `SCHEMA.md` for the full specification. The schema is versioned and +published as a standalone document so anyone can write importers in any +language. + +## Federation + +Add a friend's `index.json` URL to your `site_config.yaml`: + +```yaml +data_sources: + - type: local + path: ~/bincio_data + - type: remote + handle: alice + url: https://alice.github.io/bincio/index.json +``` + +At build time the renderer fetches their public data and renders it under +`/friends/alice/`. + +## Privacy + +Privacy is enforced at the data layer — activities never leave your control: + +| Level | GPS track | Stats visible | +|---|---|---| +| `public` | Full track | Yes | +| `blur_start` | First/last 200 m removed | Yes | +| `no_gps` | Not published | Yes | +| `private` | Not published | Not in index | + +## Tech stack + +| Layer | Technology | +|---|---| +| Extract | Python 3.12, click, fitdecode, gpxpy, lxml, rdp | +| Site framework | Astro (static generation) | +| UI components | Svelte 5 | +| Styling | Tailwind CSS | +| Charts | Observable Plot | +| Maps | MapLibre GL + OpenFreeMap tiles | +| Package manager (Python) | uv | +| Package manager (Node) | npm | + +## Development + +```bash +# Python +uv sync +uv run pytest +uv run bincio --help + +# Site +cd site && npm install +BINCIO_DATA_DIR=/tmp/bincio_test npm run dev +``` diff --git a/bincio/extract/simplify.py b/bincio/extract/simplify.py index d4f9206..e4dac55 100644 --- a/bincio/extract/simplify.py +++ b/bincio/extract/simplify.py @@ -25,6 +25,33 @@ def simplify_track( return [p for (p, _, _), keep in zip(gps_pts, mask) if keep] +def preview_coords( + points: list[DataPoint], + max_points: int = 20, +) -> list[list[float]] | None: + """Return a small list of [lat, lon] pairs for card thumbnail rendering. + + Uses a coarser RDP pass, then subsamples to at most max_points. + Returns None if there is no GPS data. + """ + gps = [(p.lat, p.lon) for p in points if p.lat is not None and p.lon is not None] + if len(gps) < 2: + return None + + # Coarse RDP (larger epsilon = fewer points) + coords = [[lon, lat] for lat, lon in gps] + mask = rdp(coords, epsilon=0.001, return_mask=True) + reduced = [gps[i] for i, keep in enumerate(mask) if keep] + + # Subsample if still too many + if len(reduced) > max_points: + step = len(reduced) / max_points + reduced = [reduced[int(i * step)] for i in range(max_points)] + reduced.append(gps[-1]) # always include the last point + + return [[round(lat, 5), round(lon, 5)] for lat, lon in reduced] + + def build_geojson( points: list[DataPoint], activity_id: str, diff --git a/bincio/extract/writer.py b/bincio/extract/writer.py index 4376480..8295ceb 100644 --- a/bincio/extract/writer.py +++ b/bincio/extract/writer.py @@ -7,7 +7,7 @@ from pathlib import Path from bincio.extract.metrics import ComputedMetrics from bincio.extract.models import LapData, ParsedActivity -from bincio.extract.simplify import build_geojson +from bincio.extract.simplify import build_geojson, preview_coords from bincio.extract.timeseries import build_timeseries @@ -119,6 +119,8 @@ def build_summary( "privacy": privacy, "detail_url": f"activities/{activity_id}.json", "track_url": f"activities/{activity_id}.geojson" if has_gps else None, + # Small track preview for card thumbnails — no separate fetch needed + "preview_coords": preview_coords(activity.points) if has_gps else None, } diff --git a/bincio/render/cli.py b/bincio/render/cli.py index 16f8ca3..49cbb87 100644 --- a/bincio/render/cli.py +++ b/bincio/render/cli.py @@ -1,4 +1,10 @@ -"""bincio render — CLI command (stub, Astro stage TBD).""" +"""bincio render — build or serve the Astro static site.""" + +import os +import subprocess +import sys +from pathlib import Path +from typing import Optional import click from rich.console import Console @@ -6,13 +12,139 @@ from rich.console import Console console = Console() +def _find_site_dir(explicit: Optional[str]) -> Path: + """Locate the Astro project directory.""" + if explicit: + p = Path(explicit).expanduser().resolve() + if not (p / "package.json").exists(): + raise click.UsageError(f"No package.json found in --site-dir {p}") + return p + + # Search upward from cwd: ./site, ../site (for when cwd is bincio_data/) + for candidate in [Path.cwd() / "site", Path.cwd().parent / "site"]: + if (candidate / "package.json").exists(): + return candidate + + raise click.UsageError( + "Could not find the Astro site directory. " + "Run from the project root or pass --site-dir." + ) + + +def _find_data_dir(explicit: Optional[str], config_path: Optional[str]) -> Path: + """Resolve the BAS data directory.""" + if explicit: + return Path(explicit).expanduser().resolve() + + if config_path and Path(config_path).exists(): + import yaml + raw = yaml.safe_load(Path(config_path).read_text()) + out = raw.get("output", {}).get("dir") + if out: + return Path(out).expanduser().resolve() + + # Default: ./bincio_data next to cwd + default = Path.cwd() / "bincio_data" + if default.exists(): + return default + + raise click.UsageError( + "Could not find the BAS data directory. " + "Run `bincio extract` first, or pass --data-dir." + ) + + +def _ensure_npm(site: Path) -> None: + """Run `npm install` if node_modules is missing or stale.""" + if not (site / "node_modules").exists(): + console.print("Running [cyan]npm install[/cyan]…") + subprocess.run(["npm", "install"], cwd=site, check=True) + + +def _link_data(site: Path, data: Path) -> None: + """Symlink the BAS data store into site/public/data.""" + public_data = site / "public" / "data" + if public_data.is_symlink(): + if public_data.resolve() == data: + return # already correct + public_data.unlink() + elif public_data.exists(): + console.print( + f"[yellow]Warning:[/yellow] {public_data} exists and is not a symlink — " + "remove it manually if you want bincio to manage it." + ) + return + public_data.symlink_to(data) + console.print(f"Linked data: [cyan]{data}[/cyan] → [cyan]{public_data}[/cyan]") + + @click.command() -@click.option("--config", "config_path", default="site_config.yaml") -@click.option("--out", "out_dir", default="./site/dist") -@click.option("--serve", is_flag=True, help="Start dev server with hot reload.") +@click.option("--config", "config_path", default=None, + help="Path to extract_config.yaml (reads output.dir from it).") +@click.option("--data-dir", default=None, + help="BAS data store directory (output of bincio extract).") +@click.option("--site-dir", default=None, + help="Astro project directory (default: ./site).") +@click.option("--out", "out_dir", default=None, + help="Build output directory (default: site/dist).") +@click.option("--serve", is_flag=True, + help="Start dev server with hot reload instead of building.") @click.option("--deploy", default=None, metavar="TARGET", - help="Deploy target: 'github'.") -def render(config_path: str, out_dir: str, serve: bool, deploy: str | None) -> None: - """Generate static site from BAS data store (Astro stage — coming soon).""" - console.print("[yellow]bincio render is not yet implemented.[/yellow]") - console.print("The web renderer (Astro + MapLibre + Observable Plot) is next.") + help="Deploy after build. Currently supports: github.") +def render( + config_path: Optional[str], + data_dir: Optional[str], + site_dir: Optional[str], + out_dir: Optional[str], + serve: bool, + deploy: Optional[str], +) -> None: + """Build (or serve) the BincioActivity static site from a BAS data store.""" + + site = _find_site_dir(site_dir) + data = _find_data_dir(data_dir, config_path) + + console.print(f"Site: [cyan]{site}[/cyan]") + console.print(f"Data: [cyan]{data}[/cyan]") + + _ensure_npm(site) + _link_data(site, data) + + env = {**os.environ, "BINCIO_DATA_DIR": str(data)} + + if serve: + console.print("Starting [cyan]astro dev[/cyan]…") + subprocess.run(["npm", "run", "dev"], cwd=site, env=env) + return + + # Build + cmd = ["npm", "run", "build"] + if out_dir: + # Pass outDir via Astro CLI flag + cmd = ["npx", "astro", "build", "--outDir", str(Path(out_dir).resolve())] + + console.print("Running [cyan]astro build[/cyan]…") + result = subprocess.run(cmd, cwd=site, env=env) + if result.returncode != 0: + console.print("[red]Build failed.[/red]") + sys.exit(result.returncode) + + dist = Path(out_dir).resolve() if out_dir else site / "dist" + console.print(f"\n[green]Build complete.[/green] Output: [cyan]{dist}[/cyan]") + + if deploy == "github": + _deploy_github(site, dist) + + +def _deploy_github(site: Path, dist: Path) -> None: + """Push dist/ to the gh-pages branch.""" + console.print("Deploying to [cyan]GitHub Pages[/cyan]…") + # Requires npx gh-pages or git subtree push + result = subprocess.run( + ["npx", "gh-pages", "-d", str(dist)], + cwd=site, + ) + if result.returncode != 0: + console.print( + "[yellow]Tip:[/yellow] install gh-pages with `npm install -g gh-pages`" + ) diff --git a/extract_config.yaml b/extract_config.yaml new file mode 100644 index 0000000..2f608bc --- /dev/null +++ b/extract_config.yaml @@ -0,0 +1,32 @@ +owner: + handle: brutsalvadi + display_name: Bru + +input: + dirs: + - ~/src/cycling_data_davide/activities + - ~/src/cycling_data_davide/Karoo_2026 + - ~/src/cycling_data_davide/Karoo + # Strava bulk export metadata — provides names, descriptions, gear + metadata_csv: ~/src/cycling_data_davide/activities.csv + +output: + dir: ~/bincio_data + +default_privacy: public + +sensors: + heart_rate: true + cadence: true + temperature: true + power: true + +track: + simplify: rdp + rdp_epsilon: 0.0001 # ~11m at equator + timeseries_hz: 1 # 1 sample/second max + +classifier: + enabled: false # ML activity type classifier (requires scikit-learn extra) + +incremental: true # skip files whose hash hasn't changed since last run diff --git a/site/astro.config.mjs b/site/astro.config.mjs index 6af1db6..8190b24 100644 --- a/site/astro.config.mjs +++ b/site/astro.config.mjs @@ -7,4 +7,11 @@ export default defineConfig({ output: "static", // When hosting at a subdirectory (e.g. GitHub Pages project site), set: // base: "/repo-name", + vite: { + optimizeDeps: { + include: ['maplibre-gl'], + esbuildOptions: { target: 'es2022' }, + }, + build: { target: 'es2022' }, + }, }); diff --git a/site/src/components/ActivityCharts.svelte b/site/src/components/ActivityCharts.svelte new file mode 100644 index 0000000..8839a2b --- /dev/null +++ b/site/src/components/ActivityCharts.svelte @@ -0,0 +1,163 @@ + + + +
{detail.description}
+ {/if} +{s.value}
+{s.label}
+{detail.gear}
+Gear
+{error}
+{:else if detail?.timeseries && detail.timeseries.t.length > 0} +| Lap | +Distance | +Time | +Avg speed | +Avg HR | +
|---|---|---|---|---|
| #{lap.index + 1} | +{formatDistance(lap.distance_m)} | +{formatDuration(lap.duration_s)} | +{formatSpeed(lap.avg_speed_kmh)} | +{lap.avg_hr_bpm ? `${lap.avg_hr_bpm} bpm` : '—'} | +
Could not load activities: {error}
+{:else if filtered.length === 0} +No activities found.
+{:else} + + + {#if hasMore} +{year}
+{formatDistance(t?.dist ?? 0)}
+{t?.count ?? 0} activities
+