Files
bincio-activity/CHANGELOG.md
T
2026-03-31 23:00:39 +02:00

197 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Changelog
## [Unreleased] — 2026-03-31
### Security fixes
- **Path traversal prevention** (`edit/server.py`) — all routes now validate `activity_id` against `[a-zA-Z0-9\-]+` regex via `_check_id()`; invalid IDs return 400
- **Path traversal in `delete_image`** — `filename` parameter now stripped to basename via `Path(filename).name` before use in filesystem paths
- **Path traversal in `upload_activity`** — uploaded `file.filename` stripped to basename via `Path(file.filename).name`
- **XSS in activity description** (`ActivityDetail.svelte`) — `marked()` output now wrapped in `DOMPurify.sanitize()` before `{@html}` rendering
- **CORS restricted** (`edit/server.py`) — `allow_origins=["*"]` replaced with `allow_origin_regex` matching `localhost` origins only
- **YAML injection in `hide_stats`** — values filtered against a `STAT_PANELS` allowlist before writing to YAML frontmatter
- **Regex injection in `deleteImage`** (`EditDrawer.svelte`) — filename special characters escaped before `RegExp` construction
### Bug fixes — data
- **MMP sliding window on non-contiguous data** (`metrics.py`) — power series now built as a dense 1 Hz array with gaps zero-filled (standard GoldenCheetah/WKO approach); recording pauses no longer inflate MMP values
- **Best-effort times on non-contiguous data** (`metrics.py`) — speed series uses same zero-fill; pauses count as 0 km/h so windows cannot span them silently
- **Activity ID collision** (`writer.py`) — when two activities share the same start-time + title, the second is disambiguated with a 6-character source hash suffix; re-extracting the same file is idempotent
- **Misaligned lat/lon arrays** (`ActivityMap.svelte`) — lat and lon were filtered for nulls independently; now filtered as pairs so indices always stay aligned
- **Falsy `0.0` speed check** (`metrics.py:89-90`, `parsers/fit.py:89`) — `if avg_speed_kmh` / `if speed_raw` replaced with `is not None`; 0.0 is no longer silently dropped
- **TCX timestamps with numeric timezone offsets** (`parsers/tcx.py`) — `+02:00`-style offsets now parsed correctly and converted to UTC; previously crashed with `ValueError`
### Bug fixes — frontend
- **Backdrop dismiss fires `saved` event** (`EditDrawer.svelte`) — backdrop click and ×-button now dispatch `close` instead of `saved`, preventing unsaved data from overwriting the displayed title/description
- **No error handling in `uploadImages`** (`EditDrawer.svelte`) — wrapped upload loop in try/catch/finally so a network error clears the `uploading` spinner and surfaces an error message instead of locking the UI
- **Stats page pagination** (`StatsView.svelte`) — heatmap now shows 4 years per page with ← Newer / Older → controls; `?page=` persisted in URL
### Schema
- **Writer output now matches schema** (`bas-v1.schema.json`) — `mmp`, `best_efforts`, `best_climb_m`, `preview_coords`, and `custom` are all declared in the schema; previously `additionalProperties: false` caused validation failures
- **`skiing` added to sport enum** — was produced by the extractor but missing from the schema definition
- **Sub-sport enum extended** — `nordic`, `alpine`, `open_water`, `pool` added to schema
- **Activity ID format corrected in SCHEMA.md** — examples updated from `+0200` offset to `Z` UTC suffix (matching actual code behaviour since v0.1.0)
### Navigation
- **URL state persistence** — filter and tab state is now stored in the URL query string so the browser back button always restores the exact view you left
- Activity feed (`/`): `?sport=cycling` — sport filter survives back navigation
- Stats page (`/stats/`): `?sport=cycling` — same
- Athlete page (`/athlete/`): `?tab=records` — active tab survives back navigation
- Records tab (`/athlete/?tab=records`): `?sport=cycling` — sport filter within records also persisted; full URL example: `/athlete/?tab=records&sport=cycling`
- All use `history.replaceState` (not `pushState`) so clicking filters does not pollute the history stack — back always goes to the previous *page*, not the previous filter state
- Default values are omitted from the URL for cleanliness (`sport=all` and the default tab are never written)
### Sport classification
- **Sub-sport detection** — `normalise_sub_sport()` in `sport.py` infers sub_sport from raw sport type strings
- CamelCase Strava types handled correctly (`MountainBikeRide``cycling / mountain`, `GravelRide``cycling / gravel`, `AlpineSki``skiing / alpine`, `NordicSki``skiing / nordic`, etc.)
- All parsers (Strava importer, GPX, TCX) now populate `sub_sport`; FIT parser was already correct
- Sub-sport shown as a secondary pill on activity detail page: **🚴 Cycling** + **MTB**
### Developer experience
- **`--dev N` flag** on `bincio extract` — samples N files evenly across the full file list (date + format diversity) and writes to `/tmp/bincio_dev/`; `incremental` is disabled automatically
- **`--dev N` flag** on `bincio import strava` — imports only the N most recent activities to `/tmp/bincio_dev/`
- Dev loop: `bincio extract --dev 50 && bincio import strava --dev 50 && bincio render --serve --data-dir /tmp/bincio_dev`
### Data ingestion
- **`bincio import strava`** — OAuth2 Strava importer (`bincio/import_/strava.py` + `bincio/import_/cli.py`)
- One-shot local OAuth2 callback server (port 8976); opens browser, receives code, exchanges for tokens
- Tokens saved to `~/.config/bincio/strava.json`; auto-refreshed on expiry (6h TTL)
- Fetches paginated activity list with `after=` timestamp for efficient incremental runs
- Per activity: `GET /activities/{id}/streams``_strava_to_parsed()``compute()``write_activity()`
- `_patch_from_summary()`: fills `None` metrics from Strava summary when sensors are missing (manual entries, indoor rides)
- Sync state persisted in `data_dir/_strava_sync.json` (imported IDs + last sync timestamp)
- Rate limit tracking via `X-RateLimit-Usage`; warns at 85% of 15-min window; auto-retries on 429
- Credentials read from (in order): CLI flags → env vars → `extract_config.yaml` under `import.strava`
- Install: `uv sync --extra strava`
- **Web file upload** — `POST /api/upload` in `bincio/edit/server.py`
- Accepts FIT/GPX/TCX (`.gz` variants too); 409 if activity already exists
- Runs full extract pipeline inline: `parse_file()``compute()``write_activity()``merge_all()`
- Staged to `data_dir/_uploads/` during processing; cleaned up in `finally`
- `↑` button in site nav, gated behind `PUBLIC_EDIT_URL`; drag-and-drop modal; auto-redirects on success
- **`extract_config.yaml` is now gitignored** — safe to store credentials under `import.strava`
- `StravaConfig` dataclass added to `bincio/extract/config.py`; parsed from `import.strava:` block
- `extract_config.example.yaml` is the tracked template
- **Theme-aware heatmap** (`StatsView.svelte`) — `applyIntensity()` now lerps from the correct
background colour in both dark (zinc-800 `#27272a`) and light (zinc-200 `#e4e4e7`) modes;
`emptyColor` and `baseRgb` reactive to `data-theme` via `MutationObserver`
### Athlete page
- **`/athlete` page** — three-tab layout: Power Curve · Records · Profile
- **Mean Maximal Power (MMP) curve** — computed at extract time for each activity with power data
- Sliding-window O(n) algorithm over 1 Hz power timeseries; 15 standard durations (1 s → 1 h)
- Multi-curve overlay with range selector: All time / Last 365 d / Last 90 d / user-defined seasons
- Log-scale x-axis via Observable Plot; FTP reference line; per-point tooltips
- Seasons configurable in `extract_config.yaml` under `athlete.seasons`
- **Personal records (Records tab)** — sport-specific best efforts computed via sliding window
- Running: 400 m, 1 km, 1 mile, 5 km, 10 km, half marathon, marathon
- Cycling: 5 km, 10 km, 20 km, 50 km, 100 km
- Swimming: 100 m, 200 m, 500 m, 1 km, 2 km
- Table shows time, pace (running) or speed (cycling/swimming), date, activity link
- Hiking / Walking: longest distance and most elevation gain
- **Best climbs** — top 10 biggest single climbs (Kadane's algorithm on 1 Hz elevation deltas); ranked table with elevation, date, activity link
- **Profile tab** — max HR, FTP, HR zones, power zones
- **`bincio edit` athlete API** (`GET /api/athlete`, `POST /api/athlete`) — reads/writes `edits/athlete.yaml`
- **`AthleteDrawer.svelte`** — slide-in profile editor (gated behind `PUBLIC_EDIT_URL`)
- Max HR and FTP number inputs
- HR and power zone tables: changing a zone's upper bound auto-cascades to the next zone's lower bound
- Season list: name + date range, add/remove rows
- **`athlete.json`** — written at extract time; contains pre-aggregated MMP curves and records; symlinked into `_merged/` by `merge_all()`
### Extraction pipeline
- **MMP computation** — `compute_mmp()` added to `metrics.py`; stored in both detail JSON and index summary (enables client-side season filtering without extra fetches)
- **Best-effort computation** — `compute_best_efforts()` two-pointer sliding window on 1 Hz speed; `_best_climb()` Kadane's on elevation deltas
- **`write_athlete_json()`** — aggregates MMP and records from all summaries into `athlete.json`
### Scripts
- **`scripts/backfill.py`** — backfills `mmp`, `best_efforts`, and `best_climb_m` into existing activity JSONs from already-extracted 1 Hz timeseries; no FIT re-parsing needed (~20 s for 2500 activities)
---
## [0.1.0] — 2026-03-29
### Extraction pipeline
- **Parallel extraction** — activities now processed with `ProcessPoolExecutor`; large shared state (Strava lookup, known hashes) sent once per worker via `initializer=` rather than once per task
- **TCX parser fixes** — handles both `http://` and `https://` Garmin namespace URIs
- **Sport classification overhaul**
- FIT parser now reads sport from the `session` frame as fallback when no separate `sport` frame is present (fixes Karoo and Strava-generated FIT files)
- Strava CSV `Activity Type` used as authoritative override when present
- Expanded sport mapping: e-bike variants (`ebikeride`, `e_bike_ride`), `ride`, `run`, date-prefix stripping, and more
- Skiing added as first-class sport: `cycling` | `running` | `hiking` | `walking` | `swimming` | `skiing` | `other`
- Nordic sub-sport: FIT sub_sport values `cross_country_skiing`, `nordic_skiing`, `skate_skiing`, `backcountry_skiing``"nordic"`
- **Distance calculation fix** — when a FIT device records `distance = 0.0` (not `null`), the extractor now falls back to haversine-computed GPS distance instead of using the zero value directly; fixes skiing activities that had valid tracks and speeds but showed 0 km
- **`metadata_csv` is fully optional** — omitting it from config works cleanly; only needed for Strava bulk exports
### Site — maps & charts
- **MapLibre GL map** fully working on the activity detail page
- Static import + `optimizeDeps.include` (not `exclude`) fixes silent tile worker failure
- `build.target: 'es2022'` required for MapLibre's ES2022 class field syntax
- MapLibre v5 requires explicit `center`/`zoom` in Map constructor and `setLngLat()` before `addTo()`
- **Observable Plot charts** (elevation, speed, HR, cadence) working
- Switched from dynamic `await import()` to static import — fixes unreliable Svelte reactivity
- Curve name is `"monotone-x"` not `"monotoneX"`
- **Power chart** added as fifth tab alongside elevation/speed/HR/cadence
- **HR and power zone histograms** — configurable zone boundaries via `athlete.hr_zones` / `athlete.power_zones` in `extract_config.yaml`; histogram x-axis capped at actual data max so sentinel values (`999`, `9999`) don't stretch the axis
- **Adjustable trim range** on histograms
### Site — activity feed
- **SVG track thumbnails** on feed cards — drawn from `preview_coords` (no extra fetch)
- **Sport filter bar** — pill buttons for All / Cycling / Running / Hiking / Walking / Swimming / Skiing / Other
### Site — stats page
- **Sport filter bar** — same pill UI as the feed; all stats and heatmap reflect the selected sport
- **Heatmap colour improvements**
- Blended colours in "All" mode: each cell's RGB is a weighted average of sport colours by distance
- Percentile-based intensity scaling (active): each day ranked against all active days, spreading colour evenly regardless of km outliers; configurable back to linear/max-relative (documented in CLAUDE.md)
- `applyIntensity()` lerps from zinc-800 background to full sport colour — dim cells fade into the background rather than going black
- `$: cellColors` precomputed as a reactive `Map<string, string>` — fixes Svelte not re-rendering cells when filter changes
- **Month label fix** — labels embedded in the week-column flex grid (no more absolute-positioning bugs); `getWeeks()` uses local date formatting (`localISO()`) instead of `toISOString()` to avoid UTC/local mismatch that produced a spurious "Dec" label at column 0
- **Cell tooltips** — hovering a cell shows a floating card with date, and for each activity: name, sport, distance, duration; each activity is a clickable link to its detail page; 120 ms grace period when moving from cell to tooltip
### Site — activity editing (`bincio edit`)
- **`bincio edit` write API** — FastAPI server (`--data-dir`, default port 4041)
- `GET /api/activity/{id}` — current values with sidecar overrides applied
- `POST /api/activity/{id}` — writes sidecar `.md`, triggers `merge_all()`
- `POST /api/activity/{id}/images` — multipart image upload
- `DELETE /api/activity/{id}/images/{filename}`
- **Activity sidecar system** (`bincio/render/merge.py`)
- Sidecars live in `edits/` alongside extracted data (never co-mingled with immutable BAS JSON)
- Fields: `title`, `sport`, `description`, `hide_stats`, `highlight`, `private`, `gear`
- `merge_all()` produces `_merged/` output; `public/data``_merged/` at runtime
- **`EditDrawer.svelte`** — slide-in drawer in the Astro site (no separate HTML from the server)
- Opens in-page via Edit button; only rendered when `PUBLIC_EDIT_URL` env var is set
- Title, sport dropdown, gear, markdown description textarea
- Image drag-and-drop with chip list + delete
- Hide-stats toggle buttons (elevation, speed, heart_rate, cadence, power)
- Highlight and private flags
- Optimistic local update on save — title and description update immediately without reload
- **Photo gallery + lightbox** on activity detail page — keyboard navigation (←/→/Esc), filename + counter overlay
- **Markdown descriptions** rendered with `marked`; local relative images suppressed from inline rendering (shown in gallery instead)
### Documentation
- **README** rewritten — philosophy statement front and centre, clear two-stage architecture diagram, quick start
- **CHEATSHEET.md** added — daily workflow, all CLI commands, config reference, privacy table, patching snippets, diagnostic scripts, key files table
- **CLAUDE.md** updated — MapLibre GL v5 gotchas, Observable Plot curve names, heatmap colour scaling approaches (linear vs percentile), sidecar/edit architecture decisions
- **`extract_config.example.yaml`** cleaned up — personal paths removed, `metadata_csv` commented out with explanation
### Infrastructure
- `publish.sh` — builds and pushes static site to GitHub Pages via orphan branch