Commit Graph

46 Commits

Author SHA1 Message Date
Davide Scaini 0b266d208c Strip pre-2000 leading points to prevent epoch-zero start time and absurd duration 2026-05-12 23:11:33 +02:00
Davide Scaini 867da767eb Add sub_sport editing to activity edit drawer 2026-05-12 23:01:12 +02:00
Davide Scaini 8f028101c7 Fix elevation gain inflation from device no-fix leading zeros
Apple Watch and similar devices record exactly 0.0 for elevation while
waiting for barometric/GPS lock, then jump to the real altitude. The
hysteresis accumulator was seeding from 0.0, counting the full jump as
ascent. Fix: detect a leading near-zero run followed by a large jump
and seed the accumulator from the first real value instead.

Applied in both _elevation() (fresh extractions) and
recalculate_elevation_hysteresis() (recompute path). Added a bulk
admin endpoint POST /api/admin/users/{handle}/recompute-elevation and
corresponding button to fix existing stored activities.
2026-05-10 16:21:24 +02:00
Davide Scaini c077fceba6 fix: validate activity file exists before treating cached hash as known
The dedup cache (.bincio_cache.json) persists source hashes across runs.
If the activities/ directory is wiped (--fresh in another session, or macOS
clearing /tmp after a restart) while the cache file survives at the user dir
level, re-running extract skips all files as "already extracted" and leaves
activities/ empty. Now only hashes whose corresponding .json file is present
on disk are treated as known, so missing files are always re-extracted.
2026-04-25 09:54:38 +02:00
Davide Scaini df496a017f fix: refine hysteresis recalculation with MA pre-smoothing and lower thresholds
- dem.py: pre-smooth elevation with 30s moving average before hysteresis
  in recalculate_elevation_hysteresis(); thresholds drop from 5m/10m to
  1m (barometric) / 3m (GPS) — accurate after noise is smoothed out
- dem.py: widen DEM median-filter window 45s → 60s
- dem.py: rename response key source → altitude_source for consistency
- writer.py: write altitude_source into detail JSON at extract time
- tests/test_dem.py: 21 unit tests for pure functions and file-level hysteresis
- tests/test_edit_server.py: 11 TestClient API tests for both recalculate endpoints
- add httpx as dev dependency (required by FastAPI TestClient)
2026-04-22 10:57:28 +02:00
Davide Scaini ebac3f50f4 fix: DEM elevation overcounting and add hysteresis-only recalculation button
- dem.py: apply 45s median filter before hysteresis to suppress SRTM
  tile-boundary steps that were accumulating through the 5m threshold;
  raise DEM hysteresis threshold from 5m to 10m
- dem.py: back up elevation_m as elevation_m_original in timeseries
  before the first DEM overwrite, so original sensor data is preserved
- dem.py: add recalculate_elevation_hysteresis() — recomputes gain/loss
  from original recorded elevation (reads elevation_m_original if a DEM
  run already replaced elevation_m) using source-aware thresholds
  (5m barometric, 10m GPS/unknown); does not touch the elevation array
- edit/server.py, serve/server.py: split /recalculate-elevation into
  two endpoints: /recalculate-elevation/dem and
  /recalculate-elevation/hysteresis
- EditDrawer.svelte: replace single DEM button with two side-by-side
  buttons — "Recalculate (hysteresis)" (fast, offline) and
  "Recalculate (DEM)" (SRTM lookup)
2026-04-20 21:41:23 +02:00
Davide Scaini 1940e2409b feat: DEM-based elevation recalculation via edit drawer button
Adds a "Recalculate from terrain map (DEM)" button to the activity edit
drawer. On click it queries an Open-Elevation-compatible API to replace
GPS altitude with SRTM terrain data, applies 5m hysteresis, and updates
the activity's elevation stats and timeseries chart in place.

- bincio/extract/dem.py: lookup_elevations() (batched HTTP POST) +
  recalculate_elevation() (subsample → DEM → interpolate → hysteresis →
  patch activity JSON, timeseries JSON, index.json)
- POST /api/activity/{id}/recalculate-elevation on both serve and edit
  servers; serve endpoint is auth-gated and triggers merge + rebuild
- --dem-url flag (also DEM_URL env var) on bincio serve and bincio edit;
  logged at startup; missing URL returns a clear 503 with setup instructions
- /api/me response gains dem_configured bool
- EditDrawer: button with loading state, shows new ↑/↓ values on success
2026-04-20 20:45:06 +02:00
Davide Scaini 872651f471 metrics: replace naive elevation accumulation with hysteresis dead-band
GPS jitter and barometric quantization noise caused systematic overestimation
of elevation gain — in extreme cases 100% of reported gain was sub-1m noise.

Implements source-aware hysteresis: elevation is only committed when it
deviates from the last committed value by ≥5m (barometric) or ≥10m (GPS/GPX/TCX).

- ParsedActivity gains `altitude_source` field ("barometric"/"gps"/"unknown")
- FIT parser sets "barometric" when enhanced_altitude is present, else "gps"
- GPX and TCX parsers always set "gps"
- metrics._elevation() uses the threshold matching the source
- 5 new parametric tests covering flat GPS noise, threshold differences, and real climbs
2026-04-20 20:29:20 +02:00
Davide Scaini cd1cdca33b extract: auto-detect gzip by magic bytes, not just .gz extension
Files compressed with gzip but named without .gz (e.g. activity.gpx
containing gzip data) now decompress transparently.
2026-04-16 18:49:01 +02:00
Davide Scaini 290eef6c72 metrics: guard against corrupted time streams causing OOM
Strava originals with absolute Unix timestamps stored as elapsed-second
offsets produce a t_max of ~1.6 billion. compute_mmp and compute_best_efforts
both create dense 1Hz arrays via range(t_min, t_max+1), which for a 1.6B
span allocates 44+ GB and OOM-kills the process. Add a >1-week sanity
check and return None early for corrupt streams.

Root cause: old Strava activities (seen from 1970-epoch start_date)
where the time stream contains absolute Unix timestamps instead of
elapsed seconds.
2026-04-15 14:06:20 +02:00
Davide Scaini e6bb6e61a2 fix elevation_gain_m null for modern Garmin FIT files; fix map flash
FIT parser: try enhanced_altitude before altitude. Barometric altimeters
on modern Garmins (Edge 540, 840, etc.) write enhanced_altitude in
record messages and total_ascent in lap messages. The old code read only
altitude, producing null elevation_m per point → null elevation_gain_m
at the activity root while laps had correct values from total_ascent.

ActivityMap: use preview_coords (passed from ActivitySummary) to
initialise the map at the activity's location on mount, eliminating the
flash of world-view before the async detail JSON / bbox arrives.
2026-04-13 19:18:37 +02:00
Davide Scaini 5ad3aee8f6 rename privacy "private" → "unlisted"; enable GPS for unlisted
- "unlisted" = not shown in the public feed, but GPS track, timeseries
  and detail JSON are all accessible by direct URL (security by obscurity)
- "private" accepted as legacy alias everywhere (backward compat with
  existing data on disk)
- New writes from Strava sync / ZIP upload / sidecar use "unlisted"
- Only "no_gps" now suppresses the GPS track
- isUnlisted() helper in format.ts used by all Svelte/Astro components
- SCHEMA.md and CLAUDE.md document the privacy model and the distinction
  between "unlisted" and "no_gps"
2026-04-13 18:49:20 +02:00
Davide Scaini f003fdd89f garmin sync first attempt 2026-04-12 15:36:21 +02:00
Davide Scaini 6c431e8821 Here's what was built and why each decision was made:
Key at data_dir.parent/.garmin_key — nginx serves location /data/ { alias /var/bincio/data/; } so
  anything inside that dir is reachable. The key lives one level up at /var/bincio/.garmin_key,
  outside nginx's reach.

  Two-layer storage — garmin_creds.json holds the encrypted email+password (needed for re-login when
  tokens expire); garmin_session/ holds the garth OAuth tokens in plain JSON (short-lived, not the
  user's actual password).

  test_login() — called by the connect endpoint before saving anything, so credentials are only
  persisted if they actually work.

  get_client() — tries the session first (fast, no network), falls back to full re-login
  transparently. The caller never needs to think about whether the session is fresh.
2026-04-12 15:12:20 +02:00
Davide Scaini 01db4eb9ae ingest activities.csv 2026-04-11 08:13:27 +02:00
Davide Scaini bc30e0a2fc option to keep all activities private from strava zip, fix copy of register link 2026-04-10 22:51:29 +02:00
Davide Scaini 3b8bc159c5 upload strava zip 2026-04-10 22:01:44 +02:00
Davide Scaini 3e4ff4019b limit number of workers 2026-04-10 18:13:49 +02:00
Davide Scaini cf414a08ad fix strava import? 2026-04-10 18:13:32 +02:00
Davide Scaini ae883a7dba fix: rebuild athlete.json on every ingest; remove bincio-extract references from UI 2026-04-10 15:47:50 +02:00
Davide Scaini 6d3673b2f7 1. Image upload size limit — _MAX_IMAGE_BYTES = 10 MB in both serve/server.py and edit/server.py
2. Image MIME type whitelist — _ALLOWED_IMAGE_TYPES blocks SVG XSS in both servers
  3. Filename collision safety — _unique_image_name() helper in both servers
  4. OAuth CSRF — state token generated in edit/server.py auth-url, stored in _oauth_states, validated and discarded in callback; strava_api.auth_url() accepts optional state param
  5. Error message leak — upload processing errors now return generic "Processing failed" instead of exception type/message
  6. Handle injection in subprocess — _trigger_rebuild now asserts handle matches _VALID_HANDLE before passing to subprocess
2026-04-10 13:56:39 +02:00
Davide Scaini 469a5954cc "keep data on the server" opt-in/out 2026-04-10 13:01:21 +02:00
Davide Scaini 084c652fdd fixing stuff after splitting jsons 2026-04-09 15:27:00 +02:00
Davide Scaini 8118f6f316 1 — Timeseries split
- writer.py: timeseries is now written to {id}.timeseries.json as a separate file. The detail JSON gets a timeseries_url field instead. finalize_pending and cleanup_pending handle the extra file.
  - merge.py (merge_one): symlinks the .timeseries.json file alongside the detail JSON. merge_all already handles it transparently (the .timeseries.json stem doesn't match any activity
  ID in to_merge, so it falls through to the symlink branch).
  - types.ts: timeseries is now timeseries?: Timeseries | null, and timeseries_url?: string | null added.
  - dataloader.ts: new loadTimeseries(url, detailUrl, base) function that resolves paths correctly in both single- and multi-user modes (uses the fetched detail URL's directory as the base).
  - ActivityDetail.svelte: loads timeseries separately after detail loads; uses detail.timeseries for IDB activities (embedded) or fetches via detail.timeseries_url for server activities. Charts show a pulse placeholder while loading.

 2 — GZip

  - GZipMiddleware (min 1 KB) added to both bincio/serve/server.py and bincio/edit/server.py — all API JSON responses are now gzip-compressed.
  - For static files (the big timeseries JSONs), nginx should be configured with gzip on; gzip_types application/json application/geo+json; — no code change needed on the server side.

  Net effect: opening an activity page now fetches ~1.4 KB (detail) instead of ~586 KB. The timeseries fetches ~60–150 KB gzip-compressed shortly after (it loads concurrently with the map rendering).
2026-04-09 14:01:02 +02:00
Davide Scaini 7dcb1e6dd0 refactor: extract/ingest facade, merge_one, deduplicate ops constants
- Add bincio/extract/ingest.py as a facade over the extract internals (ingest_parsed, strava_sync), reducing coupling from 6+ imports to one
  - Add merge_one() to merge.py — fast single-activity path for interactive edits (rewrites one file + index, skips full directory rebuild)
  - Rewrite edit/ops.py to delegate to the new facade; fix broken run_strava_sync return (was referencing undefined locals)
  - Remove duplicated SPORTS, STAT_PANELS, VALID_ACTIVITY_ID from edit/server.py — now imported from ops.py
2026-04-09 12:05:01 +02:00
Davide Scaini 98c42dc443 unify single user and multi user behaviour 2026-04-09 08:59:40 +02:00
Davide Scaini 5bf0f3636c local conversion 2026-04-06 22:25:57 +02:00
Davide Scaini 17f36889f3 sync strava data from web ui 2026-04-06 12:38:41 +02:00
Davide Scaini bd5831c2fd second pass. low 2026-04-01 19:00:28 +02:00
Davide Scaini 3d364c3992 second pass. medium 2026-04-01 11:05:00 +02:00
Davide Scaini 94369606a4 second pass at issues. critical ones. 2026-04-01 10:58:45 +02:00
Davide Scaini 81438231b4 fix low level issues 2026-03-31 23:22:12 +02:00
Davide Scaini 8f91503cf7 fix mid level issues. updated changelog 2026-03-31 23:00:39 +02:00
Davide Scaini f8abab2c23 fix high priority issues 2026-03-31 22:53:50 +02:00
Davide Scaini 77c30150b0 fix ride types subclasses (?) to be tested 2026-03-30 22:55:53 +02:00
Davide Scaini 877472e620 trying to get sub label showed properly 2026-03-30 20:09:01 +02:00
Davide Scaini d806072546 improve configs, update docs 2026-03-30 13:30:43 +02:00
Davide Scaini a6a81f9421 personal records tab into athlete page 2026-03-30 10:53:51 +02:00
Davide Scaini ec6175b143 athlete page first draft 2026-03-30 09:05:18 +02:00
Davide Scaini 4537273de9 get default hr and power zones from config file 2026-03-29 22:06:22 +02:00
Davide Scaini e71e8783ab added skiing 2026-03-29 10:51:26 +02:00
Davide Scaini fa4e91b645 fix distance calculation 2026-03-29 10:50:31 +02:00
Davide Scaini 643d092acd fix activities' types 2026-03-29 10:37:08 +02:00
Davide Scaini 3441079913 map now working 2026-03-28 19:34:22 +01:00
Davide Scaini 5d58126d2f parallelizing extraction, fix tcx files 2026-03-28 14:30:53 +01:00
Davide Scaini 38c5423aeb backend: initial commit 2026-03-28 13:59:36 +01:00