Commit Graph

155 Commits

Author SHA1 Message Date
Davide Scaini 872651f471 metrics: replace naive elevation accumulation with hysteresis dead-band
GPS jitter and barometric quantization noise caused systematic overestimation
of elevation gain — in extreme cases 100% of reported gain was sub-1m noise.

Implements source-aware hysteresis: elevation is only committed when it
deviates from the last committed value by ≥5m (barometric) or ≥10m (GPS/GPX/TCX).

- ParsedActivity gains `altitude_source` field ("barometric"/"gps"/"unknown")
- FIT parser sets "barometric" when enhanced_altitude is present, else "gps"
- GPX and TCX parsers always set "gps"
- metrics._elevation() uses the threshold matching the source
- 5 new parametric tests covering flat GPS noise, threshold differences, and real climbs
2026-04-20 20:29:20 +02:00
Davide Scaini 631e814d64 fix: load all year shards into combined feed, not just top 2 2026-04-20 16:14:40 +02:00
Davide Scaini db7047f210 perf: combined feed index for multi-user global feed
Instead of the browser resolving 20+ user shards recursively (~27 MB),
generate a pre-sorted feed.json at merge time with 50 activities per
page. The global feed loads one ~30 KB file on first paint; "Load more"
fetches subsequent pages (feed-2.json, feed-3.json, etc.).

Per-user profile pages still use year-sharded loadIndexPaged as before.
2026-04-20 15:31:35 +02:00
Davide Scaini cea1dbc2fb ops: fix data/ triple-duplication costing ~24 GB on VPS
astro build resolves the public/data symlink and copies all activity JSON
into dist/; rsync then copied that to the webroot — but nginx already serves
/data/ directly from /var/bincio/data/ via alias, so both copies were dead
weight. Freed 36 GB → 14 GB on the live server.

- post-receive hook: prune dist/data/ before rsync, add --exclude=data/
- docs: update manual rebuild command and nginx comment to match
- serve/server.py: _mb() now uses lstat() to count symlinks at face value
  rather than following them to targets, so admin storage panel no longer
  double-counts _merged/ (which is mostly symlinks into activities/)
2026-04-19 23:34:55 +02:00
Davide Scaini 5227b30456 fix: EditDrawer correctly reads and labels unlisted privacy
- serve/server.py GET adds private:bool to the response (true when
  privacy is "unlisted" or legacy "private") so EditDrawer can read it
- edit/server.py GET: same fix for the single-user edit server
- EditDrawer: fall back to d.privacy if d.private is absent; rename
  "Private" toggle label to "Unlisted"
2026-04-19 22:58:09 +02:00
Davide Scaini 8575a7015b fix: delete activity removes it from index.json; detail page uses lazy load
delete_activity now updates data_dir/index.json so merge_all no longer
re-adds the summary for a deleted activity, preventing the broken
"Activity not found" state after deletion.

ActivityDetailLoader switches from loadIndex (all year shards) to
loadIndexPaged (first year shard only) + direct file fallback, so
opening an activity detail page no longer downloads the entire history.
2026-04-19 22:31:20 +02:00
Davide Scaini cada2bcb03 perf: year-shard index.json to cut initial load from MBs to ~1 year
merge_all/_merged/index.json is now a shard manifest; activities are
split into index-{year}.json files. The feed loads only the most-recent
year on first paint (~200 activities instead of all of them). Older
years are fetched lazily when the user clicks "Load older activities".

Also strips best_efforts / best_climb_m / source from shard files —
these fields are aggregation inputs only, never read by the feed UI.
2026-04-19 22:21:10 +02:00
Davide Scaini cd1cdca33b extract: auto-detect gzip by magic bytes, not just .gz extension
Files compressed with gzip but named without .gz (e.g. activity.gpx
containing gzip data) now decompress transparently.
2026-04-16 18:49:01 +02:00
Davide Scaini 395182649b improve docs 2026-04-15 23:07:52 +02:00
Davide Scaini 87a69bcc8b settings: add nav visibility prefs and per-user Strava credentials
- user_prefs table in db.py with get/set helpers
- GET/PUT /api/me/prefs endpoints for bulk pref management
- GET/PUT/DELETE /api/me/strava-credentials; PUT preserves existing
  secret when client_secret field is left blank
- _strava_creds() helper resolves per-user → instance fallback across
  all five Strava endpoints
- Settings page: Navigation card (hide Feed/Community/About toggles)
  and Strava credentials card
- Base.astro: ids on feed/community/about nav links; applies
  nav_hide_* prefs after login
2026-04-15 20:37:42 +02:00
Davide Scaini 4fd5ba428e settings: add self-service user settings page
API endpoints (all auth-gated to the logged-in user):
- GET  /api/me/storage        — per-category disk breakdown
- DELETE /api/me/originals    — free originals/ dir (post-extraction cleanup)
- DELETE /api/me/activities   — wipe all activity data (password confirm)
- DELETE /api/me              — delete account + all data (password confirm)
- PUT  /api/me/display-name   — update display name
- PUT  /api/me/password       — change password (requires current password)

Page at /settings/:
- Storage card: activities / originals / Strava originals / photos / total
  with one-click 'Delete original files' when originals exist
- Profile card: display name field with inline save
- Password card: change password form
- Danger zone: delete all activities or delete account (both require
  password confirmation in a modal before proceeding)

Nav: 'Settings' link appears in the top bar after login (same as Admin).
2026-04-15 20:24:04 +02:00
Davide Scaini 764da09130 upload: add overwrite option to replace existing activities
When 'Overwrite existing activities' is checked, duplicate activities are
re-extracted and replaced instead of silently skipped:
- Deletes {id}.json, .geojson, .timeseries.json from activities/ and _merged/
- Removes the stale index summary and dedup cache entry
- Ingests the new file fresh via ingest_parsed
- Reports 'overwritten' (↺) status in the SSE stream vs 'imported' (↓)
- done event includes 'overwritten' count; UI shows it alongside 'added'
2026-04-15 20:17:32 +02:00
Davide Scaini a33fea91cf admin: mark ghost users (no DB account) and add Delete dir button
- /api/admin/disk now includes in_db flag per user (true if account exists in DB)
- Ghost users (directory exists, no DB account) show amber 'ghost' badge and only
  Diag + Delete dir buttons (no Re-extract, Rebuild, Reset pwd, Reset data)
- DELETE /api/admin/users/{handle}/directory wipes the entire directory and updates
  the root manifest; refuses if the account still exists in the DB
- Wires up rmdir-btn with a window.confirm before calling the new endpoint
2026-04-15 14:58:54 +02:00
Davide Scaini 290eef6c72 metrics: guard against corrupted time streams causing OOM
Strava originals with absolute Unix timestamps stored as elapsed-second
offsets produce a t_max of ~1.6 billion. compute_mmp and compute_best_efforts
both create dense 1Hz arrays via range(t_min, t_max+1), which for a 1.6B
span allocates 44+ GB and OOM-kills the process. Add a >1-week sanity
check and return None early for corrupt streams.

Root cause: old Strava activities (seen from 1970-epoch start_date)
where the time stream contains absolute Unix timestamps instead of
elapsed seconds.
2026-04-15 14:06:20 +02:00
Davide Scaini 25d80c8132 reextract: process in batches of 100 to bound subprocess memory
One Python process for 2015 activities exhausts all RAM + swap on a
cheap VPS. Split into sequential batches of 100: each subprocess handles
100 activities and exits, returning all memory to the OS before the
next batch starts. The server chains batches in the SSE event_stream
and triggers a single rebuild when all batches complete.
2026-04-15 10:08:55 +02:00
Davide Scaini a67b237161 reextract: reclaim RSS with malloc_trim + gc.collect every 50 activities
CPython's allocator holds freed memory in arenas and doesn't return it to
the OS, causing RSS to grow throughout the 2015-activity loop even when
each iteration's objects are freed. Call gc.collect() + malloc_trim(0)
every 50 activities to return freed pages to the kernel and keep RSS flat.
2026-04-15 09:58:16 +02:00
Davide Scaini 062ade28d3 reextract: use venv bincio script, not uv, to spawn subprocess
uv is unreliable in systemd environments where PATH omits ~/.local/bin.
Use sys.executable's parent directory to find the venv's bincio script
directly — this always works since the server itself runs from the venv.
2026-04-15 09:50:50 +02:00
Davide Scaini 1a563012e2 reextract-originals: run as subprocess to avoid OOM
The in-process approach loaded all 2015 Strava originals into the server
process memory, causing OOM kills. Now spawns `bincio reextract-originals`
as a child process; heavy work runs in an isolated Python interpreter that
exits when done, freeing all memory.

Also adds `bincio reextract-originals` as a standalone CLI command that
prints JSON-lines progress to stdout — useful for running directly on the
VPS via SSH for large backlogs.
2026-04-15 09:42:31 +02:00
Davide Scaini 6890892654 trying to fix building of activities that fails because of OOM 2026-04-15 09:30:22 +02:00
Davide Scaini b01b00698c rewrite reextract with queue-based thread/async bridge
The per-call run_in_executor pattern caused network errors.
New approach: one thread runs the entire extraction loop and puts
SSE strings into an asyncio.Queue via call_soon_threadsafe; the
async generator drains the queue. This is the correct pattern for
background-thread + SSE streaming in FastAPI.
2026-04-15 09:14:18 +02:00
Davide Scaini 10dd1185b9 fix reextract: async generator + run_in_executor, imports at endpoint level
The sync generator was failing with a network error because Starlette's
iterate_in_threadpool doesn't properly propagate exceptions from sync
generators — the connection resets with no body.

Fix: convert event_stream to an async generator (Starlette handles these
natively without thread wrapping), move imports to the endpoint function
scope so failures raise HTTPException before the stream starts, and run
CPU-intensive work (parse + write) via loop.run_in_executor so the
async generator can actually yield between activities.
2026-04-15 09:05:29 +02:00
Davide Scaini 378cba85ad fix re-extract: add heartbeat yield, batch index writes, handle HTTP errors in UI
- Generator now yields a 'status' event immediately so the client can
  distinguish 'working' from 'failed silently before first event'
- Batch mode: call write_activity per file but write index.json and
  athlete.json only once at the end (was O(n²) — 2015 rewrites)
- JS: check r.ok before reading the body stream; show HTTP error detail
  instead of staying stuck at 'Starting…'
- Handle 'status' event type in the progress log
2026-04-15 08:57:21 +02:00
Davide Scaini 89b92397cf add re-extract from Strava originals endpoint and improve diag
- POST /api/admin/users/{handle}/reextract-originals: reads stored
  originals/strava/*.json and re-runs strava_to_parsed + ingest_parsed
  without hitting the Strava API; streams SSE progress; calls merge_all
  and rebuild on completion
- GET /api/admin/users/{handle}/diag: now shows _merged/activities/
  file counts, a sample of filenames in activities/ (with symlink flag),
  and lists pending_files by name
- Admin page: Re-extract button per user with live SSE progress modal
2026-04-15 08:08:57 +02:00
Davide Scaini 1e30f85bdc add structured logging and admin diagnostics to serve
- bincio.serve logger wired into uvicorn output: rebuild steps, upload
  errors, strava-zip progress all now appear in the server log
- _trigger_rebuild: capture stdout/stderr, log errors instead of silently
  discarding; exceptions logged with traceback instead of swallowed
- upload handler: log per-file errors with traceback; include error detail
  in the SSE event sent back to the browser
- strava-zip handler: log imported/error counts on completion
- GET /api/admin/users/{handle}/diag: snapshot of a user's data dir
  (file counts, sizes, index activity counts, pending uploads)
- POST /api/admin/users/{handle}/rebuild-sync: blocking rebuild that
  returns full stdout/stderr — for debugging without SSH log access
- Admin page: Diag button per user opens a modal showing the diag JSON
2026-04-14 22:53:31 +02:00
Davide Scaini 13643479ef add password reset via admin-generated one-time code
db.py: reset_codes table (code, handle, created_by, created_at,
expires_at, used_at); create_reset_code() invalidates any prior unused
code for the same handle; use_reset_code() validates handle match,
expiry (24 h), and single-use; change_password() updates the hash.

server.py: POST /api/admin/users/{handle}/reset-password-code (admin)
returns a code; POST /api/auth/reset-password (public) validates the
code + handle and sets the new password.

Admin page: "Reset pwd" button per user — shows the code inline on
click (monospace, click-to-copy).
/reset-password/ page: handle + code + new password form.
Login page: "Forgot password?" link.
2026-04-14 21:58:50 +02:00
Davide Scaini d2ba96c26a fix admin delete to wipe originals/edits/geojson; rename button to Reset data
The old DELETE /api/admin/users/{handle}/activities only removed *.json
files and _merged/, leaving originals/ (Strava FIT files) and edits/
untouched — causing the 968 MB disk usage after a delete.

_wipe_user_activities() now removes activities/, edits/, originals/,
_merged/, index.json, athlete.json, and .bincio_cache.json. Admin page
button renamed to "Reset data" with updated confirmation text.
2026-04-13 20:10:15 +02:00
Davide Scaini a75dfa160b F14: add per-activity delete (DELETE /api/activity/{id} + drawer button)
Server endpoint removes the activity JSON, GeoJSON, timeseries, sidecar
edit, and images directory. Also purges the dedup cache entry so the
file can be re-uploaded if needed. Runs merge_all + rebuild afterwards.

EditDrawer: two-click delete button (click once → "Confirm delete?",
click again → deletes). On success, dispatches 'deleted' event.
ActivityDetail navigates back to the feed on delete.
2026-04-13 19:35:40 +02:00
Davide Scaini e6bb6e61a2 fix elevation_gain_m null for modern Garmin FIT files; fix map flash
FIT parser: try enhanced_altitude before altitude. Barometric altimeters
on modern Garmins (Edge 540, 840, etc.) write enhanced_altitude in
record messages and total_ascent in lap messages. The old code read only
altitude, producing null elevation_m per point → null elevation_gain_m
at the activity root while laps had correct values from total_ascent.

ActivityMap: use preview_coords (passed from ActivitySummary) to
initialise the map at the activity's location on mount, eliminating the
flash of world-view before the async detail JSON / bbox arrives.
2026-04-13 19:18:37 +02:00
Davide Scaini e7eefa345e F17: replace merge_all with merge_one in upload_image and delete_image
Single-activity writes now trigger a fast merge_one instead of a full
user rebuild. post_activity was fixed earlier; this completes the fix
for upload_image and delete_image endpoints.
2026-04-13 19:03:46 +02:00
Davide Scaini 5ad3aee8f6 rename privacy "private" → "unlisted"; enable GPS for unlisted
- "unlisted" = not shown in the public feed, but GPS track, timeseries
  and detail JSON are all accessible by direct URL (security by obscurity)
- "private" accepted as legacy alias everywhere (backward compat with
  existing data on disk)
- New writes from Strava sync / ZIP upload / sidecar use "unlisted"
- Only "no_gps" now suppresses the GPS track
- isUnlisted() helper in format.ts used by all Svelte/Astro components
- SCHEMA.md and CLAUDE.md document the privacy model and the distinction
  between "unlisted" and "no_gps"
2026-04-13 18:49:20 +02:00
Davide Scaini 1587d1cdf3 - brut: _merged/index.json has 586 activities — the count when merge_all last ran. The SSE rebuild bug (already fixed) meant it never re-ran after the full Strava sync
added 3256 more.
  - danilo: _merged/ is 8 KB — basically empty. merge_all likely ran concurrently (multiple file uploads trigger multiple rebuilds without a lock in --no-build mode),
  causing a race where shutil.rmtree(merged_acts) from one run wiped what another run was writing.

  Two fixes: serialize --no-build rebuilds with the same lock, and add a "Rebuild" button to the admin page.

 Root causes fixed:
  1. merge_all race condition — --no-build rebuilds now hold _rebuild_lock, same as full builds
  2. The SSE rebuild-trigger bug (already fixed earlier) was brut's original cause
2026-04-13 12:35:05 +02:00
Davide Scaini 7b37f45180 Bug fixed — temp ZIPs now go to /tmp/ (system temp) and are always deleted in a finally block, so they can't leak. A startup hook also auto-cleans any leftovers on
next server restart.

  Admin page now shows:
  - Overall disk bar (used/free/%)
  - Per-user table: total, activities (with file count), originals (with Strava breakdown), merged, images
  - A mini bar per user showing relative size
  - Red ⚠ warning if orphaned temp ZIPs are still present for a user
  - Delete activities button (reloads sizes after)
2026-04-13 12:24:59 +02:00
Davide Scaini 7e526c14e1 fix commit d659b90cd9 2026-04-12 19:55:13 +02:00
Davide Scaini d659b90cd9 - DELETE /api/admin/users/{handle}/activities — deletes all activities/*.json, wipes _merged/ and
index.json, then triggers a rebuild. Admin-only.
  - /admin/ page — lists all users, each with a "Delete activities" button. Clicking asks for
  confirmation in a <dialog> before firing the request. Button shows "Deleted (N)" or an error inline.
  - "Admin" nav link — appears in the top-right for admins only, hidden for everyone else.
2026-04-12 17:46:28 +02:00
Davide Scaini f003fdd89f garmin sync first attempt 2026-04-12 15:36:21 +02:00
Davide Scaini 6c431e8821 Here's what was built and why each decision was made:
Key at data_dir.parent/.garmin_key — nginx serves location /data/ { alias /var/bincio/data/; } so
  anything inside that dir is reachable. The key lives one level up at /var/bincio/.garmin_key,
  outside nginx's reach.

  Two-layer storage — garmin_creds.json holds the encrypted email+password (needed for re-login when
  tokens expire); garmin_session/ holds the garth OAuth tokens in plain JSON (short-lived, not the
  user's actual password).

  test_login() — called by the connect endpoint before saving anything, so credentials are only
  persisted if they actually work.

  get_client() — tries the session first (fast, no network), falls back to full re-login
  transparently. The caller never needs to think about whether the session is fresh.
2026-04-12 15:12:20 +02:00
Davide Scaini e80231b442 fix strava sync rebuild: trigger before yielding done event, not after 2026-04-12 14:55:33 +02:00
Davide Scaini 0b569b727c trigger site rebuild after new user registration so profile pages exist immediately 2026-04-11 14:19:40 +02:00
Davide Scaini 5de9967127 fix upload 500: add missing _file_suffix to serve server; fix iOS file picker accept types 2026-04-11 14:12:54 +02:00
Davide Scaini ef5b06c5b3 trigger rebuild after activities upload 2026-04-11 08:47:27 +02:00
Davide Scaini 82830222ba For users uploading:
- POST /api/upload now returns text/event-stream instead of JSON
  - Per-file progress events stream back as each file is processed: ↓ 3/47 (6%) — morning_ride.fit
  - Final done event shows the summary: "12 added, 35 duplicates"
  - The Vite proxy is configured to stream this properly (no buffering)

  For the admin:
  - New GET /api/admin/jobs endpoint (admin-only) returns the list of active upload jobs, each with
  user, started_at, total, done, current (filename being processed)
  - A pulsing amber badge appears in the nav bar for admins when any user has an active upload running
   — it shows e.g. "2 uploads running" with a tooltip listing each user's progress (@alice: 12/50
  files)
  - Polls every 5 seconds, disappears automatically when all jobs finish
2026-04-11 08:33:21 +02:00
Davide Scaini 01db4eb9ae ingest activities.csv 2026-04-11 08:13:27 +02:00
Davide Scaini cbd5a98cd3 - merge.py: keep private activities in _merged/index.json instead of
stripping them; privacy filtering is now done client-side
      - ActivityFeed: detect logged-in user via bincio:me event; show private
        activities only when viewing your own profile; private cards get a lock
        badge
2026-04-10 23:16:38 +02:00
Davide Scaini bc30e0a2fc option to keep all activities private from strava zip, fix copy of register link 2026-04-10 22:51:29 +02:00
Davide Scaini da622131fd upload zip archive from strava 2026-04-10 22:26:11 +02:00
Davide Scaini 3b8bc159c5 upload strava zip 2026-04-10 22:01:44 +02:00
Davide Scaini 9fd088c693 - "Last sync: never": The old blocking sync was killed by nginx at 120s before save_token was reached. The activities made it to disk (ingestion happens per-activity as it goes), but the token's
last_sync_at timestamp was never written. After deploying, do a soft reset — it'll set last_sync_at to your most recent activity's timestamp so the next sync only fetches newer ones.
  - Reset 404: Added POST /api/strava/reset to serve/server.py. The soft reset now looks in _merged/index.json first (multi-user path), falling back to index.json.
2026-04-10 18:34:53 +02:00
Davide Scaini 816f103b4c fix: write empty index.json for new users at registration so shard resolves immediately 2026-04-10 18:20:35 +02:00
Davide Scaini 3e4ff4019b limit number of workers 2026-04-10 18:13:49 +02:00
Davide Scaini cf414a08ad fix strava import? 2026-04-10 18:13:32 +02:00