Strava originals with absolute Unix timestamps stored as elapsed-second
offsets produce a t_max of ~1.6 billion. compute_mmp and compute_best_efforts
both create dense 1Hz arrays via range(t_min, t_max+1), which for a 1.6B
span allocates 44+ GB and OOM-kills the process. Add a >1-week sanity
check and return None early for corrupt streams.
Root cause: old Strava activities (seen from 1970-epoch start_date)
where the time stream contains absolute Unix timestamps instead of
elapsed seconds.
One Python process for 2015 activities exhausts all RAM + swap on a
cheap VPS. Split into sequential batches of 100: each subprocess handles
100 activities and exits, returning all memory to the OS before the
next batch starts. The server chains batches in the SSE event_stream
and triggers a single rebuild when all batches complete.
CPython's allocator holds freed memory in arenas and doesn't return it to
the OS, causing RSS to grow throughout the 2015-activity loop even when
each iteration's objects are freed. Call gc.collect() + malloc_trim(0)
every 50 activities to return freed pages to the kernel and keep RSS flat.
uv is unreliable in systemd environments where PATH omits ~/.local/bin.
Use sys.executable's parent directory to find the venv's bincio script
directly — this always works since the server itself runs from the venv.
The in-process approach loaded all 2015 Strava originals into the server
process memory, causing OOM kills. Now spawns `bincio reextract-originals`
as a child process; heavy work runs in an isolated Python interpreter that
exits when done, freeing all memory.
Also adds `bincio reextract-originals` as a standalone CLI command that
prints JSON-lines progress to stdout — useful for running directly on the
VPS via SSH for large backlogs.
The per-call run_in_executor pattern caused network errors.
New approach: one thread runs the entire extraction loop and puts
SSE strings into an asyncio.Queue via call_soon_threadsafe; the
async generator drains the queue. This is the correct pattern for
background-thread + SSE streaming in FastAPI.
The sync generator was failing with a network error because Starlette's
iterate_in_threadpool doesn't properly propagate exceptions from sync
generators — the connection resets with no body.
Fix: convert event_stream to an async generator (Starlette handles these
natively without thread wrapping), move imports to the endpoint function
scope so failures raise HTTPException before the stream starts, and run
CPU-intensive work (parse + write) via loop.run_in_executor so the
async generator can actually yield between activities.
- Generator now yields a 'status' event immediately so the client can
distinguish 'working' from 'failed silently before first event'
- Batch mode: call write_activity per file but write index.json and
athlete.json only once at the end (was O(n²) — 2015 rewrites)
- JS: check r.ok before reading the body stream; show HTTP error detail
instead of staying stuck at 'Starting…'
- Handle 'status' event type in the progress log
- POST /api/admin/users/{handle}/reextract-originals: reads stored
originals/strava/*.json and re-runs strava_to_parsed + ingest_parsed
without hitting the Strava API; streams SSE progress; calls merge_all
and rebuild on completion
- GET /api/admin/users/{handle}/diag: now shows _merged/activities/
file counts, a sample of filenames in activities/ (with symlink flag),
and lists pending_files by name
- Admin page: Re-extract button per user with live SSE progress modal
- bincio.serve logger wired into uvicorn output: rebuild steps, upload
errors, strava-zip progress all now appear in the server log
- _trigger_rebuild: capture stdout/stderr, log errors instead of silently
discarding; exceptions logged with traceback instead of swallowed
- upload handler: log per-file errors with traceback; include error detail
in the SSE event sent back to the browser
- strava-zip handler: log imported/error counts on completion
- GET /api/admin/users/{handle}/diag: snapshot of a user's data dir
(file counts, sizes, index activity counts, pending uploads)
- POST /api/admin/users/{handle}/rebuild-sync: blocking rebuild that
returns full stdout/stderr — for debugging without SSH log access
- Admin page: Diag button per user opens a modal showing the diag JSON
vis.js requires a pixel-sized container — flex:1 is ignored.
Use position:fixed toolbar + JS-measured height for the graph div,
stored as window._network for resize handling.
getStaticPaths now returns [] — all /activity/{id}/ URLs are served by
the activity/index.html shell via nginx try_files and hydrated by
ActivityDetailLoader. Pre-rendering thousands of pages was exhausting
server RAM and killing the build. The dynamic loader already handles
public, unlisted, and local activities identically.
db.py: reset_codes table (code, handle, created_by, created_at,
expires_at, used_at); create_reset_code() invalidates any prior unused
code for the same handle; use_reset_code() validates handle match,
expiry (24 h), and single-use; change_password() updates the hash.
server.py: POST /api/admin/users/{handle}/reset-password-code (admin)
returns a code; POST /api/auth/reset-password (public) validates the
code + handle and sets the new password.
Admin page: "Reset pwd" button per user — shows the code inline on
click (monospace, click-to-copy).
/reset-password/ page: handle + code + new password form.
Login page: "Forgot password?" link.
The old DELETE /api/admin/users/{handle}/activities only removed *.json
files and _merged/, leaving originals/ (Strava FIT files) and edits/
untouched — causing the 968 MB disk usage after a delete.
_wipe_user_activities() now removes activities/, edits/, originals/,
_merged/, index.json, athlete.json, and .bincio_cache.json. Admin page
button renamed to "Reset data" with updated confirmation text.
Server endpoint removes the activity JSON, GeoJSON, timeseries, sidecar
edit, and images directory. Also purges the dedup cache entry so the
file can be re-uploaded if needed. Runs merge_all + rebuild afterwards.
EditDrawer: two-click delete button (click once → "Confirm delete?",
click again → deletes). On success, dispatches 'deleted' event.
ActivityDetail navigates back to the feed on delete.
FIT parser: try enhanced_altitude before altitude. Barometric altimeters
on modern Garmins (Edge 540, 840, etc.) write enhanced_altitude in
record messages and total_ascent in lap messages. The old code read only
altitude, producing null elevation_m per point → null elevation_gain_m
at the activity root while laps had correct values from total_ascent.
ActivityMap: use preview_coords (passed from ActivitySummary) to
initialise the map at the activity's location on mount, eliminating the
flash of world-view before the async detail JSON / bbox arrives.
Single-activity writes now trigger a fast merge_one instead of a full
user rebuild. post_activity was fixed earlier; this completes the fix
for upload_image and delete_image endpoints.
- "unlisted" = not shown in the public feed, but GPS track, timeseries
and detail JSON are all accessible by direct URL (security by obscurity)
- "private" accepted as legacy alias everywhere (backward compat with
existing data on disk)
- New writes from Strava sync / ZIP upload / sidecar use "unlisted"
- Only "no_gps" now suppresses the GPS track
- isUnlisted() helper in format.ts used by all Svelte/Astro components
- SCHEMA.md and CLAUDE.md document the privacy model and the distinction
between "unlisted" and "no_gps"
If the index-based lookup fails (shard fetch silently failed, stale
index state, etc.), try fetching the activity detail file directly from
each user shard's _merged/activities/ directory. This makes private
activities and newly-synced activities accessible even when the index
resolution fails.
Also add console.error logging when shards fail in resolveShards to
help diagnose root causes.
added 3256 more.
- danilo: _merged/ is 8 KB — basically empty. merge_all likely ran concurrently (multiple file uploads trigger multiple rebuilds without a lock in --no-build mode),
causing a race where shutil.rmtree(merged_acts) from one run wiped what another run was writing.
Two fixes: serialize --no-build rebuilds with the same lock, and add a "Rebuild" button to the admin page.
Root causes fixed:
1. merge_all race condition — --no-build rebuilds now hold _rebuild_lock, same as full builds
2. The SSE rebuild-trigger bug (already fixed earlier) was brut's original cause
next server restart.
Admin page now shows:
- Overall disk bar (used/free/%)
- Per-user table: total, activities (with file count), originals (with Strava breakdown), merged, images
- A mini bar per user showing relative size
- Red ⚠ warning if orphaned temp ZIPs are still present for a user
- Delete activities button (reloads sizes after)
index.json, then triggers a rebuild. Admin-only.
- /admin/ page — lists all users, each with a "Delete activities" button. Clicking asks for
confirmation in a <dialog> before firing the request. Button shows "Deleted (N)" or an error inline.
- "Admin" nav link — appears in the top-right for admins only, hidden for everyone else.
Key at data_dir.parent/.garmin_key — nginx serves location /data/ { alias /var/bincio/data/; } so
anything inside that dir is reachable. The key lives one level up at /var/bincio/.garmin_key,
outside nginx's reach.
Two-layer storage — garmin_creds.json holds the encrypted email+password (needed for re-login when
tokens expire); garmin_session/ holds the garth OAuth tokens in plain JSON (short-lived, not the
user's actual password).
test_login() — called by the connect endpoint before saving anything, so credentials are only
persisted if they actually work.
get_client() — tries the session first (fast, no network), falls back to full re-login
transparently. The caller never needs to think about whether the session is fresh.
- POST /api/upload now returns text/event-stream instead of JSON
- Per-file progress events stream back as each file is processed: ↓ 3/47 (6%) — morning_ride.fit
- Final done event shows the summary: "12 added, 35 duplicates"
- The Vite proxy is configured to stream this properly (no buffering)
For the admin:
- New GET /api/admin/jobs endpoint (admin-only) returns the list of active upload jobs, each with
user, started_at, total, done, current (filename being processed)
- A pulsing amber badge appears in the nav bar for admins when any user has an active upload running
— it shows e.g. "2 uploads running" with a tooltip listing each user's progress (@alice: 12/50
files)
- Polls every 5 seconds, disappears automatically when all jobs finish
stripping them; privacy filtering is now done client-side
- ActivityFeed: detect logged-in user via bincio:me event; show private
activities only when viewing your own profile; private cards get a lock
badge
container div, causing the layout to collapse to zero height for a moment.
The browser then scrolls to keep the viewport anchored, but since the page
got shorter it jumps to the top. When the new SVG is appended, the page is
taller again but the scroll position was already reset.
Fix: give the chart container a min-height matching the chart height (220px)
so it never collapses.