diff --git a/CHANGELOG.md b/CHANGELOG.md index d608c04..81b5d7c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,65 @@ # Changelog +## [Unreleased] — 2026-04-20 + +### Improvement — Elevation gain accuracy (hysteresis accumulation) + +The previous algorithm accumulated every positive elevation delta between +consecutive track points, counting GPS jitter and barometric quantization +noise as real climbing. This consistently overestimated gain — in extreme +cases by 100% on flat coastal routes. + +The new algorithm uses **hysteresis dead-band accumulation**: elevation is +only committed when it changes by more than a source-specific threshold from +the last committed value. GPS noise is suppressed without losing real climbs. + +- **`bincio/extract/models.py`** — `ParsedActivity` gains an `altitude_source` + field (`"barometric"` / `"gps"` / `"unknown"`) +- **`bincio/extract/parsers/fit.py`** — detects whether any record frame used + `enhanced_altitude` (barometric altimeter) vs `altitude` (GPS-derived) and + sets `altitude_source` accordingly +- **`bincio/extract/parsers/gpx.py`**, **`tcx.py`** — both set + `altitude_source = "gps"` +- **`bincio/extract/metrics.py`** — `_elevation()` replaced with hysteresis + accumulator; thresholds: **5 m** for barometric, **10 m** for GPS/unknown +- **`tests/test_metrics.py`** — 5 new parametric tests: flat GPS noise + suppression, barometric vs GPS threshold difference, real climb approximation, + unknown-treated-as-gps invariant + +### New feature — DEM-based elevation recalculation from the edit drawer + +A new **"Recalculate from terrain map (DEM)"** button in the activity edit +drawer replaces noisy GPS altitude with SRTM terrain data, then re-applies +hysteresis accumulation to compute corrected gain/loss. + +This is the recommended fix for activities that still show inaccurate +elevation after the hysteresis improvement (e.g. activities recorded before +re-extracting from sources, or uploads where the source file had severe GPS +altitude noise). + +How it works: +1. The server subsamples the activity's 1 Hz GPS track (one point every 10 s) +2. Queries an Open-Elevation-compatible API for terrain elevation +3. Linearly interpolates DEM elevation back to every GPS-valid second +4. Applies 5 m hysteresis to compute the corrected gain and loss +5. Writes the updated `elevation_m` array to the timeseries (chart updates) +6. Patches `elevation_gain_m` / `elevation_loss_m` in the activity JSON and + `index.json` summary + +- **`bincio/extract/dem.py`** (new) — `lookup_elevations()` (batched HTTP POST, + Open-Elevation wire format) + `recalculate_elevation()` (full pipeline above) +- **`POST /api/activity/{id}/recalculate-elevation`** — on both `bincio serve` + (auth-gated, triggers `merge_one` + rebuild) and `bincio edit` (no auth) +- **`bincio serve --dem-url URL`** / **`bincio edit --dem-url URL`** — override + the default DEM endpoint (also read from `DEM_URL` env var) +- Default DEM endpoint: **`https://api.open-elevation.com`** — works out of the + box with no configuration +- **`GET /api/me`** response gains `dem_configured: bool` +- **`EditDrawer.svelte`** — button with spinner, shows `↑ Xm ↓ Ym` on success + or an inline error (e.g. if the DEM API is unreachable) + +--- + ## [Unreleased] — 2026-04-16 ### New feature — Self-service user settings page diff --git a/docs/elevation.md b/docs/elevation.md new file mode 100644 index 0000000..8dfb2d2 --- /dev/null +++ b/docs/elevation.md @@ -0,0 +1,312 @@ +# Elevation gain calculation — problem analysis and roadmap + +## The problem + +Bincio's current algorithm naively accumulates every positive elevation delta between +consecutive track points. This **always overestimates** real climbing because it treats +sensor noise as genuine ascent. The overestimation ranges from insignificant on long +mountain rides to catastrophic on flat routes where 100% of the reported gain is noise. + +--- + +## Current algorithm (`metrics.py:_elevation`) + +```python +def _elevation(pts): + elevations = [p.elevation_m for p in pts if p.elevation_m is not None] + gain = loss = 0.0 + for a, b in zip(elevations, elevations[1:]): + diff = b - a + if diff > 0: + gain += diff + else: + loss += diff + return gain, loss +``` + +Every positive step — including 0.1m GPS jitter, barometric quantization steps of +0.2m, and random-walk noise — is counted as climbing. There is no filtering, +smoothing, or minimum-step threshold of any kind. + +--- + +## Root causes of overestimation + +### 1. GPS-derived altitude noise + +GPS units measure altitude from satellite triangulation. This is inherently less +accurate than horizontal positioning: typical GPS altitude error is ±5–15m, and the +error follows a correlated random walk. On a flat route, the track oscillates above +and below the true elevation, and the positive half of those oscillations accumulates +as phantom climbing. + +**Characteristic signature:** elevation range far smaller than reported gain; nearly +100% of deltas are sub-1m; median step size is 0.0m. + +### 2. Barometric altimeter quantization + +Devices with a barometric sensor report higher-quality data, but they still apply +internal smoothing and quantise the output to fixed increments (commonly 0.2m or +0.4m). The device holds the reading steady for several seconds, then steps to the +next quantised value. Small-but-real oscillations at the quantisation boundary +(e.g. hovering between 128.2m and 128.4m while essentially flat) produce repeated +tiny up/down steps that accumulate. + +**Characteristic signature:** many repeated identical elevations (device holding); +most transitions are 0.0m, 0.2m or 0.4m; significant fraction of gain from sub-1m +steps even on a real climb. + +### 3. High sampling rate amplifying both effects + +At 1 Hz, both GPS and barometric sensors produce more noise steps per meter of +real climbing than at lower rates. Downsampling to 1 Hz (as the timeseries writer +does) does not eliminate noise already present in the source data. + +--- + +## Case studies + +### Activity 1 — diego_p, 2026-04-11T051441Z (Wahoo ELEMNT, GPX) + +- **URL:** https://bincio.org/activity/2026-04-11T051441Z/ +- **Bincio reports:** 353.6m gain +- **Correct estimate:** ~150–160m (Wahoo device's own reading, which applies internal + thresholding) +- **Excess:** ~200m (56% overestimate) + +| Metric | Value | +|--------|-------| +| Points | 15,721 | +| Elevation range | −10.6m to +5.4m (16m total span) | +| Median \|delta\| | 0.000m | +| Zero-change steps | 12,358 (79%) | +| Sub-0.5m steps | 3,339 (21%) | +| Steps ≥ 1m | 0 (0%) | +| Gain from sub-1m steps | 353.6m (100% of total) | + +**Diagnosis:** This is a flat coastal route. The GPS altitude range is only 16m. +Every single metre of reported gain is sub-1m GPS jitter — no real climbing is +recorded at all. Even a 1m threshold would produce exactly 0m gain, which is wrong +in the other direction (the route may have minor real undulation). The Wahoo device's +own algorithm uses internal hysteresis to report ~153m. + +### Activity 2 — m4xw3ll__, 2026-04-14T161945Z (Bryton Rider, FIT) + +- **URL:** http://95.216.55.151/activity/2026-04-14T161945Z/ +- **Bincio reports:** 1285.2m gain +- **Correct estimate:** ~885m (Strava / device reading) +- **Excess:** ~400m (45% overestimate) + +| Metric | Value | +|--------|-------| +| Points | 6,583 | +| Elevation range | 0.0m to 454.0m (454m total span) | +| Median \|delta\| | 0.000m | +| Zero-change steps | 4,077 (62%) | +| Sub-0.5m steps | 769 (12%) | +| 0.5–1m steps | 647 (10%) | +| 1–2m steps | 921 (14%) | +| 2m+ steps | 168 (3%) | +| Gain from sub-1m steps | 484.0m (38% of total) | + +**Diagnosis:** Real climbing exists (0–454m) but 38% of the reported gain comes from +sub-1m barometric quantization noise. The Bryton records elevation at ≈0.2m +increments. At quantization boundaries the device oscillates producing repeated +tiny up/down steps. A simple 1m threshold gives 801m (10% below truth); 2m gives +only 221m (too aggressive). Pure threshold-based filtering doesn't work well here. + +--- + +## Alternative algorithms + +### A. Simple threshold + +Only count a step if it exceeds `min_step_m`: + +```python +gain += diff if diff >= min_step_m else 0 +``` + +**Pros:** trivial to implement, zero overhead. +**Cons:** flat/hiking routes with gradual slopes produce many steps < threshold +that together represent real climbing. A 2m threshold already loses 30% of real +gain on the Bryton activity. Requires per-device tuning that is impractical. + +--- + +### B. Hysteresis / dead-band accumulation + +Track a "committed" elevation. Only commit a new elevation when it differs from +the last committed value by more than `threshold_m`. Accumulate from committed to +committed only. + +```python +def _elevation_hysteresis(elevations, threshold_m=10.0): + gain = loss = 0.0 + committed = elevations[0] + for e in elevations[1:]: + diff = e - committed + if abs(diff) >= threshold_m: + if diff > 0: + gain += diff + else: + loss += abs(diff) + committed = e + return gain, loss +``` + +**Pros:** naturally handles both GPS drift and barometric quantization; used by +Strava (proprietary variant), RideWithGPS (10m default), and GPSies (5m). +**Cons:** threshold choice is critical and device-dependent. On a genuine 8m climb +followed by descent, a 10m threshold records zero. Needs to be lower for cycling +than hiking (slopes are smoother, sensors better). + +**Results on our case studies with threshold=10m:** +- Wahoo flat (correct ~153m): would likely produce 0–30m. Fixes the gross overcount + but may undercount real minor undulation. +- Bryton climb (correct ~885m): would need evaluation on the raw data. + +--- + +### C. Moving-average pre-smoothing + +Apply a sliding-window mean or Gaussian blur to the elevation series, then +accumulate naively. + +```python +import statistics + +def smooth(elevations, window=30): + half = window // 2 + out = [] + for i, e in enumerate(elevations): + lo, hi = max(0, i - half), min(len(elevations), i + half + 1) + out.append(statistics.mean(elevations[lo:hi])) + return out + +gain, loss = _elevation(smooth(elevations, window=30)) +``` + +**Pros:** easy to implement; smoothing removes high-frequency noise while preserving +long-wavelength terrain. +**Cons:** loses real short climbs (e.g. a 20m ramp over 20 seconds is averaged to +near-flat). Window size needs tuning per sample rate. Edge effects at start/end. + +--- + +### D. Savitzky-Golay filter + +A polynomial least-squares smoothing filter that better preserves peaks and +troughs than a simple moving average. Available in `scipy.signal.savgol_filter` +(scipy is already an indirect dependency via numpy, which is used nowhere critical — +but adding scipy is a dependency choice). + +**Pros:** better terrain shape preservation than moving average; standard in +scientific signal processing. +**Cons:** requires scipy; harder to implement without it; window/order tuning still +required. + +--- + +### E. Kalman filter (device-class-aware) + +A Kalman filter can be tuned with separate process noise and measurement noise +parameters for GPS vs barometric data. This is what high-end cycling computers do +internally. + +**Pros:** theoretically optimal; can be parameterised per device class. +**Cons:** significantly more complex; requires knowing the device class (GPS-only vs +barometric); still requires parameter tuning. + +--- + +### F. Source-aware strategy + +Use different algorithms depending on the file type and whether the device reported +enhanced (barometric) altitude: + +- **FIT file with `enhanced_altitude` field**: barometric data, use hysteresis 5m +- **FIT file with GPS altitude only**: treat as GPS, use hysteresis 10–15m or + discard altitude entirely and use a DEM lookup +- **GPX with `` tag**: assume GPS unless `` contains barometric + fields; use hysteresis 10–15m +- **Strava-enriched data**: Strava's API provides corrected `altitude` arrays; use + as-is with hysteresis 2m to catch quantization + +--- + +## What Strava/Garmin/others do + +| Platform | Method | +|---|---| +| Strava | Proprietary; replaces raw altitude with DEM-corrected data for GPS-only devices; applies internal smoothing before accumulation | +| Garmin Connect | Uses enhanced\_altitude (barometric), applies Kalman filter on-device; Connect re-applies server-side smoothing | +| Wahoo | On-device hysteresis (≈3m threshold); the GPX file contains already-smoothed altitude | +| RideWithGPS | 10m hysteresis by default, configurable | +| Komoot | DEM correction + smoothing | +| TrainingPeaks | Configurable threshold (5m default) | + +Strava's approach of DEM (Digital Elevation Model) correction is the gold standard +for GPS-only tracks: replace the noisy GPS altitude entirely with the ground truth +from a 30m-resolution DEM such as SRTM. This requires an additional data source +(e.g. the Open-Elevation API or a locally hosted SRTM tile set) but completely +eliminates GPS altitude noise. + +--- + +## Recommended fix + +Given the two failure modes observed: + +### Short term — ✅ Implemented (2026-04-20) + +**Hysteresis accumulation** with source-aware thresholds, applied at extract time: + +| Source | Threshold | +|---|---| +| FIT with `enhanced_altitude` (barometric) | 5 m | +| FIT with GPS altitude | 10 m | +| GPX | 10 m | +| TCX | 10 m | + +`ParsedActivity.altitude_source` is set by each parser (`"barometric"` / `"gps"` / +`"unknown"`). `_elevation()` in `metrics.py` selects the threshold from this value. + +New activities extracted after this change benefit automatically. Existing activities +require re-extraction from source files. + +### Medium term — ✅ Implemented (2026-04-20) + +**On-demand DEM correction** via the edit drawer, using the Open-Elevation API +(SRTM30 data): + +1. GPS track subsampled to one point per 10 s to minimise API calls. +2. Terrain elevation fetched via `POST https://api.open-elevation.com/api/v1/lookup` + in batches of 512. +3. DEM elevation linearly interpolated back to the full 1 Hz series. +4. 5 m hysteresis applied to the interpolated series. +5. Timeseries and activity JSON patched in place; chart and stats update immediately. + +Implementation: `bincio/extract/dem.py` + `POST /api/activity/{id}/recalculate-elevation` +on both servers. Default endpoint: `https://api.open-elevation.com`; override with +`--dem-url` or `DEM_URL` env var. + +This is the recommended fix for activities uploaded before the hysteresis improvement, +or any activity where GPS noise is severe. + +--- + +## Implementation status + +| File | Status | +|---|---| +| `bincio/extract/models.py` | ✅ `altitude_source` field added | +| `bincio/extract/parsers/fit.py` | ✅ detects `enhanced_altitude` vs GPS | +| `bincio/extract/parsers/gpx.py` | ✅ sets `altitude_source = "gps"` | +| `bincio/extract/parsers/tcx.py` | ✅ sets `altitude_source = "gps"` | +| `bincio/extract/metrics.py` | ✅ hysteresis `_elevation()` with source-aware threshold | +| `bincio/extract/dem.py` | ✅ `lookup_elevations()` + `recalculate_elevation()` | +| `bincio/serve/server.py` | ✅ `POST /api/activity/{id}/recalculate-elevation` | +| `bincio/edit/server.py` | ✅ same endpoint (single-user) | +| `site/src/components/EditDrawer.svelte` | ✅ "Recalculate from terrain map" button | +| `tests/test_metrics.py` | ✅ 5 parametric tests | diff --git a/docs/reference/cli.md b/docs/reference/cli.md index f737a1d..3219f36 100644 --- a/docs/reference/cli.md +++ b/docs/reference/cli.md @@ -116,6 +116,7 @@ uv run bincio edit [OPTIONS] | `--port PORT` | `4041` | Bind port | | `--strava-client-id ID` | from config | Strava OAuth client ID | | `--strava-client-secret SECRET` | from config | Strava OAuth client secret | +| `--dem-url URL` | `https://api.open-elevation.com` | Open-Elevation-compatible API for the "Recalculate elevation" button (also `DEM_URL` env var) | Set `PUBLIC_EDIT_URL=http://localhost:4041` in `site/.env` to enable the Edit button and Upload ↑ button in the site. @@ -164,6 +165,7 @@ uv run bincio serve [OPTIONS] | `--site-dir DIR` | — | Astro site dir — enables post-write incremental rebuilds | | `--host HOST` | `127.0.0.1` | Bind address (keep on localhost; nginx proxies from outside) | | `--port PORT` | `4041` | Bind port | +| `--dem-url URL` | `https://api.open-elevation.com` | Open-Elevation-compatible API for the "Recalculate elevation" button (also `DEM_URL` env var) | Requires `bincio init` to have been run first. Handles auth, user management, and write operations. nginx is responsible for serving static files and proxying `/api/*` to this server. diff --git a/docs/user-guide.md b/docs/user-guide.md index ec2de54..66b3faf 100644 --- a/docs/user-guide.md +++ b/docs/user-guide.md @@ -73,6 +73,21 @@ Click **Edit** on any activity to: Changes save instantly. The site rebuilds in the background. +### Recalculating elevation from terrain data + +If an activity shows an unrealistic elevation gain (common with GPS-only devices on flat +routes, or with older Garmin/Wahoo files), the edit drawer has a +**"Recalculate from terrain map (DEM)"** button. + +Clicking it replaces the recorded GPS altitude with SRTM terrain data from the +[Open-Elevation API](https://open-elevation.com) and recomputes the gain and loss. The +elevation chart and the summary stats both update. This usually brings the numbers in +line with what Strava or your device's app reports. + +> **Note:** The correction requires a GPS track (activities marked *No GPS* cannot be +> corrected). The DEM has ~30 m horizontal resolution, so very short or indoor activities +> are not meaningfully improved. + ### Photo gallery Upload photos for an activity. They appear in a lightbox on the activity detail page. The server stores them in your data directory.