docs: update changelog, CLI reference, user guide, and elevation notes

- CHANGELOG: document hysteresis elevation fix and DEM recalculation feature
- docs/reference/cli.md: add --dem-url to bincio edit and bincio serve tables
- docs/user-guide.md: document "Recalculate from terrain map" button in edit drawer
- docs/elevation.md: mark both short-term and medium-term fixes as implemented
This commit is contained in:
Davide Scaini
2026-04-20 21:18:50 +02:00
parent 0c659db6cb
commit 2b7a37ed41
4 changed files with 389 additions and 0 deletions
+60
View File
@@ -1,5 +1,65 @@
# Changelog
## [Unreleased] — 2026-04-20
### Improvement — Elevation gain accuracy (hysteresis accumulation)
The previous algorithm accumulated every positive elevation delta between
consecutive track points, counting GPS jitter and barometric quantization
noise as real climbing. This consistently overestimated gain — in extreme
cases by 100% on flat coastal routes.
The new algorithm uses **hysteresis dead-band accumulation**: elevation is
only committed when it changes by more than a source-specific threshold from
the last committed value. GPS noise is suppressed without losing real climbs.
- **`bincio/extract/models.py`** — `ParsedActivity` gains an `altitude_source`
field (`"barometric"` / `"gps"` / `"unknown"`)
- **`bincio/extract/parsers/fit.py`** — detects whether any record frame used
`enhanced_altitude` (barometric altimeter) vs `altitude` (GPS-derived) and
sets `altitude_source` accordingly
- **`bincio/extract/parsers/gpx.py`**, **`tcx.py`** — both set
`altitude_source = "gps"`
- **`bincio/extract/metrics.py`** — `_elevation()` replaced with hysteresis
accumulator; thresholds: **5 m** for barometric, **10 m** for GPS/unknown
- **`tests/test_metrics.py`** — 5 new parametric tests: flat GPS noise
suppression, barometric vs GPS threshold difference, real climb approximation,
unknown-treated-as-gps invariant
### New feature — DEM-based elevation recalculation from the edit drawer
A new **"Recalculate from terrain map (DEM)"** button in the activity edit
drawer replaces noisy GPS altitude with SRTM terrain data, then re-applies
hysteresis accumulation to compute corrected gain/loss.
This is the recommended fix for activities that still show inaccurate
elevation after the hysteresis improvement (e.g. activities recorded before
re-extracting from sources, or uploads where the source file had severe GPS
altitude noise).
How it works:
1. The server subsamples the activity's 1 Hz GPS track (one point every 10 s)
2. Queries an Open-Elevation-compatible API for terrain elevation
3. Linearly interpolates DEM elevation back to every GPS-valid second
4. Applies 5 m hysteresis to compute the corrected gain and loss
5. Writes the updated `elevation_m` array to the timeseries (chart updates)
6. Patches `elevation_gain_m` / `elevation_loss_m` in the activity JSON and
`index.json` summary
- **`bincio/extract/dem.py`** (new) — `lookup_elevations()` (batched HTTP POST,
Open-Elevation wire format) + `recalculate_elevation()` (full pipeline above)
- **`POST /api/activity/{id}/recalculate-elevation`** — on both `bincio serve`
(auth-gated, triggers `merge_one` + rebuild) and `bincio edit` (no auth)
- **`bincio serve --dem-url URL`** / **`bincio edit --dem-url URL`** — override
the default DEM endpoint (also read from `DEM_URL` env var)
- Default DEM endpoint: **`https://api.open-elevation.com`** — works out of the
box with no configuration
- **`GET /api/me`** response gains `dem_configured: bool`
- **`EditDrawer.svelte`** — button with spinner, shows `↑ Xm ↓ Ym` on success
or an inline error (e.g. if the DEM API is unreachable)
---
## [Unreleased] — 2026-04-16
### New feature — Self-service user settings page
+312
View File
@@ -0,0 +1,312 @@
# Elevation gain calculation — problem analysis and roadmap
## The problem
Bincio's current algorithm naively accumulates every positive elevation delta between
consecutive track points. This **always overestimates** real climbing because it treats
sensor noise as genuine ascent. The overestimation ranges from insignificant on long
mountain rides to catastrophic on flat routes where 100% of the reported gain is noise.
---
## Current algorithm (`metrics.py:_elevation`)
```python
def _elevation(pts):
elevations = [p.elevation_m for p in pts if p.elevation_m is not None]
gain = loss = 0.0
for a, b in zip(elevations, elevations[1:]):
diff = b - a
if diff > 0:
gain += diff
else:
loss += diff
return gain, loss
```
Every positive step — including 0.1m GPS jitter, barometric quantization steps of
0.2m, and random-walk noise — is counted as climbing. There is no filtering,
smoothing, or minimum-step threshold of any kind.
---
## Root causes of overestimation
### 1. GPS-derived altitude noise
GPS units measure altitude from satellite triangulation. This is inherently less
accurate than horizontal positioning: typical GPS altitude error is ±515m, and the
error follows a correlated random walk. On a flat route, the track oscillates above
and below the true elevation, and the positive half of those oscillations accumulates
as phantom climbing.
**Characteristic signature:** elevation range far smaller than reported gain; nearly
100% of deltas are sub-1m; median step size is 0.0m.
### 2. Barometric altimeter quantization
Devices with a barometric sensor report higher-quality data, but they still apply
internal smoothing and quantise the output to fixed increments (commonly 0.2m or
0.4m). The device holds the reading steady for several seconds, then steps to the
next quantised value. Small-but-real oscillations at the quantisation boundary
(e.g. hovering between 128.2m and 128.4m while essentially flat) produce repeated
tiny up/down steps that accumulate.
**Characteristic signature:** many repeated identical elevations (device holding);
most transitions are 0.0m, 0.2m or 0.4m; significant fraction of gain from sub-1m
steps even on a real climb.
### 3. High sampling rate amplifying both effects
At 1 Hz, both GPS and barometric sensors produce more noise steps per meter of
real climbing than at lower rates. Downsampling to 1 Hz (as the timeseries writer
does) does not eliminate noise already present in the source data.
---
## Case studies
### Activity 1 — diego_p, 2026-04-11T051441Z (Wahoo ELEMNT, GPX)
- **URL:** https://bincio.org/activity/2026-04-11T051441Z/
- **Bincio reports:** 353.6m gain
- **Correct estimate:** ~150160m (Wahoo device's own reading, which applies internal
thresholding)
- **Excess:** ~200m (56% overestimate)
| Metric | Value |
|--------|-------|
| Points | 15,721 |
| Elevation range | 10.6m to +5.4m (16m total span) |
| Median \|delta\| | 0.000m |
| Zero-change steps | 12,358 (79%) |
| Sub-0.5m steps | 3,339 (21%) |
| Steps ≥ 1m | 0 (0%) |
| Gain from sub-1m steps | 353.6m (100% of total) |
**Diagnosis:** This is a flat coastal route. The GPS altitude range is only 16m.
Every single metre of reported gain is sub-1m GPS jitter — no real climbing is
recorded at all. Even a 1m threshold would produce exactly 0m gain, which is wrong
in the other direction (the route may have minor real undulation). The Wahoo device's
own algorithm uses internal hysteresis to report ~153m.
### Activity 2 — m4xw3ll__, 2026-04-14T161945Z (Bryton Rider, FIT)
- **URL:** http://95.216.55.151/activity/2026-04-14T161945Z/
- **Bincio reports:** 1285.2m gain
- **Correct estimate:** ~885m (Strava / device reading)
- **Excess:** ~400m (45% overestimate)
| Metric | Value |
|--------|-------|
| Points | 6,583 |
| Elevation range | 0.0m to 454.0m (454m total span) |
| Median \|delta\| | 0.000m |
| Zero-change steps | 4,077 (62%) |
| Sub-0.5m steps | 769 (12%) |
| 0.51m steps | 647 (10%) |
| 12m steps | 921 (14%) |
| 2m+ steps | 168 (3%) |
| Gain from sub-1m steps | 484.0m (38% of total) |
**Diagnosis:** Real climbing exists (0454m) but 38% of the reported gain comes from
sub-1m barometric quantization noise. The Bryton records elevation at ≈0.2m
increments. At quantization boundaries the device oscillates producing repeated
tiny up/down steps. A simple 1m threshold gives 801m (10% below truth); 2m gives
only 221m (too aggressive). Pure threshold-based filtering doesn't work well here.
---
## Alternative algorithms
### A. Simple threshold
Only count a step if it exceeds `min_step_m`:
```python
gain += diff if diff >= min_step_m else 0
```
**Pros:** trivial to implement, zero overhead.
**Cons:** flat/hiking routes with gradual slopes produce many steps < threshold
that together represent real climbing. A 2m threshold already loses 30% of real
gain on the Bryton activity. Requires per-device tuning that is impractical.
---
### B. Hysteresis / dead-band accumulation
Track a "committed" elevation. Only commit a new elevation when it differs from
the last committed value by more than `threshold_m`. Accumulate from committed to
committed only.
```python
def _elevation_hysteresis(elevations, threshold_m=10.0):
gain = loss = 0.0
committed = elevations[0]
for e in elevations[1:]:
diff = e - committed
if abs(diff) >= threshold_m:
if diff > 0:
gain += diff
else:
loss += abs(diff)
committed = e
return gain, loss
```
**Pros:** naturally handles both GPS drift and barometric quantization; used by
Strava (proprietary variant), RideWithGPS (10m default), and GPSies (5m).
**Cons:** threshold choice is critical and device-dependent. On a genuine 8m climb
followed by descent, a 10m threshold records zero. Needs to be lower for cycling
than hiking (slopes are smoother, sensors better).
**Results on our case studies with threshold=10m:**
- Wahoo flat (correct ~153m): would likely produce 030m. Fixes the gross overcount
but may undercount real minor undulation.
- Bryton climb (correct ~885m): would need evaluation on the raw data.
---
### C. Moving-average pre-smoothing
Apply a sliding-window mean or Gaussian blur to the elevation series, then
accumulate naively.
```python
import statistics
def smooth(elevations, window=30):
half = window // 2
out = []
for i, e in enumerate(elevations):
lo, hi = max(0, i - half), min(len(elevations), i + half + 1)
out.append(statistics.mean(elevations[lo:hi]))
return out
gain, loss = _elevation(smooth(elevations, window=30))
```
**Pros:** easy to implement; smoothing removes high-frequency noise while preserving
long-wavelength terrain.
**Cons:** loses real short climbs (e.g. a 20m ramp over 20 seconds is averaged to
near-flat). Window size needs tuning per sample rate. Edge effects at start/end.
---
### D. Savitzky-Golay filter
A polynomial least-squares smoothing filter that better preserves peaks and
troughs than a simple moving average. Available in `scipy.signal.savgol_filter`
(scipy is already an indirect dependency via numpy, which is used nowhere critical —
but adding scipy is a dependency choice).
**Pros:** better terrain shape preservation than moving average; standard in
scientific signal processing.
**Cons:** requires scipy; harder to implement without it; window/order tuning still
required.
---
### E. Kalman filter (device-class-aware)
A Kalman filter can be tuned with separate process noise and measurement noise
parameters for GPS vs barometric data. This is what high-end cycling computers do
internally.
**Pros:** theoretically optimal; can be parameterised per device class.
**Cons:** significantly more complex; requires knowing the device class (GPS-only vs
barometric); still requires parameter tuning.
---
### F. Source-aware strategy
Use different algorithms depending on the file type and whether the device reported
enhanced (barometric) altitude:
- **FIT file with `enhanced_altitude` field**: barometric data, use hysteresis 5m
- **FIT file with GPS altitude only**: treat as GPS, use hysteresis 1015m or
discard altitude entirely and use a DEM lookup
- **GPX with `<ele>` tag**: assume GPS unless `<extensions>` contains barometric
fields; use hysteresis 1015m
- **Strava-enriched data**: Strava's API provides corrected `altitude` arrays; use
as-is with hysteresis 2m to catch quantization
---
## What Strava/Garmin/others do
| Platform | Method |
|---|---|
| Strava | Proprietary; replaces raw altitude with DEM-corrected data for GPS-only devices; applies internal smoothing before accumulation |
| Garmin Connect | Uses enhanced\_altitude (barometric), applies Kalman filter on-device; Connect re-applies server-side smoothing |
| Wahoo | On-device hysteresis (≈3m threshold); the GPX file contains already-smoothed altitude |
| RideWithGPS | 10m hysteresis by default, configurable |
| Komoot | DEM correction + smoothing |
| TrainingPeaks | Configurable threshold (5m default) |
Strava's approach of DEM (Digital Elevation Model) correction is the gold standard
for GPS-only tracks: replace the noisy GPS altitude entirely with the ground truth
from a 30m-resolution DEM such as SRTM. This requires an additional data source
(e.g. the Open-Elevation API or a locally hosted SRTM tile set) but completely
eliminates GPS altitude noise.
---
## Recommended fix
Given the two failure modes observed:
### Short term — ✅ Implemented (2026-04-20)
**Hysteresis accumulation** with source-aware thresholds, applied at extract time:
| Source | Threshold |
|---|---|
| FIT with `enhanced_altitude` (barometric) | 5 m |
| FIT with GPS altitude | 10 m |
| GPX | 10 m |
| TCX | 10 m |
`ParsedActivity.altitude_source` is set by each parser (`"barometric"` / `"gps"` /
`"unknown"`). `_elevation()` in `metrics.py` selects the threshold from this value.
New activities extracted after this change benefit automatically. Existing activities
require re-extraction from source files.
### Medium term — ✅ Implemented (2026-04-20)
**On-demand DEM correction** via the edit drawer, using the Open-Elevation API
(SRTM30 data):
1. GPS track subsampled to one point per 10 s to minimise API calls.
2. Terrain elevation fetched via `POST https://api.open-elevation.com/api/v1/lookup`
in batches of 512.
3. DEM elevation linearly interpolated back to the full 1 Hz series.
4. 5 m hysteresis applied to the interpolated series.
5. Timeseries and activity JSON patched in place; chart and stats update immediately.
Implementation: `bincio/extract/dem.py` + `POST /api/activity/{id}/recalculate-elevation`
on both servers. Default endpoint: `https://api.open-elevation.com`; override with
`--dem-url` or `DEM_URL` env var.
This is the recommended fix for activities uploaded before the hysteresis improvement,
or any activity where GPS noise is severe.
---
## Implementation status
| File | Status |
|---|---|
| `bincio/extract/models.py` | ✅ `altitude_source` field added |
| `bincio/extract/parsers/fit.py` | ✅ detects `enhanced_altitude` vs GPS |
| `bincio/extract/parsers/gpx.py` | ✅ sets `altitude_source = "gps"` |
| `bincio/extract/parsers/tcx.py` | ✅ sets `altitude_source = "gps"` |
| `bincio/extract/metrics.py` | ✅ hysteresis `_elevation()` with source-aware threshold |
| `bincio/extract/dem.py` | ✅ `lookup_elevations()` + `recalculate_elevation()` |
| `bincio/serve/server.py` | ✅ `POST /api/activity/{id}/recalculate-elevation` |
| `bincio/edit/server.py` | ✅ same endpoint (single-user) |
| `site/src/components/EditDrawer.svelte` | ✅ "Recalculate from terrain map" button |
| `tests/test_metrics.py` | ✅ 5 parametric tests |
+2
View File
@@ -116,6 +116,7 @@ uv run bincio edit [OPTIONS]
| `--port PORT` | `4041` | Bind port |
| `--strava-client-id ID` | from config | Strava OAuth client ID |
| `--strava-client-secret SECRET` | from config | Strava OAuth client secret |
| `--dem-url URL` | `https://api.open-elevation.com` | Open-Elevation-compatible API for the "Recalculate elevation" button (also `DEM_URL` env var) |
Set `PUBLIC_EDIT_URL=http://localhost:4041` in `site/.env` to enable the Edit button and Upload ↑ button in the site.
@@ -164,6 +165,7 @@ uv run bincio serve [OPTIONS]
| `--site-dir DIR` | — | Astro site dir — enables post-write incremental rebuilds |
| `--host HOST` | `127.0.0.1` | Bind address (keep on localhost; nginx proxies from outside) |
| `--port PORT` | `4041` | Bind port |
| `--dem-url URL` | `https://api.open-elevation.com` | Open-Elevation-compatible API for the "Recalculate elevation" button (also `DEM_URL` env var) |
Requires `bincio init` to have been run first. Handles auth, user management, and write operations. nginx is responsible for serving static files and proxying `/api/*` to this server.
+15
View File
@@ -73,6 +73,21 @@ Click **Edit** on any activity to:
Changes save instantly. The site rebuilds in the background.
### Recalculating elevation from terrain data
If an activity shows an unrealistic elevation gain (common with GPS-only devices on flat
routes, or with older Garmin/Wahoo files), the edit drawer has a
**"Recalculate from terrain map (DEM)"** button.
Clicking it replaces the recorded GPS altitude with SRTM terrain data from the
[Open-Elevation API](https://open-elevation.com) and recomputes the gain and loss. The
elevation chart and the summary stats both update. This usually brings the numbers in
line with what Strava or your device's app reports.
> **Note:** The correction requires a GPS track (activities marked *No GPS* cannot be
> corrected). The DEM has ~30 m horizontal resolution, so very short or indoor activities
> are not meaningfully improved.
### Photo gallery
Upload photos for an activity. They appear in a lightbox on the activity detail page. The server stores them in your data directory.