docs: mobile app — Pyodide/hidden-WebView extraction model, algorithm-travels-to-data pattern

This commit is contained in:
Davide Scaini
2026-04-24 10:18:49 +02:00
parent e952d9bdc1
commit 61479fe554
+101 -90
View File
@@ -33,6 +33,11 @@ device alongside the extracted BAS JSON. This means:
re-extracts with the full Python pipeline.
- No data is ever locked into a proprietary representation.
**The algorithm travels to the data — not the other way around.** When internet is
available, the app downloads a fresh copy of the extraction algorithm from bincio.org
and runs it locally. Your activity files never touch the server. Only the Python
wheel (the code) is downloaded; the data stays on device.
**Sync is optional and explicit.** Connecting to a Bincio instance (bincio.org or
self-hosted) adds cloud backup, the web feed, and the ability to share activities.
The app never silently overwrites local data. Sync is user-initiated.
@@ -49,12 +54,12 @@ Several pieces of the mobile app are already implemented or proven:
| Piece | Where | Notes |
|---|---|---|
| BAS schema | `docs/schema.md` | The on-device data format — identical to the server format |
| In-browser FIT/GPX/TCX parsing | `site/src/pages/convert/` | Pyodide + the Python extractor running in a browser tab. Proves local extraction works. Not portable to mobile (Pyodide is 30 MB, browser-only). |
| Pyodide-based extraction | `site/src/pages/convert/` | FIT/GPX/TCX parsing via CPython→WASM running in the browser. **This is the proof of concept for mobile extraction** — a hidden WebView in the app uses the exact same mechanism. |
| Bincio wheel | served at `/bincio-0.1.0-py3-none-any.whl` | The extraction code packaged as a pure-Python wheel. Already downloaded and run by the `/convert/` page. |
| Local activity storage | `site/src/pages/convert/` | IndexedDB + service worker in the web app. Proves the concept; the mobile app uses SQLite instead. |
| Content-addressed dedup | `bincio/extract/dedup.py` | `source_hash` (SHA-256 of raw file) prevents duplicates on upload |
| Sync-ready REST API | `bincio/serve/server.py` | Login, upload, activity detail, index.json — the sync primitives are already there |
| Settings persistence | `bincio/serve/db.py` | `settings` table (key/value) for instance URL, auth token, sync preferences |
| Elevation algorithms | `bincio/extract/metrics.py`, `bincio/extract/dem.py` | Hysteresis and DEM correction — need a TypeScript port for offline use |
---
@@ -67,7 +72,7 @@ Several pieces of the mobile app are already implemented or proven:
- TypeScript-first, large ecosystem
- `expo-sqlite` (v2+) — fast on-device SQLite with WAL mode
- File picking from device storage: `expo-document-picker`
- Direct filesystem access (important for Karoo): `expo-file-system`
- Direct filesystem access (critical for Karoo): `expo-file-system`
- Maps: MapLibre React Native (`@maplibre/maplibre-react-native`) — same tile
standard as the web app, self-hostable
- Background tasks: `expo-background-fetch` / `expo-task-manager`
@@ -78,57 +83,85 @@ Several pieces of the mobile app are already implemented or proven:
| Option | Reason for skipping |
|---|---|
| Capacitor + Svelte | WebView performance is poor for map-heavy activity detail; Pyodide can't run on mobile |
| Flutter | Dart is a new language to learn; no practical advantage over RN for this use case |
| Capacitor + Svelte | WebView performance is poor for map-heavy activity detail; same hidden-WebView trick for Pyodide applies either way |
| Flutter | Dart is a new language; no practical advantage for this use case |
| PWA | iOS limits background sync, local storage quotas, and filesystem access — not viable for an activity logger |
---
## Extraction: hybrid model
## Extraction: Pyodide in a hidden WebView
Python (the server's extraction engine) cannot run on mobile without a specialised
runtime. Rather than fully porting the extraction to TypeScript, the app uses a
**tiered extraction model**:
This is the core technical insight. The `/convert/` page already demonstrates that
the full Python extraction pipeline can run in a browser via **Pyodide** (CPython
compiled to WebAssembly). A React Native app can host a hidden `WebView` component
running the exact same code. No rewrite required.
### Tier 1 — On-device TypeScript extraction (always available, offline)
### How the /convert/ page does it today
A TypeScript extraction library (`bincio-extract-ts`) runs entirely on the device:
```
Browser tab
└── Pyodide (CPython → WASM, ~30 MB)
├── lxml (pre-compiled in Pyodide — XML/GPX parsing)
├── fitdecode (pure Python — FIT parsing)
├── gpxpy (pure Python — GPX parsing)
├── pyyaml (pure Python)
└── bincio wheel (pure Python — metrics, hysteresis, writers)
fetched from: /bincio-0.1.0-py3-none-any.whl
```
- **FIT parsing**: `@garmin/fitsdk` or `fit-file-parser` (mature JS libraries)
- **GPX/TCX parsing**: standard XML parsing (`fast-xml-parser`)
- **Metrics**: distance (Haversine), moving time, speed, HR/power averages, lap splits
- **Elevation**: direct port of the hysteresis algorithm from `metrics.py`
All dependencies are either pre-compiled in Pyodide or **pure Python with no C
extensions**. This is the key: there is nothing to recompile for mobile.
This produces a valid BAS JSON that the app can display immediately. It is the
default path and works with no network.
### How the mobile app does it
### Tier 2 — Server-assisted extraction (when an instance is reachable)
```
React Native app
└── Hidden WebView (WKWebView on iOS, Chrome WebView on Android)
└── Same Pyodide environment as the /convert/ page
├── Pyodide runtime (cached on device after first download)
├── lxml, fitdecode, gpxpy, pyyaml (cached)
└── bincio wheel (fetched from bincio.org on startup / version check)
When a Bincio instance is configured and online, the app can delegate extraction
to the server:
Data flow:
1. App reads FIT file bytes from device filesystem
2. Sends bytes to WebView via postMessage
3. WebView writes bytes to Pyodide's virtual FS
4. Python runs the extraction → BAS JSON dict
5. WebView sends JSON back via postMessage
6. App stores BAS JSON in SQLite, original file on disk
```
1. Send the raw file to `POST /api/extract` (a new stateless endpoint — processes
the file and returns BAS JSON, does **not** store anything).
2. The server runs the full Python pipeline: FIT `enhanced_altitude` detection,
source-aware hysteresis, DEM correction, power metrics, laps.
3. The app stores the returned BAS JSON locally and marks it as server-extracted.
**Data never leaves the device.** The only network traffic is:
- Pyodide runtime (CDN or bundled, ~30 MB, cached)
- Common packages (CDN or bundled, cached)
- The bincio wheel from bincio.org (~50 KB, updated on version bump)
This gives full extraction quality without maintaining two implementations of every
algorithm. The original file is always stored locally, so the app can re-extract
via the server at any time (e.g. after a DEM correction improvement is deployed).
### Algorithm updates without app store releases
### Re-extraction
The bincio wheel is versioned and served from bincio.org. On app startup (or
periodically), the app checks the current wheel version:
Because the original file is always on device, the app can re-run either tier at
any time:
```
GET https://bincio.org/bincio-latest.whl (or a version manifest endpoint)
```
- **Re-extract offline**: apply an updated TypeScript algorithm to an existing
original file.
- **Re-extract via server**: send the original file to the server for higher-quality
processing (e.g. after connecting to an instance for the first time).
If a new version is available, the wheel is downloaded and cached. The next
extraction uses the updated algorithm. Improvements to hysteresis thresholds,
DEM correction, lap detection, or any other metric are live on all devices
within hours of deployment — **no App Store submission required**.
This means extraction quality improves automatically as algorithms improve, without
any data migration.
### Performance
- **First extraction after install**: ~58 s (Pyodide startup + package load)
- **Subsequent extractions (warm WebView)**: ~13 s per activity
- **Pyodide memory footprint**: ~100150 MB RAM while active; the WebView can
be suspended between extractions
- **Wheel size**: the bincio extract code is ~50 KB; Pyodide + packages ~30 MB
(downloaded once, cached on device)
For batch import (many files at once), the WebView is kept warm across
extractions, making the per-file cost just the Python execution time (~0.51 s
per typical activity).
---
@@ -143,23 +176,21 @@ Bincio Mobile
│ ├── Sync screen — configure instance URL, push/pull
│ └── Settings screen — account, preferences, storage info
├── Extraction Engine (TypeScript — Tier 1)
│ ├── FIT parser — wraps @garmin/fitsdk
│ ├── GPX parser — XML → BAS points
── TCX parser — XML → BAS points
├── Metrics — port of metrics.py (distance, elevation, HR, power)
│ └── Hysteresis — port of _hysteresis_gain_loss + _moving_average
├── Extraction Engine (Pyodide in hidden WebView)
│ ├── WebView host — manages lifecycle, message passing
│ ├── Wheel cache — versioned bincio wheel stored on device
── Python runtime — Pyodide + fitdecode + gpxpy + lxml
identical to the /convert/ page on the web
├── Local Store (expo-sqlite)
│ ├── activities — BAS detail JSON + indexed summary columns
│ ├── timeseries — 1 Hz arrays as JSON blob per activity
│ ├── geojson — simplified GPS track per activity
│ ├── originals — original file paths (or blobs) per activity
│ ├── originals — original file paths per activity
│ └── settings — instance_url, handle, auth_token, sync prefs
└── Sync Layer
├── Auth — POST /api/auth/login → session token
├── Extract (Tier 2) — POST /api/extract → BAS JSON, no server storage
└── Sync Layer (optional)
├── Auth — POST /api/auth/login → Bearer token
├── Push — POST /api/upload (original file)
└── Pull — GET index.json + activity/{id}.json + timeseries
```
@@ -175,9 +206,8 @@ source_hash TEXT NOT NULL, -- SHA-256 of original file (dedup key)
detail_json TEXT NOT NULL, -- full BAS detail JSON blob
timeseries_json TEXT, -- 1 Hz arrays (loaded lazily)
geojson TEXT, -- simplified GPS track
original_path TEXT, -- path to original file in app storage
extraction_tier INTEGER, -- 1 = TypeScript, 2 = server-extracted
synced_at INTEGER, -- unix timestamp of last push to remote (NULL = unsynced)
original_path TEXT NOT NULL, -- path to original file in app storage
synced_at INTEGER, -- unix timestamp of last push to remote
origin TEXT NOT NULL, -- "local" | "remote"
created_at INTEGER NOT NULL
@@ -194,6 +224,7 @@ value TEXT NOT NULL
| `handle` | `brutsalvadi` |
| `session_token` | `abc123…` |
| `last_sync_at` | `2026-04-24T10:00:00Z` |
| `wheel_version` | `0.1.0` |
| `auto_import_path` | `/sdcard/Karoo/Rides/` (Android only) |
---
@@ -202,16 +233,16 @@ value TEXT NOT NULL
Devices like the **Karoo 2** run Android and write FIT files directly to the
filesystem (e.g. `/sdcard/Karoo/Rides/`). The app can monitor this directory and
auto-import new files as rides complete, with no manual export step and no Hammerhead
(or Garmin, Wahoo, etc.) cloud sync required.
auto-import new files as rides complete no manual export step, no Hammerhead
cloud sync, no Garmin Connect, no Strava required.
On Karoo specifically:
- Install the Bincio Android APK directly.
- Install the Bincio Android APK directly (sideload or via a store).
- Configure `auto_import_path` to point at the Karoo's ride directory.
- When a new FIT file appears, the app imports it automatically (Tier 1 extraction),
stores the original file, and shows the ride in the feed.
- When WiFi is available and an instance is configured, rides can be pushed to the
instance (Tier 2 extraction for higher quality, or just raw upload).
- When a new FIT file appears, the app imports it automatically (Pyodide
extraction), stores the original file, and shows the ride in the feed.
- When WiFi is available and an instance is configured, rides can be pushed to
the instance for web access and backup.
This makes Bincio a complete replacement for Hammerhead's own sync infrastructure
for users who want full control of their data.
@@ -220,8 +251,7 @@ for users who want full control of their data.
## Sync protocol
Sync is a two-way, hash-based diff — no custom server protocol needed beyond
the existing REST API.
Sync is a two-way, hash-based diff — no custom server protocol needed.
### Push (local → server)
@@ -240,34 +270,15 @@ the existing REST API.
- `GET {instance_url}/activities/{id}.geojson``geojson`
4. Insert with `origin = "remote"`, `synced_at = now()`.
Note: pulled activities don't have a local original file. If re-extraction is
needed (e.g. for a DEM correction), the original must be uploaded to the instance
first so the server can serve it back.
### Conflict handling
Activities are immutable once created. The `source_hash` prevents double-counting
if the same file is imported on two devices before sync, whichever copy arrives at
the server first wins; the duplicate is rejected with a 409.
---
## New server endpoint needed: `POST /api/extract`
A stateless extraction endpoint: accepts a raw FIT/GPX/TCX file, runs the full
Python extraction pipeline, returns BAS JSON. Does not write anything to disk.
```
POST /api/extract
Content-Type: multipart/form-data
file: <raw activity file>
200 OK
{
"detail": { ...BAS detail JSON... },
"timeseries": { ...1 Hz arrays... },
"geojson": { ...simplified track... }
}
```
No authentication required (the server is just a compute service here — the result
is not stored). Rate limiting and file size cap apply.
Activities are immutable once created. The `source_hash` prevents double-counting:
if the same file is imported on two devices before sync, whichever arrives at the
server first wins; the duplicate is rejected with 409.
---
@@ -304,8 +315,8 @@ The token is stored in the `settings` table and sent as
| Phase | Scope |
|---|---|
| **0 — Foundation** | Expo project scaffold, SQLite store, settings screen, file picker, display a BAS JSON read from disk |
| **1 — Import** | TypeScript FIT/GPX/TCX parser + metrics engine (Tier 1), local feed, activity detail with map and chart, original file storage |
| **2 — Karoo integration** | Auto-import from a watched directory, Android-specific file access |
| **3 — Sync** | `POST /api/extract` endpoint, Bearer token auth, push/pull sync with an instance |
| **4 — Polish** | Offline map tiles, share sheet, home screen widget, performance |
| **Future** | Live recording, Bluetooth sensors, full Garmin/Wahoo replacement |
| **1 — Import** | Hidden WebView + Pyodide extraction, wheel download and caching, local feed, activity detail with map and chart, original file storage |
| **2 — Karoo integration** | Auto-import from a watched directory, Android-specific filesystem access |
| **3 — Sync** | Bearer token auth, push/pull sync with a Bincio instance |
| **4 — Polish** | Offline map tiles, share sheet, home screen widget, batch import performance |
| **Future** | Live recording, Bluetooth/ANT+ sensors, full Garmin/Wahoo/Hammerhead replacement |