docs: mobile app — Pyodide/hidden-WebView extraction model, algorithm-travels-to-data pattern

This commit is contained in:
Davide Scaini
2026-04-24 10:18:49 +02:00
parent e952d9bdc1
commit 61479fe554
+101 -90
View File
@@ -33,6 +33,11 @@ device alongside the extracted BAS JSON. This means:
re-extracts with the full Python pipeline. re-extracts with the full Python pipeline.
- No data is ever locked into a proprietary representation. - No data is ever locked into a proprietary representation.
**The algorithm travels to the data — not the other way around.** When internet is
available, the app downloads a fresh copy of the extraction algorithm from bincio.org
and runs it locally. Your activity files never touch the server. Only the Python
wheel (the code) is downloaded; the data stays on device.
**Sync is optional and explicit.** Connecting to a Bincio instance (bincio.org or **Sync is optional and explicit.** Connecting to a Bincio instance (bincio.org or
self-hosted) adds cloud backup, the web feed, and the ability to share activities. self-hosted) adds cloud backup, the web feed, and the ability to share activities.
The app never silently overwrites local data. Sync is user-initiated. The app never silently overwrites local data. Sync is user-initiated.
@@ -49,12 +54,12 @@ Several pieces of the mobile app are already implemented or proven:
| Piece | Where | Notes | | Piece | Where | Notes |
|---|---|---| |---|---|---|
| BAS schema | `docs/schema.md` | The on-device data format — identical to the server format | | BAS schema | `docs/schema.md` | The on-device data format — identical to the server format |
| In-browser FIT/GPX/TCX parsing | `site/src/pages/convert/` | Pyodide + the Python extractor running in a browser tab. Proves local extraction works. Not portable to mobile (Pyodide is 30 MB, browser-only). | | Pyodide-based extraction | `site/src/pages/convert/` | FIT/GPX/TCX parsing via CPython→WASM running in the browser. **This is the proof of concept for mobile extraction** — a hidden WebView in the app uses the exact same mechanism. |
| Bincio wheel | served at `/bincio-0.1.0-py3-none-any.whl` | The extraction code packaged as a pure-Python wheel. Already downloaded and run by the `/convert/` page. |
| Local activity storage | `site/src/pages/convert/` | IndexedDB + service worker in the web app. Proves the concept; the mobile app uses SQLite instead. | | Local activity storage | `site/src/pages/convert/` | IndexedDB + service worker in the web app. Proves the concept; the mobile app uses SQLite instead. |
| Content-addressed dedup | `bincio/extract/dedup.py` | `source_hash` (SHA-256 of raw file) prevents duplicates on upload | | Content-addressed dedup | `bincio/extract/dedup.py` | `source_hash` (SHA-256 of raw file) prevents duplicates on upload |
| Sync-ready REST API | `bincio/serve/server.py` | Login, upload, activity detail, index.json — the sync primitives are already there | | Sync-ready REST API | `bincio/serve/server.py` | Login, upload, activity detail, index.json — the sync primitives are already there |
| Settings persistence | `bincio/serve/db.py` | `settings` table (key/value) for instance URL, auth token, sync preferences | | Settings persistence | `bincio/serve/db.py` | `settings` table (key/value) for instance URL, auth token, sync preferences |
| Elevation algorithms | `bincio/extract/metrics.py`, `bincio/extract/dem.py` | Hysteresis and DEM correction — need a TypeScript port for offline use |
--- ---
@@ -67,7 +72,7 @@ Several pieces of the mobile app are already implemented or proven:
- TypeScript-first, large ecosystem - TypeScript-first, large ecosystem
- `expo-sqlite` (v2+) — fast on-device SQLite with WAL mode - `expo-sqlite` (v2+) — fast on-device SQLite with WAL mode
- File picking from device storage: `expo-document-picker` - File picking from device storage: `expo-document-picker`
- Direct filesystem access (important for Karoo): `expo-file-system` - Direct filesystem access (critical for Karoo): `expo-file-system`
- Maps: MapLibre React Native (`@maplibre/maplibre-react-native`) — same tile - Maps: MapLibre React Native (`@maplibre/maplibre-react-native`) — same tile
standard as the web app, self-hostable standard as the web app, self-hostable
- Background tasks: `expo-background-fetch` / `expo-task-manager` - Background tasks: `expo-background-fetch` / `expo-task-manager`
@@ -78,57 +83,85 @@ Several pieces of the mobile app are already implemented or proven:
| Option | Reason for skipping | | Option | Reason for skipping |
|---|---| |---|---|
| Capacitor + Svelte | WebView performance is poor for map-heavy activity detail; Pyodide can't run on mobile | | Capacitor + Svelte | WebView performance is poor for map-heavy activity detail; same hidden-WebView trick for Pyodide applies either way |
| Flutter | Dart is a new language to learn; no practical advantage over RN for this use case | | Flutter | Dart is a new language; no practical advantage for this use case |
| PWA | iOS limits background sync, local storage quotas, and filesystem access — not viable for an activity logger | | PWA | iOS limits background sync, local storage quotas, and filesystem access — not viable for an activity logger |
--- ---
## Extraction: hybrid model ## Extraction: Pyodide in a hidden WebView
Python (the server's extraction engine) cannot run on mobile without a specialised This is the core technical insight. The `/convert/` page already demonstrates that
runtime. Rather than fully porting the extraction to TypeScript, the app uses a the full Python extraction pipeline can run in a browser via **Pyodide** (CPython
**tiered extraction model**: compiled to WebAssembly). A React Native app can host a hidden `WebView` component
running the exact same code. No rewrite required.
### Tier 1 — On-device TypeScript extraction (always available, offline) ### How the /convert/ page does it today
A TypeScript extraction library (`bincio-extract-ts`) runs entirely on the device: ```
Browser tab
└── Pyodide (CPython → WASM, ~30 MB)
├── lxml (pre-compiled in Pyodide — XML/GPX parsing)
├── fitdecode (pure Python — FIT parsing)
├── gpxpy (pure Python — GPX parsing)
├── pyyaml (pure Python)
└── bincio wheel (pure Python — metrics, hysteresis, writers)
fetched from: /bincio-0.1.0-py3-none-any.whl
```
- **FIT parsing**: `@garmin/fitsdk` or `fit-file-parser` (mature JS libraries) All dependencies are either pre-compiled in Pyodide or **pure Python with no C
- **GPX/TCX parsing**: standard XML parsing (`fast-xml-parser`) extensions**. This is the key: there is nothing to recompile for mobile.
- **Metrics**: distance (Haversine), moving time, speed, HR/power averages, lap splits
- **Elevation**: direct port of the hysteresis algorithm from `metrics.py`
This produces a valid BAS JSON that the app can display immediately. It is the ### How the mobile app does it
default path and works with no network.
### Tier 2 — Server-assisted extraction (when an instance is reachable) ```
React Native app
└── Hidden WebView (WKWebView on iOS, Chrome WebView on Android)
└── Same Pyodide environment as the /convert/ page
├── Pyodide runtime (cached on device after first download)
├── lxml, fitdecode, gpxpy, pyyaml (cached)
└── bincio wheel (fetched from bincio.org on startup / version check)
When a Bincio instance is configured and online, the app can delegate extraction Data flow:
to the server: 1. App reads FIT file bytes from device filesystem
2. Sends bytes to WebView via postMessage
3. WebView writes bytes to Pyodide's virtual FS
4. Python runs the extraction → BAS JSON dict
5. WebView sends JSON back via postMessage
6. App stores BAS JSON in SQLite, original file on disk
```
1. Send the raw file to `POST /api/extract` (a new stateless endpoint — processes **Data never leaves the device.** The only network traffic is:
the file and returns BAS JSON, does **not** store anything). - Pyodide runtime (CDN or bundled, ~30 MB, cached)
2. The server runs the full Python pipeline: FIT `enhanced_altitude` detection, - Common packages (CDN or bundled, cached)
source-aware hysteresis, DEM correction, power metrics, laps. - The bincio wheel from bincio.org (~50 KB, updated on version bump)
3. The app stores the returned BAS JSON locally and marks it as server-extracted.
This gives full extraction quality without maintaining two implementations of every ### Algorithm updates without app store releases
algorithm. The original file is always stored locally, so the app can re-extract
via the server at any time (e.g. after a DEM correction improvement is deployed).
### Re-extraction The bincio wheel is versioned and served from bincio.org. On app startup (or
periodically), the app checks the current wheel version:
Because the original file is always on device, the app can re-run either tier at ```
any time: GET https://bincio.org/bincio-latest.whl (or a version manifest endpoint)
```
- **Re-extract offline**: apply an updated TypeScript algorithm to an existing If a new version is available, the wheel is downloaded and cached. The next
original file. extraction uses the updated algorithm. Improvements to hysteresis thresholds,
- **Re-extract via server**: send the original file to the server for higher-quality DEM correction, lap detection, or any other metric are live on all devices
processing (e.g. after connecting to an instance for the first time). within hours of deployment — **no App Store submission required**.
This means extraction quality improves automatically as algorithms improve, without ### Performance
any data migration.
- **First extraction after install**: ~58 s (Pyodide startup + package load)
- **Subsequent extractions (warm WebView)**: ~13 s per activity
- **Pyodide memory footprint**: ~100150 MB RAM while active; the WebView can
be suspended between extractions
- **Wheel size**: the bincio extract code is ~50 KB; Pyodide + packages ~30 MB
(downloaded once, cached on device)
For batch import (many files at once), the WebView is kept warm across
extractions, making the per-file cost just the Python execution time (~0.51 s
per typical activity).
--- ---
@@ -143,23 +176,21 @@ Bincio Mobile
│ ├── Sync screen — configure instance URL, push/pull │ ├── Sync screen — configure instance URL, push/pull
│ └── Settings screen — account, preferences, storage info │ └── Settings screen — account, preferences, storage info
├── Extraction Engine (TypeScript — Tier 1) ├── Extraction Engine (Pyodide in hidden WebView)
│ ├── FIT parser — wraps @garmin/fitsdk │ ├── WebView host — manages lifecycle, message passing
│ ├── GPX parser — XML → BAS points │ ├── Wheel cache — versioned bincio wheel stored on device
── TCX parser — XML → BAS points ── Python runtime — Pyodide + fitdecode + gpxpy + lxml
├── Metrics — port of metrics.py (distance, elevation, HR, power) identical to the /convert/ page on the web
│ └── Hysteresis — port of _hysteresis_gain_loss + _moving_average
├── Local Store (expo-sqlite) ├── Local Store (expo-sqlite)
│ ├── activities — BAS detail JSON + indexed summary columns │ ├── activities — BAS detail JSON + indexed summary columns
│ ├── timeseries — 1 Hz arrays as JSON blob per activity │ ├── timeseries — 1 Hz arrays as JSON blob per activity
│ ├── geojson — simplified GPS track per activity │ ├── geojson — simplified GPS track per activity
│ ├── originals — original file paths (or blobs) per activity │ ├── originals — original file paths per activity
│ └── settings — instance_url, handle, auth_token, sync prefs │ └── settings — instance_url, handle, auth_token, sync prefs
└── Sync Layer └── Sync Layer (optional)
├── Auth — POST /api/auth/login → session token ├── Auth — POST /api/auth/login → Bearer token
├── Extract (Tier 2) — POST /api/extract → BAS JSON, no server storage
├── Push — POST /api/upload (original file) ├── Push — POST /api/upload (original file)
└── Pull — GET index.json + activity/{id}.json + timeseries └── Pull — GET index.json + activity/{id}.json + timeseries
``` ```
@@ -175,9 +206,8 @@ source_hash TEXT NOT NULL, -- SHA-256 of original file (dedup key)
detail_json TEXT NOT NULL, -- full BAS detail JSON blob detail_json TEXT NOT NULL, -- full BAS detail JSON blob
timeseries_json TEXT, -- 1 Hz arrays (loaded lazily) timeseries_json TEXT, -- 1 Hz arrays (loaded lazily)
geojson TEXT, -- simplified GPS track geojson TEXT, -- simplified GPS track
original_path TEXT, -- path to original file in app storage original_path TEXT NOT NULL, -- path to original file in app storage
extraction_tier INTEGER, -- 1 = TypeScript, 2 = server-extracted synced_at INTEGER, -- unix timestamp of last push to remote
synced_at INTEGER, -- unix timestamp of last push to remote (NULL = unsynced)
origin TEXT NOT NULL, -- "local" | "remote" origin TEXT NOT NULL, -- "local" | "remote"
created_at INTEGER NOT NULL created_at INTEGER NOT NULL
@@ -194,6 +224,7 @@ value TEXT NOT NULL
| `handle` | `brutsalvadi` | | `handle` | `brutsalvadi` |
| `session_token` | `abc123…` | | `session_token` | `abc123…` |
| `last_sync_at` | `2026-04-24T10:00:00Z` | | `last_sync_at` | `2026-04-24T10:00:00Z` |
| `wheel_version` | `0.1.0` |
| `auto_import_path` | `/sdcard/Karoo/Rides/` (Android only) | | `auto_import_path` | `/sdcard/Karoo/Rides/` (Android only) |
--- ---
@@ -202,16 +233,16 @@ value TEXT NOT NULL
Devices like the **Karoo 2** run Android and write FIT files directly to the Devices like the **Karoo 2** run Android and write FIT files directly to the
filesystem (e.g. `/sdcard/Karoo/Rides/`). The app can monitor this directory and filesystem (e.g. `/sdcard/Karoo/Rides/`). The app can monitor this directory and
auto-import new files as rides complete, with no manual export step and no Hammerhead auto-import new files as rides complete no manual export step, no Hammerhead
(or Garmin, Wahoo, etc.) cloud sync required. cloud sync, no Garmin Connect, no Strava required.
On Karoo specifically: On Karoo specifically:
- Install the Bincio Android APK directly. - Install the Bincio Android APK directly (sideload or via a store).
- Configure `auto_import_path` to point at the Karoo's ride directory. - Configure `auto_import_path` to point at the Karoo's ride directory.
- When a new FIT file appears, the app imports it automatically (Tier 1 extraction), - When a new FIT file appears, the app imports it automatically (Pyodide
stores the original file, and shows the ride in the feed. extraction), stores the original file, and shows the ride in the feed.
- When WiFi is available and an instance is configured, rides can be pushed to the - When WiFi is available and an instance is configured, rides can be pushed to
instance (Tier 2 extraction for higher quality, or just raw upload). the instance for web access and backup.
This makes Bincio a complete replacement for Hammerhead's own sync infrastructure This makes Bincio a complete replacement for Hammerhead's own sync infrastructure
for users who want full control of their data. for users who want full control of their data.
@@ -220,8 +251,7 @@ for users who want full control of their data.
## Sync protocol ## Sync protocol
Sync is a two-way, hash-based diff — no custom server protocol needed beyond Sync is a two-way, hash-based diff — no custom server protocol needed.
the existing REST API.
### Push (local → server) ### Push (local → server)
@@ -240,34 +270,15 @@ the existing REST API.
- `GET {instance_url}/activities/{id}.geojson``geojson` - `GET {instance_url}/activities/{id}.geojson``geojson`
4. Insert with `origin = "remote"`, `synced_at = now()`. 4. Insert with `origin = "remote"`, `synced_at = now()`.
Note: pulled activities don't have a local original file. If re-extraction is
needed (e.g. for a DEM correction), the original must be uploaded to the instance
first so the server can serve it back.
### Conflict handling ### Conflict handling
Activities are immutable once created. The `source_hash` prevents double-counting Activities are immutable once created. The `source_hash` prevents double-counting:
if the same file is imported on two devices before sync, whichever copy arrives at if the same file is imported on two devices before sync, whichever arrives at the
the server first wins; the duplicate is rejected with a 409. server first wins; the duplicate is rejected with 409.
---
## New server endpoint needed: `POST /api/extract`
A stateless extraction endpoint: accepts a raw FIT/GPX/TCX file, runs the full
Python extraction pipeline, returns BAS JSON. Does not write anything to disk.
```
POST /api/extract
Content-Type: multipart/form-data
file: <raw activity file>
200 OK
{
"detail": { ...BAS detail JSON... },
"timeseries": { ...1 Hz arrays... },
"geojson": { ...simplified track... }
}
```
No authentication required (the server is just a compute service here — the result
is not stored). Rate limiting and file size cap apply.
--- ---
@@ -304,8 +315,8 @@ The token is stored in the `settings` table and sent as
| Phase | Scope | | Phase | Scope |
|---|---| |---|---|
| **0 — Foundation** | Expo project scaffold, SQLite store, settings screen, file picker, display a BAS JSON read from disk | | **0 — Foundation** | Expo project scaffold, SQLite store, settings screen, file picker, display a BAS JSON read from disk |
| **1 — Import** | TypeScript FIT/GPX/TCX parser + metrics engine (Tier 1), local feed, activity detail with map and chart, original file storage | | **1 — Import** | Hidden WebView + Pyodide extraction, wheel download and caching, local feed, activity detail with map and chart, original file storage |
| **2 — Karoo integration** | Auto-import from a watched directory, Android-specific file access | | **2 — Karoo integration** | Auto-import from a watched directory, Android-specific filesystem access |
| **3 — Sync** | `POST /api/extract` endpoint, Bearer token auth, push/pull sync with an instance | | **3 — Sync** | Bearer token auth, push/pull sync with a Bincio instance |
| **4 — Polish** | Offline map tiles, share sheet, home screen widget, performance | | **4 — Polish** | Offline map tiles, share sheet, home screen widget, batch import performance |
| **Future** | Live recording, Bluetooth sensors, full Garmin/Wahoo replacement | | **Future** | Live recording, Bluetooth/ANT+ sensors, full Garmin/Wahoo/Hammerhead replacement |