Files
bincio-activity/SCHEMA.md
T
2026-04-01 11:05:00 +02:00

13 KiB

BincioActivity Schema (BAS) — v1.0

The BincioActivity Schema defines how activity data is stored and shared as plain JSON files. It is the federation protocol: if you publish a BAS-compliant data store, any BincioActivity instance can read it.

Any tool — in any language — can produce BAS-compliant JSON without using the bincio Python package. The schema is the contract; the package is one implementation.


Files

A BAS data store is a directory (or URL prefix) with this structure:

{store_root}/
  index.json                      ← user manifest and activity feed
  index_{year}.json               ← optional yearly shards (large datasets)
  activities/
    {id}.json                     ← full activity detail
    {id}.geojson                  ← simplified GPS track (optional)

All files are UTF-8 JSON. All timestamps are ISO 8601 with timezone offset. All distances are in metres. All speeds are in km/h. All durations are in seconds. null means "not recorded / not available".


index.json

The entry point for a data store.

{
  "bas_version": "1.0",
  "owner": {
    "handle": "brutsalvadi",
    "display_name": "Bru",
    "avatar_url": null
  },
  "generated_at": "2026-03-28T10:00:00Z",
  "shards": [
    { "year": 2024, "url": "index_2024.json", "count": 312 }
  ],
  "activities": [ ... ]
}

Fields

Field Type Required Description
bas_version string yes Schema version. Currently "1.0".
owner.handle string yes URL-safe identifier, e.g. "brutsalvadi".
owner.display_name string yes Human-readable name.
owner.avatar_url string|null no Absolute URL to an avatar image.
generated_at string yes ISO 8601 timestamp of when this file was generated.
shards array no Pointers to yearly shard files. See below.
activities array yes Array of Activity Summary objects. May be empty.

index.json should contain all activities when the total count is under ~5,000. Above that, use yearly shards and keep only the most recent 200 activities inline in index.json for fast feed rendering.

Shard object

Field Type Description
year integer Calendar year covered by this shard.
url string Relative or absolute URL to the shard file.
count integer Number of activities in the shard.

Activity Summary object

Appears in index.json (and yearly shard files). Contains only the fields needed to render an activity card in a feed — no timeseries, no full track.

{
  "id": "2024-06-01T073012Z-morning-ride",
  "title": "Morning Ride",
  "sport": "cycling",
  "sub_sport": "road",
  "started_at": "2024-06-01T07:30:12+02:00",
  "distance_m": 42300.0,
  "duration_s": 5400,
  "moving_time_s": 5100,
  "elevation_gain_m": 620.0,
  "avg_speed_kmh": 28.2,
  "max_speed_kmh": 52.1,
  "avg_hr_bpm": 148,
  "max_hr_bpm": 178,
  "avg_cadence_rpm": 88,
  "avg_power_w": null,
  "source": "strava_export",
  "privacy": "public",
  "detail_url": "activities/2024-06-01T073012Z-morning-ride.json",
  "track_url": "activities/2024-06-01T073012Z-morning-ride.geojson"
}

Fields

Field Type Required Description
id string yes Unique identifier. See Activity ID section.
title string yes Human-readable name. May be auto-generated if not in source.
sport string yes One of: cycling, running, hiking, walking, swimming, skiing, other.
sub_sport string|null no e.g. road, mountain, gravel, indoor, trail, track, nordic, alpine, open_water, pool.
started_at string yes ISO 8601 timestamp with timezone.
distance_m number|null no Total distance in metres.
duration_s integer|null no Total elapsed time in seconds.
moving_time_s integer|null no Time in motion (stopped periods excluded).
elevation_gain_m number|null no Cumulative positive elevation in metres.
avg_speed_kmh number|null no Average speed over moving time.
max_speed_kmh number|null no Maximum instantaneous speed.
avg_hr_bpm integer|null no Average heart rate.
max_hr_bpm integer|null no Maximum heart rate.
avg_cadence_rpm integer|null no Average cadence (rpm for cycling, spm for running).
avg_power_w integer|null no Average power in watts.
source string|null no Origin of data. See Source values.
privacy string yes One of: public, blur_start, no_gps, private.
mmp array|null no Mean Maximal Power curve — [[duration_s, avg_watts], ...].
best_efforts array|null no Best efforts by distance — [[distance_km, time_s], ...].
best_climb_m number|null no Best single climb in metres (Kadane's algorithm).
detail_url string|null no Relative or absolute URL to the full activity JSON.
track_url string|null no Relative or absolute URL to the GeoJSON track. null if privacy is no_gps.
preview_coords array|null no Simplified track preview — [[lon, lat], ...] for card thumbnails.

Activity ID

The canonical ID format is:

{started_at_compact}[-{slug}]

Where started_at_compact is the start timestamp with special characters removed: 2024-06-01T073012Z, and slug is an optional URL-safe lowercase title (spaces → hyphens, non-ASCII stripped).

Example: 2024-06-01T073012Z-morning-ride

IDs must be unique within a data store. When a title is unavailable, the timestamp alone is sufficient: 2024-06-01T073012Z.

Source values

Value Description
strava_export Strava bulk data export
garmin_connect Garmin Connect bulk export
wahoo Wahoo ELEMNT / SYSTM export
komoot Komoot GPX export
gpx_file Generic GPX file
fit_file Generic FIT file
tcx_file Generic TCX file
karoo Hammerhead Karoo device export
manual Manually created

Privacy levels

Level GPS track published Timeseries lat/lon Stats in index
public Full track Included Yes
blur_start First/last 200 m removed Trimmed Yes
no_gps Not published Not included Yes
private Not published Not included No (not in index at all)

activities/{id}.json

Full activity record. Extends the Summary with timeseries and metadata.

{
  "bas_version": "1.0",
  "id": "2024-06-01T073012Z-morning-ride",
  "title": "Morning Ride",
  "description": "Easy morning spin before work.",
  "sport": "cycling",
  "sub_sport": "road",
  "started_at": "2024-06-01T07:30:12+02:00",
  "distance_m": 42300.0,
  "duration_s": 5400,
  "moving_time_s": 5100,
  "elevation_gain_m": 620.0,
  "elevation_loss_m": 615.0,
  "avg_speed_kmh": 28.2,
  "max_speed_kmh": 52.1,
  "avg_hr_bpm": 148,
  "max_hr_bpm": 178,
  "avg_cadence_rpm": 88,
  "avg_power_w": null,
  "max_power_w": null,
  "gear": "Canyon Ultimate CF SL",
  "device": "Hammerhead Karoo 2",
  "bbox": [9.1234, 45.4321, 9.5678, 45.8765],
  "start_latlng": [45.4321, 9.1234],
  "end_latlng": [45.4321, 9.1235],
  "laps": [],
  "timeseries": {
    "t": [0, 1, 2],
    "lat": [45.4321, 45.4322, 45.4323],
    "lon": [9.1234, 9.1235, 9.1236],
    "elevation_m": [120.0, 120.5, 121.0],
    "speed_kmh": [0.0, 15.2, 22.4],
    "hr_bpm": [null, 142, 145],
    "cadence_rpm": [null, 85, 88],
    "power_w": [null, null, null],
    "temperature_c": [null, null, null]
  },
  "source": "karoo",
  "source_file": "13957.activity.abc123.fit",
  "source_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
  "strava_id": null,
  "privacy": "public",
  "custom": {}
}

Additional fields (beyond Summary)

Field Type Required Description
description string|null no Free-text description.
elevation_loss_m number|null no Cumulative negative elevation.
max_power_w integer|null no Maximum power in watts.
gear string|null no Equipment used (bike name, shoe model…).
device string|null no Recording device (e.g. "Garmin Edge 530").
bbox array|null no [min_lon, min_lat, max_lon, max_lat]. Null if no GPS.
start_latlng array|null no [lat, lon] of activity start.
end_latlng array|null no [lat, lon] of activity end.
laps array yes Array of Lap objects. Empty array if no laps.
timeseries object yes Parallel arrays of sensor data. See below.
source_file string|null no Original filename (basename only, no path).
source_hash string|null no sha256:{hex} of the original raw file bytes. Used for deduplication.
strava_id string|null no Strava activity ID if origin is a Strava export.
custom object yes Free dict for plugin-computed fields. Must be present, may be {}.

Timeseries object

Parallel arrays, all the same length. Index i corresponds to t[i] seconds after the activity start.

Key Type Unit Description
t int[] seconds Seconds since started_at. Always present.
lat float[]|null degrees Latitude. null if no GPS or privacy=no_gps.
lon float[]|null degrees Longitude. null if no GPS or privacy=no_gps.
elevation_m float[] metres Elevation. Array of nulls if unavailable.
speed_kmh float[] km/h Speed. Array of nulls if unavailable.
hr_bpm int[] bpm Heart rate. Array of nulls if no HR sensor.
cadence_rpm int[] rpm/spm Cadence. Array of nulls if unavailable.
power_w int[] watts Power. Array of nulls if no power meter.
temperature_c float[] °C Temperature. Array of nulls if unavailable.

Timeseries are downsampled to at most 1 sample per second. The exact downsampling strategy is implementation-defined; linear interpolation or nearest-neighbour are both acceptable.

lat and lon arrays are either both present (both non-null arrays) or both null. Treat null the same as an array of nulls.

Lap object

{
  "index": 0,
  "started_at": "2024-06-01T07:30:12+02:00",
  "duration_s": 1800,
  "distance_m": 21150.0,
  "elevation_gain_m": 310.0,
  "avg_speed_kmh": 28.2,
  "avg_hr_bpm": 145,
  "avg_power_w": null
}

activities/{id}.geojson

Simplified GPS track for map rendering. Omitted entirely when privacy is no_gps or private.

{
  "type": "Feature",
  "geometry": {
    "type": "LineString",
    "coordinates": [
      [9.1234, 45.4321, 120.0],
      [9.1235, 45.4322, 120.5]
    ]
  },
  "properties": {
    "id": "2024-06-01T073012Z-morning-ride",
    "speeds": [0.0, 15.2],
    "simplification": "rdp",
    "rdp_epsilon": 0.0001,
    "point_count_original": 7200,
    "point_count_simplified": 843
  }
}

Coordinates are [longitude, latitude, elevation_metres] per GeoJSON spec. The speeds property is a parallel array to coordinates — one speed value per point — used for gradient coloring on the map.


Deduplication

Activities from different sources (e.g. a Strava export and a Karoo export) may represent the same real-world ride. Producers should detect and handle duplicates before writing the data store.

Exact duplicate

Two files with the same source_hash are byte-for-byte identical. Only one should be processed; the other is silently skipped.

Near-duplicate (same ride, different source)

Two activities are considered near-duplicates if:

  • |started_at difference| < 5 minutes, and
  • |distance_m difference| / max(distance_m) < 5%

When a near-duplicate is detected:

  1. One is kept as the canonical record (priority: FIT > GPX > TCX, then prefer the source with more sensor channels).
  2. The duplicate is written with "duplicate_of": "{canonical_id}" and "privacy": "private" so it is excluded from feeds but remains auditable.

Deduplication metadata in detail record

{
  "source_hash": "sha256:e3b0c...",
  "duplicate_of": null
}
Field Type Description
source_hash string|null sha256:{hex} of original file bytes.
duplicate_of string|null ID of the canonical activity, if this is a duplicate.

Versioning

The bas_version field allows consumers to handle schema evolution. Consumers should:

  • Reject files with a major version higher than they support.
  • Accept and ignore unknown fields (forward compatibility).
  • Treat missing optional fields as null (backward compatibility).

Current version: 1.0


Changelog

Version Date Changes
1.0 2026-03-28 Initial release.