metrics: guard against corrupted time streams causing OOM
Strava originals with absolute Unix timestamps stored as elapsed-second offsets produce a t_max of ~1.6 billion. compute_mmp and compute_best_efforts both create dense 1Hz arrays via range(t_min, t_max+1), which for a 1.6B span allocates 44+ GB and OOM-kills the process. Add a >1-week sanity check and return None early for corrupt streams. Root cause: old Strava activities (seen from 1970-epoch start_date) where the time stream contains absolute Unix timestamps instead of elapsed seconds.
This commit is contained in:
@@ -131,6 +131,10 @@ def compute_mmp(pts: list[DataPoint], started_at: datetime) -> Optional[list[lis
|
|||||||
|
|
||||||
t_min = min(sparse)
|
t_min = min(sparse)
|
||||||
t_max = max(sparse)
|
t_max = max(sparse)
|
||||||
|
# Guard against corrupted time data (e.g. absolute Unix timestamps stored as
|
||||||
|
# elapsed offsets, which can make t_max astronomically large and OOM the process).
|
||||||
|
if t_max - t_min > 7 * 24 * 3600: # > 1 week → corrupted stream
|
||||||
|
return None
|
||||||
power_1hz: list[int] = [sparse.get(t, 0) for t in range(t_min, t_max + 1)]
|
power_1hz: list[int] = [sparse.get(t, 0) for t in range(t_min, t_max + 1)]
|
||||||
|
|
||||||
n = len(power_1hz)
|
n = len(power_1hz)
|
||||||
@@ -190,6 +194,10 @@ def compute_best_efforts(
|
|||||||
|
|
||||||
t_min = min(sparse_speed)
|
t_min = min(sparse_speed)
|
||||||
t_max = max(sparse_speed)
|
t_max = max(sparse_speed)
|
||||||
|
# Guard against corrupted time data (e.g. absolute Unix timestamps stored as
|
||||||
|
# elapsed offsets, which can make t_max astronomically large and OOM the process).
|
||||||
|
if t_max - t_min > 7 * 24 * 3600: # > 1 week → corrupted stream
|
||||||
|
return None, None
|
||||||
speed_1hz: list[float] = [sparse_speed.get(t, 0.0) for t in range(t_min, t_max + 1)]
|
speed_1hz: list[float] = [sparse_speed.get(t, 0.0) for t in range(t_min, t_max + 1)]
|
||||||
ele_1hz: list[Optional[float]] = [sparse_ele.get(t) for t in range(t_min, t_max + 1)]
|
ele_1hz: list[Optional[float]] = [sparse_ele.get(t) for t in range(t_min, t_max + 1)]
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user