Refactor: extract shared image upload utilities into bincio/shared/images.py

ALLOWED_IMAGE_TYPES, MAX_IMAGE_BYTES, and unique_image_name() were
duplicated identically in both the edit and serve servers. Centralising
them means a single change point for any future extension (e.g. adding
image/avif support).

Tests added in tests/test_shared_images.py cover no-collision, single
and chained collisions, no-suffix filenames, and constant values.
This commit is contained in:
Davide Scaini
2026-05-13 23:13:08 +02:00
parent cd97e4cc87
commit e61d05fc41
6 changed files with 579 additions and 28 deletions
+497
View File
@@ -0,0 +1,497 @@
# Refactoring Plan
Branch: `refactoring`
Approach: test-first — each step starts with tests that prove correctness of the current behaviour, then the refactor makes those tests pass against the new structure.
---
## Step 1 — Extract shared image utilities
### Problem
`_ALLOWED_IMAGE_TYPES`, `_MAX_IMAGE_BYTES`, and `_unique_image_name()` are defined identically in two files:
| File | Lines |
|---|---|
| `bincio/edit/server.py` | 4658 |
| `bincio/serve/server.py` | 337357 |
Any change (e.g. adding `image/avif`) must be made in both places.
### Target
New module: `bincio/shared/images.py`
```python
# bincio/shared/images.py
from pathlib import Path
ALLOWED_IMAGE_TYPES: frozenset[str] = frozenset({
"image/jpeg", "image/png", "image/webp", "image/gif"
})
MAX_IMAGE_BYTES: int = 10 * 1024 * 1024 # 10 MB
def unique_image_name(directory: Path, filename: str) -> str:
"""Return a filename that does not collide with existing files in directory."""
stem, suffix = Path(filename).stem, Path(filename).suffix
candidate = filename
counter = 1
while (directory / candidate).exists():
candidate = f"{stem}_{counter}{suffix}"
counter += 1
return candidate
```
### Test plan
**New file**: `tests/test_shared_images.py`
Write these tests first (they will fail until the module exists):
```python
# tests/test_shared_images.py
from pathlib import Path
import pytest
from bincio.shared.images import ALLOWED_IMAGE_TYPES, MAX_IMAGE_BYTES, unique_image_name
def test_constants():
assert "image/jpeg" in ALLOWED_IMAGE_TYPES
assert "image/png" in ALLOWED_IMAGE_TYPES
assert "image/webp" in ALLOWED_IMAGE_TYPES
assert "image/gif" in ALLOWED_IMAGE_TYPES
assert "image/avif" not in ALLOWED_IMAGE_TYPES # guard against accidental expansion
assert MAX_IMAGE_BYTES == 10 * 1024 * 1024
def test_unique_name_no_collision(tmp_path):
assert unique_image_name(tmp_path, "photo.jpg") == "photo.jpg"
def test_unique_name_single_collision(tmp_path):
(tmp_path / "photo.jpg").touch()
assert unique_image_name(tmp_path, "photo.jpg") == "photo_1.jpg"
def test_unique_name_multiple_collisions(tmp_path):
(tmp_path / "photo.jpg").touch()
(tmp_path / "photo_1.jpg").touch()
assert unique_image_name(tmp_path, "photo.jpg") == "photo_2.jpg"
def test_unique_name_no_suffix(tmp_path):
(tmp_path / "photo").touch()
assert unique_image_name(tmp_path, "photo") == "photo_1"
def test_unique_name_preserves_case(tmp_path):
assert unique_image_name(tmp_path, "MyPhoto.PNG") == "MyPhoto.PNG"
```
### Implementation steps
1. Create `bincio/shared/__init__.py` (empty).
2. Create `bincio/shared/images.py` with the public constants and function (no leading underscore).
3. In `bincio/edit/server.py`: replace lines 4658 with:
```python
from bincio.shared.images import ALLOWED_IMAGE_TYPES as _ALLOWED_IMAGE_TYPES
from bincio.shared.images import MAX_IMAGE_BYTES as _MAX_IMAGE_BYTES
from bincio.shared.images import unique_image_name as _unique_image_name
```
4. In `bincio/serve/server.py`: same replacement at lines 337357.
5. Run `pytest tests/test_shared_images.py` — all pass.
6. Run full test suite to confirm no regressions.
---
## Step 2 — Extract embedded HTML template from `edit/server.py`
### Problem
`bincio/edit/server.py` contains 285 lines of static HTML/CSS/JS as a Python string literal (`_HTML`, lines 63347). This:
- Makes the file 30% larger than it needs to be.
- Prevents syntax highlighting and linting of the HTML/JS.
- Cannot be tested in isolation (template substitution is done inline by the route handler).
### Target
New file: `bincio/edit/templates/edit.html`
The template already uses three placeholder tokens:
- `__SITE_URL__` — replaced with `site_url` at request time
- `__SPORT_OPTIONS__` — replaced with generated `<option>` tags
- `__STAT_CHECKBOXES__` — replaced with generated `<label>` tags
Extract a helper that loads and renders the template:
```python
# bincio/edit/server.py — replaces the _HTML string and inline render
from pathlib import Path as _Path
_TEMPLATE_PATH = _Path(__file__).parent / "templates" / "edit.html"
def _render_edit_html(activity_id: str) -> str:
template = _TEMPLATE_PATH.read_text(encoding="utf-8")
sport_options = "\n".join(
f'<option value="{s}">{s.title()}</option>' for s in SPORTS
)
stat_checkboxes = "\n".join(
f'<label class="check-item"><input type="checkbox" data-stat="{k}"> {v}</label>'
for k, v in STAT_PANELS.items()
)
return (
template
.replace("__SITE_URL__", site_url)
.replace("__SPORT_OPTIONS__", sport_options)
.replace("__STAT_CHECKBOXES__", stat_checkboxes)
)
```
### Test plan
**Extend** `tests/test_edit_server.py` with a new section:
```python
# tests/test_edit_server.py — new tests for template loading
import bincio.edit.server as edit_server
def test_edit_ui_returns_html(tmp_path):
"""GET /edit/<id> returns 200 with an HTML body containing the form."""
edit_server.data_dir = tmp_path
activities = tmp_path / "activities"
activities.mkdir()
(activities / "run-001.json").write_text('{"id":"run-001"}')
resp = CLIENT.get("/edit/run-001")
assert resp.status_code == 200
assert resp.headers["content-type"].startswith("text/html")
assert '<form id="form"' in resp.text
def test_edit_ui_injects_site_url(tmp_path):
"""Template placeholder __SITE_URL__ is replaced with the configured site_url."""
edit_server.data_dir = tmp_path
edit_server.site_url = "http://localhost:1234"
(tmp_path / "activities").mkdir(exist_ok=True)
(tmp_path / "activities" / "run-001.json").write_text('{"id":"run-001"}')
resp = CLIENT.get("/edit/run-001")
assert "http://localhost:1234" in resp.text
assert "__SITE_URL__" not in resp.text
def test_edit_ui_no_unresolved_placeholders(tmp_path):
"""No placeholder tokens remain after rendering."""
edit_server.data_dir = tmp_path
(tmp_path / "activities").mkdir(exist_ok=True)
(tmp_path / "activities" / "run-001.json").write_text('{"id":"run-001"}')
resp = CLIENT.get("/edit/run-001")
for token in ("__SITE_URL__", "__SPORT_OPTIONS__", "__STAT_CHECKBOXES__"):
assert token not in resp.text, f"Unresolved placeholder: {token}"
def test_edit_template_file_exists():
"""The template file is present on disk (guards against accidental deletion)."""
from pathlib import Path
template = Path(edit_server.__file__).parent / "templates" / "edit.html"
assert template.exists(), f"Template not found at {template}"
```
### Implementation steps
1. Create `bincio/edit/templates/` directory.
2. Move the HTML content of `_HTML` into `bincio/edit/templates/edit.html` verbatim (keep the `__PLACEHOLDER__` tokens as-is).
3. Delete the `_HTML = """..."""` string literal from `server.py` (lines 63347).
4. Add `_render_edit_html()` helper as shown above.
5. Update the `/edit/{activity_id}` route handler to call `_render_edit_html(activity_id)` instead of the inline `_HTML.replace(...)` chain.
6. Run `pytest tests/test_edit_server.py` — all pass (new + existing).
---
## Step 3 — Split `serve/server.py` into APIRouter modules
### Problem
`bincio/serve/server.py` is 3,230 lines containing ~60 routes across 10 logical domains, all shared dependencies, all Pydantic models, and all background task machinery. It cannot be meaningfully reviewed, tested in isolation, or understood at a glance.
### Target structure
```
bincio/serve/
├── server.py # ~150 lines: app factory, middleware, router registration, startup
├── deps.py # module-level globals + shared FastAPI dependency functions
├── models.py # all Pydantic request/response models
├── tasks.py # background workers: site-rebuild, rebuild-for-handle, jobs registry
├── db.py # unchanged
└── routers/
├── __init__.py
├── auth.py # /api/auth/*, /api/register, /api/invites
├── me.py # /api/me/*
├── admin.py # /api/admin/*
├── activities.py # /api/activity/*, /api/activities/*
├── uploads.py # /api/upload/*
├── segments.py # /api/segments/*
├── strava.py # /api/strava/*
├── garmin.py # /api/garmin/*
├── ideas.py # /api/ideas/*, /api/feedback
└── feed.py # /api/feed, /api/stats, /api/me (read-only), /api/wheel/*
```
### `deps.py` — shared state and dependency functions
All module-level globals that multiple routers need move here. The CLI sets them on this module before uvicorn starts.
```python
# bincio/serve/deps.py
from __future__ import annotations
import os
import re
from pathlib import Path
from typing import Optional
import sqlite3
from fastapi import Cookie, HTTPException
from bincio.serve.db import User, get_session, get_user
# ── Module-level state (set by CLI) ──────────────────────────────────────────
data_dir: Path | None = None
site_dir: Path | None = None
webroot: Path | None = None
strava_client_id: str = ""
strava_client_secret: str = ""
public_url: str = ""
dem_url: str = "https://api.open-elevation.com"
sync_secret: str = ""
garmin_key: bytes | None = None
_db: sqlite3.Connection | None = None
# ── Constants ─────────────────────────────────────────────────────────────────
VALID_HANDLE = re.compile(r'^[a-z0-9][a-z0-9_-]{0,29}$')
SESSION_COOKIE = "bincio_session"
COOKIE_MAX_AGE = 30 * 86400
SESSION_DOMAIN = os.environ.get("SESSION_DOMAIN") or None
# ── Dependency functions ───────────────────────────────────────────────────────
def get_data_dir() -> Path:
if data_dir is None:
raise HTTPException(500, "Server not configured")
return data_dir
def get_db() -> sqlite3.Connection:
global _db
if _db is None:
from bincio.serve.db import open_db
_db = open_db(get_data_dir())
return _db
def get_current_user(bincio_session: Optional[str] = Cookie(default=None)) -> User:
if not bincio_session:
raise HTTPException(401, "Not authenticated")
sess = get_session(get_db(), bincio_session)
if sess is None:
raise HTTPException(401, "Invalid or expired session")
user = get_user(get_db(), sess.handle)
if user is None or user.suspended:
raise HTTPException(401, "Account not found or suspended")
return user
def require_admin(bincio_session: Optional[str] = Cookie(default=None)) -> User:
user = get_current_user(bincio_session)
if not user.is_admin:
raise HTTPException(403, "Admin required")
return user
```
### `models.py` — Pydantic models
All `class *Request(BaseModel)` and `class *Response(BaseModel)` definitions move to `bincio/serve/models.py`. Routers import from there.
### `tasks.py` — background workers
Move `_site_rebuild_worker`, `_site_rebuild_event`, `_rebuild_for_handle`, `_active_jobs`, `_jobs_lock`, `_job_start`, `_job_update`, `_job_finish` to `bincio/serve/tasks.py`. These import from `deps.py` for `webroot` and `site_dir`.
### Router example — `routers/auth.py`
```python
# bincio/serve/routers/auth.py
from fastapi import APIRouter, Cookie, Depends, HTTPException, Request, Response
from bincio.serve import deps
from bincio.serve.models import LoginRequest, LoginResponse, ...
router = APIRouter()
@router.post("/api/auth/login", response_model=LoginResponse)
async def login(body: LoginRequest, request: Request, response: Response):
db = deps.get_db()
...
```
Main `server.py` becomes:
```python
# bincio/serve/server.py (~150 lines)
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.middleware.gzip import GZipMiddleware
from bincio.serve.routers import auth, me, admin, activities, uploads, segments, strava, garmin, ideas, feed
app = FastAPI(title="BincioActivity Serve")
app.add_middleware(GZipMiddleware, minimum_size=1024)
app.add_middleware(CORSMiddleware, ...)
for router in [auth.router, me.router, admin.router, activities.router,
uploads.router, segments.router, strava.router, garmin.router,
ideas.router, feed.router]:
app.include_router(router)
@app.on_event("startup")
async def _on_startup() -> None: ...
```
### Test plan
**Philosophy**: write tests that call routes through the full app (via `TestClient`), so they remain valid before and after the split. Each router gets its own test file.
#### Before starting the split
Write `tests/serve/test_auth_router.py` (and equivalents for each router). These tests pass against the current monolith. After the split they must still pass — this is the regression guard.
```python
# tests/serve/test_auth_router.py
import pytest
from fastapi.testclient import TestClient
from bincio.serve.server import app
import bincio.serve.deps as deps
@pytest.fixture()
def client(tmp_path):
deps.data_dir = tmp_path
deps._db = None
return TestClient(app, raise_server_exceptions=False)
def test_login_missing_body(client):
r = client.post("/api/auth/login", json={})
assert r.status_code == 422
def test_login_wrong_password(client):
r = client.post("/api/auth/login", json={"handle": "nobody", "password": "x"})
assert r.status_code in (401, 404)
def test_logout_unauthenticated(client):
r = client.post("/api/auth/logout")
assert r.status_code == 401
def test_register_invite_required(client):
r = client.post("/api/register", json={"handle": "alice", "password": "pass1234", "invite": ""})
assert r.status_code in (400, 403)
```
Write similar test files for `me`, `admin`, `activities`, `uploads` before touching any production code.
#### After the split
Run the full test suite. No test should fail — the tests are route-level and do not import from the monolith's internals.
#### Additional tests enabled by the split
Once routers are isolated, add unit tests for dependency functions in `deps.py`:
```python
# tests/serve/test_deps.py
import pytest
from fastapi import HTTPException
import bincio.serve.deps as deps
def test_get_data_dir_raises_when_unset():
deps.data_dir = None
with pytest.raises(HTTPException) as exc:
deps.get_data_dir()
assert exc.value.status_code == 500
def test_get_current_user_raises_without_cookie(tmp_path):
deps.data_dir = tmp_path
deps._db = None
with pytest.raises(HTTPException) as exc:
deps.get_current_user(bincio_session=None)
assert exc.value.status_code == 401
```
### Implementation steps (do not start until all pre-split tests are green)
1. Create `bincio/serve/deps.py` — move globals, constants, `get_db`, `get_current_user`, `require_admin`, `get_data_dir`. Update `serve/cli.py` to set `deps.*` instead of `server.*`.
2. Create `bincio/serve/models.py` — move all Pydantic models. Update imports in `server.py`.
3. Create `bincio/serve/tasks.py` — move `_site_rebuild_worker`, `_site_rebuild_event`, `_rebuild_for_handle`, jobs registry. Import from `deps`.
4. Create `bincio/serve/routers/__init__.py` (empty).
5. Extract one router at a time in this order (least-coupled first):
- `feed.py` (read-only, minimal deps)
- `auth.py` (depends only on db, no user dir operations)
- `me.py` (depends on current user + user dir)
- `ideas.py`
- `segments.py`
- `strava.py`
- `garmin.py`
- `activities.py`
- `uploads.py`
- `admin.py` (most complex, depends on tasks)
6. After each router extraction: run the full test suite before proceeding.
7. Once all routers are extracted, reduce `server.py` to the app factory and middleware only.
---
## Step 4 — Narrow broad `except Exception:` catches
### Problem
`bincio/serve/server.py` has ~20 bare `except Exception:` or `except Exception as exc:` clauses. Most are in route handlers, not background threads. Broad catches hide bugs and prevent the right exception type from propagating to logs.
### Inventory (file: `bincio/serve/server.py`)
| Line | Context | Acceptable? | Proposed fix |
|---|---|---|---|
| 204 | Reading per-user strava creds JSON from disk | No | `(OSError, json.JSONDecodeError, KeyError, ValueError)` |
| 228 | Cleanup of orphaned `tmp*.zip` files at startup | Yes (best-effort) | Keep, add inline comment |
| 407 | Background site-rebuild thread top-level | Yes (last-resort guard) | Keep — already calls `log.exception()` |
| 443 | Background rebuild-for-handle thread | Yes (same) | Keep |
| 554 | Extracting preview coords from geojson | No | `(KeyError, TypeError, IndexError)` |
| 652 | Background BAS upload processing | Yes (background task) | Keep, ensure `log.exception()` is called |
| 694 | Background raw upload processing | Yes | Keep, ensure `log.exception()` |
| 753 | Subprocess result parsing in raw upload | No | `(subprocess.SubprocessError, OSError, json.JSONDecodeError)` |
| 767 | Same function, inner block | No | Same as above |
| 1287 | Reading activity JSON in admin diag endpoint | No | `(OSError, json.JSONDecodeError)` |
| 1346 | Parsing timeseries JSON in admin diag | No | `(OSError, json.JSONDecodeError, KeyError)` |
| 1354 | Parsing merged JSON in admin diag | No | Same |
| 1486 | Strava admin sync — per-user iteration | Yes (don't abort whole sync) | Keep, ensure `log.exception()` |
| 1512 | Strava admin sync — inner loop | No | `StravaError, requests.RequestException` |
| 1529 | Same | No | Same |
| 1567 | Garmin sync iteration | Yes (don't abort) | Keep, ensure `log.exception()` |
| 1678 | Deleting user directory | No | `OSError` |
| 1741 | Reading strava creds file in user endpoint | No | `(OSError, json.JSONDecodeError)` |
| 1769 | Writing strava creds file | No | `OSError` |
| 1782 | Same | No | `OSError` |
**Rule of thumb applied here**: background thread top-level handlers and "don't abort whole batch" iterators keep `except Exception` but must call `log.exception()`. All route-handler catches should be narrowed to the specific exception types that the called code can raise.
### Test plan
For each narrowed catch, write a targeted test that injects the specific failure and verifies the correct HTTP response code is returned (not a 500).
```python
# tests/serve/test_error_handling.py
def test_strava_creds_file_malformed_json(client, tmp_path):
"""Malformed strava_credentials.json is silently ignored; global creds used."""
user_dir = tmp_path / "alice"
user_dir.mkdir()
(user_dir / "strava_credentials.json").write_text("not json")
# Should not 500; should fall back to global (empty) creds
# Test via /api/strava/status once auth is in place
...
def test_activity_geojson_missing_geometry(client, tmp_path, authenticated_session):
"""Activity with malformed geojson does not crash the upload endpoint."""
...
```
### Implementation steps
1. Audit each line in the inventory table above.
2. For each "No" row: replace `except Exception` with the specific types listed.
3. For each "Yes — keep" row: verify `log.exception()` (not `log.error()` or bare `pass`) is called so the full traceback appears in logs.
4. Run full test suite + new `test_error_handling.py` tests.
---
## Progress tracker
| # | Step | Status |
|---|---|---|
| 1 | Extract shared image utilities → `bincio/shared/images.py` | Done |
| 2 | Extract HTML template → `bincio/edit/templates/edit.html` | Not started |
| 3 | Split `serve/server.py` into `deps.py` + `routers/*` | Not started |
| 4 | Narrow broad `except Exception:` catches | Not started |
> **Note on dependency pinning**: not included. `uv.lock` already pins every dependency (including transitives) to exact versions, which is strictly stronger than switching `>=` to `~=` in `pyproject.toml`. The lockfile is the right mechanism for this concern.