# CLAUDE.md

Guidance for Claude Code working in this repo.

## What this project is

A web app that turns a **handheld, high-framerate, low-resolution focus-sweep video** into a 3D model via **depth-from-focus**. As the camera's focal plane sweeps linearly through the scene, the frame in which each pixel is *sharpest* encodes that pixel's depth.

User-facing flow (six steps):
1. **Stabilize** — optical flow registers every frame to a common reference, removing handheld wobble so the stack is pixel-aligned.
2. **Focus masks** — dump aligned frames; per frame compute a local sharpness measure and threshold it to a mask of in-focus pixels.
3. **Trim** — user marks the front (near) and back (far) frames of the linear portion of the sweep with a dual-handle slider.
4. **Build geometry** — X/Y = pixel coords, Z = (sharpest-frame-index) × a user-scalable step. Each in-focus pixel becomes a vertex (PLY point cloud) and is also stitched into a triangulated height-field surface (STL).
5. **Render** — display the vertices in an interactive in-browser 3D viewer.
6. **Export** — download the result as STL or PLY.

The full approved implementation plan lives at
`/home/japhy/.claude/plans/wise-painting-curry.md` — read it for design rationale and build order.

## Stack & key decisions

- **Backend:** Python + FastAPI + OpenCV (`opencv-python-headless`) + NumPy. Heavy CV runs server-side.
- **Frontend:** Vite + React + TypeScript + three.js (`@react-three/fiber`, `@react-three/drei`).
- **Output:** emit **both** a PLY point cloud and a triangulated STL surface.
- **Stabilization registers to a fixed reference frame** (correct for focus stacking — static scene, only the focal plane moves), *not* trajectory smoothing.
- The 3D viewer shows the **point cloud** (step 5 "display the vertices"); STL is for export/print.
- Job-based backend: each upload gets a `job_id` and a working dir under `backend/data/jobs/<id>/`. Long steps (stabilize, generate) run in a background thread that updates a progress value the frontend polls.

## Intended layout

```
backend/
  requirements.txt
  app/
    main.py            # FastAPI app + routes
    jobs.py            # JobStore: per-job dirs, status/progress
    pipeline/
      video.py         # probe + read frames (OpenCV VideoCapture)
      stabilize.py     # optical-flow registration to a reference frame
      focus.py         # focus measure, per-frame masks, depth-from-focus
      geometry.py      # build PLY (points) + STL (triangulated heightfield)
  data/jobs/<id>/      # input.mp4, aligned/, output.ply, output.stl (gitignored)
frontend/              # Vite + React + TS
  src/
    api.ts             # backend client
    App.tsx            # stage machine: Upload → Stabilize → Trim → Result
    components/{DropZone,ProgressBar,TrimControls,Viewer}.tsx
```

## API surface (backend)

- `POST /api/jobs` — multipart video → `{job_id, meta}` (fps, frame count, w, h)
- `POST /api/jobs/{id}/stabilize` — start stabilization (background thread)
- `GET  /api/jobs/{id}/status` — `{status, progress, frame_count}` for polling
- `GET  /api/jobs/{id}/frames/{idx}` — aligned frame JPEG (filmstrip/preview)
- `GET  /api/jobs/{id}/focus/{idx}` — in-focus mask overlay (optional preview)
- `POST /api/jobs/{id}/generate` — body `{front, back, z_step, focus_threshold, downsample}` → `{vertex_count, bbox}`
- `GET  /api/jobs/{id}/points` — decimated Float32/JSON vertices for the viewer
- `GET  /api/jobs/{id}/download/{ply|stl}` — file download

Status flow: `uploaded → stabilizing → stabilized → generating → ready`.

## CV pipeline notes

- **Stabilize:** `goodFeaturesToTrack` on the reference (middle) frame → `calcOpticalFlowPyrLK` into each frame → `estimateAffinePartial2D` → `warpAffine` onto the reference grid. Fallback for low-texture frames: dense `DISOpticalFlow` → global translation.
- **Focus / depth:** per frame `lap = cv2.Laplacian(gray, CV_32F)`, local energy `cv2.boxFilter(lap*lap, ksize)`. Stream the trimmed `[front, back]` range keeping a per-pixel running **argmax** (`best_idx`) and **max** (`best_val`) — avoids holding the full H×W×N stack. `in_focus_mask = best_val > focus_threshold`; `depth = best_idx - front`.
- **Geometry:** PLY = one vertex per in-focus pixel `(x·xy_scale, y·xy_scale, depth·z_step)`, hand-rolled binary writer (no extra dep). STL = height-field; for each 2×2 cell whose 4 corners are all in-focus, emit 2 triangles with normals, hand-rolled binary writer. Holes where there's no valid depth are expected.

## Dev commands

Backend (port 8000):
```bash
cd backend
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8000
```

Frontend (port 5173, proxies `/api` → :8000):
```bash
cd frontend
npm install
npm run dev
```

## Verification

- **Backend unit:** a script generates a synthetic focus-sweep clip (a textured band that is in-focus in a row that moves linearly per frame; rest Gaussian-blurred), runs the pipeline directly, and asserts `vertex_count > 0` and monotonic recovered depth across the band. Confirms depth-from-focus before any UI.
- **File validity:** check binary STL header/triangle-count consistency and PLY point count match.
- **End-to-end:** start both servers, drag a focus-sweep video onto the home page → watch stabilization progress → set front/back + z-step → Generate → orbit the point cloud → export STL and PLY. The `/run` skill can drive the app.

## Conventions

- Keep the CV in `backend/app/pipeline/` as pure functions operating on NumPy arrays / file paths, decoupled from FastAPI, so they're unit-testable without the server.
- No external mesh libs for I/O — PLY and STL writers are hand-rolled and binary.
- `backend/data/` is runtime working state — gitignore it.