Commit 782ef529 authored by Andrey Filippov's avatar Andrey Filippov

CLAUDE: initial import — DNN training/eval/export (migrated from imagej-elphel-internal/c5p_dnn)

L1 RawFCN + L2 ConvGRU(torus), synthetic data gen, training/eval, infer_server,
and export_torchscript.py (self-contained TorchScript for native LibTorch inference).
GPLv3 (Elphel norm); headers on all .py/.sh; LICENSE = GPLv3. runs/ checkpoints untracked.
Co-Authored-By: 's avatarClaude Opus 4.8 (1M context) <noreply@anthropic.com>
parents
Pipeline #3823 canceled with stages
__pycache__/
*.pyc
runs/
*.venv/
export_venv/
*.tif
# C5P DNN front-end — design & decisions
CUAS real-time detector DNN front-end: an **all-convolutional (FCN)** per-pixel target estimator
that replaces the C5P matched-filter velocity bank. Trained on synthetic Gaussian-noise patches,
deployed inside `CuasDetectRT` (ImageJ) via ONNX Runtime (Java, CPU EP). Produces, per pixel, a
velocity posterior over an 11×11 grid (±1.25 px/frame, 0.25 step) + detection confidence `s` +
sub-pixel (dx,dy) offset.
## Files
- `synth.py` — on-the-fly training patches. `generate_sample` (center / off-center / noise classes),
half-cosine bump targets, per-frame constant velocity sampled on a disk, SNR-swept.
- `model.py``RawFCN` (24×24×N → 1×1×124 via valid convs + 2 maxpools, slid densely as an FCN);
`fcn_loss` (det BCE + velocity soft-target CE + offset MSE); `vel_bias_loss` (batch-moment de-bias).
- `train.py` — training loop + ONNX export (dynamic H/W axes). The CLI defaults ARE the "weighted" recipe.
- `velocity_bias.py` — gain-vs-SNR diagnostic (predicted vs true velocity, fit per SNR bin).
- `gen_synth_cuas.py` — builds the full-frame `*-CUAS-SYNTHETIC-CUAS.tiff` velocity-reference grid
for in-ImageJ testing (radial layout, one velocity cell per target). Writes a `.groundtruth.json`.
- `compare_dnn_truth.py` — compares saved DNN output vs the real UAS_log.csv (offset/velocity/time).
- `baselines.py`, `export_onnx.py`, `make_testvec.py`, `viz_*.py`, `extract_B.py` — support/diagnostics.
## Key decisions & findings (2026-06-15)
1. **Runtime temporal depth N.** `CuasDnnInfer` reads N from the ONNX input shape `[B,N,H,W]` and
exposes `getNFrames()`; `CuasDetectRT` uses it instead of a hardcoded 8 → 8- vs 9-frame models swap
purely by changing `curt_dnn_model`. (Committed: imagej-elphel `9d06cce7`.)
2. **Velocity training DOMAIN vs output grid.** `synth.py` confines training velocity to a *disk* of
radius `vmax_px`. It was 1.0, but the output grid runs to ±1.25 (radius-5 cells), so every cell with
|v|>1.0 was untrained → underestimate + asymmetric ghosts at the corners. Diagnosed via the synthetic
grid (the diagonal (1.0,1.0)=|v|1.414 cell got projected onto the trained boundary → argmax (0.75,0.75)).
Fix: train with **vmax_px ≈ 1.4** (covers the ±1.25 on-axis with margin; only super-physical diagonal
corners stay untrained). `velocity_bias.py` couldn't catch this — it samples the same disk.
3. **Velocity SCALE de-bias (`vel_bias_loss`).** Softmax-centroid velocity shrinks as confidence drops
(centroid of a noise-broadened, grid-bounded posterior regresses toward 0): gain 0.97 clean → 0.76 at
SNR=1. **The recurrent layer reduces variance, not bias** — so the bias must be removed in the DNN.
`vel_bias_loss`: per equal-population bin of a conditioning var, pooled least-squares gain-through-origin
of predicted vs true velocity, penalize `(gain−1)²`. This pins the MEAN scale to 1 per bin WITHOUT
penalizing per-sample variance (variance is information-limited; left for the recurrent to average).
Result: gain → ~1.0 across SNR (RMSE preserved at low SNR — the intended signature).
- `--bias_by snr` (default): clean training label; the correction is baked into the weights so the var
need not exist at inference. **Best for the in-loss term.**
- `--bias_by s` (confidence): cause-agnostic — corrects on the net's own uncertainty regardless of WHY
it's low (Gaussian noise vs clutter vs close targets), so it may transfer to non-Gaussian degraders on
REAL data. Near-identical to snr on Gaussian; the real-data A/B (tile-streak, two close targets) is the
only discriminator. If it doesn't transfer, augment synth with those factors.
4. **Half-pixel registration.** Training referenced the patch center at `(W−1)/2 = 11.5` (even patch),
but deployment (`CuasDnnInfer.inferROI`) puts the ROI/output pixel at index `half = P/2 = 12`
(`ix = cx + i − half`, patch spans `[cx−12, cx+11]`). The 0.5 gap was a systematic ½-px position bias.
Fix: set `cx0 = cy0 = P/2` in `generate_sample` (matches the deployment reference; even patch kept).
The 24-patch asymmetry (12 left / 11 right) is exactly what deployment already imposes, so training now
matches it — no odd-25-patch architecture change needed.
5. **Temporal sync (exact).** The DNN output is anchored at the **newest** frame of its N-frame window
(`window[0] = framesD[newest]`, training labels frame i=0 = newest). The output is tagged
`ts[newest] + " f"+newest`, so `f<n>` IS the motion frame (= level-slice index). No ±-frame slack.
6. **Radial synthetic grid math.** `gen_synth_cuas.py` radial layout: cell (vx,vy) node = `center + 30·(vx,vy)`,
velocity `(vx/4, vy/4)` px/frame, `pos = node + v·t`. So the **effective grid spacing = 30 + t/4 px**
(breathes outward 0.25 px/frame). Clean integer-pixel grids occur at `t ≡ 0 (mod 4)`: t=8→32px, 12→33px, …
Between them, sub-pixel offsets fan out per velocity. center = (320.5, 256.5) on a 640×512 frame.
7. **ONNX deploy convention.** torch exports external-data ONNX (`model.onnx` + `model.onnx.data`); the
`.onnx` references the sidecar by its export-time name, so the pair **cannot be renamed flat** (renaming
loads the wrong sidecar → size-mismatch error). Deploy each model in **its own subdir** with the canonical
`model.onnx` / `model.onnx.data`. Cache: `~/.cache/c5p_dnn/<name>/model.onnx`.
8. **Ghostbuster (untrained-corner velocities).** The velocity grid (radius `vel_radius`=5 cells,
corners to R≈7) extends past the trained disk (R = `vmax_px`·`vel_decimate` ≈ 5.6 at vmax 1.4), so
the untrained corner cells emit spurious velocity sidelobes (ghosts) of non-trivial strength (field
value up to ~0.09 vs ~0.15 real) that would confuse the recurrent. `CuasDetectRT.dnnGhostbust` zeros
velocity cells with cell-radius > `curt_dnn_vmax·vel_decimate`; if a pixel's PEAK lands in that region
the whole detection is discarded (field=0, **s=0**). Applied to the DNN field + offset before save and
the recurrent feed. **`curt_dnn_vmax` (px/frame, default 1.4) must match the loaded model's training
`vmax_px`** (PM models = 1.4, `m9_base` = 1.0); too low over-masks trained cells, too high leaves ghosts.
(imagej-elphel `0bd16311`.)
9. **Larger attention area (32-patch) — kills trajectory-alias ghosts (T4).** Widened the receptive
field to a 32-patch: `RawFCN(patch=32)` = 6 conv3 + 2 pool (32→30→28→14→12→10→5→3→1; pools on even
sizes so the cx0=P/2 centering holds), ~119k params. Grows off-center suppression reach
`off_max = P/2-margin-1` from 9→13 px, covering the alias reach `vmax·(N-1) ≈ 11.2 px`. On the real
synthetic grid the ghost field dropped **0.16 → ~0.003** (~50×, essentially gone). NB: the single-
static `ghost_probe.py` did NOT reproduce the ghost (both 24/32 suppress a lone static off-center
target) — the real alias needs the multi-target/conditioning context — but the wider RF fixed it
regardless. Java: `inferROI` patch is configurable via `CuasDnnInfer.setPatch`; new param
**`curt_dnn_patch`** (default 24; set 32 for this model) — MUST match the loaded model. Deployed
`~/.cache/c5p_dnn/m9_p32_s/model.onnx`. IMPORTANT: `curt_dnn_vmax` (ghostbuster) must be ≥ the max
real-target velocity you want to KEEP, not just the training vmax — e.g. (4,4)=|v|1.414 (cell-R 5.657)
needs vmax≥1.42, so use 1.5 (rmax 6.0) to keep (4,4) while still killing far corners (5,5)=7.07.
Training disk was 1.4, so (4,4) is marginally extrapolated; a retrain at vmax_px≈1.45–1.5 trains the
diagonal corners cleanly.
## Training recipe (DGX GB10, NGC pytorch:25.10-py3, ~35 it/s, ~3 min / 6000 steps)
```
python train.py --steps 6000 --nframes 9 --vmax 1.4 --w_bias 10 [--bias_by s] --out runs/<name>
```
Defaults (frac_pos 0.4, frac_off 0.4, w_vel 1.0, w_off 0.3, snr 1 8, patch 24, vel_radius 5, vel_decimate 4,
sigma_v 0.9) = the weighted recipe. Sync scripts to `elphel@192.168.0.62:~/c5p_dnn/`, run in the container.
## Deployed models (`~/.cache/c5p_dnn/`) — "perfect-match" set (half-pixel fix + vmax 1.4 + de-bias)
- `m9_pm/model.onnx` — 9-frame, snr-de-bias
- `m9_pm_s/model.onnx` — 9-frame, s-de-bias
(supersede `m9_dbias*`, which lacked the half-pixel fix; `m9_base` = 9-frame no-de-bias baseline.)
## Open items
- Real-data A/B: snr vs s de-bias on the tile-streak and two-close-targets cases (the s-transfer test).
- Connect the DNN field to the recurrent layer (weights, as-is vs splat); tune re-sharpen.
- (Done — ghostbuster, decision 8.) Corner ghost sidelobes masked at readout.
- Explore in-network recurrent (extra layers / memory) — cf. Andrey's predecessor 2-stage Siamese
tile-disparity CNN (spatial neighbour context; multi-scale 1×1/3×3/5×5 loss).
# C5P DNN v2 — Two-stage MF + learned Hough-vote (2026-06-18, Andrey + Claude)
Redesign agreed after the v1 findings: softmax competition (not 121-resolution) kills ghosts;
the ghostbuster is structurally backwards (clips near-limit reals, keeps under-read fast ghosts);
the trajectory-alias velocity ramp is **structure, not noise**; low-SNR needs **spatial** evidence
integration (the spatial analog of the temporal recurrent). This architecture turns the alias into
the detector.
## Core idea
A true target (head H at newest frame, velocity V, tail T = H − V·(N−1)) makes the per-pixel MF field
fire a **predictable pattern** around H: a neighbor pixel P reports (tail-anchored, exact for tail-only)
`V_P = V + (P − H)/(N−1)` — slope 1/(N−1), the ramp we measured. The invariant: every pixel of the
target back-projects to the **same tail**: `P − V_P·(N−1) = T`. So aliased neighbors don't scatter —
they all point at one place. Accumulating those (s-weighted) votes recovers the target from many weak
detections (~√N gain) — far more noise-resilient than one pixel's s.
## Stage 1 — local matched filter (per-pixel, neighbor-agnostic)
- In: N-frame conditioned patch. Out per pixel: continuous **(Vx, Vy, s)** + K latent "secret-message" channels.
- **No velocity grid** — continuous V (the v1 grid-softmax's competition role moves to Stage 2). Resolves the
grid-vs-reg tension: continuous velocity here, competition there.
- **Velocity range ±2.5–3 from the start** (Andrey 2026-06-18: higher range is needed/beneficial; design wide now to
avoid a narrow→wide re-iteration). Continuous V → NO grid-resolution penalty (unlike v1's wider grid: decimate-2
step-0.5 gave centroid RMSE 0.06 vs 0.03 — here it's just a wider tanh bound). Wider per-DNN range ⇒ FEWER pyramid
levels to stack.
- RF ≈ trajectory reach `vmax·(N−1)` + bump. At vmax 3: **N=9 → reach 24 → patch ~52**; N=5 → reach 12 → patch ~28.
Lock N with the range (T5 coupling) — **lean N=9** (temporal depth = the low-SNR lever; the bigger RF is CNN-shared,
so amortized). Still drops v1's neighbor-suppression margin → smaller than a v1-style net at the same vmax.
- Training: **center-positives only** (report the trajectory through me) + noise negatives (det=0). **Drop the
off-center negatives** — we WANT the alias ramp, not suppression. Trained on causal MB (RT bias absorbed, per
the −0.36→+0.02 result). CNN-shared features (cheaper than literal per-pixel MF, like the 3d3 coarse layer).
## Stage 2 — learned Hough vote (spatial competition + aggregation)
- In: the Stage-1 (Vx,Vy,s)[+latent] field. Each pixel casts an s-weighted vote at `T_P = P − V_P·(N−1)`
through a **learned vote kernel**; accumulate → soft-argmax → per-target detection. Recover head = T + V·(N−1).
- **Physics as init + soft bound, not hard-coded:** seed toward the tail-anchored ramp, clamp the deviation to
the measured spread (the real all-frames MF spread is NARROWER than tail-only, and MB shifts it); let it learn
the rest.
- **Subsumes both** the v1 velocity-softmax competition AND the ghostbuster: a ghost has one voter → loses; a real
target gets coherent votes → wins. No output-velocity clipping (that was backwards).
## Loss schedule — dictated reference → end-to-end (the agreed methodology)
Structure lives in the ARCHITECTURE (differentiable Stage1→Stage2); "dictatorship" is just the aux-loss weight:
1. **Reference (hard):** deep supervision — aux loss pins Stage-1 (Vx,Vy,s) to the MF targets + Stage-2 final loss.
Interpretable, data-efficient, and the permanent **debugging substrate** (v1 was only debuggable because we could
read intermediates).
2. **Anneal:** lower the Stage-1 aux weight; let the K latent channels carry richer inter-stage features.
3. **End-to-end:** final loss only (+ small aux as regularizer). **Compare to the reference**; keep the gain if it
beats it, else the reference stands.
Training order: Stage 1 alone (freezable) → Stage 2 on its field → end-to-end.
## Pairs with the recurrent (T3)
Stage-2 vote = spatial weak-evidence integration; recurrent = temporal. Same principle; eventually one framework.
## To decide at implementation
N and RF size; K latent channels; vote-accumulator sub-pixel resolution; whether Stage 2 is conv/attention/explicit
scatter-add; how the head-recovery (T + V·(N−1)) feeds the output/recurrent. Build the reference first, measure
against v1 (m9_p32_grid121_cblur etc.) on the same synthetic harness.
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for
software and other kinds of works.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.
Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.
For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.
Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
imagej-elphel
Copyright (C) 2017 Elphel
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:
imagej-elphel Copyright (C) 2017 Elphel
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
<http://www.gnu.org/licenses/>.
The GNU General Public License does not permit incorporating your program
into proprietary programs. If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
<http://www.gnu.org/philosophy/why-not-lgpl.html>.
# imagej_elphel_dnn
DNN companion to [imagej-elphel](https://git.elphel.com/Elphel/imagej-elphel) for tile-processor
motion detection / ranging. GPLv3.
## Layers
- **L1 — `RawFCN`** (`model.py`): fully-convolutional patch net `[B,N,P,P] -> [B,C,1,1]`, slid densely over
the full frame (no FC layers). Per-tile detection logit + velocity field. Trained on synthetic sequences.
- **L2 — `Layer2Net` / ConvGRU on a torus** (`layer2.py`): learned track-before-detect over the L1 field.
## Workflow
- Training / eval / synthetic data generation: `train.py`, `layer2_train*.py`, `gen_synth_cuas.py`, `synth.py`, eval scripts.
- Inference (dev/remote): `infer_server.py` (+ `run_infer_server.sh`) — PyTorch server.
- **Deployment export:** `export_torchscript.py` produces a self-contained TorchScript `.pt` (weights + graph),
loaded natively (no Python) by LibTorch in [tile_processor_gpu](https://git.elphel.com/Elphel/tile_processor_gpu)
and bundled as a resource in imagej-elphel (`resources/cuas_dnn/`). Build-once on a dev box (PyTorch);
deployment needs only the NVIDIA driver + libtorch runtime + the bundled `.pt`.
Checkpoints (`runs/`) are training outputs and are not tracked.
# baselines.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""
Phase-correlation and matched-filter velocity baselines for the C5P DNN benchmark. # By Claude on 06/13/2026
Pure numpy (no torch) so they can be unit-tested anywhere. Both estimate (vx,vy) px/frame
from an [N,H,W] patch stack (frame 0 = newest; target near center). These are the bars the
network must match (PC) and beat (matched filter = our current linear front-end).
"""
import numpy as np
def _parabolic(c, l, r):
"""Sub-sample peak offset in [-0.5,0.5] from 3 samples (left, center, right), c is max."""
d = (l - r)
den = 2.0 * (l - 2.0 * c + r)
return (d / den) if abs(den) > 1e-12 else 0.0
def pc_shift(a, b):
"""Phase correlation: estimate the shift (sx,sy) of `a` relative to `b` (sub-pixel).
Positive sx means `a`'s content is at larger x than `b`'s."""
H, W = a.shape
Fa = np.fft.fft2(a); Fb = np.fft.fft2(b)
R = Fa * np.conj(Fb)
R /= (np.abs(R) + 1e-9) # whiten -> phase correlation (fat zero eps)
r = np.fft.ifft2(R).real
py, px = np.unravel_index(np.argmax(r), r.shape)
# sub-pixel parabolic on each axis (wrap neighbors)
dx = _parabolic(r[py, px], r[py, (px - 1) % W], r[py, (px + 1) % W])
dy = _parabolic(r[py, px], r[(py - 1) % H, px], r[(py + 1) % H, px])
sx = (px + dx); sy = (py + dy)
if sx > W / 2: sx -= W
if sy > H / 2: sy -= H
return sx, sy
def pc_velocity(frames):
"""NAIVE PC (kept for contrast, DO NOT use as the bar): per-pair whitening then spatial
shift-averaging - amplifies noise, fails at low SNR. See pc_velocity_fd for the real one."""
N = frames.shape[0]
sx = sy = 0.0
for i in range(N - 1):
dx, dy = pc_shift(frames[i], frames[i + 1])
sx += dx; sy += dy
return sx / (N - 1), sy / (N - 1)
def pc_velocity_fd(frames, baseline=1, fatzero=1e-2):
"""Andrey's coherent PC: SUM the cross-power spectra of same-baseline pairs in the # By Claude on 06/13/2026
frequency domain BEFORE normalization, normalize once (fat-zero regularized), ifft ->
one correlation surface from all pairs jointly. For constant velocity every consecutive
pair encodes the same per-frame shift, so they add coherently (~(N-1)x SNR) and a single
whitening then localizes the peak. baseline>1 uses longer-baseline pairs (bigger, better-
resolved displacement for slow targets - the 'increase pairs/baseline when speed is low').
Returns (vx,vy) px/frame. (Masked iterative + tracking-camera refinement not included
here - this is the core FD-combined estimate; refinement would push it further.)"""
N, H, W = frames.shape
F = np.fft.fft2(frames, axes=(-2, -1)) # [N,H,W] complex
R = np.zeros((H, W), dtype=complex)
for i in range(N - baseline):
R += F[i] * np.conj(F[i + baseline]) # combine in FD BEFORE normalization
R /= (np.abs(R) + fatzero * np.abs(R).max()) # single whitening (fat zero)
r = np.fft.ifft2(R).real
py, px = np.unravel_index(np.argmax(r), r.shape)
dx = _parabolic(r[py, px], r[py, (px - 1) % W], r[py, (px + 1) % W])
dy = _parabolic(r[py, px], r[(py - 1) % H, px], r[(py + 1) % H, px])
sx = px + dx; sy = py + dy
if sx > W / 2: sx -= W
if sy > H / 2: sy -= H
return sx / baseline, sy / baseline # peak shift = v*baseline
def _bump(cx, cy, H, W, radial=False):
ys = np.arange(H)[:, None] - cy; xs = np.arange(W)[None, :] - cx
if radial:
rr = np.sqrt(xs * xs + ys * ys)
return np.where(rr < 1.5, np.cos(np.pi / 3.0 * rr), 0.0)
bx = np.where(np.abs(xs) < 1.5, np.cos(np.pi / 3.0 * np.abs(xs)), 0.0)
by = np.where(np.abs(ys) < 1.5, np.cos(np.pi / 3.0 * np.abs(ys)), 0.0)
return bx * by
def mf_velocity(frames, vel_radius=5, vel_decimate=4, radial=False, parabolic=True):
"""Matched-filter velocity (= the C5P statistic at the patch center): correlate the
stack with the swept bump for each velocity cell, argmax over the grid (+ parabolic
sub-cell). Returns (vx,vy) px/frame and the peak response."""
N, H, W = frames.shape
cx0 = (W - 1) / 2.0; cy0 = (H - 1) / 2.0
vdim = 2 * vel_radius + 1
resp = np.empty((vdim, vdim), dtype=np.float64)
for iy, vyc in enumerate(range(-vel_radius, vel_radius + 1)):
for ix, vxc in enumerate(range(-vel_radius, vel_radius + 1)):
vx = vxc / vel_decimate; vy = vyc / vel_decimate
s = 0.0
for i in range(N):
t = _bump(cx0 - vx * i, cy0 - vy * i, H, W, radial)
s += float((frames[i] * t).sum())
resp[iy, ix] = s
iy, ix = np.unravel_index(np.argmax(resp), resp.shape)
vyc, vxc = iy - vel_radius, ix - vel_radius
if parabolic and 0 < ix < vdim - 1 and 0 < iy < vdim - 1:
vxc += _parabolic(resp[iy, ix], resp[iy, ix - 1], resp[iy, ix + 1])
vyc += _parabolic(resp[iy, ix], resp[iy - 1, ix], resp[iy + 1, ix])
return vxc / vel_decimate, vyc / vel_decimate, float(resp[iy, ix])
if __name__ == "__main__":
import synth
rng = np.random.default_rng(7)
print("velRMSE px/fr (PCnaive = bad strawman, PCfd = Andrey's coherent FD-combined, MF = matched filter)")
for snr in [2.0, 3.0, 5.0, 8.0, 100.0]:
en = []; efd = []; efd4 = []; emf = []
for _ in range(200):
f, lab = synth.generate_sample(rng, snr=snr, target=True)
vxn, vyn = pc_velocity(f)
vxf, vyf = pc_velocity_fd(f, baseline=1)
vxf4, vyf4 = pc_velocity_fd(f, baseline=4) # longer baseline for slow targets
vxm, vym, _ = mf_velocity(f)
en.append(np.hypot(vxn - lab["vx"], vyn - lab["vy"]))
efd.append(np.hypot(vxf - lab["vx"], vyf - lab["vy"]))
efd4.append(np.hypot(vxf4 - lab["vx"], vyf4 - lab["vy"]))
emf.append(np.hypot(vxm - lab["vx"], vym - lab["vy"]))
rms = lambda e: np.sqrt(np.mean(np.square(e)))
print(f"snr={snr:6.1f} PCnaive={rms(en):.4f} PCfd(b1)={rms(efd):.4f} "
f"PCfd(b4)={rms(efd4):.4f} MF={rms(emf):.4f}")
# build_combo3.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Build the standard combo3_hyper.tif from per-type eval tiffs. By Claude on 06/24/2026.
The locked L2 viewing standard (Andrey, 2026-06-24): ImageJ hyperstack axes TZCYX,
C=type / Z=frame / T=series, grayscale, each 32x32 frame 2x2-tiled to 64x64 (seam at
center cross), per-slice labels "TYPE sN fF". C=type keeps ImageJ's display range
per-channel so scrubbing frame(Z)/series(T) holds a common contrast.
Channel order matches gap_eval.py/clean_eval_fwhm.py outputs:
gap evals (5 types): L2_det, L1_s, input, truth, signal
clean/easy (3 types): L2_det, L1_s, truth
Reads whatever subset is present in --dir, in that fixed order.
Usage: build_combo3.py --dir runs/l1views/mhc_eval --T 128
build_combo3.py --dir runs/l1views/mhc_easy --T 120
"""
import argparse, os, numpy as np, tifffile
ORDER = ["L2_det", "L1_s", "input", "truth", "signal"] # fixed C order
def tile2x2(a): # (...,32,32) -> (...,64,64), seam at center cross
return np.tile(a, (1, 1, 2, 2)) if a.ndim == 4 else np.tile(a, (2, 2))
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--dir", required=True)
ap.add_argument("--T", type=int, default=128, help="frames per series")
ap.add_argument("--out", default=None)
a = ap.parse_args()
# Default filename = folder basename, so open windows get distinct/descriptive ImageJ
# titles (e.g. mhgb_train12_hyper.tif), not a generic combo3_hyper.tif. By Claude 06/24/2026.
out = a.out or os.path.join(a.dir, os.path.basename(os.path.normpath(a.dir)) + "_hyper.tif")
types = [t for t in ORDER if os.path.exists(os.path.join(a.dir, f"{t}.tif"))]
if not types:
raise SystemExit(f"no per-type tiffs found in {a.dir}")
stacks = []
for t in types:
s = tifffile.imread(os.path.join(a.dir, f"{t}.tif")).astype(np.float32) # (T*Z,32,32)
nser = s.shape[0] // a.T
s = s.reshape(nser, a.T, s.shape[1], s.shape[2]) # (series,frame,32,32)
s = tile2x2(s) # (series,frame,64,64)
stacks.append(s)
nser, nfr = stacks[0].shape[0], stacks[0].shape[1]
C = len(types)
vol = np.stack(stacks, axis=2) # (T=series, Z=frame, C=type, Y, X)
# ImageJ Labels: C fastest, then Z(frame), then T(series)
labels = [f"{types[c]} s{t} f{z}" for t in range(nser) for z in range(nfr) for c in range(C)]
tifffile.imwrite(out, vol, imagej=True,
metadata={"axes": "TZCYX", "mode": "grayscale", "Labels": labels})
print(f"wrote {out} TZCYX={vol.shape} types={types} series={nser} frames={nfr}")
if __name__ == "__main__":
main()
# clean_eval_fwhm.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Clean FWHM eval: trained L2 vs L1, on NO-GAP data, bright frames only. By Claude 06/23/2026"""
import argparse, numpy as np, torch, synth, layer2_data as L1D
from layer2 import Layer2Net
def fwhm_at(field, cx, cy, G):
# roll truth to center, measure half-max width of center row & col (sub-pixel, clean interp)
f=np.roll(np.roll(field,int(round(G//2-cy)),0),int(round(G//2-cx)),1); c=G//2
def w(line):
pk=float(line[c])
if pk<=1e-6: return np.nan
h=pk/2
# right crossing
i=c
while i<len(line)-1 and line[i]>=h: i+=1
r=(i-1)+(line[i-1]-h)/(line[i-1]-line[i]) if line[i-1]>line[i] else float(i-1)
# left
i=c
while i>0 and line[i]>=h: i-=1
l=(i+1)-(line[i+1]-h)/(line[i+1]-line[i]) if line[i+1]>line[i] else float(i+1)
return r-l
return np.nanmean([w(f[c,:]),w(f[:,c])])
ap=argparse.ArgumentParser()
ap.add_argument("--l1",default="runs/weighted9_pm/model.pt")
ap.add_argument("--model",required=True); ap.add_argument("--out",required=True)
ap.add_argument("--T",type=int,default=120); ap.add_argument("--G",type=int,default=32)
a=ap.parse_args(); import os; os.makedirs(a.out,exist_ok=True)
dev="cuda" if torch.cuda.is_available() else "cpu"
net1,N,_=L1D._load_l1(a.l1,dev)
ck=torch.load(a.model,map_location=dev); ar=ck["args"]
net=Layer2Net(ch_in=3,ch_hidden=ar["ch"],grid=a.G,vmax=ar["vmax"]).to(dev)
net.load_state_dict(ck["model"]); net.eval()
rng=np.random.default_rng(777)
# CLEAN no-gap data (matches training)
frames,pos,vel,present=L1D.render_run(rng,T=a.T,G=a.G,vmax=ar["vmax"],snr=ar["snr"],gaps=False)
seq=L1D.gen_field_sequence(net1,frames,pos,a.G,N,dev)
with torch.no_grad():
det,_=net(torch.from_numpy(seq[None]).to(dev))
l2=torch.sigmoid(det)[0,:,0].cpu().numpy()
l1s=seq[:,0]
fl1,fl2,fp_away=[],[],[]
for t in range(a.T):
if not present[t]: continue
cx,cy=pos[t]; ci,cj=int(round(cy))%a.G,int(round(cx))%a.G
if l1s[t,ci,cj]>0.5: fl1.append(fwhm_at(l1s[t],cx,cy,a.G)) # L1 bright frames
if l2[t].max()>0.5:
fl2.append(fwhm_at(l2[t],cx,cy,a.G))
m=np.ones((a.G,a.G),bool); yy,xx=np.ogrid[:a.G,:a.G]
m[(((xx-cj+a.G/2)%a.G-a.G/2)**2+((yy-ci+a.G/2)%a.G-a.G/2)**2)<=9]=False
fp_away.append(float(l2[t][m].max()))
synth.save_tiff_stack(l1s,f"{a.out}/L1_s.tif"); synth.save_tiff_stack(l2,f"{a.out}/L2_det.tif")
tr=np.zeros((a.T,a.G,a.G),np.float32)
for t in range(a.T):
if present[t]: tr[t]=L1D.halfcos_bump_torus(pos[t,0],pos[t,1],a.G)
synth.save_tiff_stack(tr,f"{a.out}/truth.tif")
print(f"=== CLEAN no-gap eval of {a.model} ===")
print(f"present {int(present.sum())}/{a.T} L1 locked {len(fl1)} L2 locked {len(fl2)}")
print(f"FWHM @ target (bright frames): L1 ~ {np.nanmean(fl1):.2f} px L2 ~ {np.nanmean(fl2):.2f} px")
print(f"L2 max-FP away-from-target (locked frames): mean {np.mean(fp_away):.2f} max {np.max(fp_away):.2f}")
#!/usr/bin/env python3
# compare_dnn_truth.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""
compare_dnn_truth.py - compare the DNN -DNN-OFFSET output against the eyeballed
linear ground-truth track of the real target (DJI Mini 4 Pro @ ~600 m).
Reads the ImageJ hyperstack *-DNN...-OFFSET.tiff (channels dx, dy, s over the
save ROI, one Z per scene), parses the per-scene timestamp labels, and for each
scene reports: max detection s, the sub-pixel detected position (px+dx, py+dy)
at the peak pixel, the interpolated truth position, and the position error.
This is a first-cut scaffold: it INTROSPECTS the TIFF first (shape, imagej
metadata, labels) and then parses with the assumptions marked # VERIFY below.
If the axis order / label format differ, flip the marked lines - the structure
is otherwise correct.
Deps: pip install tifffile numpy (matplotlib optional for --plot)
Usage: python3 compare_dnn_truth.py /path/to/dir_or_file [--plot] [--sthr 0.3]
"""
import sys, glob, re, argparse
import numpy as np
# ----- ground truth from the UAS flight log (smoothed + interpolated) + constant offset -----
# log cols: timestamp(full s), px, py, range(m). px,py UPDATE at ~5 fps (held between samples ->
# a staircase) while logged finer, so we SMOOTH (moving average) then interpolate to scene times.
# The drone curves (NOT constant velocity), hence the log rather than a linear fit. A constant
# calibration offset aligns log->image: image_truth(t) = smooth_log(t) + (OFFSET_X, OFFSET_Y).
OFFSET_X, OFFSET_Y = -4.0, +2.0 # log->image; refine (log(410.1,306.1) vs eyeball(406,308))
SMOOTH_WIN = 9 # moving-average window in log samples (~0.3 s) to de-staircase the 5 fps updates
ROI_X0, ROI_Y0, ROI_W, ROI_H = 395, 300, 70, 20
VEL_RADIUS = 5 # fan is (2*VR+1)^2 cells (curt_vel_radius)
VEL_STEP = 0.25 # px/level-frame per grid cell (= 1/curt_vel_decimate, VD4)
SEARCH_R = 3 # px window around the log truth: measure the DNN on the TARGET, not the y~312 clutter
_LOG = {'t': None, 'x': None, 'y': None}
def read_meta(path):
"""All <key>value</key> props from the ImageJ Info XML (TIFF ImageDescription): curt_* etc. By Claude on 06/16/2026"""
import tifffile, xml.etree.ElementTree as ET
info = None
try:
with tifffile.TiffFile(path) as tf:
md = tf.imagej_metadata or {}
info = md.get('Info')
if info is None:
try: info = tf.pages[0].tags['ImageDescription'].value
except Exception: info = None
except Exception:
info = None
props = {}
if info:
try:
for ch in ET.fromstring(info):
props[ch.tag] = ch.text
except Exception:
pass
return props
def _smooth(v, w):
"""Edge-safe moving average (pad with edge values, no zero-bias at the ends)."""
if w <= 1 or v.size < w:
return v
p = w // 2
return np.convolve(np.pad(v, p, mode='edge'), np.ones(w) / w, mode='valid')[:v.size]
def load_log(path):
import csv
ts, xs, ys = [], [], []
with open(path) as f:
rd = csv.reader(f); next(rd) # skip header
for row in rd:
if len(row) < 3:
continue
ts.append(float(row[0])); xs.append(float(row[1])); ys.append(float(row[2]))
o = np.argsort(ts)
t = np.array(ts)[o]
_LOG['t'], _LOG['x'], _LOG['y'] = t, _smooth(np.array(xs)[o], SMOOTH_WIN), _smooth(np.array(ys)[o], SMOOTH_WIN)
print(f" log: {len(t)} samples, t {t[0]:.3f}..{t[-1]:.3f}, smooth_win={SMOOTH_WIN}, offset=({OFFSET_X:+.1f},{OFFSET_Y:+.1f})")
def truth_xy(t, margin=0.3):
"""Smoothed-log truth (x, y) at full-second timestamp t, + constant offset. None outside log window."""
if t is None or _LOG['t'] is None or t < _LOG['t'][0] - margin or t > _LOG['t'][-1] + margin:
return None
return (float(np.interp(t, _LOG['t'], _LOG['x'])) + OFFSET_X,
float(np.interp(t, _LOG['t'], _LOG['y'])) + OFFSET_Y)
def parse_ts(label):
""" '1773135468_500748-0 f3' -> 5468.500748 (seconds tail, drops the high prefix)."""
if label is None:
return None
m = re.search(r'(\d{6,})_(\d+)', str(label)) # search (labels are prefixed 'dx:'/'<vtitle>:'); ts seconds are long
if not m:
m2 = re.search(r'(\d{6,})', str(label))
return float(m2.group(1)) if m2 else None
sec = int(m.group(1)) # FULL seconds, matches the log's full timestamp
frac = float('0.' + m.group(2))
return sec + frac
def load_offset(path):
import tifffile
with tifffile.TiffFile(path) as tf:
arr = tf.asarray()
ij = tf.imagej_metadata or {}
labels = ij.get('Labels')
print(f" shape={arr.shape} dtype={arr.dtype} ij(channels={ij.get('channels')}, "
f"slices={ij.get('slices')}, frames={ij.get('frames')})")
if labels:
print(f" first labels: {labels[:6]}")
return arr, labels
def to_channels_scenes(arr):
"""Normalize the array to (3, n_scene, H, W) = (dx,dy,s channels, scenes). # VERIFY axis order."""
a = np.asarray(arr)
if a.ndim == 4:
# find the axis of size 3 (channels dx,dy,s); the other non-spatial axis is scenes
ax3 = next((i for i, s in enumerate(a.shape[:-2]) if s == 3), 0)
a = np.moveaxis(a, ax3, 0) # -> (3, scenes, H, W)
elif a.ndim == 3:
# (3*scenes, H, W) flattened: ImageJ interleaves... assume channel-major blocks # VERIFY
n = a.shape[0]
assert n % 3 == 0, f"page count {n} not divisible by 3"
a = a.reshape(3, n // 3, a.shape[1], a.shape[2])
else:
raise ValueError(f"unexpected ndim {a.ndim}")
return a # (3, nsc, H, W) : [0]=dx [1]=dy [2]=s
def load_hyper(path):
"""Read -DNN-...-HYPER-RECT -> velocity fan (nvel, nsc, H, W); drops the leading MAX-over-v slice."""
import tifffile
with tifffile.TiffFile(path) as tf:
a = np.asarray(tf.asarray()); ij = tf.imagej_metadata or {}
labels = ij.get('Labels')
nvel = (2 * VEL_RADIUS + 1) ** 2
axv = next((i for i, s in enumerate(a.shape) if s in (nvel, nvel + 1)), 0) # VERIFY velocity axis
a = np.moveaxis(a, axv, 0)
if a.shape[0] == nvel + 1:
a = a[1:] # drop the leading MAX-over-v slice
print(f" hyper: fan shape {a.shape} (nvel,nsc,H,W)")
return a, labels
def fan_vel(f):
"""(argmax_vx, argmax_vy, centroid_vx, centroid_vy, s) from one pixel's fan, in px/level-frame."""
f = np.clip(np.nan_to_num(np.asarray(f, float)), 0, None)
tot = f.sum()
if tot <= 0:
return (np.nan, np.nan, np.nan, np.nan, 0.0)
n = 2 * VEL_RADIUS + 1
ix = np.arange(f.size)
vx = (ix % n - VEL_RADIUS) * VEL_STEP
vy = (ix // n - VEL_RADIUS) * VEL_STEP
k = int(np.argmax(f))
return (vx[k], vy[k], float((vx * f).sum() / tot), float((vy * f).sum() / tot), float(tot))
def log_speed(t, dt):
"""Log velocity (px per level-frame of dt seconds) at full-second t, from the smoothed-log slope."""
if _LOG['t'] is None or t is None:
return (np.nan, np.nan)
gx = np.gradient(_LOG['x'], _LOG['t']); gy = np.gradient(_LOG['y'], _LOG['t']) # px/s
return (float(np.interp(t, _LOG['t'], gx) * dt), float(np.interp(t, _LOG['t'], gy) * dt))
def main():
ap = argparse.ArgumentParser()
ap.add_argument('path', help='dir (uses newest *-DNN*-OFFSET.tiff) or a tiff file')
ap.add_argument('--sthr', type=float, default=0.3, help='detection threshold on s')
ap.add_argument('--plot', action='store_true')
args = ap.parse_args()
path = args.path
if not path.lower().endswith(('.tif', '.tiff')):
cands = sorted(glob.glob(f"{path}/*-DNN*OFFSET*.tif*"), key=lambda p: __import__('os').path.getmtime(p))
if not cands:
sys.exit(f"no *-DNN*-OFFSET*.tiff in {path}")
path = cands[-1]
print(f"file: {path}")
import os
# Run config from image metadata (all curt_*), preferring it; else the -ROIx_y_w_h filename tag; else default. By Claude on 06/16/2026
meta = read_meta(path)
if meta:
print("metadata: " + ", ".join(f"{k}={meta[k]}" for k in
('curt_dnn_model','curt_dnn_patch','curt_dnn_vmax','curt_save_select') if k in meta))
nums = re.findall(r'-?\d+', meta.get('curt_save_select', '')) if meta else []
if len(nums) >= 4:
ROI_X0, ROI_Y0, ROI_W, ROI_H = map(int, nums[:4])
print(f"ROI from metadata: {ROI_X0},{ROI_Y0},{ROI_W},{ROI_H}")
else:
mroi = re.search(r'ROI(\d+)_(\d+)_(\d+)_(\d+)', os.path.basename(path))
if mroi:
ROI_X0, ROI_Y0, ROI_W, ROI_H = map(int, mroi.groups())
print(f"ROI from filename: {ROI_X0},{ROI_Y0},{ROI_W},{ROI_H}")
else:
print(f"ROI not in metadata/filename; using default {ROI_X0},{ROI_Y0},{ROI_W},{ROI_H}")
logp = os.path.join(os.path.dirname(os.path.abspath(path)), 'UAS_log.csv')
if not os.path.exists(logp):
sys.exit(f"no UAS_log.csv next to {path}")
load_log(logp)
arr, labels = load_offset(path)
chs = to_channels_scenes(arr)
# remote (DGX) -OFFSET is full-frame (5ch dx,dy,s,Vx,Vy at 512x640); local is ROI-sized (3ch).
# chs[0,1,2]=dx,dy,s for both; crop a full-frame array to the metadata ROI so the ROI-relative
# indexing below works for both. By Claude on 06/20/2026
if (chs.shape[-2] != ROI_H) or (chs.shape[-1] != ROI_W):
chs = chs[:, :, ROI_Y0:ROI_Y0 + ROI_H, ROI_X0:ROI_X0 + ROI_W]
print(f" cropped full-frame -OFFSET to ROI -> {chs.shape}")
# channel order: 3-ch local -OFFSET = {dx,dy,s}; 5-ch remote -OFFSET = {s,Vx,Vy,dx,dy} (s-first). By Claude on 06/20/2026
if chs.shape[0] >= 5:
s, dx, dy = chs[0], chs[3], chs[4] # each (nsc, H, W)
else:
dx, dy, s = chs[0], chs[1], chs[2]
nsc = s.shape[0]
ts = [parse_ts(labels[i]) if labels and i < len(labels) else None for i in range(nsc)]
print(f"\nposition: peak s within +/-{SEARCH_R}px of the log truth (excludes distant clutter)")
print(f"{'t':>14} {'truth(x,y)':>16} {'s@tgt':>7} {'det(x,y)':>16} {'err_px':>7}")
errs = []; det_pts = []
for i in range(nsc):
t = ts[i]; gt = truth_xy(t)
si = np.nan_to_num(s[i], nan=-1.0)
if gt is not None:
gr = int(round(gt[1] - ROI_Y0)); gc = int(round(gt[0] - ROI_X0))
r0, r1 = max(0, gr - SEARCH_R), min(si.shape[0], gr + SEARCH_R + 1)
c0, c1 = max(0, gc - SEARCH_R), min(si.shape[1], gc + SEARCH_R + 1)
if r1 <= r0 or c1 <= c0:
continue
lr, lc = np.unravel_index(int(np.argmax(si[r0:r1, c0:c1])), (r1 - r0, c1 - c0))
r, c = r0 + lr, c0 + lc
else:
r, c = np.unravel_index(int(np.argmax(si)), si.shape)
smax = float(s[i][r, c]) if np.isfinite(s[i][r, c]) else 0.0
det_x = ROI_X0 + c + (dx[i, r, c] if np.isfinite(dx[i, r, c]) else 0.0)
det_y = ROI_Y0 + r + (dy[i, r, c] if np.isfinite(dy[i, r, c]) else 0.0)
if gt is not None:
e = ((det_x - gt[0])**2 + (det_y - gt[1])**2) ** 0.5
if smax >= args.sthr:
errs.append(e); det_pts.append((t, det_x, det_y))
print(f"{t:14.3f} ({gt[0]:6.2f},{gt[1]:6.2f}) {smax:7.3f} ({det_x:6.2f},{det_y:6.2f}) {e:7.2f}")
else:
print(f"{(t or float('nan')):14.3f} {'(out)':>16} {smax:7.3f}")
if errs:
errs = np.array(errs)
print(f"\ndetected {len(errs)}/{nsc} scenes (s>= {args.sthr}); "
f"pos err mean={errs.mean():.2f} median={np.median(errs):.2f} max={errs.max():.2f} px")
# time-offset / trend check: is the position error a velocity-dependent trend (-> log<->image clock
# offset, i.e. calibration) rather than random DNN error? Search dt that minimizes residual RMS.
if len(det_pts) >= 4:
tp = np.array([p[0] for p in det_pts]); dxp = np.array([p[1] for p in det_pts]); dyp = np.array([p[2] for p in det_pts])
ex = np.array([dxp[k] - truth_xy(tp[k])[0] for k in range(len(tp))])
ey = np.array([dyp[k] - truth_xy(tp[k])[1] for k in range(len(tp))])
rms0 = float(np.sqrt(np.mean(ex**2 + ey**2)))
sx = float(np.polyfit(tp, ex, 1)[0]); sy = float(np.polyfit(tp, ey, 1)[0]) # px/s trend
best = (0.0, rms0)
for dtc in np.arange(-0.5, 0.5001, 1.0/60):
gs = [truth_xy(t + dtc) for t in tp]
if any(g is None for g in gs):
continue
r = float(np.sqrt(np.mean([(dxp[k]-gs[k][0])**2 + (dyp[k]-gs[k][1])**2 for k in range(len(tp))])))
if r < best[1]:
best = (float(dtc), r)
print(f"\ntime-offset/trend check ({len(tp)} scenes, s>= {args.sthr}):")
print(f" err_x: mean={ex.mean():+.2f}px trend={sx:+.2f}px/s | err_y: mean={ey.mean():+.2f}px trend={sy:+.2f}px/s")
print(f" RMS@dt=0={rms0:.2f}px ; best dt={best[0]:+.3f}s -> RMS={best[1]:.2f}px "
f"({'time-offset likely' if best[1] < 0.7*rms0 else 'no strong time-offset -> mostly random/DNN'})")
# --- velocity from the -DNN-...-HYPER-RECT fan (argmax + centroid) vs the log slope ---
import os
hpath = path.replace('OFFSET', 'HYPER-RECT') # match the offset file's level + synth/real
if os.path.exists(hpath):
fan, hlabels = load_hyper(hpath)
_, nschh, H, W = fan.shape # fan = (nvel, nsc, H, W); scenes is axis 1 // fixed By Claude 06/20/2026
hts = [parse_ts(hlabels[i]) if hlabels and i < len(hlabels) else None for i in range(nschh)]
good = [t for t in hts if t is not None]
dt = float(np.median(np.diff(good))) if len(good) > 1 else (8.0 / 60.0)
print(f"\nvelocity from {os.path.basename(hpath)} (frame dt={dt:.4f}s; px/level-frame)")
print(f"{'t':>14} {'argmax(vx,vy)':>16} {'centroid(vx,vy)':>18} {'log(vx,vy)':>16}")
for i in range(nschh):
smap = fan[:, i].reshape(fan.shape[0], H, W).sum(0) # (H,W) detection map
gt = truth_xy(hts[i])
if gt is not None:
gr = int(round(gt[1] - ROI_Y0)); gc = int(round(gt[0] - ROI_X0))
r0, r1 = max(0, gr - SEARCH_R), min(H, gr + SEARCH_R + 1)
c0, c1 = max(0, gc - SEARCH_R), min(W, gc + SEARCH_R + 1)
if r1 <= r0 or c1 <= c0:
continue
lr, lc = np.unravel_index(int(np.argmax(smap[r0:r1, c0:c1])), (r1 - r0, c1 - c0))
rr, cc = r0 + lr, c0 + lc
else:
rr, cc = np.unravel_index(int(np.argmax(smap)), smap.shape)
ax, ay, cx, cy, sval = fan_vel(fan[:, i, rr, cc])
lvx, lvy = log_speed(hts[i], dt)
tt = hts[i] if hts[i] else float('nan')
print(f"{tt:14.3f} ({ax:+.2f},{ay:+.2f}) ({cx:+.2f},{cy:+.2f}) ({lvx:+.2f},{lvy:+.2f})")
if args.plot:
import matplotlib.pyplot as plt
tv = np.array([t if t else np.nan for t in ts])
sv = np.array([float(np.nanmax(s[i])) for i in range(nsc)])
plt.figure(); plt.plot(tv, sv, 'o-'); plt.axhline(args.sthr, ls='--', c='r')
plt.xlabel('t (s tail)'); plt.ylabel('max s'); plt.title('DNN visibility(t)'); plt.grid(True)
plt.show()
if __name__ == '__main__':
main()
#!/usr/bin/env python3
# dense_check.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Validate full-res shift-and-stitch == per-pixel patch inference. By Claude on 06/20/2026.
Run inside the NGC container (needs model.pt + model.py).
The stride-4 RawFCN, run densely on a whole frame, emits a 1/4-res grid. To recover the
full-res per-input-pixel field (what the Java CPU inferROI produces, CuasDnnInfer.java:111),
we run the dense net on the S*S=16 (S=4) sub-pixel shifts of the zero-padded frame and
interleave the outputs. This proves that reconstruction bit-matches the per-pixel reference
and pins the padding/offset alignment before it goes into the server + Java."""
import sys
import torch
import torch.nn.functional as F
from model import RawFCN
def load(run_dir):
ck = torch.load(run_dir + "/model.pt", map_location="cpu", weights_only=False)
a = ck.get("args", {}) or {}
m = RawFCN(n_frames=a.get("nframes", 8), vel_radius=a.get("vel_radius", 5),
patch=a.get("patch", 24), velocity_mode=a.get("velocity_mode", "grid"),
vmax=a.get("vmax", 1.4))
m.load_state_dict(ck["model"])
m.eval().cuda()
return m, a.get("nframes", 8), a.get("patch", 24)
@torch.no_grad()
def per_pixel(m, x, P, region):
# Reference (mirrors CPU inferROI): for each pixel in region, run the net on its P×P
# patch (zero-padded at borders), keep the channel vector. All patches batched -> 1 forward.
# x: [N, H, W] on GPU. region = (y0, x0, h, w).
N, H, W = x.shape
half = P // 2
y0, x0, h, w = region
xp = F.pad(x, (half, half, half, half)) # [N, H+P, W+P] zero border (== CPU fill)
patches = torch.empty(h * w, N, P, P, device=x.device, dtype=x.dtype) # [npix, N, P, P]
for q in range(h * w):
cy, cx = y0 + q // w, x0 + q % w # pixel center in ORIGINAL coords
patches[q] = xp[:, cy:cy + P, cx:cx + P] # xp[cy:cy+P] == original rows [cy-half, cy+half)
out = m(patches) # [npix, C, 1, 1]
return out[:, :, 0, 0].reshape(h, w, -1) # [h, w, C]
@torch.no_grad()
def shift_stitch(m, x, P, S=4):
# Pad by half so a valid dense pass aligns output cell (oi,oj) to input pixel (S*oi, S*oj).
# Then 16 shifts (sy,sx in 0..S-1) interleave into the full-res [C,H,W] field.
N, H, W = x.shape
half = P // 2
xp = F.pad(x, (half, half, half, half)) # [N, H+P, W+P]
C = m.out_ch
full = torch.zeros(C, H, W, device=x.device, dtype=x.dtype) # [C, H, W] full-res field
for sy in range(S):
for sx in range(S):
xs = xp[:, sy:, sx:] # shift the padded frame by (sy,sx)
y = m(xs[None])[0] # [C, oH, oW] 1/4-res dense grid
oh = min(y.shape[1], (H - sy + S - 1) // S) # cells that land inside [0,H)/[0,W)
ow = min(y.shape[2], (W - sx + S - 1) // S)
full[:, sy::S, sx::S] = y[:, :oh, :ow] # interleave at stride S
return full.permute(1, 2, 0) # [H, W, C]
def compare(m, P, N, tag):
# one alignment check under the current backend/precision settings
torch.manual_seed(0)
H, W = 40, 48
dt = next(m.parameters()).dtype # match model precision (fp32 / fp64)
x = torch.randn(N, H, W, dtype=dt).cuda() # [N, H, W] random conditioned stack
region = (P // 2, P // 2, 8, 8) # interior region (y0,x0,h,w)
ref = per_pixel(m, x, P, region) # [8, 8, C]
full = shift_stitch(m, x, P) # [H, W, C]
y0, x0, h, w = region
sub = full[y0:y0 + h, x0:x0 + w] # [8, 8, C]
d = (ref - sub).abs().max().item()
rng = ref.abs().max().item()
print(f" [{tag}] max|diff|={d:.3e} out~{rng:.2f} rel={d/max(rng,1e-12):.1e} "
f"{'MATCH' if d < 1e-4 else 'mismatch'}")
if __name__ == "__main__":
run = sys.argv[1] if len(sys.argv) > 1 else "runs/weighted9_pm_s"
m, N, P = load(run)
# 1) default: cuDNN on, fp32 -> small-patch and dense convs may pick different algos
torch.backends.cudnn.enabled = True
compare(m, P, N, "cuDNN fp32")
# 2) cuDNN OFF: both paths use the same aten kernels -> isolates algorithm-choice noise
torch.backends.cudnn.enabled = False
compare(m, P, N, "cuDNN OFF fp32")
# 3) fp64: near-exact arithmetic -> proves the ALIGNMENT (geometry) is right
torch.backends.cudnn.enabled = True
compare(m.double(), P, N, "fp64")
# diag_clean.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Why is learned S jittery on clean input? Compare learned S vs EXACT MF path-sum. By Claude 06/18/2026
Usage: python diag_clean.py [model.pt] (default runs/stage1_mfs/model.pt)"""
import sys, numpy as np, torch
import synth, stage2 as S2
from model import RawFCN
dev = "cuda" if torch.cuda.is_available() else "cpu"
N, vmax, P = 9, 2.8, 52; half, Nm1, HW = P // 2, N - 1, 140
ckpt = sys.argv[1] if len(sys.argv) > 1 else "/work/runs/stage1_mfs/model.pt"
print("model:", ckpt)
s1 = RawFCN(n_frames=N, patch=P, velocity_mode="reg", vmax=vmax).to(dev)
s1.load_state_dict(torch.load(ckpt, map_location=dev)["model"]); s1.eval()
def render(V, amp, noise, seed=0):
rng = np.random.default_rng(seed)
fr = rng.standard_normal((N, HW, HW)).astype(np.float32) if noise else np.zeros((N, HW, HW), np.float32)
Hx = HW / 2 + V * Nm1 * 0.5; Hy = HW / 2; nb = 4; subs = np.arange(nb) * (1.0 / nb)
for i in range(N):
acc = np.zeros((HW, HW))
for ss in subs: acc += synth.halfcos_bump(Hx - V * (i + ss), Hy, HW, HW)
fr[i] += (amp * acc / nb).astype(np.float32)
return fr, Hx, Hy
def rough(a): return float(np.std(np.diff(a))) # pixel-to-pixel roughness (0 = perfectly smooth)
for noise, lbl in [(False, "CLEAN (no noise)"), (True, "NOISY amp=5")]:
fr, Hx, Hy = render(1.0, 5, noise)
s_t, vx_t, vy_t = S2.stage1_dense(s1, fr, dev=dev, mf_s=True)
mf = S2.mf_sum(torch.from_numpy(fr).to(dev), vx_t, vy_t, half, N) # EXACT MF along the net's own velocity
s = s_t.cpu().numpy(); vx = vx_t.cpu().numpy(); mfn = mf.cpu().numpy()
yf = int(round(Hy - half)); hxf = Hx - half; xs = np.arange(int(hxf) - 14, int(hxf) + 15)
print(f"\n===== {lbl} =====")
print(" x-off | S_learned Vx_learned S_exactMF")
for x in xs[::2]:
print(" %+4d | %8.2f %+6.3f %8.2f" % (x - hxf, s[yf, x], vx[yf, x], mfn[yf, x]))
print(" roughness(diff-std): S_learned=%.3f S_exactMF=%.3f Vx_learned=%.4f"
% (rough(s[yf, xs]), rough(mfn[yf, xs]), rough(vx[yf, xs])))
# eval_mfs.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Eval the MF-S v2 pipeline (Stage-1 native-S weight + retrained refine). By Claude on 06/18/2026"""
import numpy as np, torch
import stage2 as S2
from model import RawFCN
dev="cuda" if torch.cuda.is_available() else "cpu"; N,vmax,HW=9,2.8,140; Nm1=N-1; half=26
s1=RawFCN(n_frames=N,patch=52,velocity_mode="reg",vmax=vmax).to(dev)
s1.load_state_dict(torch.load("/work/runs/stage1_mfs/model.pt",map_location=dev)["model"]); s1.eval()
net=S2.VoteRefine().to(dev); net.load_state_dict(torch.load("/work/runs/stage2_mfs/model.pt",map_location=dev)["model"]); net.eval()
def peaks(p,th,r=2):
out=[]; H_,W_=p.shape
for y in range(r,H_-r):
for x in range(r,W_-r):
if p[y,x]>th and p[y,x]>=p[y-r:y+r+1,x-r:x+r+1].max()-1e-6: out.append((x,y,p[y,x]))
return out
for TH in (0.5,0.7):
ndet=0;ntot=0;errs=[];gh=[];rng=np.random.default_rng(123)
for t in range(15):
fr,tg=S2.gen_field(rng,HW,4,N,vmax,[3.0,8.0]); s,vx,vy=S2.stage1_dense(s1,fr,dev=dev,mf_s=True)
aS,aVx,aVy=S2.vote_scatter(s,vx,vy,Nm1); nrm=aS.max().clamp(min=1e-6); aS=aS/nrm;aVx=aVx/nrm;aVy=aVy/nrm
with torch.no_grad(): p=torch.sigmoid(net(aS,aVx,aVy)[0]).cpu().numpy()
F_=p.shape[0]; pk=peaks(p,TH)
tt=[((hx-tvx*Nm1)-half,(hy-tvy*Nm1)-half) for hx,hy,tvx,tvy in tg]; tt=[(x,y) for x,y in tt if 0<=x<F_ and 0<=y<F_]
for tx,ty in tt:
ntot+=1; near=[np.hypot(px-tx,py-ty) for px,py,pv in pk if np.hypot(px-tx,py-ty)<8]
if near: ndet+=1; errs.append(min(near))
for px,py,pv in pk:
if all(np.hypot(px-tx,py-ty)>=8 for tx,ty in tt): gh.append(pv)
print("th=%.2f: det %d/%d (%.0f%%) locerr %.2f | TRUE ghosts(>8px) %d max %.3f"%(TH,ndet,ntot,100*ndet/ntot,np.median(errs) if errs else -1,len(gh),max(gh) if gh else 0))
# export_onnx.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Export an existing model.pt checkpoint to ONNX (no retrain). # By Claude on 06/13/2026
Usage: python export_onnx.py /work/runs/weighted/model.pt"""
import sys, torch
from model import RawFCN
ckpt = sys.argv[1]
ck = torch.load(ckpt, map_location="cpu")
a = ck["args"]
N = a.get("nframes", 8); P = a.get("patch", 24); vr = a.get("vel_radius", 5)
m = RawFCN(n_frames=N, vel_radius=vr)
m.load_state_dict(ck["model"]); m.eval()
onnx_path = ckpt[:-3] + ".onnx" if ckpt.endswith(".pt") else ckpt + ".onnx"
dummy = torch.zeros(1, N, P, P)
torch.onnx.export(m, dummy, onnx_path,
input_names=["frames"], output_names=["out"],
dynamic_axes={"frames": {0: "B", 2: "H", 3: "W"}, "out": {0: "B", 2: "Hout", 3: "Wout"}},
opset_version=17)
print(f"exported {onnx_path} (frames[B,{N},H,W] -> out[B,{m.out_ch},Hout,Wout])")
# export_refine.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Export a trained Stage-2 VoteRefine checkpoint to a self-contained ONNX. By Claude on 06/18/2026
CuasStage2Infer feeds a single [1,3,H,W] tensor (accS, accVx, accVy stacked) and reads [1,3,H,W]
(det logit + Vx,Vy). So we export the inner conv stack (VoteRefine.net), which is exactly that
map. dynamo=False keeps weights INLINE (no external .onnx.data) - Java's ORT load wants one file.
Usage: python export_refine.py /work/runs/stage2_mfs/model.pt
"""
import sys, torch
from stage2 import VoteRefine
ckpt = sys.argv[1]
ck = torch.load(ckpt, map_location="cpu")
net = VoteRefine()
net.load_state_dict(ck["model"]); net.eval()
onnx_path = ckpt[:-3] + ".onnx" if ckpt.endswith(".pt") else ckpt + ".onnx"
dummy = torch.zeros(1, 3, 64, 64) # [B,3,H,W]; H,W dynamic below
torch.onnx.export(net.net, dummy, onnx_path,
input_names=["acc"], output_names=["out"],
dynamic_axes={"acc": {0: "B", 2: "H", 3: "W"}, "out": {0: "B", 2: "H", 3: "W"}},
opset_version=17, dynamo=False)
print(f"exported {onnx_path} (acc[B,3,H,W] -> out[B,3,H,W])")
#!/usr/bin/env python3
# extract_B.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
# By Claude on 06/12/2026. Extract the measured 4D ambiguity function B from a
# synthetic-run -C5P-RECT (or -RECT) rendering, verify it against ground truth,
# and solve the regularized 4D deconvolution D (Wiener) with a
# condensation-vs-noise-gain sweep.
#
# RECT layout: ROI pixel (px,py) -> 12x12 block at [1+py*12 : 12+py*12,
# 1+px*12 : 12+px*12] (11x11 velocity cells + 1px NaN gap grid).
import argparse, json
import numpy as np
import tifffile
def main():
ap = argparse.ArgumentParser()
ap.add_argument('--rect', required=True, help='-RECT.tiff rendering (synthetic run)')
ap.add_argument('--synth', required=True, help='the -CUAS-SYNTHETIC-CUAS.tiff (for frame index mapping)')
ap.add_argument('--gt', required=True, help='ground truth json')
ap.add_argument('--roi', default='160,96,320,320', help='SYNTH_ROI x,y,w,h')
ap.add_argument('--srad', type=int, default=8, help='spatial PSF half-extent, px')
ap.add_argument('--slice', type=int, default=-1, help='RECT slice (0-based); -1 = auto: newest frame t%%4==0, mid-file')
args = ap.parse_args()
gt = json.load(open(args.gt))
vrad = gt['vel_radius']; vdim = 2*vrad + 1
rx, ry, rw, rh = [int(v) for v in args.roi.split(',')]
with tifffile.TiffFile(args.synth) as tf:
synth_labels = tf.imagej_metadata['Labels'][1:] # frame t = index
with tifffile.TiffFile(args.rect) as tf:
rect_labels = tf.imagej_metadata['Labels']
nsl = len(tf.pages)
# map each rect slice to frame index of its newest data
frames = [synth_labels.index(l) for l in rect_labels]
sl = args.slice
if sl < 0: # newest frame on the 4-frame phase grid (all targets pixel-centered)
cands = [i for i, t in enumerate(frames) if t % 4 == 0 and 20 <= t]
sl = cands[len(cands)//2] if cands else nsl//2
t = frames[sl]
img = tf.asarray(key=sl)
print('rect slices %d, using slice %d (label %s, frame t=%d)' % (nsl, sl, rect_labels[sl], t))
srad = args.srad
sdim = 2*srad + 1
psfs = {}
for tg in gt['targets']:
vx_c, vy_c = tg['v_cells']
vx, vy = tg['v_pix_per_frame']
x = tg['node'][0] + vx*t
y = tg['node'][1] + vy*t
px = int(np.floor(x)) - rx
py = int(np.floor(y)) - ry
if not (srad <= px < rw-srad and srad <= py < rh-srad):
continue
# extract [sdim, sdim, vdim, vdim] patch (spatial dy, dx, vel vy, vx)
P = np.zeros((sdim, sdim, vdim, vdim), np.float32)
for dy in range(-srad, srad+1):
for dx in range(-srad, srad+1):
bx = px+dx; by = py+dy
P[dy+srad, dx+srad] = img[by*12:by*12+11, bx*12:bx*12+11]
psfs[(vx_c, vy_c)] = P
print('extracted %d target PSFs' % len(psfs))
# --- sanity: blob center at (dp=0, v=v_true)? and +-v symmetry of the static one
P0 = psfs.get((0, 0))
if P0 is not None:
c = P0[srad, srad]
iy, ix = np.unravel_index(np.argmax(c), c.shape)
print('static target: center-pixel argmax at vel (%+d,%+d) (expect 0,0), peak %.2f'
% (ix-vrad, iy-vrad, c[iy, ix]))
row = c[vrad]
asym = np.abs(row - row[::-1]).max() / row.max()
print('static target vx row:', np.round(row, 1))
print(' max +-vx asymmetry: %.3f (relative)' % asym)
# --- shift-invariance in velocity: align central targets' PSFs and compare
keys = [k for k in psfs if max(abs(k[0]), abs(k[1])) <= 2]
aligned = []
for (i, j) in keys:
P = psfs[(i, j)]
A = np.roll(np.roll(P, -j, axis=2), -i, axis=3) # shift v_true to center
# mask cells rolled across the velocity border
M = np.ones((vdim, vdim), bool)
M = np.roll(np.roll(M, -j, axis=0), -i, axis=1)
aligned.append((A, M))
ref = aligned[len(aligned)//2][0]
devs = []
for A, M in aligned:
d = (np.abs(A - ref)[:, :, M]).max() / ref.max()
devs.append(d)
print('velocity shift-invariance over %d central targets: max rel deviation %.3f'
% (len(keys), max(devs)))
# --- average PSF (central targets) = B; Wiener D sweep
B = np.mean([A for A, _ in aligned], axis=0).astype(np.float64)
B /= B.max()
# desired G: separable half-cosine, +-1 cell in all four dims
g1 = np.array([0.5, 1.0, 0.5])
G = np.zeros_like(B)
cs, cv = srad, vrad
for a in range(-1, 2):
for b in range(-1, 2):
for cc in range(-1, 2):
for d in range(-1, 2):
G[cs+a, cs+b, cv+cc, cv+d] = g1[a+1]*g1[b+1]*g1[cc+1]*g1[d+1]
Bf = np.fft.fftn(np.fft.ifftshift(B))
Gf = np.fft.fftn(np.fft.ifftshift(G))
print('\nlambda resid noise_gain out_extent(cells>10%)')
for lam in (1e-4, 1e-3, 1e-2, 0.03, 0.1, 0.3):
Df = np.conj(Bf)*Gf / (np.abs(Bf)**2 + lam)
D = np.real(np.fft.fftshift(np.fft.ifftn(Df)))
out = np.real(np.fft.fftshift(np.fft.ifftn(Df*Bf)))
resid = np.sqrt(((out-G)**2).sum()/ (G**2).sum())
ngain = np.sqrt((D**2).sum())
extent = int((out > 0.1*out.max()).sum())
print('%7.0e %7.3f %8.3f %d (G itself: %d)'
% (lam, resid, ngain, extent, int((G > 0.1).sum())))
np.save(args.rect + '.B.npy', B)
print('\nsaved averaged B to', args.rect + '.B.npy')
if __name__ == '__main__':
main()
# gap_eval.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Gap-L2 eval on HELD-OUT modulated synthetic: sharpness + coast + hallucination. By Claude 06/23/2026"""
import argparse, numpy as np, torch, synth, layer2_data as L1D
from layer2 import Layer2Net
def fwhm_at(field,cx,cy,G):
f=np.roll(np.roll(field,int(round(G//2-cy)),0),int(round(G//2-cx)),1); c=G//2
def w(l):
pk=float(l[c])
if pk<=1e-6: return np.nan
h=pk/2; i=c
while i<len(l)-1 and l[i]>=h: i+=1
r=(i-1)+(l[i-1]-h)/(l[i-1]-l[i]) if l[i-1]>l[i] else float(i-1)
i=c
while i>0 and l[i]>=h: i-=1
L=(i+1)-(l[i+1]-h)/(l[i+1]-l[i]) if l[i+1]>l[i] else float(i+1)
return r-L
return np.nanmean([w(f[c,:]),w(f[:,c])])
ap=argparse.ArgumentParser()
ap.add_argument("--l1",default="runs/weighted9_pm/model.pt"); ap.add_argument("--model",required=True)
ap.add_argument("--out",required=True); ap.add_argument("--T",type=int,default=128); ap.add_argument("--G",type=int,default=32)
ap.add_argument("--seeds",default="3000,3001,3002,3003,3004,3005")
ap.add_argument("--train_n",type=int,default=0,help="if >0: reproduce first N TRAINING sequences (one rng seed 0, sequential, == build_cache) instead of --seeds")
a=ap.parse_args(); import os; os.makedirs(a.out,exist_ok=True)
dev="cuda" if torch.cuda.is_available() else "cpu"
net1,N,_=L1D._load_l1(a.l1,dev)
ck=torch.load(a.model,map_location=dev); ar=ck["args"]
net=Layer2Net(ch_in=3,ch_hidden=ar["ch"],grid=a.G,vmax=ar["vmax"]).to(dev); net.load_state_dict(ck["model"]); net.eval()
G=a.G; seeds=[int(s) for s in a.seeds.split(",")]
# train_n mode: ONE rng(0) generating N sequences sequentially == build_cache training data. By Claude 06/23
train_rng = np.random.default_rng(0) if a.train_n>0 else None
items = list(range(a.train_n)) if a.train_n>0 else seeds
INs,L1s,L2s,TRs,SGs=[],[],[],[],[]
fl1,fl2,s_bright,s_gap,fp_dark=[],[],[],[],[]
for sd in items:
rng = train_rng if a.train_n>0 else np.random.default_rng(sd)
fr,pos,vel,pres,sig=L1D.render_run(rng,T=a.T,G=G,vmax=1.4,snr=6.0,gaps=True,bp_lo=6,bp_hi=18,duty_offset=0.2,starter_len=8,return_signal=True)
seq=L1D.gen_field_sequence(net1,fr,pos,G,N,dev)
with torch.no_grad():
det,_=net(torch.from_numpy(seq[None]).to(dev)); l2=torch.sigmoid(det)[0,:,0].cpu().numpy()
l1s=seq[:,0]
tr=np.zeros((a.T,G,G),np.float32)
for t in range(a.T):
if pres[t]: tr[t]=L1D.halfcos_bump_torus(pos[t,0],pos[t,1],G)
INs.append(fr);L1s.append(l1s);L2s.append(l2);TRs.append(tr);SGs.append(np.broadcast_to(sig[:,None,None],(a.T,G,G)).astype(np.float32))
for t in range(a.T):
if not pres[t]: continue
cx,cy=pos[t]; ci,cj=int(round(cy))%G,int(round(cx))%G
l1v=l1s[t,ci,cj]; l2v=float(l2[t,max(0,ci-1):ci+2,max(0,cj-1):cj+2].max())
if sig[t]>0.5: # BRIGHT frame
s_bright.append(l2v)
if l1v>0.5: fl1.append(fwhm_at(l1s[t],cx,cy,G))
if l2[t].max()>0.5: fl2.append(fwhm_at(l2[t],cx,cy,G))
else: # GAP frame (present, L1 starved)
s_gap.append(l2v)
m=np.ones((G,G),bool); yy,xx=np.ogrid[:G,:G]
m[(((xx-cj+G/2)%G-G/2)**2+((yy-ci+G/2)%G-G/2)**2)<=9]=False
fp_dark.append(float(l2[t][m].max()))
def cat(L): return np.concatenate(L,0)
for nm,st in [("input",INs),("L1_s",L1s),("L2_det",L2s),("truth",TRs),("signal",SGs)]:
synth.save_tiff_stack(cat(st).astype(np.float32),f"{a.out}/{nm}.tif")
_src = f"TRAINING seqs 0..{a.train_n-1}" if a.train_n>0 else f"held-out seeds {seeds}"
print(f"=== GAP-L2 eval ({a.model}) {_src} ===")
print(f"FWHM @ target (bright): L1 ~ {np.nanmean(fl1):.2f} L2 ~ {np.nanmean(fl2):.2f} px")
print(f"COAST L2 s@truth: BRIGHT frames ~ {np.mean(s_bright):.2f} GAP frames ~ {np.mean(s_gap):.2f} (want gap HIGH = coasting)")
print(f"HALLUCINATION L2 max away-from-target on GAP frames: mean {np.mean(fp_dark):.2f} max {np.max(fp_dark):.2f}")
#!/usr/bin/env python3
# gen_synth_cuas.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
# By Claude on 06/11/2026, design by Andrey Filippov (2026-06-11).
#
# Generate a synthetic *-CUAS-MERGED-CUAS.tiff for the 4D-deconvolution experiment:
# ideal point targets ("straight line segments" in x,y,t) on a grid, one fine
# velocity per grid cell, structurally identical to a real merged file (same
# 640x512 float32, same slice count, timestamps borrowed verbatim from the real
# file) so it runs through CuasDetectRT unchanged, with the same pyramid levels.
#
# Layout:
# - grid step (default 40 px, human-friendly): each cell holds ONE target so no
# other target enters the conv5d kernel attention area (direct-kernel spatial
# reach is 8 px at VR5/VD4/NH6; travel budget per segment = step/2 - reach).
# Step 30 would leave only a 7 px budget -> the fastest velocity (1.25 px/frame
# per axis) could not complete one clean 6-frame window; step 40 gives 12 px
# -> 9 frames, clean windows for every velocity.
# - velocity per cell: vx_cell = (col % 11) - 5, vy_cell = (row % 11) - 5
# (all 121 fine velocities in the top-left 11x11 block, repeating beyond).
# Pixel velocity = cell / vel_decimate px/frame.
# - motion: position = node + v * (frame - segment_start); each target restarts
# (jumps back to its node) when its per-axis travel would exceed the budget.
# Restart frames are recorded in the ground-truth JSON - analysis can select
# windows that do not span a restart.
# - target rendering: sum-normalized Gaussian spot (sigma ~0.7 px) at the
# fractional position - constant total flux at any sub-pixel phase.
# - slice 1 is "average" (temporal mean; skipped by the importer, kept for
# human browsing); the rest reuse the real file's timestamp labels.
#
# Usage:
# python gen_synth_cuas.py --ref <real-CUAS-MERGED-CUAS.tiff> --out <out.tiff>
# Ground truth is written next to the output as <out>.groundtruth.json
import argparse, json, os
import numpy as np
import tifffile
def main():
ap = argparse.ArgumentParser()
ap.add_argument('--ref', required=True, help='real *-CUAS-MERGED-CUAS.tiff (labels/shape source)')
ap.add_argument('--out', required=True, help='output synthetic tiff path')
ap.add_argument('--layout', choices=['radial','tiled'], default='radial',
help='radial (default): one target per velocity cell at center + step*(vx,vy), velocities point outward from image center - distances only grow, no restarts, full-duration tracks. tiled: velocity by (col%%11,row%%11) over the whole frame with per-target restart jumps.')
ap.add_argument('--step', type=int, default=30, help='grid step, px (default 30 radial / use 40 for tiled)')
ap.add_argument('--center-x', type=float, default=None, help='radial velocity-0 node X (default image center). Set to a ROI corner so the full +velocity sweep (0..+vmax) lands inside the ROI instead of radiating out of it. By Claude 06/19/2026')
ap.add_argument('--center-y', type=float, default=None, help='radial velocity-0 node Y (default image center)')
ap.add_argument('--vel-radius', type=int, default=5, help='fine velocity radius, cells (VR)')
ap.add_argument('--vel-decimate', type=int, default=4, help='velocity decimation (VD): px/frame = cell/VD')
ap.add_argument('--vel-step', type=float, default=None, help='explicit target velocity step, px/frame: v_pix = cell*vel_step (overrides cell/vel_decimate). For LEV-emulation set to v_LEV3/4 so cells 1,2,4 = LEV1,LEV2,LEV3. By Claude 06/19/2026')
ap.add_argument('--kernel-reach', type=int, default=8, help='conv kernel spatial reach, px (direct kernel: 8 at VR5/VD4/NH6)')
ap.add_argument('--amplitude', type=float, default=100.0, help='target total flux, counts')
ap.add_argument('--shape', choices=['halfcosine','gaussian'], default='halfcosine',
help='target spot shape: halfcosine (default) = canonical 3-px half-period half-cosine bump (matches matched-filter kernel, G, condition()); gaussian = sigma-defined')
ap.add_argument('--sigma', type=float, default=0.7, help='Gaussian sigma, px (shape=gaussian only)')
ap.add_argument('--background', type=float, default=0.0, help='constant background level')
ap.add_argument('--background-from', default=None,
help='tiff to use as per-frame background (e.g. the REAL merged file: synthetic targets are ADDED to real clutter); overrides --background')
ap.add_argument('--bg-decimate-average', type=int, default=1, help='each output bg frame = mean of N consecutive --background-from frames (noise-floor / level control: N=2^L gives the LEV-L noise floor, std down sqrt(N)). By Claude 06/19/2026')
ap.add_argument('--peak', type=float, default=0.0,
help='if >0: set amplitude so a pixel-centered target PEAK equals this (halfcosine: amplitude = 4*peak)')
ap.add_argument('--nframes', type=int, default=0, help='limit number of frames (0 = same as ref)')
ap.add_argument('--motion-blur', action='store_true', help='causal/RT motion blur (Andrey naive method): each frame = average of sub-frame target positions trailing into the past; streak ~|v|*blur_frac, centroid lags ~0.5*blur_frac*|v| (RT registration bias) // By Claude on 06/17/2026')
ap.add_argument('--blur-frac', type=float, default=1.0, help='MB averaging window in frames (1.0 = non-overlap decimation)')
ap.add_argument('--blur-subs', type=int, default=4, help='sub-frames per averaging window (the 4-8 subdivision)')
args = ap.parse_args()
with tifffile.TiffFile(args.ref) as tf:
labels = list(tf.imagej_metadata['Labels'])
nslices, height, width = tf.series[0].shape
assert labels[0] == 'average' and len(labels) == nslices
ts_labels = labels[1:]
navg = max(1, args.bg_decimate_average) # noise-floor / level control: each output frame averages navg real frames // By Claude 06/19/2026
avail = len(ts_labels) // navg
nframes = avail if args.nframes <= 0 else min(args.nframes, avail)
ts_labels = ts_labels[::navg][:nframes]
step = args.step
vrad = args.vel_radius
vdim = 2 * vrad + 1
vstep = args.vel_step if args.vel_step is not None else 1.0 / args.vel_decimate # px/frame per velocity cell // By Claude 06/19/2026
budget = step / 2.0 - args.kernel_reach
assert budget > 0, 'grid step too small for kernel reach'
targets = []
if args.layout == 'radial':
# one target per fine velocity cell, at center + step*(vx_cell, vy_cell), moving
# outward: pure expansion, pairwise distances only grow - no restarts ever
cx = (args.center_x if args.center_x is not None else width // 2 + 0.5) # velocity-0 node (phase 0); ROI corner -> +sweep lands in-ROI
cy = (args.center_y if args.center_y is not None else height // 2 + 0.5)
tid = 0
for vy_cell in range(-vrad, vrad + 1):
for vx_cell in range(-vrad, vrad + 1):
targets.append({
'id': tid,
'node': [cx + step * vx_cell, cy + step * vy_cell],
'v_cells': [vx_cell, vy_cell],
'v_pix_per_frame': [vx_cell * vstep, vy_cell * vstep],
'restart_period_frames': 0, # never
})
tid += 1
else: # tiled
cols = width // step
rows = height // step
for row in range(rows):
for col in range(cols):
vx_cell = (col % vdim) - vrad
vy_cell = (row % vdim) - vrad
vx = vx_cell * vstep # px/frame
vy = vy_cell * vstep
node_x = col * step + step / 2.0 + 0.5 # +0.5: start exactly on a pixel center (phase 0)
node_y = row * step + step / 2.0 + 0.5
vmax = max(abs(vx), abs(vy))
period = int(np.floor(budget / vmax)) if vmax > 0 else 0 # 0 = static, never restarts
targets.append({
'id': row * cols + col,
'node': [node_x, node_y],
'v_cells': [vx_cell, vy_cell],
'v_pix_per_frame': [vx, vy],
'restart_period_frames': period,
})
if args.peak > 0: # peak-referenced amplitude: peak = amplitude * splat_peak/splat_sum at phase 0
xs = np.arange(-3, 4) + 0.0
prof = (np.where(np.abs(xs) < 1.5, np.cos(np.pi/3.0*np.abs(xs)), 0.0) if args.shape == 'halfcosine'
else np.exp(-xs**2/(2*args.sigma**2)))
args.amplitude = args.peak * (prof.sum()**2) / (prof.max()**2)
print('peak %.3g -> amplitude (total flux) %.4g' % (args.peak, args.amplitude))
# sum-normalized splat: half-cosine bump (canonical 3-px half-period) or Gaussian
def splat_1d(coords, center):
if args.shape == 'halfcosine':
d = np.abs(coords - center)
return np.where(d < 1.5, np.cos(np.pi / 3.0 * d), 0.0)
return np.exp(-((coords - center) ** 2) / (2 * args.sigma ** 2))
rad = 2 if args.shape == 'halfcosine' else int(np.ceil(3 * args.sigma)) + 1
if args.background_from:
with tifffile.TiffFile(args.background_from) as tf:
bg_labels = tf.imagej_metadata['Labels']
first = 1 if bg_labels[0] == 'average' else 0
need = nframes * navg
raw = tf.asarray()[first:first+need].astype(np.float32)
assert raw.shape[0] == need, 'need %d bg frames, have %d (reduce --nframes or --bg-decimate-average)' % (need, raw.shape[0])
if navg > 1:
raw = raw.reshape(nframes, navg, height, width).mean(axis=1) # LEV-log2(navg) noise floor: std down sqrt(navg) // By Claude 06/19/2026
data = raw.copy()
assert data.shape == (nframes, height, width)
print('background: %d output frames, each = mean of %d real frames (%d total) from %s'
% (nframes, navg, need, args.background_from))
else:
data = np.full((nframes, height, width), args.background, dtype=np.float32)
for t in range(nframes):
frame = data[t]
for tg in targets:
vx, vy = tg['v_pix_per_frame']
period = tg['restart_period_frames']
dt = t if period == 0 else (t % period)
x = tg['node'][0] + vx * dt
y = tg['node'][1] + vy * dt
# causal/RT motion blur: average sub-frame positions trailing into the past (naive method)
if args.motion_blur and (vx or vy):
subs = np.arange(args.blur_subs) * (args.blur_frac / args.blur_subs) # s in [0, blur_frac)
pxs = [x - vx * sb for sb in subs]; pys = [y - vy * sb for sb in subs]
else:
pxs = [x]; pys = [y]
ix0 = max(int(np.floor(min(pxs))) - rad, 0)
ix1 = min(int(np.floor(max(pxs))) + rad + 1, width)
iy0 = max(int(np.floor(min(pys))) - rad, 0)
iy1 = min(int(np.floor(max(pys))) + rad + 1, height)
if ix0 >= ix1 or iy0 >= iy1:
continue
xs = np.arange(ix0, ix1) + 0.5 # pixel centers
ys = np.arange(iy0, iy1) + 0.5
spot = np.zeros((iy1 - iy0, ix1 - ix0), dtype=np.float64)
for sx, sy in zip(pxs, pys):
spot += np.outer(splat_1d(ys, sy), splat_1d(xs, sx)) # equal-weight trailing average
spot /= len(pxs)
s = spot.sum()
if s > 0:
frame[iy0:iy1, ix0:ix1] += (args.amplitude / s) * spot.astype(np.float32)
avg = data.mean(axis=0, keepdims=True)
stack = np.concatenate([avg, data], axis=0)
out_labels = ['average'] + ts_labels
os.makedirs(os.path.dirname(args.out), exist_ok=True)
# axes 'ZYX' is essential: without it tifffile declares the planes as CHANNELS
# (channels=498 composite) and ImageJ gives each its own display range - slices
# then LOOK differently scaled although pixel data is identical
tifffile.imwrite(
args.out, stack, imagej=True,
metadata={'axes': 'ZYX', 'Labels': out_labels})
gt = {
'ref': args.ref,
'layout': args.layout,
'width': width, 'height': height, 'nframes': nframes,
'grid_step': step, 'vel_radius': vrad, 'vel_decimate': args.vel_decimate,
'vel_step_px': vstep, 'bg_decimate_average': navg,
'kernel_reach': args.kernel_reach, 'travel_budget_px': budget,
'amplitude_total_flux': args.amplitude, 'shape': args.shape, 'sigma_px': args.sigma,
'background': args.background, 'background_from': args.background_from, 'peak': args.peak,
'motion': ('pos = node + v*t, no restarts (radial expansion)' if args.layout == 'radial'
else 'pos = node + v*(t % restart_period); restart_period 0 = static'),
'targets': targets,
}
with open(args.out + '.groundtruth.json', 'w') as f:
json.dump(gt, f, indent=1)
print('wrote', args.out, stack.shape, 'and ground truth json;',
len(targets), 'targets, layout', args.layout, 'step', step)
if __name__ == '__main__':
main()
#!/usr/bin/env python3
# ghost_probe.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Trajectory-alias ghost probe. # By Claude on 06/16/2026
A STATIC bright target placed at offset d from the patch center (= the output/ROI pixel) should be
SUPPRESSED (det s -> 0): it's a neighbour's target, not a fast target through me. The failure mode
(seen in the clean synthetic grid) is the net hallucinating a FAST velocity whose tail lands on the
static blob -> a ghost detection at d in the untrained off-center band. This sweeps d and prints s +
the argmax velocity, so we can compare the 24-patch (off_max=9) vs 32-patch (off_max=13) models:
the wider RF + extended off-center suppression should drive s->~0 out to a larger d."""
import argparse, numpy as np, torch, synth
from model import RawFCN
def sigmoid(z): return 1.0 / (1.0 + np.exp(-z))
if __name__ == "__main__":
ap = argparse.ArgumentParser()
ap.add_argument("ck")
ap.add_argument("--nframes", type=int, default=9)
ap.add_argument("--patch", type=int, default=24)
ap.add_argument("--vel_radius", type=int, default=5)
ap.add_argument("--vel_decimate", type=int, default=4)
ap.add_argument("--amp", type=float, default=8.0) # static target peak (training: snr*bump, snr 1..8)
a = ap.parse_args()
dev = "cuda" if torch.cuda.is_available() else "cpu"
ck = torch.load(a.ck, map_location=dev)
m = RawFCN(n_frames=a.nframes, vel_radius=a.vel_radius, patch=a.patch).to(dev)
m.load_state_dict(ck["model"]); m.eval()
P = a.patch; cx = P / 2.0; cy = P / 2.0 # deployment reference = P/2
n = 2 * a.vel_radius + 1; step = 1.0 / a.vel_decimate
ix = np.arange(n * n); vxc = (ix % n - a.vel_radius) * step; vyc = (ix // n - a.vel_radius) * step
print(f"model {a.ck} patch={a.patch} amp={a.amp} off_max={P/2 - 2 - 1:.0f}px")
print(f" STATIC target at offset d along +x; want s->0 as d grows (ghost = high s + fast v)")
print(f"{'d':>4} {'s':>7} {'argmax v (cells)':>17} {'|v| px':>7}")
for d in range(0, P // 2):
frames = np.empty((a.nframes, P, P), dtype=np.float32)
bump = (a.amp * synth.halfcos_bump(cx + d, cy, P, P)).astype(np.float32)
for i in range(a.nframes):
frames[i] = bump # static: identical every frame
with torch.no_grad():
out = m(torch.from_numpy(frames[None]).to(dev))[0, :, 0, 0].cpu().numpy()
s = sigmoid(out[0]); k = int(np.argmax(out[1:1 + n * n]))
vx, vy = vxc[k], vyc[k]
print(f"{d:4d} {s:7.3f} ({vx:+.2f},{vy:+.2f}) {np.hypot(vx, vy):6.3f}")
#!/usr/bin/env python3
# infer_server.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""DGX remote-inference server for the CUAS RawFCN (stateful, batched). By Claude on 06/20/2026.
Java uploads the SUBAVG+LoG-conditioned (optionally synth-mixed) stack ONCE; the DGX builds the
temporal pyramid (0.5*(now+prev), as Java's temporalAverageLReLU), then serves BATCHED full-res
inference: one INFER does a whole range of scenes of a level (chunked for GPU memory), shift-and-
stitch -> on-GPU GHOSTBUSTER (== CuasDetectRT.dnnGhostbust) -> decode -> returns the full-frame
offset {dx,dy,s,Vx,Vy} + the ROI-only 121-cell softmax*s, per scene. A debug READBACK returns one
conditioned pyramid frame. Runs inside nvcr.io/nvidia/pytorch:25.10-py3.
Protocol (big-endian, matches Java DataInput/OutputStream). Each request: int32 cmd, then:
UPLOAD (1): int32 T,H,W ; T*H*W float32 (conditioned stack)
-> reply: int32 n_levels ; n_levels x int32 frames_per_level ; int32 N ; float64 build_ms
INFER (2): int32 level, start, count, stride, roi_x, roi_y, roi_w, roi_h ; float64 rmax_cells
(count scenes: newest = start + j*stride, j=0..count-1; rmax_cells<=0 disables ghostbuster)
-> reply: float64 gpu_ms ; int32 H,W,count,nvel,rh,rw
count*5*H*W float32 (offset5 dx,dy,s,Vx,Vy, plane-major per scene)
count*rh*rw*nvel float32 (ROI softmax*s, pixel-major per scene)
READBACK(3): int32 level, frame -> reply: int32 H,W ; H*W float32
BYE (0): close
"""
import argparse, os, socket, struct, time
from datetime import datetime
import numpy as np
import torch
import torch.nn.functional as F
from model import RawFCN
CMD_BYE, CMD_UPLOAD, CMD_INFER, CMD_READBACK = 0, 1, 2, 3
CMD_STATUS = 4 # report loaded L1/L2 model paths so the client can detect a model change. By Claude on 06/24/2026
GPU_CHUNK = 16 # scenes processed per batched GPU pass (memory vs utilization)
VEL_DECIMATE = 4 # velocity-grid cells per px/level-frame (Java curt_vel_decimate); L2 was trained on Vx,Vy in px/frame (cells/4). By Claude on 06/22/2026
AGE_THR = 0.2 # L2 track-age death threshold: a cell with det<=AGE_THR "dies" (age 0). Raised 0.01->0.2 so the
# weak noise halo dies and the 5x5 max-pool can't dilate age across gaps. By Claude on 06/24/2026
AGE_K = 0.5 # ancestor gate: a 5x5 previous-frame neighbor may pass its age only if its det >= AGE_K * (local
# max det in that 5x5) - blocks a weak-but-old straggler from seeding age. By Claude on 06/24/2026
NOISE_REF_LEVEL = 3 # the net is calibrated to ~LEV3's absolute noise (low-contrast signals tested mainly on LEV3).
# The pyramid averages 2 frames/level so sigma drops sqrt(2)/level; scale each level's L1 input by
# sqrt(2)^(level-REF) to put every level at LEV3's absolute noise (uniform FP). By Claude on 06/24/2026
def load_l2(run_dir, device):
# Optional Layer-2 (track-before-detect) recurrent net; FCN so it runs on any H,W. By Claude on 06/22/2026
from layer2 import Layer2Net
ck = torch.load(os.path.join(run_dir, "model.pt"), map_location="cpu", weights_only=False)
a = ck.get("args", {}) or {}
m = Layer2Net(ch_in=3, ch_hidden=a.get("ch", 24), grid=a.get("G", 32), vmax=a.get("vmax", 1.4))
m.load_state_dict(ck["model"]); m.eval().to(device)
print(f"loaded L2 {run_dir}/model.pt: ch_hidden={a.get('ch',24)} vmax={a.get('vmax',1.4)}", flush=True)
return m
def load_model(run_dir, device):
ck = torch.load(os.path.join(run_dir, "model.pt"), map_location="cpu", weights_only=False)
a = ck.get("args", {}) or {}
kw = dict(n_frames=a.get("nframes", 8), vel_radius=a.get("vel_radius", 5),
patch=a.get("patch", 24), velocity_mode=a.get("velocity_mode", "grid"),
vmax=a.get("vmax", 1.4))
if a.get("ch") is not None:
kw["ch"] = tuple(a["ch"])
m = RawFCN(**kw)
m.load_state_dict(ck["model"])
m.eval().to(device)
print(f"loaded {run_dir}/model.pt: N={kw['n_frames']} patch={kw['patch']} vr={kw['vel_radius']} "
f"mode={kw['velocity_mode']} out_ch={m.out_ch}", flush=True)
return m, kw["n_frames"], kw["patch"], kw["vel_radius"]
def recvall(conn, n):
buf = bytearray()
while len(buf) < n:
chunk = conn.recv(n - len(buf))
if not chunk:
raise ConnectionError("short read")
buf += chunk
return bytes(buf)
def build_pyramid(log, n_levels_max=8):
# Replicates Java's pyramid (CuasDetectRT temporalAverageLReLU, linear). log: [T,H,W].
levels = [0.5 * (log[1:] + log[:-1])] # [T-1,H,W]
while len(levels) < n_levels_max:
prev = levels[-1]
nl = len(prev) // 2 - 1
if nl < 1:
break
idx = torch.arange(nl, device=log.device)
levels.append(0.5 * (prev[2 * idx + 2] + prev[2 * idx]))
return levels
@torch.no_grad()
def shift_stitch(m, x, P, S=4):
# x: [B,N,H,W] -> full-res field [B,C,H,W] (validated == per-pixel in fp64, dense_check.py).
B, N, H, W = x.shape
half = P // 2
xp = F.pad(x, (half, half, half, half)) # [B,N,H+P,W+P]
full = torch.zeros(B, m.out_ch, H, W, device=x.device, dtype=x.dtype)
for sy in range(S):
for sx in range(S):
y = m(xp[:, :, sy:, sx:]) # [B,C,oH,oW]
oh = min(y.shape[2], (H - sy + S - 1) // S)
ow = min(y.shape[3], (W - sx + S - 1) // S)
full[:, :, sy::S, sx::S] = y[:, :, :oh, :ow]
return full # [B,C,H,W]
@torch.no_grad()
def decode(field, vr, roi, rmax_cells):
# field: [B,C,H,W]. On-GPU ghostbuster (== CuasDetectRT.dnnGhostbust) + decode.
vdim = 2 * vr + 1
nvel = vdim * vdim
B, C, H, W = field.shape
x0, y0, rw, rh = roi
s = field[:, 0].sigmoid() # [B,H,W]
p = field[:, 1:1 + nvel].softmax(1) # [B,nvel,H,W]
k = torch.arange(nvel, device=field.device)
cx = (k % vdim - vr).to(field.dtype) # [nvel] vx cell coord
cy = (k // vdim - vr).to(field.dtype) # [nvel] vy cell coord
if rmax_cells > 0: # ghostbuster
corner = (cx * cx + cy * cy) > (rmax_cells * rmax_cells) # [nvel] untrained corner cells
ghost = corner[p.argmax(1)] # [B,H,W] peak lands in a corner -> whole pixel is a ghost
p = p * (~corner).to(p.dtype)[None, :, None, None] # zero corner cells everywhere
keep = (~ghost).to(p.dtype) # [B,H,W]
p = p * keep[:, None] # zero all cells at ghost pixels
s = s * keep # s=0 at ghost pixels
psum = p.sum(1).clamp_min(1e-12) # [B,H,W] (normalize the centroid)
vx = (p * cx[None, :, None, None]).sum(1) / psum # [B,H,W] velocity centroid (cells)
vy = (p * cy[None, :, None, None]).sum(1) / psum
dx, dy = field[:, 1 + nvel], field[:, 1 + nvel + 1] # [B,H,W]
offset5 = torch.stack([dx, dy, s, vx, vy], 1) # [B,5,H,W] -> -OFFSET
roi_field = (p * s[:, None])[:, :, y0:y0 + rh, x0:x0 + rw] # [B,nvel,rh,rw] softmax*s over ROI
return offset5, roi_field.permute(0, 2, 3, 1).contiguous(), nvel # [B,5,H,W], [B,rh,rw,nvel]
def serve(run_dir, host, port, l2_run=None):
device = "cuda" if torch.cuda.is_available() else "cpu"
torch.backends.cudnn.benchmark = True
torch.set_grad_enabled(False) # inference-only server; L2 recurrence (m2.cell/decode) isn't @no_grad'd. By Claude 06/22/2026
m, N, P, vr = load_model(run_dir, device)
m2 = load_l2(l2_run, device) if l2_run else None # optional Layer-2; None -> L1-only (old way). By Claude 06/22/2026
print(f"device={device} gpu={torch.cuda.get_device_name(0) if device=='cuda' else 'cpu'} "
f"patch={P} N={N} vr={vr} L2={'on('+l2_run+')' if m2 is not None else 'off'}", flush=True)
_ = shift_stitch(m, torch.zeros(1, N, 64, 64, device=device), P) # warm-up
if device == "cuda":
torch.cuda.synchronize()
print("warm-up done", flush=True)
srv = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
srv.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
srv.bind((host, port))
srv.listen(4)
print(f"listening on {host}:{port} (N={N}, batched full-res shift-and-stitch + ghostbuster)", flush=True)
pyr = None
while True:
conn, addr = srv.accept()
conn.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
print(f"{datetime.now():%H:%M:%S} client {addr}", flush=True)
h_l2 = None # Layer-2 recurrent hidden state [1,ch,H,W]; persists across INFER chunks, reset on l2_reset. By Claude 06/22/2026
age_l2 = None # L2 track-age field [1,1,H,W]; sprev_l2 = previous-frame L2 det; carried+reset like h_l2. By Claude 06/24/2026
sprev_l2 = None
try:
while True:
cmd = struct.unpack(">i", recvall(conn, 4))[0]
if cmd == CMD_BYE:
break
if cmd == CMD_STATUS:
# Reply with loaded L1 + L2 model paths (len-prefixed UTF-8); empty L2 = L1-only.
# Lets the Java client detect a model change and relaunch. By Claude on 06/24/2026
b1 = run_dir.encode("utf-8")
b2 = (l2_run or "").encode("utf-8")
conn.sendall(struct.pack(">i", len(b1)) + b1 + struct.pack(">i", len(b2)) + b2)
continue
if cmd == CMD_UPLOAD:
T, H, W = struct.unpack(">iii", recvall(conn, 12))
data = recvall(conn, T * H * W * 4)
log = torch.from_numpy(np.frombuffer(data, dtype=">f4").astype(np.float32)
.reshape(T, H, W)).to(device)
t0 = time.perf_counter()
pyr = build_pyramid(log)
if device == "cuda":
torch.cuda.synchronize()
bms = (time.perf_counter() - t0) * 1e3
nl = len(pyr)
print(f"{datetime.now():%H:%M:%S} UPLOAD T={T} {H}x{W} -> {nl} levels "
f"{[len(l) for l in pyr]} build={bms:.1f}ms ({T*H*W*4/1e6:.1f}MB)", flush=True)
conn.sendall(struct.pack(">i", nl) + b"".join(struct.pack(">i", len(l)) for l in pyr)
+ struct.pack(">id", N, bms))
elif cmd == CMD_INFER:
level, start, count, stride, rx, ry, rw, rh = struct.unpack(">iiiiiiii", recvall(conn, 32))
rmax = struct.unpack(">d", recvall(conn, 8))[0]
l2_enable, l2_reset = struct.unpack(">ii", recvall(conn, 8)) # By Claude 06/22/2026
noise_scale = struct.unpack(">d", recvall(conn, 8))[0] # per-level L1-input noise scale from Java (single source of truth); <=0 -> server fallback. By Claude 06/24/2026
use_l2 = bool(l2_enable) and (m2 is not None)
# Per-level noise normalization: scale this level's L1 input to LEV3's absolute noise so all
# levels sit in the net's trained regime (uniform FP across levels). LEV3 -> 1.0, lower/noisier
# levels scale down, higher levels up. Independent of the age filter. By Claude on 06/24/2026
if noise_scale <= 0.0: # fallback only: Java didn't send one -> theoretical sqrt(2)^(level-ref)
noise_scale = 2.0 ** ((level - NOISE_REF_LEVEL) / 2.0)
lev = pyr[level] * noise_scale # [Tl,H,W]
H, W = lev.shape[1], lev.shape[2]
nvel = (2 * vr + 1) ** 2
o5_gpu, rf_gpu = [], []
# Time PURE GPU compute (shift-and-stitch + decode), continuous over the whole range -
# the production throughput. Results stay on-GPU (prod feeds Layer 2 there); the D2H copy
# below is dev-only and NOT timed. By Claude on 06/20/2026
if device == "cuda":
ev0 = torch.cuda.Event(enable_timing=True); ev1 = torch.cuda.Event(enable_timing=True)
torch.cuda.synchronize(); ev0.record()
else:
t0 = time.perf_counter()
for c0 in range(0, count, GPU_CHUNK):
b = min(GPU_CHUNK, count - c0)
# newest-first windows (channel 0 = newest), matching the Java order
wins = torch.stack([lev[(start + (c0 + j) * stride) - N + 1:
(start + (c0 + j) * stride) + 1].flip(0) for j in range(b)]) # [b,N,H,W]
field = shift_stitch(m, wins, P) # [b,C,H,W]
o5, rf, nv = decode(field, vr, (rx, ry, rw, rh), rmax) # L1: ghostbusted offset5 + ROI
nvel = nv
if use_l2:
# Layer-2 (track-before-detect) over the scene/time axis. Feed the FULL
# (non-ghostbusted) field as (s, Vx/vd, Vy/vd) px/level-frame; carry the recurrent
# hidden state across chunks (reset on l2_reset at the level's first chunk). Output
# replaces offset5 with {L1 dx, L1 dy, L2 det, L2 Vx*vd, L2 Vy*vd} (vel back to cells
# so Java's existing /vel_decimate viz scaling -> px/level-frame). By Claude 06/22/2026
ong, _, _ = decode(field, vr, (rx, ry, rw, rh), 0.0) # no ghostbuster (L2 gets full field)
l2in = torch.stack([ong[:, 2], ong[:, 3] / VEL_DECIMATE, ong[:, 4] / VEL_DECIMATE], 1) # [b,3,H,W]
# FPN-bad margins arrive as NaN; the recurrent circular conv would otherwise spread
# NaN inward by the kernel radius every frame ("eating" the borders). Sanitize the
# input so NaN can never seed/propagate through the hidden state. By Claude 06/22/2026
l2in = torch.nan_to_num(l2in, nan=0.0, posinf=0.0, neginf=0.0)
Hf, Wf = l2in.shape[2], l2in.shape[3]
if (h_l2 is None) or (h_l2.shape[2] != Hf) or (h_l2.shape[3] != Wf) or (l2_reset and c0 == 0):
h_l2 = torch.zeros(1, m2.ch_hidden, Hf, Wf, device=device, dtype=field.dtype)
age_l2 = torch.zeros(1, 1, Hf, Wf, device=device, dtype=field.dtype) # track age, carried+reset like h_l2
sprev_l2 = torch.zeros(1, 1, Hf, Wf, device=device, dtype=field.dtype) # previous-frame L2 det
dets, vxs, vys, ages = [], [], [], []
for j in range(b): # forward in time, carry hidden + age
h_l2 = m2.cell(l2in[j:j+1], h_l2)
dlog, vel = m2.decode(h_l2) # [1,1,H,W],[1,2,H,W]
s = torch.sigmoid(dlog[:, 0:1]) # [1,1,H,W] current L2 det
# AGE (track-before-detect persistence): die where det<=AGE_THR, else 1 + oldest age among
# 5x5 PREVIOUS-frame neighbors that are themselves STRONG (det >= AGE_K * local-max det) -
# so a weak-but-old straggler can't seed age; the raised AGE_THR stops the noise halo from
# dilating age across gaps. Level-uniform 5x5 (pyramid keeps ~const px/level-frame). By Claude 06/24/2026
maxS = F.max_pool2d(sprev_l2, 5, 1, 2) # local max prev-det in 5x5
elig = (sprev_l2 >= AGE_K * maxS) & (sprev_l2 > AGE_THR) # strong AND alive ancestors
prev = torch.where(elig, age_l2, torch.zeros_like(age_l2)) # only strong ancestors pass age
age_l2 = torch.where(s > AGE_THR, F.max_pool2d(prev, 5, 1, 2) + 1.0, torch.zeros_like(age_l2))
sprev_l2 = s
dets.append(s[:, 0]); ages.append(age_l2[:, 0]); vxs.append(vel[:, 0]); vys.append(vel[:, 1])
l2vx = torch.cat(vxs, 0) * VEL_DECIMATE; l2vy = torch.cat(vys, 0) * VEL_DECIMATE
o5 = torch.stack([o5[:, 0], o5[:, 1], torch.cat(dets, 0), l2vx, l2vy, torch.cat(ages, 0)], 1) # +L2 age (6th); keep L1 dx,dy
o5_gpu.append(o5); rf_gpu.append(rf) # keep on GPU (rf = L1 ROI reference even when L2 on)
if device == "cuda":
ev1.record(); torch.cuda.synchronize(); gms = ev0.elapsed_time(ev1)
else:
gms = (time.perf_counter() - t0) * 1e3
allo = torch.cat(o5_gpu, 0).cpu().numpy().astype(">f4") # [count,nch,H,W] nch=5 (L1) or 6 (L2:+age) D2H untimed
nch = allo.shape[1] # channel count sent in the header (was hardcoded 5)
allr = torch.cat(rf_gpu, 0).cpu().numpy().astype(">f4") # [count,rh,rw,nvel]
print(f"{datetime.now():%H:%M:%S} INFER lev={level} {count} scenes (f{start}..,stride {stride}) "
f"ROI={rw}x{rh} ghost={rmax:.1f} nscale={noise_scale:.3f} L2={'on' if use_l2 else 'off'}{'(reset)' if (use_l2 and l2_reset) else ''} "
f"gpu={gms:.1f}ms ({(allo.nbytes+allr.nbytes)/1e6:.1f}MB out)", flush=True)
conn.sendall(struct.pack(">diiiiiii", gms, H, W, count, nch, nvel, rh, rw)) # +nch (offset channels). By Claude 06/24/2026
conn.sendall(allo.tobytes())
conn.sendall(allr.tobytes())
elif cmd == CMD_READBACK:
level, frame = struct.unpack(">ii", recvall(conn, 8))
fr = pyr[level][frame].cpu().numpy().astype(">f4")
print(f"{datetime.now():%H:%M:%S} READBACK lev={level} f={frame}", flush=True)
conn.sendall(struct.pack(">ii", fr.shape[0], fr.shape[1]) + fr.tobytes())
except (ConnectionError, struct.error, IndexError) as ex:
print(f"client {addr} closed/err: {ex}", flush=True)
finally:
conn.close()
if __name__ == "__main__":
ap = argparse.ArgumentParser()
ap.add_argument("--run", default="runs/weighted9_pm_s")
ap.add_argument("--l2run", default=None, help="optional Layer-2 run dir (model.pt); omit for L1-only")
ap.add_argument("--host", default="0.0.0.0")
ap.add_argument("--port", type=int, default=5577)
args = ap.parse_args()
serve(args.run, args.host, args.port, l2_run=args.l2run)
# l1_samples.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Multi-sample L1-output viewer, 2x2-tiled (64x64) only. By Claude on 06/23/2026
Runs frozen L1 over several synthetic gap-runs and writes 2x2-tiled stacks so Andrey can scrub L1
behavior across samples in Fiji (seam through the center cross => torus continuity visible; no need
for the single 32x32). Per quantity, pages = nsamples*T concatenated. Channels: input, L1 s, truth
marker, signal (1=bump rendered / 0=gap). The point: SEE how L1's s-field clears noise near the
target when present, and how noise returns in a gap (coast-a-gap = higher noise)."""
import argparse
import numpy as np
import torch
import synth
import layer2_data as L1D
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--l1", default="runs/weighted9_pm/model.pt")
ap.add_argument("--T", type=int, default=64); ap.add_argument("--G", type=int, default=32)
ap.add_argument("--nsamples", type=int, default=6)
ap.add_argument("--vmax", type=float, default=1.4); ap.add_argument("--snr", type=float, default=6.0)
ap.add_argument("--gaps", action="store_true")
ap.add_argument("--bp_lo", type=int, default=3); ap.add_argument("--bp_hi", type=int, default=9)
ap.add_argument("--duty_offset", type=float, default=-0.3); ap.add_argument("--starter_len", type=int, default=8)
ap.add_argument("--out", default="runs/l1_samples")
a = ap.parse_args()
import os; os.makedirs(a.out, exist_ok=True)
dev = "cuda" if torch.cuda.is_available() else "cpu"
net, N, meta = L1D._load_l1(a.l1, dev)
G, T = a.G, a.T
rkw = dict(gaps=a.gaps, bp_lo=a.bp_lo, bp_hi=a.bp_hi, duty_offset=a.duty_offset, starter_len=a.starter_len)
def tile2x2(st): # [T,G,G] -> [T,2G,2G]
return np.tile(st, (1, 2, 2))
inputs, sfields, truths, signals = [], [], [], []
print(f"L1 {a.l1} N={N}; {a.nsamples} samples T={T} gaps={a.gaps}", flush=True)
for k in range(a.nsamples):
rng = np.random.default_rng(100 + k) # distinct, reproducible per sample
frames, pos, vel, present, signal = L1D.render_run(rng, T=T, G=G, vmax=vmax_(a), snr=a.snr,
return_signal=True, **rkw)
seq = L1D.gen_field_sequence(net, frames, pos, G, N, dev) # [T,3,G,G]
truth = np.zeros((T, G, G), np.float32)
for t in range(T):
if present[t]:
truth[t] = L1D.halfcos_bump_torus(pos[t, 0], pos[t, 1], G)
inputs.append(tile2x2(frames))
sfields.append(tile2x2(seq[:, 0]))
truths.append(tile2x2(truth))
signals.append(tile2x2(np.broadcast_to(signal[:, None, None], (T, G, G)).astype(np.float32)))
ng = int((present > 0).sum()); ngap = int(((present > 0) & (signal < 0.5)).sum())
print(f" sample {k}: present {ng}/{T}, gap {ngap}", flush=True)
# concatenate samples along the page axis -> one scrubable stack per quantity
synth.save_tiff_stack(np.concatenate(inputs, 0), f"{a.out}/input_2x2.tif")
synth.save_tiff_stack(np.concatenate(sfields, 0), f"{a.out}/s_2x2.tif")
synth.save_tiff_stack(np.concatenate(truths, 0), f"{a.out}/truth_2x2.tif")
synth.save_tiff_stack(np.concatenate(signals, 0), f"{a.out}/signal_2x2.tif")
print(f"wrote {a.out}/{{input,s,truth,signal}}_2x2.tif "
f"({a.nsamples}x{T}={a.nsamples*T} pages, 64x64, 32-bit; sample k = pages [k*{T},(k+1)*{T}))", flush=True)
def vmax_(a):
return a.vmax
if __name__ == "__main__":
main()
# l2_fp_analysis.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""L2 false-positive + P_d analysis vs UAS flight-log truth, per pyramid level. By Claude on 06/24/2026.
PLUMBING/CORRECTNESS first pass (not final numbers). For one sequence dir (the center's vNNN dir), it:
- finds every *-OFFSET-<model>.tiff (level from "-LEVn-"; an untagged legacy file = level 0),
- reads only the s-channel pages (label starts with "s:") one at a time (memory-safe; LEV0 is ~3GB),
- matches each frame to the UAS truth in *-UAS_DATA.tsv by timestamp,
- applies the clean-sky ROI geometry (see project_l2_fp_measurement), counts FP "blobs" (local maxima
of s above threshold, numpy-only) per pixel-hectare in the HIGH/LOW sky zones, excluding a disk around
the UAS truth on IN-FoV frames, and scores P_d (UAS detected within that disk),
- sweeps the s-threshold and prints a per-level table.
UAS not always on screen is handled by the TSV `status` (IN FoV / OUT OF FoV / no entry): P_d only on IN-FoV
frames; other frames contribute to FP only. A sequence with no IN-FoV frames -> pure FP (P_d = n/a).
Run: /home/elphel/.venvs/c5p/bin/python l2_fp_analysis.py --dir <.../center/vNNN> [--model mexhat_gaps_boost40]
"""
import argparse, glob, os, re, numpy as np, tifffile
# --- clean-sky FP geometry (per-sequence; the 620->560m UAS clip family) -----------------------------
ROI = (42, 45, 555, 270) # clean ROI x,y,w,h -> x in [42,597), y in [45,315)
SIGN = (27, 200, 123, 135) # bomb-sign + rotation artefacts exclusion x,y,w,h (right edge 150: artefacts reach x=149). By Claude 06/25/2026
SKY_SPLIT = 230 # high sky y<230, low sky y>=230
HECTARE = 100.0 * 100.0 # 10000 px
def build_zone_masks(H, W):
base = np.zeros((H, W), bool)
x0, y0, w, h = ROI
base[y0:y0 + h, x0:x0 + w] = True
sx, sy, sw, sh = SIGN
base[sy:sy + sh, sx:sx + sw] = False
high = base.copy(); high[SKY_SPLIT:, :] = False
low = base.copy(); low[:SKY_SPLIT, :] = False
return high, low
def local_maxima(s, thr):
"""boolean map: strict-enough 3x3 local maxima with s > thr (one mark per blob). numpy-only."""
from numpy.lib.stride_tricks import sliding_window_view
sp = np.pad(s, 1, mode="constant", constant_values=-np.inf)
nmax = sliding_window_view(sp, (3, 3)).max(axis=(2, 3)) # 3x3 neighborhood max
return (s >= nmax) & (s > thr)
def norm_ts(label):
"""'s:1773135520_851518-0 f8' or '1773135519.534413' -> '1773135519.534413' (bare ts string)."""
t = label.replace("_", ".")
m = re.search(r"\d{5,10}\.\d{6}", t)
return m.group(0) if m else t
def load_truth(tsv_path):
"""ts(str, rounded) -> (status, px, py). Only rows with a parseable ts."""
truth = {}
with open(tsv_path) as f:
header = f.readline()
for line in f:
c = line.rstrip("\n").split("\t")
if len(c) < 5:
continue
ts = norm_ts(c[1])
try:
key = round(float(ts), 6)
except ValueError:
continue
status = c[2]
px = float(c[3]) if c[3] else np.nan
py = float(c[4]) if c[4] else np.nan
truth[key] = (status, px, py)
return truth
def level_of(path):
m = re.search(r"-LEV(\d+)-", os.path.basename(path))
return int(m.group(1)) if m else 0 # untagged legacy file = level 0
def analyze_level(tiff_path, truth, thresholds, disk_r, dbg=False):
"""Return per-threshold dict of accumulated FP blobs / valid hectares / P_d counts."""
disk_r2 = disk_r * disk_r
acc = {t: dict(fp_hi=0, fp_lo=0, ha_hi=0.0, ha_lo=0.0, pd_hit=0, pd_tot=0) for t in thresholds}
matched = 0
unmatched = 0
with tifffile.TiffFile(tiff_path) as tf:
labels = (tf.imagej_metadata or {}).get("Labels") or []
H, W = tf.pages[0].shape
high0, low0 = build_zone_masks(H, W)
yy, xx = np.ogrid[:H, :W]
for i, page in enumerate(tf.pages):
lab = labels[i] if i < len(labels) else ""
if not lab.startswith("s:"): # s-channel pages only
continue
key = None
try:
key = round(float(norm_ts(lab)), 6)
except ValueError:
pass
tr = truth.get(key)
if tr is None:
unmatched += 1
else:
matched += 1
s = np.asarray(page.asarray(), dtype=np.float32)
valid = np.isfinite(s)
s = np.where(valid, s, -np.inf)
# target disk (only on IN-FoV frames)
in_fov = (tr is not None) and (tr[0] == "IN FoV") and np.isfinite(tr[1]) and np.isfinite(tr[2])
if in_fov:
px, py = tr[1], tr[2]
disk = ((xx - px) ** 2 + (yy - py) ** 2) <= disk_r2
else:
disk = np.zeros((H, W), bool)
hi = high0 & valid
lo = low0 & valid
for t in thresholds:
peaks = local_maxima(s, t)
# FP = peaks in zone, outside target disk
acc[t]["fp_hi"] += int(np.count_nonzero(peaks & hi & ~disk))
acc[t]["fp_lo"] += int(np.count_nonzero(peaks & lo & ~disk))
acc[t]["ha_hi"] += np.count_nonzero(hi) / HECTARE
acc[t]["ha_lo"] += np.count_nonzero(lo) / HECTARE
if in_fov:
acc[t]["pd_tot"] += 1
if np.any((s > t) & disk):
acc[t]["pd_hit"] += 1
return acc, matched, unmatched
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--dir", required=True, help="center vNNN dir with -OFFSET tiffs + -UAS_DATA.tsv")
ap.add_argument("--model", default="mexhat_gaps_boost40")
ap.add_argument("--disk", type=float, default=6.0, help="target-disk radius (px) for P_d / FP exclusion")
ap.add_argument("--thr", default="0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.6")
a = ap.parse_args()
thresholds = [float(x) for x in a.thr.split(",")]
tsvs = glob.glob(os.path.join(a.dir, "*-UAS_DATA.tsv"))
truth = load_truth(tsvs[0]) if tsvs else {}
print(f"truth: {'<none>' if not tsvs else os.path.basename(tsvs[0])} rows={len(truth)} "
f"IN-FoV={sum(1 for v in truth.values() if v[0]=='IN FoV')}")
tiffs = sorted(glob.glob(os.path.join(a.dir, f"*-OFFSET-{a.model}.tiff")), key=level_of)
if not tiffs:
raise SystemExit(f"no *-OFFSET-{a.model}.tiff in {a.dir}")
seq = os.path.basename(tiffs[0]).split("-SUBAVG")[0] # center name, for cross-sequence concat
rows = [] # reusable summary rows (raw counts + derived), one per (level, threshold)
for f in tiffs:
lev = level_of(f)
acc, matched, unmatched = analyze_level(f, truth, thresholds, a.disk)
print(f"\n=== LEV{lev} (frames matched={matched} unmatched={unmatched}) {os.path.basename(f)[:48]}... ===")
print(f"{'thr':>5} {'FP/ha_hi':>9} {'FP/ha_lo':>9} {'P_d':>6}")
for t in thresholds:
d = acc[t]
fph = d["fp_hi"] / d["ha_hi"] if d["ha_hi"] > 0 else float("nan")
fpl = d["fp_lo"] / d["ha_lo"] if d["ha_lo"] > 0 else float("nan")
pd = (d["pd_hit"] / d["pd_tot"]) if d["pd_tot"] > 0 else float("nan")
print(f"{t:5.2f} {fph:9.3f} {fpl:9.3f} {pd:6.2f}")
rows.append((seq, lev, t, d["fp_hi"], d["ha_hi"], fph, d["fp_lo"], d["ha_lo"], fpl,
d["pd_hit"], d["pd_tot"], pd, matched, unmatched))
# reusable summary CSV in the same dir (raw counts kept so densities/P_d can be re-aggregated
# across sequences without re-reading the tiffs). By Claude on 06/24/2026
out = os.path.join(a.dir, f"{seq}-L2FP-{a.model}.csv")
with open(out, "w") as fo:
fo.write(f"# L2 FP/P_d summary model={a.model} disk_r={a.disk}px "
f"ROI={ROI} SIGN={SIGN} sky_split_y={SKY_SPLIT} hectare={int(HECTARE)}px\n")
fo.write("seq,level,thr,fp_hi,ha_hi,fp_per_ha_hi,fp_lo,ha_lo,fp_per_ha_lo,pd_hit,pd_tot,pd,matched,unmatched\n")
for r in rows:
fo.write(",".join(str(x) for x in r) + "\n")
print(f"\nsummary -> {out}")
if __name__ == "__main__":
main()
# layer2.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""C5P Layer-2 (track-before-detect) — minimal circular-ConvGRU on a torus. By Claude on 06/21/2026
Layer 1 (frozen RawFCN) emits, per level-frame, a dense stride-4 field {s, Vx, Vy, dx, dy}.
Layer 2 is a RECURRENT net whose hidden state is the running 4D track memory (x, y, vx, vy),
fed a target-following 32x32 slice of that field one frame at a time. This first cut is the
SIMPLEST viable version (per Andrey 06/21):
- plain circular ConvGRU (NO explicit velocity-advection warp yet — added as a 2nd step;
the conv recurrence still learns local motion implicitly),
- dense Gaussian-bump readout (det map + Vx,Vy maps; supervise with a bump at truth),
- single target, free-orbit (absolute position = torus-local + winding offset, tracked
OUTSIDE the net; not needed for this module's forward/backward).
Torus rationale: xy is a PERIODIC 32x32 grid (Conv2d padding_mode='circular'). With the target
drift over a window staying << 32 cells, the single target "lives in infinite space" on a tiny
fixed array — no border code, translation-equivariant everywhere, trivial to batch. vx,vy are
NOT periodic (bounded by vmax; velocity does not wrap).
UNITS: the field grid is stride-4, so one torus cell = 4 scene px. Vx,Vy channels and the
velocity readout are kept in Layer-1 units (px/level-frame); vmax≈1.4 px/frame => ~0.35 cells/
frame => ~2.8 cells over N=8 (<< 32, the R<<G condition the torus relies on). The /4 conversion
to cells only matters once we add the advection warp.
Run the smoke test: python layer2.py
"""
import argparse
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
# ---------------------------------------------------------------------------
# Recurrent cell
# ---------------------------------------------------------------------------
class ConvGRUCellTorus(nn.Module):
"""One ConvGRU step with circular (toroidal) padding on the xy grid. By Claude on 06/21/2026
Standard ConvGRU:
z = sigmoid(Wz . [x, h]) update gate [B, Ch, G, G]
r = sigmoid(Wr . [x, h]) reset gate [B, Ch, G, G]
n = tanh (Wn . [x, r*h]) candidate state [B, Ch, G, G]
h'= (1 - z) * h + z * n new hidden [B, Ch, G, G]
All convs are k x k with padding_mode='circular' so the 32x32 grid wraps both axes.
"""
def __init__(self, ch_in, ch_hidden, k=3):
super().__init__()
pad = k // 2
cat = ch_in + ch_hidden # concat of input + hidden along channels
# one conv per gate; circular pad makes the receptive field wrap the torus edges
self.conv_z = nn.Conv2d(cat, ch_hidden, k, padding=pad, padding_mode='circular')
self.conv_r = nn.Conv2d(cat, ch_hidden, k, padding=pad, padding_mode='circular')
self.conv_n = nn.Conv2d(cat, ch_hidden, k, padding=pad, padding_mode='circular')
def forward(self, x, h):
# x: [B, Cin, G, G] h: [B, Ch, G, G] -> h_new: [B, Ch, G, G]
xh = torch.cat([x, h], dim=1) # [B, Cin+Ch, G, G]
z = torch.sigmoid(self.conv_z(xh)) # [B, Ch, G, G] update gate
r = torch.sigmoid(self.conv_r(xh)) # [B, Ch, G, G] reset gate
xrh = torch.cat([x, r * h], dim=1) # [B, Cin+Ch, G, G] reset-masked hidden
n = torch.tanh(self.conv_n(xrh)) # [B, Ch, G, G] candidate
return (1.0 - z) * h + z * n # [B, Ch, G, G] new hidden
# ---------------------------------------------------------------------------
# Layer-2 net
# ---------------------------------------------------------------------------
class Layer2Net(nn.Module):
"""Recurrent track-before-detect over a torus field sequence. By Claude on 06/21/2026
forward(seq) consumes T frames of the Layer-1 field slice and returns, per frame, a dense
det logit + (Vx,Vy) over the torus. Hidden state starts at 0 (no track) and accumulates
evidence across frames — the recurrence IS the track filter.
"""
def __init__(self, ch_in=3, ch_hidden=24, grid=32, vmax=1.4, k=3):
super().__init__()
self.ch_in = ch_in # field channels fed in: s, Vx, Vy
self.ch_hidden = ch_hidden # hidden track-memory channels
self.grid = grid # torus side G (cells); one cell = 4 scene px
self.vmax = vmax # velocity readout bound, px/level-frame (matches Layer-1 training vmax)
self.cell = ConvGRUCellTorus(ch_in, ch_hidden, k=k)
# readout head: hidden -> det(1) + raw Vx,Vy(2); 1x1 conv = per-cell decode
self.head = nn.Conv2d(ch_hidden, 1 + 2, 1)
def init_hidden(self, B, device, dtype):
# zero hidden = "no track yet"; [B, Ch, G, G]
return torch.zeros(B, self.ch_hidden, self.grid, self.grid, device=device, dtype=dtype)
def decode(self, h):
# h: [B, Ch, G, G] -> det_logit [B, 1, G, G], vel [B, 2, G, G] bounded to +-vmax
o = self.head(h) # [B, 3, G, G]
det = o[:, 0:1] # [B, 1, G, G] raw logit
vel = self.vmax * torch.tanh(o[:, 1:3]) # [B, 2, G, G] px/level-frame
return det, vel
def forward(self, seq, h=None):
# seq: [B, T, Cin, G, G] -> det [B, T, 1, G, G], vel [B, T, 2, G, G]
B, T = seq.shape[0], seq.shape[1]
if h is None:
h = self.init_hidden(B, seq.device, seq.dtype)
dets, vels = [], []
for t in range(T): # BPTT unrolls this loop
h = self.cell(seq[:, t], h) # [B, Ch, G, G] recurrent update
det, vel = self.decode(h) # per-frame readout
dets.append(det)
vels.append(vel)
det = torch.stack(dets, dim=1) # [B, T, 1, G, G]
vel = torch.stack(vels, dim=1) # [B, T, 2, G, G]
return det, vel
# ---------------------------------------------------------------------------
# Dense Gaussian-bump supervision (single target)
# ---------------------------------------------------------------------------
def bump_target(pos_xy, grid, sigma=1.0, device="cpu"):
"""Toroidal Gaussian bump at (sub-cell) position pos_xy. By Claude on 06/21/2026
pos_xy: [B, T, 2] (x, y) in torus cells (may be fractional / out of [0,G) — wraps).
Returns det bump [B, T, 1, G, G] in [0,1]. Distance uses the WRAPPED (toroidal) metric so
a target near the edge still gets a single round bump that straddles the seam.
"""
B, T = pos_xy.shape[0], pos_xy.shape[1]
coord = torch.arange(grid, device=device).float() # [G]
gy, gx = torch.meshgrid(coord, coord, indexing='ij') # [G, G] each
gx = gx[None, None]; gy = gy[None, None] # [1,1,G,G] broadcast over B,T
px = pos_xy[..., 0][..., None, None] # [B, T, 1, 1]
py = pos_xy[..., 1][..., None, None] # [B, T, 1, 1]
# wrapped (toroidal) coordinate difference: nearest image around the G-periodic grid
dx = (gx - px + grid / 2) % grid - grid / 2 # [B, T, G, G] in (-G/2, G/2]
dy = (gy - py + grid / 2) % grid - grid / 2
g = torch.exp(-(dx * dx + dy * dy) / (2 * sigma * sigma)) # [B, T, G, G]
return g[:, :, None] # [B, T, 1, G, G]
def layer2_loss(det_logit, vel, det_t, vel_t, support=0.3, pos_weight=20.0):
"""Detection BCE (sparse bump -> pos_weight) + velocity MSE on the bump support. By Claude 06/21
det_logit: [B,T,1,G,G] raw det_t: [B,T,1,G,G] in [0,1]
vel: [B,T,2,G,G] vel_t: [B,T,2,G,G] (px/level-frame; only used where det_t>support)
"""
pw = torch.tensor(pos_weight, device=det_logit.device)
l_det = F.binary_cross_entropy_with_logits(det_logit, det_t, pos_weight=pw)
m = (det_t > support) # [B,T,1,G,G] bump core mask
if m.any():
m2 = m.expand_as(vel) # [B,T,2,G,G]
l_vel = F.mse_loss(vel[m2], vel_t[m2])
else:
l_vel = vel.sum() * 0.0
return l_det + 0.3 * l_vel, {"det": float(l_det.detach()), "vel": float(l_vel.detach() if torch.is_tensor(l_vel) else l_vel)}
# ---------------------------------------------------------------------------
# Smoke test: fake Layer-1-like field, single target on a wrapping straight line.
# Verifies the module trains end-to-end (forward + BPTT + loss) BEFORE real Layer-1 fields.
# This is NOT the real training data — that comes in the next step (trajectory-sequence gen).
# ---------------------------------------------------------------------------
def fake_field_batch(rng, B, T, grid, vmax, sigma=1.0, snr=4.0, device="cpu"):
"""Build a toy 'Layer-1 field' sequence + truth. By Claude on 06/21/2026
A single target starts at a random torus cell, moves at constant (vx,vy) px/frame
(=> (vx,vy)/4 cells/frame), wrapping. The s-channel is a noisy Gaussian bump at the target;
Vx,Vy channels carry the true velocity over the bump (+ noise), 0 elsewhere. Returns:
seq [B,T,3,G,G] (s, Vx, Vy)
pos [B,T,2] target (x,y) in cells
veltru [B,T,2] true (Vx,Vy) px/level-frame
"""
seq = torch.zeros(B, T, 3, grid, grid, device=device)
pos = torch.zeros(B, T, 2, device=device)
veltru = torch.zeros(B, T, 2, device=device)
for b in range(B):
x0 = rng.uniform(0, grid); y0 = rng.uniform(0, grid)
ang = rng.uniform(0, 2 * np.pi); spd = rng.uniform(0.3, 1.0) * vmax
vx = spd * np.cos(ang); vy = spd * np.sin(ang) # px/level-frame
for t in range(T):
cx = (x0 + vx / 4.0 * t) # cells (stride-4 => /4)
cy = (y0 + vy / 4.0 * t)
pos[b, t, 0] = cx % grid; pos[b, t, 1] = cy % grid
veltru[b, t, 0] = vx; veltru[b, t, 1] = vy
# s channel: noisy toroidal bump at the target; vel channels: truth over the bump
bump = bump_target(pos[b:b+1].unsqueeze(0).reshape(1, T, 2), grid, sigma, device) # [1,T,1,G,G]
bump = bump[0, :, 0] # [T,G,G]
noise = torch.from_numpy(rng.standard_normal((T, grid, grid)).astype(np.float32)).to(device)
seq[b, :, 0] = (snr * bump + noise).clamp(min=0.0) # s >= 0, SNR-scaled signal in noise
core = (bump > 0.3).float() # [T,G,G]
seq[b, :, 1] = vx * core; seq[b, :, 2] = vy * core
return seq, pos, veltru
def smoke_test(steps=400, B=16, T=8, grid=32, vmax=1.4, device=None):
"""Overfit the toy generator a few hundred steps; det peak should sharpen, vel MSE drop."""
device = device or ("cuda" if torch.cuda.is_available() else "cpu")
rng = np.random.default_rng(0)
net = Layer2Net(ch_in=3, ch_hidden=24, grid=grid, vmax=vmax).to(device)
opt = torch.optim.Adam(net.parameters(), 2e-3)
nparams = sum(p.numel() for p in net.parameters())
print(f"Layer2Net: {nparams} params, grid={grid}, ch_hidden=24, device={device}", flush=True)
for step in range(1, steps + 1):
seq, pos, veltru = fake_field_batch(rng, B, T, grid, vmax, device=device)
det_t = bump_target(pos, grid, sigma=1.0, device=device) # [B,T,1,G,G]
vel_t = torch.zeros(B, T, 2, grid, grid, device=device)
core = (det_t[:, :, 0] > 0.3) # [B,T,G,G]
for c in range(2):
vel_t[:, :, c][core] = veltru[..., c][..., None, None].expand(-1, -1, grid, grid)[core]
det_logit, vel = net(seq)
loss, comp = layer2_loss(det_logit, vel, det_t, vel_t)
opt.zero_grad(); loss.backward(); opt.step()
if step % 50 == 0 or step == 1:
with torch.no_grad():
p = torch.sigmoid(det_logit)
peak = float(p[det_t > 0.3].mean())
bg = float(p[det_t < 0.05].max())
print(f"step {step:4d} det {comp['det']:.4f} vel {comp['vel']:.4f} "
f"peak(s@truth) {peak:.3f} max-bg {bg:.3f}", flush=True)
print("smoke test done.", flush=True)
if __name__ == "__main__":
ap = argparse.ArgumentParser()
ap.add_argument("--steps", type=int, default=400)
ap.add_argument("--grid", type=int, default=32)
ap.add_argument("--vmax", type=float, default=1.4)
a = ap.parse_args()
smoke_test(steps=a.steps, grid=a.grid, vmax=a.vmax)
# layer2_data.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""C5P Layer-2 v1 training-data generator + torus border-crossing verifier. By Claude on 06/21/2026
v1 = absolute minimum, to debug the infrastructure and the TORUS BORDER CROSSING (Andrey 06/21):
- one long, steady, STRAIGHT-LINE run; strong target; fade-in onset; single class;
- periodic 32x32-px scene torus (target position mod 32) + iid Gaussian noise;
- frozen Layer-1 (weighted9_pm, patch=24, N=9, GRID mode) run DENSELY at full resolution
(stride-1) via circular-unfold -> a 32x32 wrapped field {s, Vx, Vy} per level-frame;
- L2 ingests the full 32x32 field sequence (free-orbit; absolute pos = truth-only).
Full-res rationale (vs stride-4 8x8): pixel-grid == torus-grid, so the truth bump and readout
sit at the actual pixel — no stride-4 / dx,dy reconstruction to get wrong while we verify the
seam. Cost is trivial at 32x32. Torus >= L1 patch (32 >= 24) so a wrap-slice holds the target
ONCE (no L1 self-alias).
L1 dense eval: circular-pad the periodic 32x32 scene by (12,11) per axis, unfold 24x24 stride-1
-> 32x32 patches; output pixel j <-> target at scene index j (matches training cx0=P/2=12, so no
half-pixel offset). Grid head decoded: s=sigmoid(det); (Vx,Vy)=softmax-centroid of the 11x11
velocity grid / vel_decimate (px/level-frame).
Run the verifier: python layer2_data.py --l1 runs/weighted9_pm/model.pt
"""
import argparse
import numpy as np
import torch
import torch.nn.functional as F
import synth
from model import RawFCN
VEL_RADIUS = 5 # weighted9_pm: 11x11 velocity grid
VEL_DECIMATE = 4 # 4 cells = 1 px/level-frame
L1_PATCH = 24
L1_HALF = L1_PATCH // 2 # 12; patch for output j spans original [j-12, j+11]
# ---------------------------------------------------------------------------
# Periodic scene rendering (numpy)
# ---------------------------------------------------------------------------
def halfcos_bump_torus(cx, cy, G):
"""Half-cosine bump centered at (cx,cy) on a G x G TORUS (wraps both axes). By Claude 06/21
Separable cos(pi/3|d|) for |d|<1.5 (same shape as synth.halfcos_bump), but distance uses the
wrapped (toroidal) metric so a bump near the seam straddles it as one round bump."""
xs = np.arange(G)[None, :] - cx
ys = np.arange(G)[:, None] - cy
dx = (xs + G / 2) % G - G / 2 # wrapped delta in (-G/2, G/2]
dy = (ys + G / 2) % G - G / 2
bx = np.where(np.abs(dx) < 1.5, np.cos(np.pi / 3.0 * np.abs(dx)), 0.0)
by = np.where(np.abs(dy) < 1.5, np.cos(np.pi / 3.0 * np.abs(dy)), 0.0)
return (bx * by).astype(np.float32) # [G,G]
def gauss_blob_torus(cx, cy, G, sigma):
"""Broad isotropic Gaussian blob on a G x G TORUS (wraps both axes). By Claude 06/22
Used for PERSISTENT CLUTTER: a wider-than-target scene feature that L1 lights up as a standing
low-freq cloud. Distance uses the wrapped (toroidal) metric (round blob across the seam)."""
xs = np.arange(G)[None, :] - cx
ys = np.arange(G)[:, None] - cy
dx = (xs + G / 2) % G - G / 2
dy = (ys + G / 2) % G - G / 2
return np.exp(-(dx * dx + dy * dy) / (2.0 * sigma * sigma)).astype(np.float32) # [G,G]
def _sample_velocity(rng, vmax):
"""One (vx,vy) on the annulus 0.3*vmax <= |v| <= vmax (px/level-frame). By Claude 06/22"""
for _ in range(50):
vx = rng.uniform(-vmax, vmax); vy = rng.uniform(-vmax, vmax)
r2 = vx * vx + vy * vy
if (0.3 * vmax) ** 2 < r2 <= vmax * vmax:
return float(vx), float(vy)
return 0.7 * vmax, 0.0
def _bandpass_envelope(rng, T, bp_lo, bp_hi, offset):
"""Per-frame target amplitude multiplier from a limited band-pass filter. By Claude on 06/23/2026
white noise -> band-pass [bp_lo,bp_hi] cyc/seq -> unit-std -> +offset -> ReLU.
Band-pass removes flicker (HF) and DC (mean->0 so it crosses zero => GAPS where ReLU clips to 0);
the offset sets the duty cycle (~fraction present ≈ Phi(offset)). NOT capped: the amplitude is the
natural filter output, so it occasionally runs high (>1) and often sits moderate — random is
random (Andrey 06/23). The target bump is simply scaled by this per frame; everything else in the
scene stays exactly like L1 training (no clutter, single target)."""
w = rng.standard_normal(T)
Fw = np.fft.rfft(w)
fbin = np.arange(Fw.shape[0]) # frequency in cycles per sequence
Fw[(fbin < bp_lo) | (fbin > bp_hi)] = 0.0 # band-pass
x = np.fft.irfft(Fw, n=T).astype(np.float32)
x /= (x.std() + 1e-8) # unit std -> offset is the duty knob
return np.maximum(0.0, x + offset).astype(np.float32) # ReLU: 0 in gaps, uncapped when present
def render_run(rng, T=64, G=32, vmax=1.4, snr=6.0, noise_prefix=(8, 24), fade=(4, 10),
p_abrupt=0.35, p_maneuver=0.0, turn_sigma=0.07, # p_maneuver=0: CONSTANT velocity for now;
# add maneuvering as a SEPARATE later step (one challenge at a time, Andrey 06/23)
gaps=True, bp_lo=3, bp_hi=9, duty_offset=-0.3, starter_len=8,
p_death=0.0, return_signal=False): # p_death=0: targets IMMORTAL (always coast/hope);
# mechanism kept (gated) to re-enable "voluntary death" later. Andrey 06/23
"""One L2 training run: an L1-INPUT scene (same distribution L1 was trained on) of a SINGLE sharp
target, whose amplitude is multiplied PER FRAME by a band-pass envelope. By Claude 06/23 (v3)
The synthetic scene here is the L1 INPUT (raw scene -> frozen L1 -> L1 output field -> L2). It must
stay in L1's training distribution, so: iid-N(0,1) background + ONE half-cosine target, NO clutter,
no "bad"/extra targets (Andrey 06/23). The ONLY L2-specific change is the per-frame amplitude
multiplier from a limited band-pass filter, which creates fades and hard zero-streaks (gaps).
Returns:
frames [T,G,G] iid-N(0,1) background + the per-frame-amplitude-scaled target bump
pos [T,2] target (x,y) in px mod G; NaN when the target is truth-ABSENT
vel [T,2] per-frame instantaneous (vx,vy) px/level-frame (0 where absent)
present [T] 0/1 TRUTH-present flag (the supervision target) — 1 THROUGH gaps, 0 after death
The two masks:
- TRUTH-present (`present`): what L2 must report. 1 from onset, stays 1 across signal gaps
(the target keeps moving), drops to 0 only at death (permanent disappearance).
- rendered-SIGNAL (internal `signal[t]`): whether the target bump is actually drawn into
`frames`. During a GAP the amplitude envelope is 0 while present stays 1 -> the frozen 9-frame
L1 window is starved -> L1's field goes dark THERE -> the ConvGRU must COAST on hidden state
+ velocity to keep firing at the (moved) truth position.
"""
frames = rng.standard_normal((T, G, G)).astype(np.float32) # iid N(0,1) background (as L1 training)
pos = np.full((T, 2), np.nan, np.float32)
vel = np.zeros((T, 2), np.float32)
present = np.zeros(T, np.float32)
# (clutter / "bad targets" removed 06/23 — the L1-input scene is a SINGLE target on noise.)
# --- target trajectory: onset, abrupt/fade, straight/maneuvering, gaps, optional death
onset = int(rng.integers(noise_prefix[0], noise_prefix[1] + 1))
nfade = 1 if rng.random() < p_abrupt else int(rng.integers(fade[0], fade[1] + 1))
maneuver = rng.random() < p_maneuver
vx, vy = _sample_velocity(rng, vmax)
cx = rng.uniform(0, G); cy = rng.uniform(0, G) # onset position (sub-pixel)
# death: with some prob the target permanently leaves after it has had time to lock; the tail
# is supervised ABSENT so L2 must RELEASE a dead track. Death-absence is long (to end of run);
# gaps are short (envelope zero-streaks). The net learns "coast a few frames, then release" from the
# contrast: an absence that ends quickly == gap (still present); one that never ends == death.
death_t = T
if rng.random() < p_death:
earliest = onset + nfade + 8 # only after a real lock window
if earliest < T - 2:
death_t = int(rng.integers(earliest, T))
# amplitude envelope: smooth fades + hard zero-streaks (gaps) from ONE band-pass process.
# Full amplitude through a "starter" window (clean acquire) then modulated; env==0 => target
# bump not drawn => the frozen 9-frame L1 window is starved => the ConvGRU must COAST on hidden
# state + velocity. By Claude on 06/22/2026.
signal = np.zeros(T, np.float32)
env = np.zeros(T, np.float32)
if gaps:
env_bp = _bandpass_envelope(rng, T, bp_lo, bp_hi, duty_offset) # [T] >=0, uncapped
acquire_end = onset + nfade + starter_len # clean full-SNR acquire (no gaps)
env[onset:death_t] = env_bp[onset:death_t]
env[onset:min(death_t, acquire_end)] = 1.0 # starter clamp: focus on MAINTAIN
else:
env[onset:death_t] = 1.0 # no gaps: full amplitude (v1-like)
for t in range(onset, T):
k = t - onset
cxw = cx % G; cyw = cy % G
if t < death_t:
pos[t] = (cxw, cyw); vel[t] = (vx, vy); present[t] = 1.0
base = snr * min(1.0, (k + 1) / nfade) # onset fade-in (nfade=1 => abrupt)
amp = base * env[t] # envelope: fades + GAPS(env=0) + camels
signal[t] = 1.0 if env[t] > 1e-6 else 0.0 # 0 in a gap => L1 starved => coast
if amp > 0:
frames[t] += amp * halfcos_bump_torus(cxw, cyw, G)
# advance the target (it keeps MOVING through gaps; truth pos stays correct)
cx += vx; cy += vy
if maneuver: # smooth heading/speed random walk
ang = np.arctan2(vy, vx) + rng.normal(0.0, turn_sigma)
spd = float(np.hypot(vx, vy)) * float(np.exp(rng.normal(0.0, 0.05)))
spd = min(max(spd, 0.3 * vmax), vmax)
vx = spd * np.cos(ang); vy = spd * np.sin(ang)
if return_signal:
return frames, pos, vel, present, signal # signal[t]=1 if bump rendered, 0 in a gap
return frames, pos, vel, present
# ---------------------------------------------------------------------------
# Frozen Layer-1 dense eval at full resolution (torch, GPU)
# ---------------------------------------------------------------------------
def l1_field_torus(net, window9, G, dev):
"""Run frozen grid-mode L1 over a periodic 9-frame window -> 32x32 {s,Vx,Vy}. By Claude 06/21
window9: [N,G,G] (N=9, newest first to match training). Returns field [3,G,G] = (s, Vx, Vy),
Vx,Vy in px/level-frame. Stride-1 dense via circular-pad + unfold (output j <-> scene idx j)."""
N = window9.shape[0]
x = torch.from_numpy(window9[None]).to(dev) # [1,N,G,G]
# circular pad (left=12, right=11) both axes so a 24-patch centers on every output pixel
xp = F.pad(x, (L1_HALF, L1_PATCH - 1 - L1_HALF, L1_HALF, L1_PATCH - 1 - L1_HALF), mode='circular')
cols = F.unfold(xp, kernel_size=L1_PATCH) # [1, N*24*24, G*G]
L = cols.shape[-1] # = G*G = 1024
cols = cols.reshape(N, L1_PATCH, L1_PATCH, L).permute(3, 0, 1, 2).contiguous() # [L,N,24,24]
with torch.no_grad():
out = net(cols)[:, :, 0, 0] # [L, 124]
det, vel, off = net.split(out.unsqueeze(-1).unsqueeze(-1)) # det[L,1,1], vel[L,121,1,1]...
s = torch.sigmoid(det.reshape(L)) # [L] confidence
vdim = net.vdim
p = torch.softmax(vel.reshape(L, vdim * vdim), dim=1).reshape(L, vdim, vdim) # [L, vy, vx]
cells = torch.arange(vdim, device=dev).float() - VEL_RADIUS # -5..5
pvx = (p.sum(1) * cells).sum(1) / VEL_DECIMATE # px/level-frame (vx inner dim)
pvy = (p.sum(2) * cells).sum(1) / VEL_DECIMATE
field = torch.stack([s, pvx, pvy], 0).reshape(3, G, G) # [3,G,G]
return field.cpu().numpy()
def gen_field_sequence(net, frames, pos, G, N, dev):
"""Slide the N-frame L1 window over a run -> field seq [T,3,G,G] aligned to each frame.
Frame t uses window [t, t-1, ..., t-(N-1)] (newest first), clamped at the start. By Claude 06/21"""
T = frames.shape[0]
seq = np.zeros((T, 3, G, G), np.float32)
for t in range(T):
idx = [max(0, t - i) for i in range(N)] # newest first; clamp pre-roll
seq[t] = l1_field_torus(net, frames[idx], G, dev)
return seq
# ---------------------------------------------------------------------------
# Border-crossing verifier
# ---------------------------------------------------------------------------
def wrapped_err(a, b, G):
d = (a - b + G / 2) % G - G / 2
return float(np.hypot(d[0], d[1]))
def verify(l1_path, T=64, G=32, vmax=1.4, snr=6.0, seed=0):
dev = "cuda" if torch.cuda.is_available() else "cpu"
ck = torch.load(l1_path, map_location=dev)
a = ck.get("args", {})
N = a.get("nframes", 9)
net = RawFCN(n_frames=N, patch=a.get("patch", 24), velocity_mode="grid",
vel_radius=a.get("vel_radius", VEL_RADIUS)).to(dev)
net.load_state_dict(ck["model"]); net.eval()
print(f"L1 {l1_path}: patch={a.get('patch',24)} N={N} grid; dev={dev}", flush=True)
rng = np.random.default_rng(seed)
frames, pos, vel, present = render_run(rng, T=T, G=G, vmax=vmax, snr=snr)
seq = gen_field_sequence(net, frames, pos, G, N, dev) # [T,3,G,G]
vp = vel[present > 0] if (present > 0).any() else vel[:1] # per-frame now ([T,2]); summarize
print(f"velocity truth (px/level-frame, mean over present): vx={vp[:,0].mean():+.3f} "
f"vy={vp[:,1].mean():+.3f} |v|~{np.hypot(vp[:,0],vp[:,1]).mean():.3f} (per-frame; maneuver varies)",
flush=True)
print("frame present truth(x,y) L1 s-peak(x,y) pos-err s@peak Vx,Vy@peak(decoded) cross?", flush=True)
prev_peak = None
perr_present = []
for t in range(T):
s = seq[t, 0]
pj = int(np.argmax(s)); py, px = divmod(pj, G)
speak = s[py, px]
vxp, vyp = seq[t, 1, py, px], seq[t, 2, py, px]
cross = ""
if prev_peak is not None:
# seam-crossing flag: peak jumped across the wrap (large raw jump, small wrapped jump)
raw = np.hypot(px - prev_peak[0], py - prev_peak[1])
wrp = wrapped_err(np.array([px, py], float), np.array(prev_peak, float), G)
if raw - wrp > 4: cross = " <-- SEAM"
prev_peak = (px, py)
if present[t]:
perr = wrapped_err(np.array([px, py], float), pos[t], G)
perr_present.append(perr)
tru = f"({pos[t,0]:5.1f},{pos[t,1]:5.1f})"
else:
perr = float('nan'); tru = " -- "
print(f"{t:4d} {int(present[t])} {tru} ({px:2d},{py:2d}) "
f"{perr:5.2f} {speak:.3f} {vxp:+.3f},{vyp:+.3f}{cross}", flush=True)
pe = np.array(perr_present)
print(f"\nposition error over present frames: mean {np.nanmean(pe):.2f} px max {np.nanmax(pe):.2f} px "
f"(over {len(pe)} frames)", flush=True)
print("PASS: L1 field peak tracks the target across the seam." if np.nanmean(pe) < 2.0
else "CHECK: peak/truth offset > 2px — inspect alignment (half-pixel? velocity sign?).", flush=True)
return seq, frames, pos, vel, present
def _load_l1(l1_path, dev):
ck = torch.load(l1_path, map_location=dev)
a = ck.get("args", {})
N = a.get("nframes", 9)
net = RawFCN(n_frames=N, patch=a.get("patch", 24), velocity_mode="grid",
vel_radius=a.get("vel_radius", VEL_RADIUS)).to(dev)
net.load_state_dict(ck["model"]); net.eval()
return net, N, a
def display_run(l1_path, T=120, G=32, vmax=1.4, snr=6.0, seed=0, out="runs/l2_l1view", render_kw=None):
"""Run L1 over a long run and write scrubable TIFF stacks + a PNG montage. By Claude 06/21
Outputs (in `out/`): input.tif, s.tif, Vx.tif, Vy.tif (T-page 32-bit, ImageJ) and montage.png
(12 evenly-spaced s-frames with the truth position overlaid).
render_kw forwards the gap-envelope knobs; prints a per-frame present/signal/L1-s@truth table so
the L1 stage can be VERIFIED on gap data IN ISOLATION before trusting L2. By Claude 06/23"""
import os
os.makedirs(out, exist_ok=True)
dev = "cuda" if torch.cuda.is_available() else "cpu"
net, N, a = _load_l1(l1_path, dev)
rng = np.random.default_rng(seed)
frames, pos, vel, present, signal = render_run(rng, T=T, G=G, vmax=vmax, snr=snr,
return_signal=True, **(render_kw or {}))
seq = gen_field_sequence(net, frames, pos, G, N, dev) # [T,3,G,G] = s,Vx,Vy
vp = vel[present > 0] if (present > 0).any() else vel[:1] # [.,2] per-frame velocity
vmean = float(np.hypot(vp[:, 0], vp[:, 1]).mean())
ncross = int(round((vmean * present.sum()) / G)) # rough seam-crossing count
print(f"L1 {l1_path}: patch={a.get('patch',24)} N={N}; T={T}, |v|~{vmean:.3f} px/fr, "
f"~{ncross} seam crossings, target present {int(present.sum())}/{T} frames", flush=True)
truth = np.zeros((T, G, G), np.float32) # truth marker (target-shape bump)
for t in range(T):
if present[t]:
truth[t] = halfcos_bump_torus(pos[t, 0], pos[t, 1], G)
synth.save_tiff_stack(frames, f"{out}/input.tif") # raw periodic scene
synth.save_tiff_stack(seq[:, 0], f"{out}/s.tif") # L1 confidence field
synth.save_tiff_stack(seq[:, 1], f"{out}/Vx.tif") # decoded Vx (px/level-frame)
synth.save_tiff_stack(seq[:, 2], f"{out}/Vy.tif")
synth.save_tiff_stack(truth, f"{out}/truth.tif") # truth position (compare vs s)
synth.save_tiff_stack(np.broadcast_to(signal[:, None, None], (T, G, G)).astype(np.float32),
f"{out}/signal.tif") # 1=bump rendered, 0=GAP (L1 starved)
print(f"wrote {out}/{{input,s,Vx,Vy,truth,signal}}.tif ({T} pages each, 32-bit float, ImageJ)", flush=True)
# --- L1 VERIFICATION on gap data: per-frame present / signal / L1 s@truth + away-FP. The check:
# in GAP frames (signal=0, present=1) L1 s@truth should COLLAPSE (L1 starved); when signal
# returns it should re-lock. If L1 does NOT go dark in gaps, the L2 gap result is meaningless.
# By Claude on 06/23/2026.
print("\nframe pres sig L1_s@truth L1_max_away note", flush=True)
ng = int((present > 0).sum()); ngap = int(((present > 0) & (signal < 0.5)).sum())
for t in range(T):
if not present[t]:
continue
cx, cy = pos[t]
ci, cj = int(round(cy)) % G, int(round(cx)) % G
s_at = float(seq[t, 0, ci, cj])
mask = np.ones((G, G), bool) # away = outside a 3px disk of truth
yy, xx = np.ogrid[:G, :G]
dwin = ((((xx - cj + G/2) % G - G/2)**2 + ((yy - ci + G/2) % G - G/2)**2) <= 9)
mask[dwin] = False
s_away = float(seq[t, 0][mask].max())
note = "GAP -> want s@truth low" if signal[t] < 0.5 else ""
# print sparsely: every gap frame + a few present frames
if signal[t] < 0.5 or t % 8 == 0:
print(f"{t:4d} {int(present[t])} {int(signal[t])} {s_at:6.3f} {s_away:6.3f} {note}", flush=True)
print(f"\nsummary: present {ng}/{T} frames, of which {ngap} are GAP frames (signal=0).", flush=True)
print("VERIFY: gap-frame s@truth should be markedly LOWER than non-gap s@truth.", flush=True)
# 2x2-tiled stacks: the torus seam now runs through the CENTER CROSS (x=G, y=G) of the 2Gx2G
# image -> any seam discontinuity is glaring there. Clean torus => invisible cross. By Claude 06/21
def tile2x2(st): return np.tile(st, (1, 2, 2)) # [T,G,G] -> [T,2G,2G]
for nm, st in [("input", frames), ("s", seq[:, 0]), ("Vx", seq[:, 1]),
("Vy", seq[:, 2]), ("truth", truth)]:
synth.save_tiff_stack(tile2x2(st), f"{out}/{nm}_2x2.tif")
print(f"wrote {out}/*_2x2.tif (2Gx2G tiled; seam = center cross at {G},{G})", flush=True)
_tiled_montage(tile2x2(seq[:, 0]), G, f"{out}/montage_2x2.png")
import matplotlib; matplotlib.use("Agg"); import matplotlib.pyplot as plt
idx = np.linspace(0, T - 1, 12).astype(int)
fig, axs = plt.subplots(3, 4, figsize=(13, 10))
for ax, t in zip(axs.ravel(), idx):
ax.imshow(seq[t, 0], vmin=0, vmax=1, cmap="magma", origin="upper")
if present[t]:
ax.plot(pos[t, 0], pos[t, 1], "c+", ms=12, mew=2) # truth position (x,y)
ax.set_title(f"f{t} {'tgt' if present[t] else 'noise'}", fontsize=9)
ax.set_xticks([]); ax.set_yticks([])
fig.suptitle(f"L1 confidence field s — {T}-frame run (cyan + = truth) |v|~{vmean:.2f}px/fr")
fig.tight_layout(); fig.savefig(f"{out}/montage.png", dpi=90)
print(f"wrote {out}/montage.png", flush=True)
_save_gif(frames, seq[:, 0], pos, present, f"{out}/watch.gif") # [input | s] + truth, animated
print(f"wrote {out}/watch.gif ({T} frames, [input | L1 s] side-by-side, cyan + = truth)", flush=True)
return seq, frames, pos, vel, present
def _tiled_montage(s2, G, path):
"""8 frames of the 2x2-tiled s field with seam guide-lines at the center cross. By Claude 06/21
Continuity across the cyan guide lines == clean torus (no seam artifact)."""
import matplotlib; matplotlib.use("Agg"); import matplotlib.pyplot as plt
T = s2.shape[0]; idx = np.linspace(0, T - 1, 8).astype(int)
fig, axs = plt.subplots(2, 4, figsize=(13, 7))
for ax, t in zip(axs.ravel(), idx):
ax.imshow(s2[t], vmin=0, vmax=1, cmap="magma", origin="upper")
ax.axvline(G - 0.5, color="cyan", lw=0.6, alpha=0.6) # seam (center cross)
ax.axhline(G - 0.5, color="cyan", lw=0.6, alpha=0.6)
ax.set_title(f"f{t}", fontsize=9); ax.set_xticks([]); ax.set_yticks([])
fig.suptitle("2x2-tiled L1 s — seam = cyan center cross; continuous across it => clean torus")
fig.tight_layout(); fig.savefig(path, dpi=90); plt.close(fig)
print(f"wrote {path}", flush=True)
def _save_gif(frames, sfield, pos, present, path, up=8, dur=120):
"""Animated [input | s] with cyan truth marker, upscaled x`up` for watchability. By Claude 06/21"""
import matplotlib
from PIL import Image, ImageDraw
T, G, _ = frames.shape
fn = (frames - frames.min()) / (np.ptp(frames) + 1e-9) # global gray-normalize input
gray = matplotlib.colormaps["gray"]; mag = matplotlib.colormaps["magma"]
pages = []
for t in range(T):
li = (gray(fn[t])[..., :3] * 255).astype(np.uint8) # [G,G,3] input
ls = (mag(np.clip(sfield[t], 0, 1))[..., :3] * 255).astype(np.uint8) # [G,G,3] s in [0,1]
sep = np.full((G, 2, 3), 60, np.uint8)
row = np.concatenate([li, sep, ls], axis=1) # [G, 2G+2, 3]
im = Image.fromarray(row).resize((row.shape[1] * up, G * up), Image.NEAREST)
if present[t]:
d = ImageDraw.Draw(im)
for x0 in (0, (G + 2) * up): # mark both panels
cx = x0 + pos[t, 0] * up; cy = pos[t, 1] * up
d.line([(cx - 6, cy), (cx + 6, cy)], fill=(0, 255, 255), width=2)
d.line([(cx, cy - 6), (cx, cy + 6)], fill=(0, 255, 255), width=2)
pages.append(im)
pages[0].save(path, save_all=True, append_images=pages[1:], duration=dur, loop=0)
if __name__ == "__main__":
ap = argparse.ArgumentParser()
ap.add_argument("--mode", choices=["verify", "display"], default="verify")
ap.add_argument("--l1", default="runs/weighted9_pm/model.pt")
ap.add_argument("--T", type=int, default=64); ap.add_argument("--G", type=int, default=32)
ap.add_argument("--vmax", type=float, default=1.4); ap.add_argument("--snr", type=float, default=6.0)
ap.add_argument("--seed", type=int, default=0); ap.add_argument("--out", default="runs/l2_l1view")
# gap-envelope knobs for the L1-on-gaps verification (display mode). By Claude 06/23
ap.add_argument("--gaps", action="store_true", help="render band-pass amplitude gaps (the fancy data)")
ap.add_argument("--bp_lo", type=int, default=3); ap.add_argument("--bp_hi", type=int, default=9)
ap.add_argument("--duty_offset", type=float, default=-0.3); ap.add_argument("--starter_len", type=int, default=8)
a = ap.parse_args()
if a.mode == "verify":
verify(a.l1, T=a.T, G=a.G, vmax=a.vmax, snr=a.snr, seed=a.seed)
else:
render_kw = dict(gaps=a.gaps, bp_lo=a.bp_lo, bp_hi=a.bp_hi,
duty_offset=a.duty_offset, starter_len=a.starter_len)
display_run(a.l1, T=a.T, G=a.G, vmax=a.vmax, snr=a.snr, seed=a.seed, out=a.out, render_kw=render_kw)
# layer2_eval.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""C5P Layer-2 testing/visualization on same-generator input. By Claude 06/22/2026
Loads a trained Layer2Net + frozen L1, generates a same-generator test sequence (render_run ->
frozen-L1 field), runs L2, and writes 32-bit float TIFF stacks AND 2x2-tiled versions (seam =
center cross, same check we used for L1) so the L2 output can be watched the same way:
L1_s = L1 input confidence field
L2_det = L2 track-before-detect output (sigmoid)
L2_s_v = L2 detection masked-overlay of |V| (optional sanity of velocity field)
truth = target-shape bump at truth position
Also prints per-frame lock / FP metrics. Run on DGX:
python layer2_eval.py --l2 runs/l2_v1/model.pt --T 120 --out runs/l2_v1/test
"""
import argparse
import numpy as np
import torch
import synth
import layer2_data as L1D
from layer2 import Layer2Net
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--l1", default="runs/weighted9_pm/model.pt")
ap.add_argument("--l2", default="runs/l2_v1/model.pt")
ap.add_argument("--T", type=int, default=120); ap.add_argument("--seed", type=int, default=777)
ap.add_argument("--snr", type=float, default=6.0); ap.add_argument("--out", default="runs/l2_v1/test")
a = ap.parse_args()
import os; os.makedirs(a.out, exist_ok=True)
dev = "cuda" if torch.cuda.is_available() else "cpu"
net1, N, _ = L1D._load_l1(a.l1, dev)
ck = torch.load(a.l2, map_location=dev); la = ck.get("args", {})
G = la.get("G", 32); vmax = la.get("vmax", 1.4)
net2 = Layer2Net(ch_in=3, ch_hidden=la.get("ch", 24), grid=G, vmax=vmax).to(dev)
net2.load_state_dict(ck["model"]); net2.eval()
print(f"L1={a.l1} (N={N}) L2={a.l2} (ch={la.get('ch',24)}, G={G}) dev={dev}", flush=True)
rng = np.random.default_rng(a.seed)
frames, pos, vel, present = L1D.render_run(rng, T=a.T, G=G, vmax=vmax, snr=a.snr)
seq = L1D.gen_field_sequence(net1, frames, pos, G, N, dev) # [T,3,G,G]
with torch.no_grad():
det_logit, velo = net2(torch.from_numpy(seq[None]).to(dev))
l2det = torch.sigmoid(det_logit)[0, :, 0].cpu().numpy() # [T,G,G]
l2vx = velo[0, :, 0].cpu().numpy(); l2vy = velo[0, :, 1].cpu().numpy() # [T,G,G] px/level-frame
truth = np.zeros((a.T, G, G), np.float32)
for t in range(a.T):
if present[t]:
truth[t] = L1D.halfcos_bump_torus(pos[t, 0], pos[t, 1], G)
L1_s = seq[:, 0]
stacks = {"L1_s": L1_s, "L2_det": l2det, "L2_Vx": l2vx, "L2_Vy": l2vy, "truth": truth}
for nm, st in stacks.items():
synth.save_tiff_stack(st, f"{a.out}/{nm}.tif")
synth.save_tiff_stack(np.tile(st, (1, 2, 2)), f"{a.out}/{nm}_2x2.tif")
print(f"wrote {a.out}/{{L1_s,L2_det,L2_Vx,L2_Vy,truth}}{{,_2x2}}.tif ({a.T} pages, 32-bit float)", flush=True)
# velocity accuracy: L2 (Vx,Vy) read at the detected peak vs truth velocity
verr = []
for t in range(a.T):
if not present[t]:
continue
pj = int(np.argmax(l2det[t])); py, px = divmod(pj, G)
verr.append((l2vx[t, py, px] - vel[t, 0], l2vy[t, py, px] - vel[t, 1])) # vel is per-frame [T,2]
verr = np.array(verr[3:]) # skip pre-lock
vp = vel[present > 0]
print(f"truth vel (mean over present) = ({vp[:,0].mean():+.3f},{vp[:,1].mean():+.3f}) px/level-frame; "
f"L2 vel@peak error: mean |dV| {np.hypot(verr[:,0],verr[:,1]).mean():.3f} "
f"(bias {verr[:,0].mean():+.3f},{verr[:,1].mean():+.3f})", flush=True)
L1D._tiled_montage(np.tile(l2det, (1, 2, 2)), G, f"{a.out}/L2_det_2x2.png")
# L1 -> L2 -> truth comparison montage (the cloud->clean-blob transformation)
import matplotlib; matplotlib.use("Agg"); import matplotlib.pyplot as plt
idx = np.linspace(0, a.T - 1, 7).astype(int)
fig, axs = plt.subplots(3, 7, figsize=(15, 6.5))
for j, t in enumerate(idx):
for r, (img, lab) in enumerate([(L1_s[t], "L1 s"), (l2det[t], "L2 det"), (truth[t], "truth")]):
axs[r, j].imshow(img, vmin=0, vmax=1, cmap="magma", origin="upper")
axs[r, j].set_xticks([]); axs[r, j].set_yticks([])
if j == 0: axs[r, j].set_ylabel(lab, fontsize=11)
axs[0, j].set_title(f"f{t} {'tgt' if present[t] else 'noise'}", fontsize=9)
fig.suptitle("L1 input (clouds) -> L2 detection (clean) -> truth")
fig.tight_layout(); fig.savefig(f"{a.out}/compare.png", dpi=90); plt.close(fig)
print(f"wrote {a.out}/compare.png", flush=True)
# per-frame metrics: s@truth (lock) vs HONEST FP = max bg excluding a radius-R disk around truth
yy, xx = np.mgrid[0:G, 0:G]; R = 4.0
print("\nframe present s@truth FP(>4px from tgt) note", flush=True)
locked = None; onset = int(np.argmax(present)) if present.any() else 0
fp_present = []
for t in range(a.T):
core = truth[t] > 0.3
sat = float(l2det[t][core].mean()) if core.any() else float('nan')
if present[t]:
dx = (xx - pos[t, 0] + G / 2) % G - G / 2; dy = (yy - pos[t, 1] + G / 2) % G - G / 2
far = (dx * dx + dy * dy) > R * R # exclude target neighborhood
else:
far = np.ones((G, G), bool)
fp = float(l2det[t][far].max()) if far.any() else 0.0
if present[t]: fp_present.append(fp)
note = ""
if present[t] and locked is None and sat > 0.5:
locked = t; note = f"<- LOCK (+{t - onset} fr)"
if t % 8 == 0 or note:
print(f"{t:4d} {'tgt' if present[t] else 'noise':5s} {sat:6.3f} {fp:6.3f} {note}", flush=True)
print(f"\nlock frame: {locked} (onset {onset}); FP on locked frames: "
f"mean {np.mean(fp_present[2:] or [0]):.3f} max {np.max(fp_present[2:] or [0]):.3f}", flush=True)
if __name__ == "__main__":
main()
# layer2_gapcheck.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""C5P Layer-2 gap-coast diagnostic — the direct test for the wild-test failure. By Claude 06/22/2026
The 2026-06-22 wild test found L2 drops the track the instant the L1 signal does (v1 had no gaps in
training, so the recurrence never learned to coast). This script builds a DETERMINISTIC run with a
single, explicit signal GAP in the middle of an otherwise clean straight track — no clutter, no
death, no maneuver — runs frozen L1 then trained L2, and reports, frame by frame:
L1 s@truth — frozen-L1 confidence AT the (moving) truth position. COLLAPSES inside the gap.
L2 s@truth — trained-L2 confidence AT the truth position. Should COAST (stay high).
L2 pos-err — wrapped distance from L2 peak to truth (px). Should stay small in the gap.
PASS = L2 holds the track through the gap while L1 has gone dark. That is the memory doing its job.
Run on DGX: python layer2_gapcheck.py --l2 runs/l2_v2/model.pt --gap 18 26
"""
import argparse
import numpy as np
import torch
import synth
import layer2_data as L1D
from layer2 import Layer2Net
def render_gap(T, G, vmax, snr, gap, onset=8, seed=0):
"""Clean straight track with ONE explicit signal gap [gap0,gap1). By Claude 06/22
Returns frames[T,G,G], pos[T,2] (truth, defined from onset), vel[2], present[T], gapmask[T].
The target keeps MOVING through the gap (truth pos advances); only the rendered bump is removed,
so the frozen 9-frame L1 window is starved while truth-present stays 1."""
rng = np.random.default_rng(seed)
frames = rng.standard_normal((T, G, G)).astype(np.float32)
pos = np.full((T, 2), np.nan, np.float32)
present = np.zeros(T, np.float32)
gapmask = np.zeros(T, bool); gapmask[gap[0]:gap[1]] = True
vx, vy = L1D._sample_velocity(rng, vmax)
x0 = rng.uniform(0, G); y0 = rng.uniform(0, G)
for t in range(onset, T):
k = t - onset
cx = (x0 + vx * k) % G; cy = (y0 + vy * k) % G
pos[t] = (cx, cy); present[t] = 1.0
if not gapmask[t]: # gap => bump NOT rendered (L1 starves)
frames[t] += snr * L1D.halfcos_bump_torus(cx, cy, G)
return frames, pos, np.array([vx, vy], np.float32), present, gapmask
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--l1", default="runs/weighted9_pm/model.pt")
ap.add_argument("--l2", default="runs/l2_v2/model.pt")
ap.add_argument("--T", type=int, default=40); ap.add_argument("--gap", type=int, nargs=2, default=[18, 26])
ap.add_argument("--snr", type=float, default=6.0); ap.add_argument("--seed", type=int, default=0)
ap.add_argument("--out", default="runs/l2_v2/gapcheck")
a = ap.parse_args()
import os; os.makedirs(a.out, exist_ok=True)
dev = "cuda" if torch.cuda.is_available() else "cpu"
net1, N, _ = L1D._load_l1(a.l1, dev)
ck = torch.load(a.l2, map_location=dev); la = ck.get("args", {})
G = la.get("G", 32); vmax = la.get("vmax", 1.4)
net2 = Layer2Net(ch_in=3, ch_hidden=la.get("ch", 24), grid=G, vmax=vmax).to(dev)
net2.load_state_dict(ck["model"]); net2.eval()
print(f"L1={a.l1} (N={N}) L2={a.l2} (ch={la.get('ch',24)}, G={G}) gap={a.gap} dev={dev}", flush=True)
frames, pos, vel, present, gap = render_gap(a.T, G, vmax, a.snr, a.gap, seed=a.seed)
seq = L1D.gen_field_sequence(net1, frames, pos, G, N, dev) # [T,3,G,G]
with torch.no_grad():
det_logit, _ = net2(torch.from_numpy(seq[None]).to(dev))
l2 = torch.sigmoid(det_logit)[0, :, 0].cpu().numpy() # [T,G,G]
l1s = seq[:, 0]
truth = np.zeros((a.T, G, G), np.float32)
for t in range(a.T):
if present[t]: truth[t] = L1D.halfcos_bump_torus(pos[t, 0], pos[t, 1], G)
for nm, st in [("L1_s", l1s), ("L2_det", l2), ("truth", truth)]:
synth.save_tiff_stack(st, f"{a.out}/{nm}.tif")
print(f"wrote {a.out}/{{L1_s,L2_det,truth}}.tif ({a.T} pages)\n", flush=True)
print("frame present in_gap L1 s@truth L2 s@truth L2 pos-err", flush=True)
pre, ingap = [], []
for t in range(a.T):
core = truth[t] > 0.3
l1v = float(l1s[t][core].mean()) if core.any() else float('nan')
l2v = float(l2[t][core].mean()) if core.any() else float('nan')
pj = int(np.argmax(l2[t])); py, px = divmod(pj, G)
perr = L1D.wrapped_err(np.array([px, py], float), pos[t], G) if present[t] else float('nan')
flag = "GAP" if gap[t] else ("tgt" if present[t] else "noise")
print(f"{t:4d} {int(present[t])} {flag:5s} {l1v:7.3f} {l2v:7.3f} {perr:6.2f}", flush=True)
if present[t] and not gap[t] and t < a.gap[0]:
pre.append((l1v, l2v))
if gap[t]:
ingap.append((l1v, l2v, perr))
pre = np.array(pre); ingap = np.array(ingap)
l1_drop = ingap[:, 0].mean(); l2_hold = ingap[:, 1].mean(); perr_gap = np.nanmean(ingap[:, 2])
print(f"\nbefore gap (locked): L1 s@truth {pre[:,0].mean():.3f} L2 s@truth {pre[:,1].mean():.3f}", flush=True)
print(f"inside gap : L1 s@truth {l1_drop:.3f} L2 s@truth {l2_hold:.3f} L2 pos-err {perr_gap:.2f}px", flush=True)
# PASS: L1 collapsed in the gap (signal truly gone) AND L2 coasted (held high, near truth)
coasted = (l1_drop < 0.5) and (l2_hold > 0.5) and (perr_gap < 4.0)
print("COAST PASS — L2 holds the track through the gap while L1 goes dark." if coasted
else "COAST FAIL — L2 did not coast (still follows the live L1 signal). Train w/ stronger gaps.",
flush=True)
if __name__ == "__main__":
main()
# layer2_train.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""C5P Layer-2 v1 training: track-before-detect recurrent over frozen-L1 fields. By Claude 06/22/2026
(a) first end-to-end L2 train on the CLEAN steady runs (Andrey 06/21). Layer 1 is FROZEN, so the
field sequences are PRE-COMPUTED ONCE into a cache (stage2.py pattern), then Layer2Net trains fast
on the cache via BPTT. Supervision per frame:
- target PRESENT -> det = toroidal Gaussian bump at truth pos; (Vx,Vy) at the bump support;
- target ABSENT (noise prefix) -> det = 0 everywhere == the false-positive-suppression signal.
The net must (i) stay quiet on the noise prefix + clutter clouds, (ii) lock onto the faded-in
target as coherent evidence accumulates (survival), (iii) hold + report velocity across seams.
Step rate = raw per-level-frame (decimation / stride-2-overlap parked for later). Run on DGX:
python layer2_train.py --l1 runs/weighted9_pm/model.pt --nseq 128 --T 48 --steps 4000 --out runs/l2_v1
"""
import argparse
import numpy as np
import torch
import torch.nn.functional as F
import synth
import layer2_data as L1D
from layer2 import Layer2Net, bump_target, layer2_loss
def build_cache(net, N, nseq, T, G, vmax, snr, dev, seed=0, render_kw=None):
"""Pre-compute nseq frozen-L1 field sequences + truth. By Claude 06/22
Returns lists of: seq[T,3,G,G], pos[T,2], vel[2], present[T] (all torch on dev).
render_kw overrides render_run realism knobs (e.g. --v1 zeroes gaps/clutter). By Claude 06/22"""
rng = np.random.default_rng(seed)
render_kw = render_kw or {}
cache = []
for k in range(nseq):
frames, pos, vel, present = L1D.render_run(rng, T=T, G=G, vmax=vmax, snr=snr, **render_kw)
seq = L1D.gen_field_sequence(net, frames, pos, G, N, dev) # [T,3,G,G]
cache.append((torch.from_numpy(seq).to(dev),
torch.from_numpy(np.nan_to_num(pos)).float().to(dev),
torch.from_numpy(vel).to(dev),
torch.from_numpy(present).to(dev)))
if (k + 1) % 32 == 0:
print(f" cache {k+1}/{nseq}", flush=True)
return cache
def make_targets(batch, G, dev, sigma=1.5):
"""Stack a minibatch -> (seq, det_t, vel_t, present). By Claude 06/22
det_t: bump at truth, zeroed on absent frames. vel_t: constant per-seq velocity broadcast."""
seq = torch.stack([b[0] for b in batch], 0) # [B,T,3,G,G]
pos = torch.stack([b[1] for b in batch], 0) # [B,T,2]
vel = torch.stack([b[2] for b in batch], 0) # [B,T,2] per-frame (v2: maneuver)
present = torch.stack([b[3] for b in batch], 0) # [B,T] truth-present (1 thru gaps)
det_t = bump_target(pos, G, sigma=sigma, device=dev) # [B,T,1,G,G]
det_t = det_t * present[:, :, None, None, None] # zero where no target (prefix/death)
B, T = present.shape
vel_t = vel[:, :, :, None, None].expand(B, T, 2, G, G).contiguous() # [B,T,2,G,G]
return seq, det_t, vel_t, present
def evaluate(net, batch, G, dev, sigma=1.5):
"""Per-seq lock/FP metrics over a held-out batch. By Claude 06/22
sigma must match training supervision so the 'core' mask matches the bump. By Claude 06/22"""
seq, det_t, vel_t, present = make_targets(batch, G, dev, sigma=sigma)
with torch.no_grad():
det_logit, vel = net(seq)
p = torch.sigmoid(det_logit)[:, :, 0] # [B,T,G,G]
pres = present.bool()
core = det_t[:, :, 0] > 0.3 # [B,T,G,G] truth bump core
peak = float(p[core].mean()) if core.any() else 0.0 # s at truth (want ->1)
absent = ~pres # noise-prefix frames
bg_absent = float(p[absent].amax()) if absent.any() else 0.0 # worst FP on noise (want ->0)
# background on PRESENT frames, away from the target (clutter-cloud FP)
bg_present = float((p * (~core) * pres[:, :, None, None]).amax())
return peak, bg_absent, bg_present
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--l1", default="runs/weighted9_pm/model.pt")
ap.add_argument("--nseq", type=int, default=128); ap.add_argument("--T", type=int, default=48)
ap.add_argument("--G", type=int, default=32); ap.add_argument("--vmax", type=float, default=1.4)
ap.add_argument("--snr", type=float, default=6.0); ap.add_argument("--steps", type=int, default=4000)
ap.add_argument("--bs", type=int, default=8); ap.add_argument("--lr", type=float, default=2e-3)
ap.add_argument("--ch", type=int, default=24); ap.add_argument("--out", default="runs/l2_v1")
# --- loss / supervision knobs (Option A artifact check). By Claude 06/22 ---
ap.add_argument("--sigma", type=float, default=1.5, help="supervision bump sigma (A: try 0.7-0.8)")
ap.add_argument("--pos_weight", type=float, default=30.0, help="BCE positive weight (A: try 3-8 AFTER sigma)")
ap.add_argument("--v1", action="store_true", help="v1-equivalent data: no gaps/maneuver/abrupt/death/clutter (isolate the sigma sweep)")
a = ap.parse_args()
import os; os.makedirs(a.out, exist_ok=True)
dev = "cuda" if torch.cuda.is_available() else "cpu"
# --v1 zeroes the v2 realism knobs so Option A varies ONLY the supervision width. By Claude 06/22
render_kw = dict(p_gap=0.0, p_maneuver=0.0, p_abrupt=0.0, p_death=0.0, clutter_n=(0, 0)) if a.v1 else {}
net1, N, _ = L1D._load_l1(a.l1, dev)
print(f"L1 {a.l1} N={N}; precomputing {a.nseq} field sequences (T={a.T}, G={a.G}) "
f"sigma={a.sigma} pos_weight={a.pos_weight} v1={a.v1}...", flush=True)
cache = build_cache(net1, N, a.nseq, a.T, a.G, a.vmax, a.snr, dev, render_kw=render_kw)
nval = max(8, a.nseq // 8); val = cache[:nval]; train = cache[nval:]
print(f"cache: {len(train)} train / {len(val)} val sequences", flush=True)
net = Layer2Net(ch_in=3, ch_hidden=a.ch, grid=a.G, vmax=a.vmax).to(dev)
opt = torch.optim.Adam(net.parameters(), a.lr)
nparams = sum(p.numel() for p in net.parameters())
print(f"Layer2Net {nparams} params; training {a.steps} steps bs={a.bs}", flush=True)
rng = np.random.default_rng(1)
for step in range(1, a.steps + 1):
idx = rng.integers(0, len(train), a.bs)
seq, det_t, vel_t, present = make_targets([train[i] for i in idx], a.G, dev, sigma=a.sigma)
det_logit, vel = net(seq)
loss, comp = layer2_loss(det_logit, vel, det_t, vel_t, pos_weight=a.pos_weight)
opt.zero_grad(); loss.backward(); opt.step()
if step % 250 == 0 or step == 1:
peak, bga, bgp = evaluate(net, val, a.G, dev, sigma=a.sigma)
print(f"step {step:5d} det {comp['det']:.4f} vel {comp['vel']:.4f} | "
f"val: s@truth {peak:.3f} max-FP(noise) {bga:.3f} max-FP(clutter) {bgp:.3f}", flush=True)
torch.save({"model": net.state_dict(), "args": vars(a)}, f"{a.out}/model.pt")
print(f"saved {a.out}/model.pt", flush=True)
# eval viz: run trained L2 on a fresh sequence, dump tiffs to watch track-before-detect
rng2 = np.random.default_rng(777)
frames, pos, vel, present = L1D.render_run(rng2, T=120, G=a.G, vmax=a.vmax, snr=a.snr)
seq = L1D.gen_field_sequence(net1, frames, pos, a.G, N, dev)
with torch.no_grad():
det_logit, velo = net(torch.from_numpy(seq[None]).to(dev))
l2s = torch.sigmoid(det_logit)[0, :, 0].cpu().numpy() # [T,G,G] L2 detection
truth = np.zeros((120, a.G, a.G), np.float32)
for t in range(120):
if present[t]: truth[t] = L1D.halfcos_bump_torus(pos[t, 0], pos[t, 1], a.G)
synth.save_tiff_stack(seq[:, 0], f"{a.out}/L1_s.tif") # L1 input field
synth.save_tiff_stack(l2s, f"{a.out}/L2_det.tif") # L2 track-before-detect output
synth.save_tiff_stack(truth, f"{a.out}/truth.tif")
print(f"wrote {a.out}/{{L1_s,L2_det,truth}}.tif (120 pages) — compare L2_det vs L1_s vs truth", flush=True)
if __name__ == "__main__":
main()
# layer2_train_A.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""C5P Layer-2 Option A: supervision-width sweep + OUTPUT-FWHM measurement. By Claude 06/22/2026
Same FROZEN-L1 v1 (no-gap) training data as runs/l2_v1 — the ONLY variables are the
supervision bump width (--sigma) and the BCE positive weight (--pos_weight). Question this
answers: can L2's *output* detection blob be made as tight as L1's (~FWHM 2 px), or does it
stay fat regardless of how sharp we supervise (which would mean per-pixel BCE can't
concentrate mass -> motivates the spatial-softmax readout, Option B)?
FWHM = 2.355 * sigma -> L1-like FWHM ~2 px == --sigma ~0.85
Runs on DGX in the NGC torch container:
python layer2_train_A.py --l1 runs/weighted9_pm/model.pt --nseq 128 --T 48 --steps 4000 \
--sigma 1.5 --pos_weight 30 --out runs/l2_A_s15
python layer2_train_A.py ... --sigma 0.85 --pos_weight 30 --out runs/l2_A_s085
"""
import argparse
import numpy as np
import torch
import torch.nn.functional as F
import synth
import layer2_data as L1D
from layer2 import Layer2Net, bump_target, layer2_loss
def build_cache(net, N, nseq, T, G, vmax, snr, dev, seed=0, render_kw=None):
"""Pre-compute nseq frozen-L1 field sequences + truth. By Claude 06/22
render_kw forwards gaps/band-pass knobs (default gaps=False = no-gap, apples-to-apples). By Claude 06/23"""
rng = np.random.default_rng(seed)
rkw = dict(gaps=False); rkw.update(render_kw or {})
cache = []
for k in range(nseq):
frames, pos, vel, present, signal = L1D.render_run(rng, T=T, G=G, vmax=vmax, snr=snr,
return_signal=True, **rkw)
seq = L1D.gen_field_sequence(net, frames, pos, G, N, dev) # [T,3,G,G]
cache.append((torch.from_numpy(seq).to(dev),
torch.from_numpy(np.nan_to_num(pos)).float().to(dev),
torch.from_numpy(vel).to(dev),
torch.from_numpy(present).to(dev),
torch.from_numpy(signal).to(dev))) # signal: 1=bump rendered, 0=GAP
if (k + 1) % 32 == 0:
print(f" cache {k+1}/{nseq}", flush=True)
return cache
def make_targets(batch, G, dev, sigma=1.5):
"""Stack a minibatch -> (seq, det_t, vel_t, present). By Claude 06/22
det_t: bump at truth, zeroed on absent frames. vel_t: constant per-seq velocity broadcast."""
seq = torch.stack([b[0] for b in batch], 0) # [B,T,3,G,G]
pos = torch.stack([b[1] for b in batch], 0) # [B,T,2]
vel = torch.stack([b[2] for b in batch], 0) # [B,2] (v1) or [B,T,2] (v2 per-frame)
present = torch.stack([b[3] for b in batch], 0) # [B,T]
signal = torch.stack([b[4] for b in batch], 0) # [B,T] 1=bump rendered, 0=GAP
det_t = bump_target(pos, G, sigma=sigma, device=dev) # [B,T,1,G,G]
det_t = det_t * present[:, :, None, None, None] # zero where no target
B, T = present.shape
if vel.dim() == 2: # per-seq -> broadcast over T
vel_t = vel[:, None, :, None, None].expand(B, T, 2, G, G).contiguous()
else: # per-frame (maneuver) [B,T,2]
vel_t = vel[:, :, :, None, None].expand(B, T, 2, G, G).contiguous()
return seq, det_t, vel_t, present, pos, signal
def mexhat_loss(det_logit, vel, pos_t, present, vel_t, G,
sig_core=0.85, sig_wide=1.5, w_center=30.0, w_ring=15.0, w_bg=1.0,
signal=None, gap_boost=1.0):
"""Center-surround (LoG / Mexican-hat) weighted dense BCE. By Claude on 06/22/2026
Encourage firing in the FWHM~2 core (sig_core); HARD-suppress the surrounding ring
(sig_wide minus core = the 2<FWHM<4 moat); light background floor elsewhere. A spatial
weight map replaces the scalar pos_weight (so it both fixes the imbalance AND carves the
skirt). We KNOW the true target is FWHM~2, so the ring is known-empty. Velocity MSE on core.
gap_boost (Andrey 06/23): weight GAP frames (present but L1-starved, signal==0 -> the net must
COAST from memory) by gap_boost x relative to easy-following frames, so the loss prioritizes
getting the hard remembered frames right (ground truth tells us exactly which they are)."""
pres = present[:, :, None, None, None]
core = bump_target(pos_t, G, sigma=sig_core, device=det_logit.device) * pres # [B,T,1,G,G] peak 1
wide = bump_target(pos_t, G, sigma=sig_wide, device=det_logit.device) * pres
ring = (wide - core).clamp_min(0.0) # annulus (the moat)
target = core
W = w_bg + w_center * core + w_ring * ring # Mexican-hat per-pixel weight
if signal is not None and gap_boost != 1.0: # per-FRAME boost on coast (gap) frames
gap = (present > 0.5) & (signal < 0.5) # [B,T] present AND no L1 signal
fw = 1.0 + (gap_boost - 1.0) * gap.float() # [B,T]
W = W * fw[:, :, None, None, None] # broadcast frame weight onto the map
bce = F.binary_cross_entropy_with_logits(det_logit, target, reduction='none')
l_det = (W * bce).mean()
m = (core[:, :, 0] > 0.3) # core disk for velocity supervision
if m.any():
l_vel = F.mse_loss(vel[m[:, :, None].expand_as(vel)], vel_t[m[:, :, None].expand_as(vel)])
else:
l_vel = vel.sum() * 0.0
return l_det + 0.3 * l_vel, {"det": float(l_det.detach()),
"vel": float(l_vel.detach() if torch.is_tensor(l_vel) else l_vel)}
def evaluate(net, batch, G, dev, sigma=1.5):
"""Per-seq lock/FP metrics over a held-out batch. By Claude 06/22
sigma matches the training supervision so the 'core' mask matches the bump. By Claude 06/22"""
seq, det_t, vel_t, present, _, _ = make_targets(batch, G, dev, sigma=sigma)
with torch.no_grad():
det_logit, vel = net(seq)
p = torch.sigmoid(det_logit)[:, :, 0] # [B,T,G,G]
pres = present.bool()
core = det_t[:, :, 0] > 0.3 # [B,T,G,G] truth bump core
peak = float(p[core].mean()) if core.any() else 0.0 # s at truth (want ->1)
absent = ~pres # noise-prefix frames
bg_absent = float(p[absent].amax()) if absent.any() else 0.0 # worst FP on noise (want ->0)
# background on PRESENT frames, away from the target (clutter-cloud FP)
bg_present = float((p * (~core) * pres[:, :, None, None]).amax())
return peak, bg_absent, bg_present
# --- Output-blob width measurement (the Option A deliverable). By Claude 06/22 -------------
def _half_max_width(line):
"""FWHM (px) of a 1-D profile with its peak at the CENTER index. Linear-interpolated
half-maximum crossings on each side. Returns NaN if the center is not a real peak."""
n = len(line); c = n // 2
pk = float(line[c])
if pk <= 1e-6:
return float("nan")
half = pk / 2.0
# walk right until below half, interpolate the crossing
j = c
while j < n - 1 and line[j] >= half:
j += 1
right = (j - 1) + (line[j - 1] - half) / (line[j - 1] - line[j]) if line[j - 1] > line[j] else float(j - 1)
# walk left
i = c
while i > 0 and line[i] >= half:
i -= 1
left = (i + 1) - (half - line[i]) / (line[i + 1] - line[i]) if line[i + 1] > line[i] else float(i + 1)
return right - left
def peak_fwhm_at(field, cx, cy, G):
"""Mean FWHM (px) of the blob at toroidal truth (cx,cy): roll truth to grid center so the
bump can't straddle the seam, then average the half-max widths of the center row & column."""
f = np.roll(np.roll(field, int(round(G // 2 - cy)), axis=0), int(round(G // 2 - cx)), axis=1)
c = G // 2
wx = _half_max_width(f[c, :])
wy = _half_max_width(f[:, c])
return np.nanmean([wx, wy])
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--l1", default="runs/weighted9_pm/model.pt")
ap.add_argument("--nseq", type=int, default=128); ap.add_argument("--T", type=int, default=48)
ap.add_argument("--G", type=int, default=32); ap.add_argument("--vmax", type=float, default=1.4)
ap.add_argument("--snr", type=float, default=6.0); ap.add_argument("--steps", type=int, default=4000)
ap.add_argument("--bs", type=int, default=8); ap.add_argument("--lr", type=float, default=2e-3)
ap.add_argument("--ch", type=int, default=24); ap.add_argument("--out", default="runs/l2_A")
# Option A knobs (this is the whole experiment):
ap.add_argument("--sigma", type=float, default=1.5, help="supervision bump sigma; FWHM=2.355*sigma (try 0.85 for ~2px)")
ap.add_argument("--pos_weight", type=float, default=30.0, help="BCE positive weight (try lower AFTER sigma)")
# Mexican-hat (LoG center-surround) weighting — replaces pos_weight when --mexhat is set.
ap.add_argument("--mexhat", action="store_true", help="center-surround weighted BCE (encourage core, suppress ring)")
ap.add_argument("--mh_center", type=float, default=30.0); ap.add_argument("--mh_ring", type=float, default=15.0)
ap.add_argument("--mh_bg", type=float, default=1.0)
ap.add_argument("--mh_sig_core", type=float, default=0.85); ap.add_argument("--mh_sig_wide", type=float, default=1.5)
# gap-envelope knobs (default off = no-gap, apples-to-apples with the sharpening test). By Claude 06/23
ap.add_argument("--gaps", action="store_true", help="train on band-pass amplitude gaps (coast challenge)")
ap.add_argument("--bp_lo", type=int, default=6); ap.add_argument("--bp_hi", type=int, default=18)
ap.add_argument("--duty_offset", type=float, default=0.2); ap.add_argument("--starter_len", type=int, default=8)
ap.add_argument("--gap_boost", type=float, default=1.0, help="weight GAP (coast) frames Nx vs easy following (Andrey: 30-50)")
a = ap.parse_args()
import os; os.makedirs(a.out, exist_ok=True)
dev = "cuda" if torch.cuda.is_available() else "cpu"
render_kw = dict(gaps=a.gaps, bp_lo=a.bp_lo, bp_hi=a.bp_hi, duty_offset=a.duty_offset, starter_len=a.starter_len)
net1, N, _ = L1D._load_l1(a.l1, dev)
print(f"L1 {a.l1} N={N}; mexhat={a.mexhat} gaps={a.gaps} (bp[{a.bp_lo},{a.bp_hi}] off={a.duty_offset}); "
f"precomputing {a.nseq} seqs (T={a.T}, G={a.G})...", flush=True)
cache = build_cache(net1, N, a.nseq, a.T, a.G, a.vmax, a.snr, dev, render_kw=render_kw)
nval = max(8, a.nseq // 8); val = cache[:nval]; train = cache[nval:]
print(f"cache: {len(train)} train / {len(val)} val sequences", flush=True)
net = Layer2Net(ch_in=3, ch_hidden=a.ch, grid=a.G, vmax=a.vmax).to(dev)
opt = torch.optim.Adam(net.parameters(), a.lr)
nparams = sum(p.numel() for p in net.parameters())
print(f"Layer2Net {nparams} params; training {a.steps} steps bs={a.bs}", flush=True)
rng = np.random.default_rng(1)
for step in range(1, a.steps + 1):
idx = rng.integers(0, len(train), a.bs)
seq, det_t, vel_t, present, pos_t, signal = make_targets([train[i] for i in idx], a.G, dev, sigma=a.sigma)
det_logit, vel = net(seq)
if a.mexhat:
loss, comp = mexhat_loss(det_logit, vel, pos_t, present, vel_t, a.G,
sig_core=a.mh_sig_core, sig_wide=a.mh_sig_wide,
w_center=a.mh_center, w_ring=a.mh_ring, w_bg=a.mh_bg,
signal=signal, gap_boost=a.gap_boost)
else:
loss, comp = layer2_loss(det_logit, vel, det_t, vel_t, pos_weight=a.pos_weight)
opt.zero_grad(); loss.backward(); opt.step()
if step % 250 == 0 or step == 1:
peak, bga, bgp = evaluate(net, val, a.G, dev, sigma=a.sigma)
print(f"step {step:5d} det {comp['det']:.4f} vel {comp['vel']:.4f} | "
f"val: s@truth {peak:.3f} max-FP(noise) {bga:.3f} max-FP(clutter) {bgp:.3f}", flush=True)
torch.save({"model": net.state_dict(), "args": vars(a)}, f"{a.out}/model.pt")
print(f"saved {a.out}/model.pt", flush=True)
# eval viz + FWHM: run trained L2 on a fresh sequence, dump tiffs, measure output blob width
rng2 = np.random.default_rng(777)
frames, pos, vel, present = L1D.render_run(rng2, T=120, G=a.G, vmax=a.vmax, snr=a.snr)
seq = L1D.gen_field_sequence(net1, frames, pos, a.G, N, dev)
with torch.no_grad():
det_logit, velo = net(torch.from_numpy(seq[None]).to(dev))
l2s = torch.sigmoid(det_logit)[0, :, 0].cpu().numpy() # [T,G,G] L2 detection
truth = np.zeros((120, a.G, a.G), np.float32)
for t in range(120):
if present[t]: truth[t] = L1D.halfcos_bump_torus(pos[t, 0], pos[t, 1], a.G)
synth.save_tiff_stack(seq[:, 0], f"{a.out}/L1_s.tif") # L1 input field
synth.save_tiff_stack(l2s, f"{a.out}/L2_det.tif") # L2 track-before-detect output
synth.save_tiff_stack(truth, f"{a.out}/truth.tif")
# FWHM + argmax pos-MAE at the target on LOCKED frames (a shared metric across all rungs).
fw_l1, fw_l2, perr = [], [], []
for t in range(120):
if not present[t]:
continue
fw_l1.append(peak_fwhm_at(seq[t, 0], pos[t, 0], pos[t, 1], a.G))
if l2s[t].max() > 0.5: # only where L2 actually locked
fw_l2.append(peak_fwhm_at(l2s[t], pos[t, 0], pos[t, 1], a.G))
pj = int(np.argmax(l2s[t])); pyk, pxk = divmod(pj, a.G) # L2 peak cell
dx = (pxk - pos[t, 0] + a.G / 2) % a.G - a.G / 2
dy = (pyk - pos[t, 1] + a.G / 2) % a.G - a.G / 2
perr.append(float(np.hypot(dx, dy)))
mu_l1 = float(np.nanmean(fw_l1)) if fw_l1 else float("nan")
mu_l2 = float(np.nanmean(fw_l2)) if fw_l2 else float("nan")
mu_perr = float(np.mean(perr)) if perr else float("nan")
mode = "MEXICAN-HAT" if a.mexhat else f"plain BCE pos_weight={a.pos_weight}"
print(f"wrote {a.out}/{{L1_s,L2_det,truth}}.tif (120 pages)", flush=True)
print(f"=== dense result ({mode}, sigma_core={a.mh_sig_core if a.mexhat else a.sigma}) ===", flush=True)
print(f" output blob FWHM @ target: L1_in ~ {mu_l1:.2f} px L2_out ~ {mu_l2:.2f} px", flush=True)
print(f" pos-MAE(argmax) ~ {mu_perr:.2f} px (L2 locked on {len(fw_l2)}/{len(fw_l1)} present frames)", flush=True)
if __name__ == "__main__":
main()
# layer2_train_P.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""C5P Layer-2 PARAMETRIC training: position+presence head, known-width output. By Claude 06/22/2026
Trains Layer2NetP (layer2p.py): softmax "where" -> sub-pixel position + scalar presence + expected
velocity. No dense sigmoid bump, no pos_weight on a spatial map -> the wrong-skirt failure mode is
impossible and output FWHM = 2.355*sigma_where BY CONSTRUCTION (Andrey 06/22: known width 2.0).
First run is a SHAKEDOWN on the pristine v1 (no-gap) cache already on the DGX -- isolate "does the
new readout lock + localize + separate presence" before adding gap-coast difficulty. The gap
(memory) run comes next, on the v2 generator (per-frame velocity), once synced.
Run on DGX (NGC torch container):
python layer2_train_P.py --l1 runs/weighted9_pm/model.pt --nseq 128 --T 48 --steps 4000 \
--out runs/l2_P_v1shake
"""
import argparse
import numpy as np
import torch
import synth
import layer2_data as L1D
from layer2p import Layer2NetP, layer2p_loss, render_blob, _wrap
def build_cache(net, N, nseq, T, G, vmax, snr, dev, seed=0, render_kw=None):
"""Pre-compute nseq frozen-L1 field sequences + truth. By Claude 06/22
render_kw forwards the gap-envelope / realism knobs to render_run."""
rng = np.random.default_rng(seed)
render_kw = render_kw or {}
cache = []
for k in range(nseq):
frames, pos, vel, present = L1D.render_run(rng, T=T, G=G, vmax=vmax, snr=snr, **render_kw)
seq = L1D.gen_field_sequence(net, frames, pos, G, N, dev)
cache.append((torch.from_numpy(seq).to(dev),
torch.from_numpy(np.nan_to_num(pos)).float().to(dev), # [T,2] truth pos
torch.from_numpy(vel).to(dev), # [2] per-seq velocity
torch.from_numpy(present).to(dev))) # [T]
if (k + 1) % 32 == 0:
print(f" cache {k+1}/{nseq}", flush=True)
return cache
def make_targets(batch, G, dev):
"""Stack -> (seq, pos_t[B,T,2], vel_t[B,T,2], present[B,T]). By Claude 06/22
v1 velocity is per-seq [2] -> broadcast across T."""
seq = torch.stack([b[0] for b in batch], 0) # [B,T,3,G,G]
pos_t = torch.stack([b[1] for b in batch], 0) # [B,T,2]
vel = torch.stack([b[2] for b in batch], 0) # [B,2] (v1) or [B,T,2] (v2 gaps)
present = torch.stack([b[3] for b in batch], 0) # [B,T]
B, T = present.shape
if vel.dim() == 2: # per-seq -> broadcast over T
vel_t = vel[:, None, :].expand(B, T, 2).contiguous()
else: # per-frame (maneuver) -> as-is
vel_t = vel.contiguous()
return seq, pos_t, vel_t, present
def evaluate(net, batch, G, dev):
"""Position MAE (px, present frames) + presence separation (on vs off). By Claude 06/22"""
seq, pos_t, vel_t, present = make_targets(batch, G, dev)
with torch.no_grad():
logP, pos, pres_logit, vel = net(seq)
pres_p = torch.sigmoid(pres_logit) # [B,T]
m = present.bool()
dxy = _wrap(pos - pos_t, G) # [B,T,2] toroidal error
perr = torch.sqrt((dxy ** 2).sum(-1)) # [B,T] distance, cells
pos_mae = float(perr[m].mean()) if m.any() else float("nan") # localize accuracy (want small)
pp_on = float(pres_p[m].mean()) if m.any() else float("nan") # presence on present (want ->1)
pp_off = float(pres_p[~m].mean()) if (~m).any() else float("nan") # presence on absent (want ->0)
return pos_mae, pp_on, pp_off
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--l1", default="runs/weighted9_pm/model.pt")
ap.add_argument("--nseq", type=int, default=128); ap.add_argument("--T", type=int, default=48)
ap.add_argument("--G", type=int, default=32); ap.add_argument("--vmax", type=float, default=1.4)
ap.add_argument("--snr", type=float, default=6.0); ap.add_argument("--steps", type=int, default=4000)
ap.add_argument("--bs", type=int, default=8); ap.add_argument("--lr", type=float, default=2e-3)
ap.add_argument("--ch", type=int, default=24); ap.add_argument("--out", default="runs/l2_P")
# parametric-loss knobs (no spatial pos_weight by design):
ap.add_argument("--sigma_where", type=float, default=0.85, help="where-target sigma; render FWHM=2.355*it (~2px)")
ap.add_argument("--w_pos", type=float, default=0.2)
ap.add_argument("--w_pres", type=float, default=1.0)
ap.add_argument("--w_vel", type=float, default=0.3)
ap.add_argument("--pres_pos_weight", type=float, default=2.0, help="SCALAR presence BCE weight (not spatial)")
# gap-envelope knobs (forwarded to render_run). Without --gaps it's the v1-like no-gap shakedown.
ap.add_argument("--gaps", action="store_true", help="band-pass amplitude gaps (the coast/memory test)")
ap.add_argument("--bp_lo", type=int, default=3)
ap.add_argument("--bp_hi", type=int, default=9, help="HF cutoff ~3-5*bp_lo")
ap.add_argument("--duty_offset", type=float, default=-0.3, help="more negative => more/longer gaps")
ap.add_argument("--starter_len", type=int, default=8, help="clean full-SNR acquire (yesterday locked +6)")
a = ap.parse_args()
import os; os.makedirs(a.out, exist_ok=True)
dev = "cuda" if torch.cuda.is_available() else "cpu"
render_kw = dict(gaps=a.gaps, bp_lo=a.bp_lo, bp_hi=a.bp_hi,
duty_offset=a.duty_offset, starter_len=a.starter_len)
net1, N, _ = L1D._load_l1(a.l1, dev)
print(f"L1 {a.l1} N={N}; PARAMETRIC head, render FWHM={2.355*a.sigma_where:.2f}px; "
f"gaps={a.gaps} bp[{a.bp_lo},{a.bp_hi}] off={a.duty_offset}; "
f"precomputing {a.nseq} seqs (T={a.T}, G={a.G})...", flush=True)
cache = build_cache(net1, N, a.nseq, a.T, a.G, a.vmax, a.snr, dev, render_kw=render_kw)
nval = max(8, a.nseq // 8); val = cache[:nval]; train = cache[nval:]
print(f"cache: {len(train)} train / {len(val)} val sequences", flush=True)
net = Layer2NetP(ch_in=3, ch_hidden=a.ch, grid=a.G, vmax=a.vmax).to(dev)
opt = torch.optim.Adam(net.parameters(), a.lr)
nparams = sum(p.numel() for p in net.parameters())
print(f"Layer2NetP {nparams} params; training {a.steps} steps bs={a.bs}", flush=True)
rng = np.random.default_rng(1)
for step in range(1, a.steps + 1):
idx = rng.integers(0, len(train), a.bs)
seq, pos_t, vel_t, present = make_targets([train[i] for i in idx], a.G, dev)
logP, pos, pres_logit, vel = net(seq)
loss, comp = layer2p_loss(logP, pos, pres_logit, vel, pos_t, present, vel_t, a.G,
sigma_where=a.sigma_where, w_pos=a.w_pos, w_pres=a.w_pres,
w_vel=a.w_vel, pres_pos_weight=a.pres_pos_weight)
opt.zero_grad(); loss.backward(); opt.step()
if step % 250 == 0 or step == 1:
mae, pon, poff = evaluate(net, val, a.G, dev)
print(f"step {step:5d} where {comp['where']:.3f} pos {comp['pos']:.3f} "
f"pres {comp['pres']:.3f} vel {comp['vel']:.3f} | "
f"val: pos-MAE {mae:.2f}px presence on/off {pon:.2f}/{poff:.2f}", flush=True)
torch.save({"model": net.state_dict(), "args": vars(a)}, f"{a.out}/model.pt")
print(f"saved {a.out}/model.pt", flush=True)
# eval viz: run on a fresh sequence, RENDER the known-width blob, dump tiffs + metrics.
rng2 = np.random.default_rng(777)
frames, pos, vel, present = L1D.render_run(rng2, T=120, G=a.G, vmax=a.vmax, snr=a.snr)
seq = L1D.gen_field_sequence(net1, frames, pos, a.G, N, dev)
with torch.no_grad():
logP, pp, pres_logit, velo = net(torch.from_numpy(seq[None]).to(dev))
pos_pred = pp[0].cpu().numpy() # [T,2]
pres_p = torch.sigmoid(pres_logit)[0].cpu().numpy() # [T]
l2render = render_blob(pos_pred, pres_p, a.G, sigma=a.sigma_where) # [T,G,G] FWHM 2 by construction
truth = np.zeros((120, a.G, a.G), np.float32)
for t in range(120):
if present[t]: truth[t] = L1D.halfcos_bump_torus(pos[t, 0], pos[t, 1], a.G)
synth.save_tiff_stack(seq[:, 0], f"{a.out}/L1_s.tif")
synth.save_tiff_stack(l2render.astype(np.float32), f"{a.out}/L2P_render.tif")
synth.save_tiff_stack(truth, f"{a.out}/truth.tif")
# metrics: position MAE on present frames where the net says present (locked), presence stats.
perr, locked = [], 0
for t in range(120):
if not present[t]:
continue
if pres_p[t] > 0.5:
locked += 1
d = _wrap(torch.tensor(pos_pred[t] - np.array([pos[t, 0], pos[t, 1]])), a.G).numpy()
perr.append(float(np.sqrt((d ** 2).sum())))
mae = float(np.mean(perr)) if perr else float("nan")
npres = int(present.sum())
print(f"wrote {a.out}/{{L1_s,L2P_render,truth}}.tif (120 pages)", flush=True)
print(f"=== Parametric result (render FWHM {2.355*a.sigma_where:.2f}px BY CONSTRUCTION) ===", flush=True)
print(f" position MAE @ locked frames: {mae:.2f} px (locked {locked}/{npres} present frames)", flush=True)
print(f" presence prob present~{pres_p[present.astype(bool)].mean():.2f} "
f"absent~{pres_p[~present.astype(bool)].mean():.2f}", flush=True)
if __name__ == "__main__":
main()
# layer2p.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""C5P Layer-2 PARAMETRIC head (position + presence), known-width readout. By Claude on 06/22/2026
Endpoint of the "we KNOW the target is FWHM 2.0" decision (Andrey 06/22). Instead of the dense
sigmoid-bump detector (Layer2Net), the net emits, per frame, ONLY:
- a spatial-SOFTMAX "where" map over the GxG torus -> sub-pixel position (toroidal centroid),
- a scalar "whether" presence logit,
- an expected velocity (softmax-weighted Vx,Vy).
The detection image, if needed, is RENDERED as a fixed FWHM-2 blob at the predicted position,
scaled by presence -> the output width is correct BY CONSTRUCTION; a "wrong skirt" cannot occur.
Why softmax (not sigmoid/BCE): a softmax map's total mass is FIXED at 1, so the net cannot lower
its loss by spreading into a skirt -- it MUST concentrate. That removes BOTH the fat-skirt failure
of the dense head AND the rare-positive class imbalance that forced pos_weight=30 (Andrey 06/22:
"it does not spread wide, so ratio of positives is not a concern"). Single-target by construction.
Reuses ConvGRUCellTorus + the toroidal Gaussian from layer2.py; the deployed dense Layer2Net is
left untouched (the inference server keeps loading it + runs/l2_v1). By Claude on 06/22/2026.
"""
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from layer2 import ConvGRUCellTorus, bump_target
def _torus_centroid(P, G):
"""Sub-pixel toroidal centroid of a probability map. By Claude on 06/22/2026
P: [B, G, G] (>=0, sums to 1 over the GxG grid). Returns pos [B, 2] = (x, y) in cells, in
[0, G). Uses the CIRCULAR mean (map each index to an angle, average sin/cos, atan2 back) so a
peak straddling the wrap seam averages to the seam -- NOT to the middle of the grid as a plain
weighted mean would. x = last axis (columns), y = first spatial axis (rows)."""
dev = P.device
ang = (2.0 * np.pi / G) * torch.arange(G, device=dev, dtype=P.dtype) # [G] cell index -> angle
cos_x = torch.cos(ang)[None, None, :] # [1,1,G] over columns (x)
sin_x = torch.sin(ang)[None, None, :]
cos_y = torch.cos(ang)[None, :, None] # [1,G,1] over rows (y)
sin_y = torch.sin(ang)[None, :, None]
Cx = (P * cos_x).sum(dim=(-1, -2)); Sx = (P * sin_x).sum(dim=(-1, -2)) # [B] each
Cy = (P * cos_y).sum(dim=(-1, -2)); Sy = (P * sin_y).sum(dim=(-1, -2))
x = torch.atan2(Sx, Cx) % (2.0 * np.pi) * (G / (2.0 * np.pi)) # [B] in [0,G)
y = torch.atan2(Sy, Cy) % (2.0 * np.pi) * (G / (2.0 * np.pi))
return torch.stack([x, y], dim=-1) # [B, 2]
class Layer2NetP(nn.Module):
"""Recurrent track-before-detect with a PARAMETRIC (position + presence) readout. By Claude 06/22
forward(seq) -> per frame: log-softmax where-map [B,T,G,G], position [B,T,2] (cells, sub-pixel),
presence logit [B,T], velocity [B,T,2] (px/level-frame). Same ConvGRU recurrence as Layer2Net.
"""
def __init__(self, ch_in=3, ch_hidden=24, grid=32, vmax=1.4, k=3):
super().__init__()
self.ch_hidden = ch_hidden
self.grid = grid
self.vmax = vmax
self.cell = ConvGRUCellTorus(ch_in, ch_hidden, k=k)
self.score_head = nn.Conv2d(ch_hidden, 1, 1) # -> spatial logits, softmaxed to "where"
self.vel_head = nn.Conv2d(ch_hidden, 2, 1) # -> per-cell velocity field
self.pres_head = nn.Linear(ch_hidden, 1) # -> scalar "whether" from pooled hidden
def init_hidden(self, B, device, dtype):
return torch.zeros(B, self.ch_hidden, self.grid, self.grid, device=device, dtype=dtype)
def decode(self, h):
# h: [B, Ch, G, G]
B, _, G, _ = h.shape
score = self.score_head(h).view(B, G * G) # [B, G*G] spatial logits
logP = F.log_softmax(score, dim=1).view(B, G, G) # [B, G, G] log "where" map (sums to 1)
P = logP.exp() # [B, G, G] probability map
pos = _torus_centroid(P, G) # [B, 2] sub-pixel (x,y) in cells
pooled = h.mean(dim=(-1, -2)) # [B, Ch] global context for presence
pres_logit = self.pres_head(pooled)[:, 0] # [B] "is a target present this frame"
vel_field = self.vmax * torch.tanh(self.vel_head(h)) # [B, 2, G, G] bounded velocity field
vel = (P[:, None] * vel_field).sum(dim=(-1, -2)) # [B, 2] expected velocity under "where"
return logP, pos, pres_logit, vel
def forward(self, seq, h=None):
# seq: [B, T, Cin, G, G]
B, T = seq.shape[0], seq.shape[1]
if h is None:
h = self.init_hidden(B, seq.device, seq.dtype)
logPs, poss, press, vels = [], [], [], []
for t in range(T): # BPTT unrolls this loop
h = self.cell(seq[:, t], h)
logP, pos, pres, vel = self.decode(h)
logPs.append(logP); poss.append(pos); press.append(pres); vels.append(vel)
return (torch.stack(logPs, 1), # [B,T,G,G] log where-map
torch.stack(poss, 1), # [B,T,2] position (cells)
torch.stack(press, 1), # [B,T] presence logit
torch.stack(vels, 1)) # [B,T,2] velocity
def _wrap(d, G):
"""Toroidal signed difference into (-G/2, G/2]. By Claude on 06/22/2026"""
return (d + G / 2.0) % G - G / 2.0
def layer2p_loss(logP, pos, pres_logit, vel, pos_t, present, vel_t, G,
sigma_where=0.85, w_pos=0.2, w_pres=1.0, w_vel=0.3, pres_pos_weight=2.0):
"""Parametric loss. By Claude on 06/22/2026
logP [B,T,G,G] log-softmax where-map pos [B,T,2] predicted (x,y) cells
pres_logit [B,T] presence logit vel [B,T,2] predicted velocity
pos_t [B,T,2] truth position (cells) present [B,T] 1=target present
vel_t [B,T,2] truth velocity
Terms (NO pos_weight on a spatial map -- softmax already fixes total mass, so no skirt / no
class imbalance; the only scalar weight is on the per-frame presence BCE):
- WHERE : soft cross-entropy of the softmax map vs a SHARP toroidal-Gaussian truth target
Q (sigma_where ~ FWHM 2). Concentrates one peak at truth; cannot win by spreading.
- POS : toroidal MSE on the sub-pixel centroid (sub-cell refinement of WHERE).
- PRES : BCE on the per-frame presence scalar (present vs absent/gap frames).
- VEL : MSE of expected velocity, on present frames only.
WHERE/POS/VEL are masked to present frames; PRES is supervised every frame."""
B, T = present.shape
pres = present.float()
m = present.bool()
# WHERE: build a sharp normalized target distribution Q at truth, cross-entropy = -sum Q*logP.
Q = bump_target(pos_t, G, sigma=sigma_where, device=logP.device)[:, :, 0] # [B,T,G,G] (peak 1)
Q = Q / Q.sum(dim=(-1, -2), keepdim=True).clamp_min(1e-8) # normalize -> sums to 1
ce = -(Q * logP).sum(dim=(-1, -2)) # [B,T] cross-entropy
l_where = ce[m].mean() if m.any() else logP.sum() * 0.0
# POS: sub-pixel toroidal MSE on present frames.
dxy = _wrap(pos - pos_t, G) # [B,T,2]
l_pos = (dxy[m] ** 2).mean() if m.any() else pos.sum() * 0.0
# PRES: per-frame presence BCE (scalar imbalance handled by a small pos_weight, NOT spatial).
pw = torch.tensor(pres_pos_weight, device=pres_logit.device)
l_pres = F.binary_cross_entropy_with_logits(pres_logit, pres, pos_weight=pw)
# VEL: expected-velocity MSE on present frames.
l_vel = F.mse_loss(vel[m], vel_t[m]) if m.any() else vel.sum() * 0.0
total = l_where + w_pos * l_pos + w_pres * l_pres + w_vel * l_vel
return total, {"where": float(l_where.detach()), "pos": float(l_pos.detach()),
"pres": float(l_pres.detach()), "vel": float(l_vel.detach())}
def render_blob(pos, pres, G, sigma=0.85):
"""Render the KNOWN-width detection image from (position, presence). By Claude on 06/22/2026
pos [T,2] cells, pres [T] in [0,1] -> [T,G,G] = pres * toroidal Gaussian(FWHM=2.355*sigma) at pos.
Width is fixed here, so the output FWHM is correct by construction."""
T = pos.shape[0]
p = torch.as_tensor(pos, dtype=torch.float32).view(1, T, 2)
g = bump_target(p, G, sigma=sigma, device="cpu")[0, :, 0].numpy() # [T,G,G] peak 1
return g * np.asarray(pres, dtype=np.float32)[:, None, None]
# make_testvec.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Save a fixed (input, raw-output) pair for Java-vs-PyTorch verification. # By Claude on 06/13/2026
python make_testvec.py /work/runs/weighted/model.pt /work/runs/weighted/testvec"""
import sys, numpy as np, torch
import synth
from model import RawFCN
ck = torch.load(sys.argv[1], map_location="cpu"); a = ck["args"]
N = a.get("nframes", 8); P = a.get("patch", 24); vr = a.get("vel_radius", 5)
m = RawFCN(n_frames=N, vel_radius=vr); m.load_state_dict(ck["model"]); m.eval()
rng = np.random.default_rng(999)
f, lab = synth.generate_sample(rng, N=N, H=P, W=P, snr=6.0, place="center")
with torch.no_grad():
out = m(torch.from_numpy(f[None])).reshape(-1).numpy() # [124] raw network output
f.astype('<f4').tofile(sys.argv[2] + "_in.bin") # [N,H,W] row-major LE float32
out.astype('<f4').tofile(sys.argv[2] + "_out.bin") # [124] LE float32
print(f"testvec N={N} P={P} outlen={out.size} true vx={lab['vx']:+.3f} vy={lab['vy']:+.3f} "
f"det_logit={out[0]:.4f}")
# model.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""
All-convolutional (FCN) target estimator for the C5P DNN experiment. # By Claude on 06/13/2026
Per Andrey's minimal-segment / siamese-reuse insight: train the per-location operator on
small patches (one receptive field), deploy the SAME weights slid over the full frame.
So there are NO fully-connected layers - the patch-net maps [B, N, P, P] -> [B, C, 1, 1]
on a P=24 patch, and the identical net run on a larger image yields a dense grid (FCN).
Per-patch output channels C = 1 (detection logit) + Vdim*Vdim (velocity logits, default
121) + 2 (sub-pixel dx,dy offset). At inference the full P(x,y,vx,vy) field is:
x,y <- convolution position + (dx,dy) offset head
vx,vy<- softmax(velocity logits) per location
detection <- sigmoid(det logit) per location.
Raw branch: learned conv encoder on the conditioned frames (frames as input channels).
Whitened branch (added next): same head, but the first layer is the FIXED matched-filter
conv (frozen) - so the comparison is learned-front-end vs analytical-front-end, same back.
"""
import torch
import torch.nn as nn
import torch.nn.functional as F
class RawFCN(nn.Module):
"""Patch -> [det, vel(Vdim^2), offset(2)]. Valid convs + maxpool reduce P=24 -> 1x1,
so it slides as an FCN over larger inputs (output grid downsampled by the pool stride)."""
def __init__(self, n_frames=8, vel_radius=5, patch=24, ch=None, velocity_mode="grid", vmax=1.4):
super().__init__()
self.vel_radius = vel_radius
self.vdim = 2 * vel_radius + 1
self.patch = patch
self.velocity_mode = velocity_mode # "grid"=121-cell softmax (legacy) | "reg"=continuous Vx,Vy,logvar. By Claude on 06/17/2026
self.vmax = vmax # reg velocity bound (px/frame): v = vmax*tanh(raw) -> no grid, no corners
# Valid 3x3 convs + 2x2 maxpools reduce patch -> 1x1 (RF = patch). Pools always act on EVEN
# sizes so the RF stays centered (preserves the cx0 = P/2 half-pixel alignment). By Claude 06/16/2026:
# patch=32 widens the attention area so off-center suppression reaches the full alias distance
# (off_max = P/2-margin-1 = 13 > vmax*(N-1) = 11.2 px) -> trains away the trajectory-alias ghosts.
if patch == 24:
if ch is None: ch = (32, 32, 48, 48, 64)
c0, c1, c2, c3, c4 = ch
self.features = nn.Sequential(
nn.Conv2d(n_frames, c0, 3), nn.ReLU(inplace=True), # 24->22
nn.Conv2d(c0, c1, 3), nn.ReLU(inplace=True), # 22->20
nn.MaxPool2d(2), # 20->10
nn.Conv2d(c1, c2, 3), nn.ReLU(inplace=True), # 10->8
nn.Conv2d(c2, c3, 3), nn.ReLU(inplace=True), # 8->6
nn.MaxPool2d(2), # 6->3
nn.Conv2d(c3, c4, 3), nn.ReLU(inplace=True), # 3->1
)
last = c4
elif patch == 32:
if ch is None: ch = (32, 32, 48, 48, 64, 64)
c0, c1, c2, c3, c4, c5 = ch
self.features = nn.Sequential(
nn.Conv2d(n_frames, c0, 3), nn.ReLU(inplace=True), # 32->30
nn.Conv2d(c0, c1, 3), nn.ReLU(inplace=True), # 30->28
nn.MaxPool2d(2), # 28->14
nn.Conv2d(c1, c2, 3), nn.ReLU(inplace=True), # 14->12
nn.Conv2d(c2, c3, 3), nn.ReLU(inplace=True), # 12->10
nn.MaxPool2d(2), # 10->5
nn.Conv2d(c3, c4, 3), nn.ReLU(inplace=True), # 5->3
nn.Conv2d(c4, c5, 3), nn.ReLU(inplace=True), # 3->1
)
last = c5
elif patch == 52:
# v2 Stage-1 MF at the wider velocity range (vmax~2.5-3, N=9 -> reach vmax*(N-1)=24 -> RF 52). By Claude on 06/18/2026
# 52->50->48->(pool)24->22->20->(pool)10->8->6->(pool)3->1 : RF=52, 7 conv3 + 3 pool.
if ch is None: ch = (32, 32, 48, 48, 64, 64, 96)
c0, c1, c2, c3, c4, c5, c6 = ch
self.features = nn.Sequential(
nn.Conv2d(n_frames, c0, 3), nn.ReLU(inplace=True), # 52->50
nn.Conv2d(c0, c1, 3), nn.ReLU(inplace=True), # 50->48
nn.MaxPool2d(2), # 48->24
nn.Conv2d(c1, c2, 3), nn.ReLU(inplace=True), # 24->22
nn.Conv2d(c2, c3, 3), nn.ReLU(inplace=True), # 22->20
nn.MaxPool2d(2), # 20->10
nn.Conv2d(c3, c4, 3), nn.ReLU(inplace=True), # 10->8
nn.Conv2d(c4, c5, 3), nn.ReLU(inplace=True), # 8->6
nn.MaxPool2d(2), # 6->3
nn.Conv2d(c5, c6, 3), nn.ReLU(inplace=True), # 3->1
)
last = c6
else:
raise ValueError("RawFCN: unsupported patch %d (use 24, 32, or 52)" % patch)
# grid: det(1) + Vdim^2 vel logits + offset(2); reg: det(1) + Vx,Vy(2) + logvar(1) + offset(2). By Claude on 06/17/2026
self.out_ch = (1 + self.vdim * self.vdim + 2) if (velocity_mode == "grid") else (1 + 2 + 1 + 2)
self.head = nn.Conv2d(last, self.out_ch, 1) # 1x1 -> per-location output
def forward(self, x):
# x: [B, N, P, P] -> feat [B, c4, Hf, Wf] -> out [B, C, Hf, Wf]
out = self.head(self.features(x))
if self.velocity_mode == "reg":
# bound Vx,Vy to +-vmax (no grid, no corners); det/logvar/offset stay raw. By Claude on 06/17/2026
v = self.vmax * torch.tanh(out[:, 1:3])
out = torch.cat([out[:, 0:1], v, out[:, 3:]], dim=1)
return out
def split(self, out): # grid mode: (det_logit [B,*], vel_logits [B,Vdim^2,*], off [B,2,*])
det = out[:, 0]
vel = out[:, 1:1 + self.vdim * self.vdim]
off = out[:, 1 + self.vdim * self.vdim:]
return det, vel, off
def split_reg(self, out): # reg mode: (det_logit [B,*], vel [B,2,*], logvar [B,*], off [B,2,*]). By Claude on 06/17/2026
return out[:, 0], out[:, 1:3], out[:, 3], out[:, 4:6]
def fcn_loss(out, model, det_t, vel_soft_t, off_t, det_w=None, w_vel=1.0, w_off=1.0):
"""Combined loss for center-supervised patches (output is [B,C,1,1]).
det_t : [B] 0/1 detection labels
vel_soft_t: [B, Vdim^2] soft P(vx,vy) target (positives only; ignored for negatives)
off_t : [B, 2] (dx,dy) target (positives only)
det_w : [B] or None per-sample detection-loss weight - heavier on near-miss off-center
negatives (confusability ~ PSF overlap with center). None = all 1.
Returns (total, dict of components)."""
det_logit, vel_logits, off = model.split(out)
det_logit = det_logit.reshape(det_logit.shape[0]) # [B]
vel_logits = vel_logits.reshape(vel_logits.shape[0], -1) # [B, Vdim^2]
off = off.reshape(off.shape[0], 2) # [B, 2]
pos = (det_t > 0.5)
# detection: BCE over all samples (optionally per-sample weighted by confusability)
l_det = F.binary_cross_entropy_with_logits(det_logit, det_t, weight=det_w)
# velocity: cross-entropy to the soft target, positives only (KL up to a const)
if pos.any():
logp = F.log_softmax(vel_logits[pos], dim=1)
l_vel = -(vel_soft_t[pos] * logp).sum(dim=1).mean()
l_off = F.mse_loss(off[pos], off_t[pos])
else:
l_vel = vel_logits.sum() * 0.0
l_off = off.sum() * 0.0
total = l_det + w_vel * l_vel + w_off * l_off
return total, {"det": l_det.item(), "vel": l_vel.detach().item(), "off": l_off.detach().item()}
def reg_loss(out, model, det_t, vx_t, vy_t, off_t, det_w=None, w_vel=1.0, w_off=1.0,
w_bias=0.0, bin_var=None, n_bins=4, mfsum_t=None, w_mfs=0.02):
"""Loss for the continuous-velocity (reg) head. By Claude on 06/17/2026
Velocity = heteroscedastic isotropic-Gaussian NLL: 0.5*||v-vtrue||^2 * exp(-logvar) + logvar
(const dropped) -> learns BOTH velocity and its uncertainty sigma=exp(logvar/2). Plus det BCE,
offset MSE, and the batch-moment de-bias (pin per-bin mean gain to 1; bin_var = snr or s).
vx_t,vy_t,off_t : [B] / [B,2] tensors (px/frame, px).
MF-S mode (option a, Andrey 2026-06-18): if mfsum_t is given, channel 0 is no longer a det
LOGIT but a direct REGRESSION of the matched-filter path-sum S (sum of clean signal along the
trajectory) -> MSE, RAW output (no sigmoid; Java reads it raw). S is then the informative vote
weight: full at the true head, fading off-center, ~0 on noise - the same quantity the Hough
vote needs, so voteScatter weights by S directly (no separate path-sum pass)."""
det_logit, v, logvar, off = model.split_reg(out)
det_logit = det_logit.reshape(det_logit.shape[0])
v = v.reshape(v.shape[0], 2); logvar = logvar.reshape(logvar.shape[0]); off = off.reshape(off.shape[0], 2)
pos = (det_t > 0.5)
if mfsum_t is not None:
l_det = w_mfs * F.mse_loss(det_logit, mfsum_t) # channel 0 = MF path-sum regression (raw)
else:
l_det = F.binary_cross_entropy_with_logits(det_logit, det_t, weight=det_w)
if pos.any():
dvx = v[pos, 0] - vx_t[pos]; dvy = v[pos, 1] - vy_t[pos]
sq = dvx * dvx + dvy * dvy
l_vel = (0.5 * sq * torch.exp(-logvar[pos]) + logvar[pos]).mean() # heteroscedastic NLL
l_off = F.mse_loss(off[pos], off_t[pos])
else:
l_vel = v.sum() * 0.0; l_off = off.sum() * 0.0
# de-bias: per equal-population bin of bin_var, pooled LSQ gain through origin -> (gain-1)^2
l_bias = v.sum() * 0.0; nb = 0
if (w_bias > 0) and (bin_var is not None) and (int(pos.sum()) >= 4 * n_bins):
vp = v[pos]; tvx = vx_t[pos]; tvy = vy_t[pos]; b = bin_var[pos]
q = torch.linspace(0, 1, n_bins + 1, device=b.device, dtype=b.dtype)
edges = torch.quantile(b, q); edges[0] = edges[0] - 1e-4; edges[-1] = edges[-1] + 1e-4
for i in range(n_bins):
m = (b >= edges[i]) & (b < edges[i + 1])
if int(m.sum()) < 4:
continue
num = (vp[m, 0] * tvx[m] + vp[m, 1] * tvy[m]).sum()
den = (tvx[m] * tvx[m] + tvy[m] * tvy[m]).sum() + 1e-6
l_bias = l_bias + (num / den - 1.0) ** 2; nb += 1
if nb > 0:
l_bias = l_bias / nb
total = l_det + w_vel * l_vel + w_off * l_off + w_bias * l_bias
return total, {"det": l_det.item(), "vel": l_vel.detach().item(), "off": l_off.detach().item(),
"bias": float(l_bias.detach()) if nb > 0 else 0.0}
def vel_bias_loss(out, model, vx_true, vy_true, det_t, bin_var, n_bins=4, vel_decimate=4):
"""Batch-moment de-biasing term (positives only). By Claude on 06/15/2026
Per equal-population bin of `bin_var` (quantile edges), the pooled least-squares gain
through the origin of the predicted softmax-centroid velocity vs the true velocity:
gain_bin = sum(pred . true) / sum(true . true)
penalized as (gain_bin - 1)^2, averaged over bins. Pins the MEAN velocity scale to 1 in
every bin - removing the systematic regime-dependent shrink and the ~0.97 clean bias -
WITHOUT penalizing per-sample scatter (variance is information-limited; left for the
recurrent layer to average out; only the bias, which the recurrent cannot fix, is removed).
bin_var : [B] quantity to bin by. For this IN-LOSS term, true SNR is the cleaner label
(uniform coverage; no coupling with the simultaneously-trained det head; no s-saturation)
- and the conditioning var need NOT exist at inference since the correction is baked into
the weights. Confidence s=sigmoid(det) is the right variable for a POST-HOC gain(s)
calibration instead (the only signal available at runtime). Membership uses bin_var
directly (detach s before passing); the gradient flows only through the velocity centroid.
vx_true, vy_true, det_t : [B] tensors (px/frame, px/frame, 0/1)."""
_, vel_logits, _ = model.split(out)
vel_logits = vel_logits.reshape(vel_logits.shape[0], -1)
pos = det_t > 0.5
if int(pos.sum()) < 4 * n_bins:
return out.sum() * 0.0
vdim = model.vdim
p = torch.softmax(vel_logits[pos], dim=1).reshape(-1, vdim, vdim) # [Npos, vy, vx]
cells = torch.arange(vdim, device=p.device, dtype=p.dtype) - model.vel_radius
pvx = (p.sum(1) * cells).sum(1) # predicted centroid, cells (vx inner)
pvy = (p.sum(2) * cells).sum(1)
tvx = vx_true[pos] * vel_decimate # true, cells
tvy = vy_true[pos] * vel_decimate
b = bin_var[pos]
q = torch.linspace(0, 1, n_bins + 1, device=b.device, dtype=b.dtype)
edges = torch.quantile(b, q) # equal-population bins, robust to skew
edges[0] = edges[0] - 1e-4; edges[-1] = edges[-1] + 1e-4
loss = pvx.sum() * 0.0; nb = 0
for i in range(n_bins):
m = (b >= edges[i]) & (b < edges[i + 1])
if int(m.sum()) < 4:
continue
num = (pvx[m] * tvx[m] + pvy[m] * tvy[m]).sum()
den = (tvx[m] * tvx[m] + tvy[m] * tvy[m]).sum() + 1e-6
loss = loss + (num / den - 1.0) ** 2
nb += 1
return loss / nb if nb > 0 else loss
#!/usr/bin/env python3
# nettest.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Raw-socket throughput tester (stdlib only). By Claude on 06/20/2026.
server: nettest.py server [port]
client: nettest.py client HOST PORT MB DIR (DIR=up client->server | down server->client)
Receiver times wall-clock from first to last byte and reports MB/s + Gbit/s (no encryption,
single bulk stream -> clean link throughput)."""
import socket, struct, sys, time
CHUNK = 1 << 20 # 1 MiB
def recv_timed(conn, nbytes):
got = 0
buf = bytearray(CHUNK)
t0 = None
while got < nbytes:
n = conn.recv_into(buf, min(CHUNK, nbytes - got))
if not n:
break
if t0 is None:
t0 = time.perf_counter()
got += n
dt = time.perf_counter() - t0 if t0 else 0.0
return got, dt
def send_all(conn, nbytes):
block = b"\0" * CHUNK
sent = 0
while sent < nbytes:
sent += conn.send(block[:min(CHUNK, nbytes - sent)])
return sent
def server(port):
s = socket.socket()
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(("0.0.0.0", port))
s.listen(1)
print(f"nettest server on :{port}", flush=True)
while True:
c, a = s.accept()
c.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
try:
hdr = c.recv(16)
if len(hdr) < 16:
c.close(); continue
direction, nbytes = struct.unpack(">qq", hdr) # 0=up(server recvs) 1=down(server sends)
if direction == 0:
got, dt = recv_timed(c, nbytes)
mbps = got / 1e6 / dt if dt else 0
print(f" UP recv {got/1e6:.0f}MB {dt*1e3:.1f}ms = {mbps:.0f} MB/s ({mbps*8/1000:.2f} Gbit/s)", flush=True)
c.sendall(struct.pack(">d", dt))
else:
send_all(c, nbytes)
finally:
c.close()
def client(host, port, mb, direction):
nbytes = mb * (1 << 20)
c = socket.socket()
c.connect((host, port))
c.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
d = 0 if direction == "up" else 1
c.sendall(struct.pack(">qq", d, nbytes))
if d == 0:
send_all(c, nbytes)
dt, = struct.unpack(">d", c.recv(8))
mbps = nbytes / 1e6 / dt if dt else 0
print(f"UP client->server {mb}MB: {mbps:.0f} MB/s = {mbps*8/1000:.2f} Gbit/s (server-timed)")
else:
got, dt = recv_timed(c, nbytes)
mbps = got / 1e6 / dt if dt else 0
print(f"DOWN server->client {got/1e6:.0f}MB: {mbps:.0f} MB/s = {mbps*8/1000:.2f} Gbit/s")
c.close()
if __name__ == "__main__":
if sys.argv[1] == "server":
server(int(sys.argv[2]) if len(sys.argv) > 2 else 5578)
else:
client(sys.argv[2], int(sys.argv[3]), int(sys.argv[4]), sys.argv[5])
# partial_votes.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Single-target partial-vote visualization. By Claude on 2026-06-19 (design Andrey)
One target (Vx,Vy via args, default 1,0), one timestamp, no interference. Runs frozen Stage 1
(mf_s) -> per-pixel (Vx,Vy,S). Writes an ImageJ hyperstack TIFF:
slice 0 = TOTAL vote (sum of all contributions = accS)
slice 1..K = each voter pixel's S-weighted splat at its tail, slice-labeled "X{dx}:Y{dy}:S={s}"
(dx,dy = pixel offset from the true head; voters sorted by descending S, capped)
Plus the vote-weighted readout at the peak: (Vx,Vy) = sum(vote*V)/sum(vote), head = tail + V*(N-1).
Usage: python partial_votes.py [model.pt] [out.tif] [vx] [vy] [amp] [noise0/1]
"""
import sys, numpy as np, torch
import synth, stage2 as S2
from model import RawFCN
import tifffile
dev = "cuda" if torch.cuda.is_available() else "cpu"
N, vmax, P = 9, 2.8, 52; half, Nm1, HW = P // 2, N - 1, 120; F_ = HW - P + 1
ckpt = sys.argv[1] if len(sys.argv) > 1 else "/work/runs/stage1_mfs2/model.pt"
out = sys.argv[2] if len(sys.argv) > 2 else "/work/runs/partial_votes.tif"
Vx = float(sys.argv[3]) if len(sys.argv) > 3 else 1.0
Vy = float(sys.argv[4]) if len(sys.argv) > 4 else 0.0
amp = float(sys.argv[5]) if len(sys.argv) > 5 else 5.0
noise = (len(sys.argv) > 6 and sys.argv[6] == "1")
THR_FRAC, MAXV = 0.15, 220 # voter gate (frac of max S) and slice cap
s1 = RawFCN(n_frames=N, patch=P, velocity_mode="reg", vmax=vmax).to(dev)
s1.load_state_dict(torch.load(ckpt, map_location=dev)["model"]); s1.eval()
# one target, head centered, causal MB (blur_frac 1.0, matches training)
rng = np.random.default_rng(0)
fr = rng.standard_normal((N, HW, HW)).astype(np.float32) if noise else np.zeros((N, HW, HW), np.float32)
Hx = Hy = HW / 2.0; subs = np.arange(4) * 0.25
for i in range(N):
acc = np.zeros((HW, HW))
for ss in subs: acc += synth.halfcos_bump(Hx - Vx * (i + ss), Hy - Vy * (i + ss), HW, HW)
fr[i] += (amp * acc / 4).astype(np.float32)
s_t, vx_t, vy_t = S2.stage1_dense(s1, fr, dev=dev, mf_s=True)
accS, accVx, accVy = S2.vote_scatter(s_t, vx_t, vy_t, Nm1)
s, vx, vy = s_t.cpu().numpy(), vx_t.cpu().numpy(), vy_t.cpu().numpy()
accS_n, accVx_n, accVy_n = accS.cpu().numpy(), accVx.cpu().numpy(), accVy.cpu().numpy()
hxf, hyf = Hx - half, Hy - half # true head in field coords
# voters (s above gate), strongest first
thr = THR_FRAC * float(s.max())
vox = sorted([(i, j) for i in range(F_) for j in range(F_) if s[i, j] > thr], key=lambda p: -s[p])
if len(vox) > MAXV: vox = vox[:MAXV]
slices = [accS_n.astype(np.float32)]; labels = ["TOTAL VOTE (sum)"]
for (i, j) in vox:
sl = np.zeros((F_, F_), np.float32); sval = float(s[i, j])
tx, ty = j - vx[i, j] * Nm1, i - vy[i, j] * Nm1
x0, y0 = int(np.floor(tx)), int(np.floor(ty)); fx, fy = tx - x0, ty - y0
for dx in (0, 1):
for dy in (0, 1):
xi, yi = x0 + dx, y0 + dy
if 0 <= xi < F_ and 0 <= yi < F_:
sl[yi, xi] += sval * (1 - fx if dx == 0 else fx) * (1 - fy if dy == 0 else fy)
slices.append(sl)
labels.append("X%+d:Y%+d:S=%.2f" % (j - hxf, i - hyf, sval))
tifffile.imwrite(out, np.stack(slices).astype(np.float32), imagej=True, metadata={"Labels": labels})
# vote-weighted readout at the peak
pk = np.unravel_index(accS_n.argmax(), accS_n.shape); Sp = accS_n[pk]
vcx, vcy = accVx_n[pk] / Sp, accVy_n[pk] / Sp
print("voters: %d (gate S>%.2f), wrote %d slices -> %s" % (len(vox), thr, len(slices), out))
print("peak tail @field (x=%d,y=%d) vote-sum S=%.1f" % (pk[1], pk[0], Sp))
print("vote-weighted velocity: Vx=%.3f Vy=%.3f (true %.2f,%.2f)" % (vcx, vcy, Vx, Vy))
print("=> head = tail + V*(N-1) = field (x=%.1f,y=%.1f) true head field (x=%.1f,y=%.1f)" %
(pk[1] + vcx * Nm1, pk[0] + vcy * Nm1, hxf, hyf))
#!/usr/bin/env bash
# run_infer_server.sh - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
# Start/stop the CUAS DGX inference server (PyTorch RawFCN, cuDNN) in the NGC container.
# By Claude on 06/20/2026. Run on the DGX (elphel@192.168.0.62).
# start|stop|logs|status env: RUN=runs/<model> RUN2=runs/<l2> PORT=5577
set -euo pipefail
NAME=cuas_infer
IMG=nvcr.io/nvidia/pytorch:25.10-py3
CODE=/home/elphel/c5p_dnn
RUN="${RUN:-runs/weighted9_pm_s}"
RUN2="${RUN2:-}" # optional Layer-2 run dir; empty -> L1-only. By Claude 06/22/2026
PORT="${PORT:-5577}"
case "${1:-start}" in
start)
docker rm -f "$NAME" >/dev/null 2>&1 || true
L2ARG=""; [ -n "$RUN2" ] && L2ARG="--l2run $RUN2"
docker run -d --name "$NAME" --gpus all --network host \
-v "$CODE":/work -w /work "$IMG" \
python infer_server.py --run "$RUN" $L2ARG --port "$PORT" >/dev/null
echo "started $NAME (run=$RUN l2=${RUN2:-off} port=$PORT)"; sleep 3; docker logs "$NAME"
;;
stop) docker rm -f "$NAME" >/dev/null 2>&1 && echo "stopped" || echo "not running" ;;
logs) docker logs --tail 60 "$NAME" ;;
status) docker ps --filter "name=$NAME" --format "{{.Names}} {{.Status}}" ;;
*) echo "usage: $0 {start|stop|logs|status} (env: RUN=, RUN2=, PORT=)"; exit 1 ;;
esac
#!/usr/bin/env bash
# run_l2A.sh - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
# Option A: baseline sigma=1.5 vs FWHM~2 target sigma=0.85, SAME v1 data. By Claude 06/22/2026
set -uo pipefail
cd /work
COMMON="--l1 runs/weighted9_pm/model.pt --nseq 128 --T 48 --steps 4000 --pos_weight 30"
echo "=== Option A run 1: sigma=1.5 (baseline) $(date) ==="
python layer2_train_A.py $COMMON --sigma 1.5 --out runs/l2_A_s15 2>&1 | tee runs/l2_A_s15.log
echo "=== Option A run 2: sigma=0.85 (FWHM~2 target) $(date) ==="
python layer2_train_A.py $COMMON --sigma 0.85 --out runs/l2_A_s085 2>&1 | tee runs/l2_A_s085.log
echo "=== Option A DONE $(date) ==="
echo "----- FWHM summary -----"
grep -h "output blob FWHM" runs/l2_A_s15.log runs/l2_A_s085.log
#!/usr/bin/env bash
# run_l2_chain.sh - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
# Overnight chain: parametric L2 (position+presence) trained WITH band-pass gaps, swept over
# HF x duty-offset to find the best coast/maintain regime. By Claude on 06/22/2026.
# Each run is isolated (own out dir + log); a failing run does NOT abort the rest. Run on the DGX:
# tmux new-session -d -s l2chain "docker exec cuas_infer bash /work/run_l2_chain.sh"
set -uo pipefail
cd /work
L1="runs/weighted9_pm/model.pt"
COMMON="--l1 $L1 --nseq 128 --T 64 --steps 4000 --gaps --starter_len 8"
run () { # run <name> <extra args...>
local name="$1"; shift
echo "=== $name $(date) ==="
python layer2_train_P.py $COMMON "$@" --out "runs/$name" 2>&1 | tee "runs/$name.log" \
|| echo "!! $name FAILED (continuing)"
}
# HF x offset sweep (LF=3 fixed). gentle camels (HF 9-12), deeper offset => longer clean gaps.
run l2P_h9_o2 --bp_hi 9 --duty_offset -0.2
run l2P_h9_o4 --bp_hi 9 --duty_offset -0.4
run l2P_h12_o3 --bp_hi 12 --duty_offset -0.3
run l2P_h12_o5 --bp_hi 12 --duty_offset -0.5
# one longer high-step run on the conservative-gaps setting for a quality model
run l2P_h9_o3_long --bp_hi 9 --duty_offset -0.3 --steps 8000
echo "=== L2 CHAIN DONE $(date) ==="
echo "----- summary (position MAE + presence separation per run) -----"
for f in runs/l2P_*.log; do
echo "## ${f##*/}"
grep -hE "position MAE|presence prob" "$f" 2>/dev/null || echo " (no result line — check $f)"
done
#!/usr/bin/env bash
# run_l2_dense.sh - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
# Relaunch the two dense rungs (bug fixed). By Claude 06/23/2026
set -uo pipefail
cd /work
DENSE="--l1 runs/weighted9_pm/model.pt --nseq 128 --T 48 --steps 4000 --sigma 0.85"
echo "=== seq1: dense sigma=0.85 plain BCE $(date) ==="
python layer2_train_A.py $DENSE --out runs/seq1_dense_s085 2>&1 | tee runs/seq1_dense_s085.log || echo "!! seq1 FAILED"
echo "=== seq2: dense Mexican-hat $(date) ==="
python layer2_train_A.py $DENSE --mexhat --out runs/seq2_dense_mexhat 2>&1 | tee runs/seq2_dense_mexhat.log || echo "!! seq2 FAILED"
echo "=== DENSE DONE $(date) ==="
for f in runs/seq1_dense_s085.log runs/seq2_dense_mexhat.log; do echo "## ${f##*/}"; grep -hE "dense result|FWHM @ target|pos-MAE" "$f"; done
#!/usr/bin/env bash
# run_l2_seq.sh - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
# Overnight ablation ladder — separate single-change models to evaluate & compare. By Claude 06/22/2026.
# seq1 dense, sigma=0.85, plain BCE (sharper target only)
# seq2 dense, sigma=0.85, + Mexican-hat (center-surround / known-empty ring)
# seq4 parametric + gap modulation ("fancy")
# (#3 parametric-no-gap = the shakedown, run separately.) Each run is isolated, own out dir + log;
# a failing run does NOT abort the rest. Run on the DGX:
# tmux new-session -d -s l2seq "docker exec cuas_infer bash /work/run_l2_seq.sh"
set -uo pipefail
cd /work
DENSE="--l1 runs/weighted9_pm/model.pt --nseq 128 --T 48 --steps 4000 --sigma 0.85"
echo "=== seq1: dense sigma=0.85 plain BCE $(date) ==="
python layer2_train_A.py $DENSE --out runs/seq1_dense_s085 2>&1 | tee runs/seq1_dense_s085.log || echo "!! seq1 FAILED"
echo "=== seq2: dense Mexican-hat (core 0.85) $(date) ==="
python layer2_train_A.py $DENSE --mexhat --out runs/seq2_dense_mexhat 2>&1 | tee runs/seq2_dense_mexhat.log || echo "!! seq2 FAILED"
echo "=== seq4: parametric + gaps (fancy) $(date) ==="
python layer2_train_P.py --l1 runs/weighted9_pm/model.pt --nseq 128 --T 64 --steps 4000 \
--gaps --bp_hi 9 --duty_offset -0.3 --starter_len 8 --out runs/seq4_param_gaps \
2>&1 | tee runs/seq4_param_gaps.log || echo "!! seq4 FAILED"
echo "=== L2 SEQUENCE DONE $(date) ==="
echo "----- comparison summary -----"
for f in runs/seq1_dense_s085.log runs/seq2_dense_mexhat.log runs/seq4_param_gaps.log; do
echo "## ${f##*/}"
grep -hE "FWHM @ target|pos-MAE|presence prob|dense result|Parametric result" "$f" 2>/dev/null || echo " (no result line — check $f)"
done
# shake.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""C5P v2 diagnostic: "shake the stack". By Claude on 06/18/2026
Andrey's intuition: looking through a noisy stack, shake tilt (velocity) and x0,y0 and watch what
persists. Maps the matched-filter path-sum S(dx, dvx) around a horizontal target and around a noise
patch. Expectation: a target shows a central peak (all N frames coincide) with a FAN of weaker
ridges (single-frame aliases, slope dvx=dx/i) all crossing AT the head -> the head is the
convergence of many ridges (robust), not a lone max. Noise: no coherent crossing.
"""
import numpy as np, synth
N = 9; HW = 140
def render_h(V, amp, noise, blur_frac=1.0, nb=4, seed=1):
rng = np.random.default_rng(seed)
frames = rng.standard_normal((N, HW, HW)).astype(np.float32) if noise else np.zeros((N, HW, HW), np.float32)
Hx = HW / 2 + V * (N - 1) * 0.5; Hy = HW / 2
subs = np.arange(nb) * (blur_frac / nb)
for i in range(N):
acc = np.zeros((HW, HW))
for ss in subs: acc += synth.halfcos_bump(Hx - V * (i + ss), Hy, HW, HW)
frames[i] += (amp * acc / nb).astype(np.float32)
return frames, Hx, Hy
def bilin(img, x, y):
ix = int(np.floor(x)); iy = int(np.floor(y)); fx = x - ix; fy = y - iy
if ix < 0 or ix >= HW - 1 or iy < 0 or iy >= HW - 1: return 0.0
return float((1-fy)*((1-fx)*img[iy,ix]+fx*img[iy,ix+1]) + fy*((1-fx)*img[iy+1,ix]+fx*img[iy+1,ix+1]))
def landscape(frames, x0, y0, V, DX, DVX):
L = np.zeros((len(DX), len(DVX)))
for a, dx in enumerate(DX):
for b, dvx in enumerate(DVX):
L[a, b] = sum(bilin(frames[i], x0 + dx - (V + dvx) * i, y0) for i in range(N))
return L
CH = " .:-=+*#%@"
def show(L, DX, DVX, title):
mx = L.max()
print(title + " (peak path-sum %.1f; rows=dx, cols=dvx %.1f..%.1f)" % (mx, DVX[0], DVX[-1]))
for a, dx in enumerate(DX):
row = "".join(CH[min(9, max(0, int(round(9 * L[a, b] / mx))))] if mx > 0 else " " for b in range(len(DVX)))
print(" dx=%+3d |%s|%s" % (dx, row, " <- dx=0" if dx == 0 else ""))
z = int(np.argmin(np.abs(DVX)))
print(" " + " " * z + "^dvx=0 (ramp ridge: dvx=dx/8)")
DX = np.arange(-12, 13); DVX = np.arange(-1.4, 1.45, 0.1)
for amp in (5, 3, 2):
fr, Hx, Hy = render_h(1.0, amp, True)
show(landscape(fr, Hx, Hy, 1.0, DX, DVX), DX, DVX, "\n===== TARGET V=1.0 amp=%d (noisy) =====" % amp)
show(landscape(fr, 30, 30, 1.0, DX, DVX), DX, DVX, "----- NOISE patch (same stack, off-target) -----")
# stage2.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""C5P DNN v2 — Stage 2: learned Hough-vote refinement. By Claude on 06/18/2026
Stage 1 (frozen RawFCN reg, patch 52) emits a per-pixel (Vx,Vy,s) field; around a target it forms
the alias ramp V_P = V_true + (P-H)/(N-1), so every pixel back-projects to the SAME tail
T = P - V_P*(N-1). Stage 2 splats each pixel's s-weighted vote at T (differentiable bilinear
scatter), then a small conv REFINES the accumulator into a clean tail-detection + consensus velocity.
A real target gets many coherent votes -> sharp peak; a ghost/alias has no consensus. This replaces
the v1 velocity-softmax competition AND the (backwards) ghostbuster. Vote target = tail; head = tail + V*(N-1).
Training: Stage 1 FROZEN. Fields are PRE-COMPUTED once (dense Stage-1 over each field) and cached, then
the refine net trains fast on the cache (deep-supervision reference; e2e + latent channels come later).
"""
import argparse, numpy as np, torch, torch.nn as nn, torch.nn.functional as F
import synth
from model import RawFCN
def gen_field(rng, HW, ntgt, N, vmax, snr_rng, blur_frac=1.0, nb=4, margin=50):
"""Render ntgt causal-MB targets (random head/velocity) + Gaussian noise. By Claude on 06/18/2026
Returns frames [N,HW,HW], list of (hx,hy,vx,vy). Heads kept in [margin, HW-margin]."""
snr = float(np.exp(rng.uniform(np.log(snr_rng[0]), np.log(snr_rng[1]))))
frames = rng.standard_normal((N, HW, HW)).astype(np.float32)
subs = np.arange(nb) * (blur_frac / nb)
tgts = []
for _ in range(ntgt):
for _try in range(50):
vx = rng.uniform(-vmax, vmax); vy = rng.uniform(-vmax, vmax)
if vx*vx + vy*vy > vmax*vmax: continue
hx = rng.uniform(margin, HW-margin); hy = rng.uniform(margin, HW-margin)
xs = [hx - vx*i for i in range(N)]; ys = [hy - vy*i for i in range(N)]
if min(xs) >= 3 and max(xs) <= HW-4 and min(ys) >= 3 and max(ys) <= HW-4: break
else:
continue
for i in range(N):
acc = np.zeros((HW, HW), np.float64)
for ss in subs: acc += synth.halfcos_bump(hx - vx*(i+ss), hy - vy*(i+ss), HW, HW)
frames[i] += (snr * acc / nb).astype(np.float32)
tgts.append((hx, hy, vx, vy))
return frames, tgts
def stage1_dense(net, frames, P=52, dev="cuda", chunk=8192, mf_s=False):
"""Run frozen Stage 1 stride-1 over a field -> (s,Vx,Vy) maps at field-res (HW-P+1). By Claude on 06/18/2026
mf_s=True (option a): channel 0 is the RAW matched-filter path-sum S (clamp>=0), not a det
logit -> S is itself the informative vote weight, so no separate mf_sum() pass is needed."""
N, HW, _ = frames.shape
x = torch.from_numpy(frames[None]).to(dev) # [1,N,HW,HW]
cols = F.unfold(x, kernel_size=P) # [1, N*P*P, L]
L = cols.shape[-1]; F_ = HW - P + 1
cols = cols.reshape(N, P, P, L).permute(3, 0, 1, 2).contiguous() # [L,N,P,P]
outs = []
with torch.no_grad():
for b in range(0, L, chunk):
o = net(cols[b:b+chunk]) # [bs,6,1,1]
outs.append(o[:, :, 0, 0])
o = torch.cat(outs, 0) # [L,6]
s = o[:, 0].clamp(min=0.0) if mf_s else torch.sigmoid(o[:, 0]); v = o[:, 1:3]
return s.reshape(F_, F_), v[:, 0].reshape(F_, F_), v[:, 1].reshape(F_, F_)
def mf_sum(frames, vx, vy, half, N):
"""Matched-filter response = sum of data along each pixel's trajectory (Andrey 2026-06-18).
Informative even WITHOUT noise (full path-sum at the true head, partial at aliases) and kills
noise-consensus (no real data along a spurious path -> ~0). frames [N,HW,HW]; vx,vy [F,F] (field
coords; scene = field + half). Returns [F,F] = max(sum_i frames[i] @ (field+half - V*i), 0)."""
dev = frames.device; HW = frames.shape[-1]; F_ = vx.shape[0]
fi, fj = torch.meshgrid(torch.arange(F_, device=dev).float(), torch.arange(F_, device=dev).float(), indexing='ij')
acc = torch.zeros(F_, F_, device=dev)
for i in range(N):
sx = (fj + half - vx * i); sy = (fi + half - vy * i)
grid = torch.stack([2 * sx / (HW - 1) - 1, 2 * sy / (HW - 1) - 1], dim=-1)[None]
acc = acc + F.grid_sample(frames[i][None, None], grid, align_corners=True, padding_mode='zeros')[0, 0]
return acc.clamp(min=0.0)
def vote_scatter(s, vx, vy, Nm1):
"""Bilinear vote splat at T = P - V*(N-1); s = vote WEIGHT (MF path-sum), unnormalized. By Claude on 06/18/2026
Returns 3 accumulators [F,F]: (sum w, sum w*vx, sum w*vy). Normalize per-field afterward."""
F_ = s.shape[0]; dev = s.device
ys, xs = torch.meshgrid(torch.arange(F_, device=dev).float(),
torch.arange(F_, device=dev).float(), indexing='ij')
tx = xs - vx * Nm1; ty = ys - vy * Nm1
accS = torch.zeros(F_*F_, device=dev); accVx = torch.zeros_like(accS); accVy = torch.zeros_like(accS)
x0 = torch.floor(tx); y0 = torch.floor(ty)
for dx in (0, 1):
for dy in (0, 1):
xi = (x0 + dx); yi = (y0 + dy)
wx = (1 - (tx - x0)) if dx == 0 else (tx - x0)
wy = (1 - (ty - y0)) if dy == 0 else (ty - y0)
w = (s * wx * wy) # s = MF path-sum weight (already informative; no squaring) // By Claude 06/18
valid = (xi >= 0) & (xi < F_) & (yi >= 0) & (yi < F_)
idx = (yi.clamp(0, F_-1) * F_ + xi.clamp(0, F_-1)).long().reshape(-1)
wv = (w * valid).reshape(-1)
accS.scatter_add_(0, idx, wv)
accVx.scatter_add_(0, idx, (wv * vx.reshape(-1)))
accVy.scatter_add_(0, idx, (wv * vy.reshape(-1)))
return accS.reshape(F_, F_), accVx.reshape(F_, F_), accVy.reshape(F_, F_)
class VoteRefine(nn.Module):
"""Conv refine on the 3-channel vote accumulator -> tail-detection logit + consensus velocity.
Reference Stage 2: vote (geometric, physics) is fixed; the conv learns to sharpen/threshold."""
def __init__(self, ch=32):
super().__init__()
self.net = nn.Sequential(
nn.Conv2d(3, ch, 3, padding=1), nn.ReLU(True),
nn.Conv2d(ch, ch, 3, padding=1), nn.ReLU(True),
nn.Conv2d(ch, 3, 1)) # det logit + (Vx,Vy) consensus
def forward(self, accS, accVx, accVy):
a = torch.stack([accS, accVx, accVy], 0)[None] # [1,3,F,F]
return self.net(a)[0] # [3,F,F]
def tail_label(tgts, F_, P=52, N=9, sigma=1.5):
"""Gaussian tail-detection target map (field coords) + velocity maps. By Claude on 06/18/2026"""
Nm1 = N - 1; half = P // 2
det = np.zeros((F_, F_), np.float32); vx = np.zeros_like(det); vy = np.zeros_like(det)
ys, xs = np.mgrid[0:F_, 0:F_]
for hx, hy, tvx, tvy in tgts:
tx = (hx - tvx*Nm1) - half; ty = (hy - tvy*Nm1) - half # tail in field coords
g = np.exp(-((xs-tx)**2 + (ys-ty)**2) / (2*sigma*sigma)).astype(np.float32)
det = np.maximum(det, g); m = g > 0.3
vx[m] = tvx; vy[m] = tvy
return det, vx, vy
if __name__ == "__main__":
ap = argparse.ArgumentParser()
ap.add_argument("--stage1", default="runs/stage1_mf/model.pt")
ap.add_argument("--nframes", type=int, default=9); ap.add_argument("--vmax", type=float, default=2.8)
ap.add_argument("--HW", type=int, default=120); ap.add_argument("--ntgt", type=int, default=4)
ap.add_argument("--nfields", type=int, default=384); ap.add_argument("--steps", type=int, default=3000)
ap.add_argument("--snr", type=float, nargs=2, default=[2.0, 8.0]); ap.add_argument("--out", default="runs/stage2")
ap.add_argument("--mf_s", action="store_true") # Stage 1 emits the MF path-sum directly as S (option a) -> vote weight = S, no mf_sum() pass // By Claude 06/18
a = ap.parse_args()
import os; os.makedirs(a.out, exist_ok=True)
dev = "cuda" if torch.cuda.is_available() else "cpu"
s1 = RawFCN(n_frames=a.nframes, patch=52, velocity_mode="reg", vmax=a.vmax).to(dev)
s1.load_state_dict(torch.load(a.stage1, map_location=dev)["model"]); s1.eval()
rng = np.random.default_rng(0); Nm1 = a.nframes - 1; half = 52 // 2
print(f"precomputing {a.nfields} Stage-1 fields + MF-sum votes (HW={a.HW}, ntgt={a.ntgt})...", flush=True)
cache = []
for k in range(a.nfields):
nt = 0 if (k % 4 == 0) else a.ntgt # 25% noise-only fields: learn to suppress noise-consensus // By Claude 06/18
fr, tg = gen_field(rng, a.HW, nt, a.nframes, a.vmax, a.snr)
s, vx, vy = stage1_dense(s1, fr, dev=dev, mf_s=a.mf_s)
F_ = s.shape[0]
# option a: S already IS the MF path-sum (learned, denoised) -> use it as the vote weight
# directly; legacy path computes the explicit data path-sum from the frames.
w = s if a.mf_s else mf_sum(torch.from_numpy(fr).to(dev), vx, vy, half, a.nframes)
accS, accVx, accVy = vote_scatter(w, vx, vy, Nm1)
nrm = accS.max().clamp(min=1e-6) # per-field normalize (regime-invariant)
accS = accS / nrm; accVx = accVx / nrm; accVy = accVy / nrm
det_t, tvx, tvy = tail_label(tg, F_, N=a.nframes)
cache.append((accS.detach(), accVx.detach(), accVy.detach(),
torch.from_numpy(det_t).to(dev), torch.from_numpy(tvx).to(dev), torch.from_numpy(tvy).to(dev)))
if (k+1) % 64 == 0: print(f" {k+1}/{a.nfields}", flush=True)
net = VoteRefine().to(dev); opt = torch.optim.Adam(net.parameters(), 1e-3)
print("training Stage-2 refine...", flush=True)
for step in range(1, a.steps+1):
accS, accVx, accVy, det_t, tvx, tvy = cache[np.random.randint(len(cache))]
out = net(accS, accVx, accVy) # [3,F,F]
l_det = F.binary_cross_entropy_with_logits(out[0], det_t, pos_weight=torch.tensor(8.0, device=out.device)) # tail-Gaussians are sparse
m = det_t > 0.3
l_vel = (F.mse_loss(out[1][m], tvx[m]) + F.mse_loss(out[2][m], tvy[m])) if m.any() else out.sum()*0
loss = l_det + 0.3*l_vel
opt.zero_grad(); loss.backward(); opt.step()
if step % 300 == 0:
with torch.no_grad():
p = torch.sigmoid(out[0]); peakhit = float(p[m].mean()) if m.any() else 0
bg = float(p[~m].max())
print(f"step {step:5d} det {l_det.item():.4f} vel {float(l_vel):.4f} "
f"tail-s(peak) {peakhit:.3f} max-bg {bg:.3f}", flush=True)
torch.save({"model": net.state_dict(), "args": vars(a)}, f"{a.out}/model.pt")
print(f"saved {a.out}/model.pt", flush=True)
# stage2_eval3.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
import numpy as np, torch
import stage2 as S2
from model import RawFCN
dev="cuda" if torch.cuda.is_available() else "cpu"; N,vmax,HW=9,2.8,140; Nm1=N-1; half=26
s1=RawFCN(n_frames=N,patch=52,velocity_mode="reg",vmax=vmax).to(dev)
s1.load_state_dict(torch.load("/work/runs/stage1_mf/model.pt",map_location=dev)["model"]); s1.eval()
net=S2.VoteRefine().to(dev); net.load_state_dict(torch.load("/work/runs/stage2c/model.pt",map_location=dev)["model"]); net.eval()
def peaks(p,th,r=2):
out=[]; H_,W_=p.shape
for y in range(r,H_-r):
for x in range(r,W_-r):
if p[y,x]>th and p[y,x]>=p[y-r:y+r+1,x-r:x+r+1].max()-1e-6: out.append((x,y,p[y,x]))
return out
for TH in (0.5,0.7):
ndet=0;ntot=0;errs=[];gh=[];rng=np.random.default_rng(123)
for t in range(15):
fr,tg=S2.gen_field(rng,HW,4,N,vmax,[3.0,8.0]); s,vx,vy=S2.stage1_dense(s1,fr,dev=dev)
w=S2.mf_sum(torch.from_numpy(fr).to(dev),vx,vy,half,N)
aS,aVx,aVy=S2.vote_scatter(w,vx,vy,Nm1); nrm=aS.max().clamp(min=1e-6); aS=aS/nrm;aVx=aVx/nrm;aVy=aVy/nrm
with torch.no_grad(): p=torch.sigmoid(net(aS,aVx,aVy)[0]).cpu().numpy()
F_=p.shape[0]; pk=peaks(p,TH)
tt=[((hx-tvx*Nm1)-half,(hy-tvy*Nm1)-half) for hx,hy,tvx,tvy in tg]; tt=[(x,y) for x,y in tt if 0<=x<F_ and 0<=y<F_]
for tx,ty in tt:
ntot+=1; near=[np.hypot(px-tx,py-ty) for px,py,pv in pk if np.hypot(px-tx,py-ty)<8]
if near: ndet+=1; errs.append(min(near))
for px,py,pv in pk:
if all(np.hypot(px-tx,py-ty)>=8 for tx,ty in tt): gh.append(pv)
print("th=%.2f: det %d/%d (%.0f%%) locerr %.2f | TRUE ghosts(>8px) %d max %.3f"%(TH,ndet,ntot,100*ndet/ntot,np.median(errs) if errs else -1,len(gh),max(gh) if gh else 0))
# synth.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""
Synthetic single-target generator for the C5P DNN experiment. # By Claude on 06/13/2026
Goal: produce unlimited exactly-labeled training data for a network that maps a short
spatio-temporal patch sequence -> P(x, y, vx, vy) (the full 4D target posterior, one shot
from N frames - strictly more than phase correlation, which is pairwise and velocity-only).
Model of one sample (matches the validated pipeline conventions):
- One target: a canonical half-cosine bump of peak amplitude `snr` (noise sigma = 1, so
snr = peak SNR), at sub-pixel position (x0, y0) in the NEWEST frame (frame index 0),
moving at constant velocity (vx, vy) px/frame. Frame i (i = 0 newest .. N-1 oldest)
has the bump centered at (x0 - vx*i, y0 - vy*i) [the target was at earlier positions
in the past], matching the C5P window convention (window[0] = newest).
- i.i.d. Gaussian noise, sigma = 1, added to every pixel of every frame. Pure Gaussian
at this stage on purpose: PC's optimality is derived for Gaussian noise, so "matches PC
under Gaussian noise" is a meaningful, falsifiable target; real-clutter comes later.
Labels: (x0, y0, vx, vy) continuous. Velocity is in px/frame; the output grid maps cells
to px/frame via vel_decimate (4 cells = 1 px/frame at decimate=4, as in the pipeline).
Bump: separable half-cosine cos(pi/3 * |d|) for |d| < 1.5 (same shape as
TemporalKernelGenerator.halfcos and the existing synthetic generator). Set radial=True for
an isotropic bump (rotationally symmetric) if we switch the whole system that way.
"""
import numpy as np
def halfcos_bump(cx, cy, H, W, radial=False):
"""Sample the canonical half-cosine bump centered at (cx, cy) on an HxW grid.
Separable cos(pi/3|dx|)*cos(pi/3|dy|) (default, matches the system) or radial."""
ys = np.arange(H)[:, None] - cy
xs = np.arange(W)[None, :] - cx
if radial:
r = np.sqrt(xs * xs + ys * ys)
b = np.where(r < 1.5, np.cos(np.pi / 3.0 * r), 0.0)
else:
bx = np.where(np.abs(xs) < 1.5, np.cos(np.pi / 3.0 * np.abs(xs)), 0.0)
by = np.where(np.abs(ys) < 1.5, np.cos(np.pi / 3.0 * np.abs(ys)), 0.0)
b = bx * by
return b
def generate_sample(rng, N=8, H=24, W=24, vmax_px=1.0, snr=5.0, radial=False, snr_log=False,
place="center", off_range=0.5, off_min=1.0, off_max=None,
margin=2.0, target=None, motion_blur=False, blur_frac=1.0):
"""One labeled training patch for the fully-convolutional (FCN) regime. # By Claude on 06/13/2026
The patch is ONE receptive field; supervision is for its CENTER output pixel only.
Per-pixel output is (Vx, Vy, s) - position is the pixel grid (conv location), so the
center output's job is: is a target trajectory centered on ME at t0 (newest frame)?
place='center' -> POSITIVE (det=1): target t0 within +-off_range px of patch center
(the cell this output owns). Train Vx,Vy on it.
place='offcenter' -> NEGATIVE-WITH-TARGET (det=0): a real target owned by a NEIGHBOR
pixel - t0 offset off_min..off_max from center - whose evidence
reaches this receptive field. Teaches the center output to SUPPRESS
competing line segments that don't pass through it at t0. (Vx,Vy NOT
trained - det=0 masks the velocity loss; stored only for viz.)
place='none' -> NOISE negative (det=0): pure Gaussian noise.
target= True/False kept for back-compat (-> 'center'/'none').
"""
if target is not None:
place = "center" if target else "none"
if isinstance(snr, (tuple, list)):
# log-uniform spans low->high (keeps the high-SNR/noiseless sharpness anchor while covering
# low SNR) - fixes the reg head's high-SNR overshoot; linear otherwise. By Claude on 06/17/2026
snr = (float(np.exp(rng.uniform(np.log(snr[0]), np.log(snr[1])))) if snr_log
else rng.uniform(snr[0], snr[1]))
# Reference the patch center at index P/2 (= deployment's `half` in CuasDnnInfer.inferROI,
# where patch index half maps to the output/ROI pixel), NOT the geometric (W-1)/2. The 0.5
# gap between (W-1)/2=11.5 and P/2=12 was the systematic half-pixel registration bias; aligning
# the training reference to the deployment reference removes it (even patch kept). By Claude on 06/15/2026
cx0 = W / 2.0; cy0 = H / 2.0
if off_max is None:
off_max = min(W, H) / 2.0 - margin - 1.0 # neighbor targets out to the RF reach
if place == "none":
frames = rng.standard_normal((N, H, W)).astype(np.float32)
return frames, {"det": 0.0, "place": place, "x0": np.nan, "y0": np.nan,
"vx": np.nan, "vy": np.nan, "dx": np.nan, "dy": np.nan, "snr": snr, "mfsum": 0.0}
# pick t0 offset from center per class, velocity on the disk, retry until the whole
# trajectory (t0 - v*i, i=0..N-1) stays inside the patch within `margin`.
for _ in range(200):
vx = rng.uniform(-vmax_px, vmax_px); vy = rng.uniform(-vmax_px, vmax_px)
if vx * vx + vy * vy > vmax_px * vmax_px:
continue
if place == "center":
dx = rng.uniform(-off_range, off_range); dy = rng.uniform(-off_range, off_range)
else: # offcenter: t0 owned by a neighbor (annulus off_min..off_max around center)
ang = rng.uniform(0, 2 * np.pi); rad = rng.uniform(off_min, off_max)
dx = rad * np.cos(ang); dy = rad * np.sin(ang)
x0 = cx0 + dx; y0 = cy0 + dy
xs = [x0 - vx * i for i in range(N)]; ys = [y0 - vy * i for i in range(N)]
bmx = blur_frac * abs(vx) if motion_blur else 0.0 # causal streak extends older by up to blur_frac*|v|
bmy = blur_frac * abs(vy) if motion_blur else 0.0
if (min(xs) - bmx >= margin and max(xs) + bmx <= W - 1 - margin and
min(ys) - bmy >= margin and max(ys) + bmy <= H - 1 - margin):
break
frames = np.empty((N, H, W), dtype=np.float32)
# motion blur, RT/CAUSAL model (Andrey 2026-06-17): a decimated frame averages the finest
# sub-frames AT and BEFORE it (trailing), as realtime must - data[i] = mean of sub-frames at
# i, i+1/sub, ... i+(blur_frac - 1/sub) in the OLDER direction (larger index = older here).
# The streak is ~|v|*blur_frac long AND its centroid lags by ~0.5*blur_frac*|v| (the RT bias:
# the target's apparent position is OLDER than its label time). Flux-conserving (peak drops).
# blur_frac = averaging window in frames (1.0 = non-overlap decimation, 4 sub-steps). By Claude on 06/17/2026
nb = max(2, int(round(4 * blur_frac))) if motion_blur else 1
subs = np.arange(nb) * (blur_frac / nb) if nb > 1 else np.array([0.0]) # causal: s in [0, blur_frac)
mfsum = 0.0 # clean-signal matched-filter path-sum = label for the MF-like S head (option a). By Claude on 06/18/2026
for i in range(N):
if motion_blur and (vx or vy):
acc = np.zeros((H, W), dtype=np.float32)
for ss in subs:
acc += halfcos_bump(x0 - vx * (i + ss), y0 - vy * (i + ss), H, W, radial=radial)
sig = snr * acc / nb
else:
sig = snr * halfcos_bump(x0 - vx * i, y0 - vy * i, H, W, radial=radial)
frames[i] = sig + rng.standard_normal((H, W))
# accumulate the clean signal sampled (bilinear) along the trajectory tap (x0-vx*i, y0-vy*i):
# this is exactly "sum of data along the trajectory" with the noise removed (its expectation).
tx = x0 - vx * i; ty = y0 - vy * i
ix = int(np.floor(tx)); iy = int(np.floor(ty)); fx = tx - ix; fy = ty - iy
if 0 <= ix < W - 1 and 0 <= iy < H - 1:
mfsum += float((1 - fy) * ((1 - fx) * sig[iy, ix] + fx * sig[iy, ix + 1])
+ fy * ((1 - fx) * sig[iy + 1, ix] + fx * sig[iy + 1, ix + 1]))
det = 1.0 if place == "center" else 0.0
return frames, {"det": det, "place": place, "x0": x0, "y0": y0,
"vx": vx, "vy": vy, "dx": dx, "dy": dy, "snr": snr, "mfsum": mfsum}
def pick_place(rng, frac_pos=0.4, frac_off=0.4):
"""Three-way class draw: center-positive / off-center-negative / noise-negative. # By Claude on 06/13/2026
The off-center class (a target owned by a neighbor pixel) is what stops the net from
collapsing to the matched filter - it must learn to SUPPRESS competing line segments."""
u = rng.random()
if u < frac_pos: return "center"
if u < frac_pos + frac_off: return "offcenter"
return "none"
def generate_batch(rng, B, frac_pos=0.4, frac_off=0.4, **kw):
"""Batch of B FCN training patches, three classes (center / offcenter / noise). # By Claude on 06/13/2026"""
N = kw.get("N", 8); H = kw.get("H", 24); W = kw.get("W", 24)
frames = np.empty((B, N, H, W), dtype=np.float32)
keys = ("det", "x0", "y0", "vx", "vy", "dx", "dy", "snr", "mfsum")
code = {"none": 0, "center": 1, "offcenter": 2}
labels = {k: np.empty(B, dtype=np.float32) for k in keys}
labels["place_code"] = np.empty(B, dtype=np.int64) # 0 none / 1 center / 2 offcenter
for b in range(B):
f, lab = generate_sample(rng, place=pick_place(rng, frac_pos, frac_off), **kw)
frames[b] = f
for k in keys:
labels[k][b] = lab[k]
labels["place_code"][b] = code[lab["place"]]
return frames, labels
def soft_target_vel(label, vel_radius=5, vel_decimate=4, sigma_v=0.9):
"""Soft P(vx,vy) target for the CENTER output pixel: Gaussian bump at the true velocity # By Claude on 06/13/2026
in cell space (cell = vel_decimate*px/frame). Shape [vdim,vdim] (vy outer, vx inner -
matches v_out_idx convention). Normalized to sum 1. Use only for positives."""
vdim = 2 * vel_radius + 1
vcx = label["vx"] * vel_decimate
vcy = label["vy"] * vel_decimate
vyc = (np.arange(vdim)[:, None] - vel_radius) - vcy
vxc = (np.arange(vdim)[None, :] - vel_radius) - vcx
vel = np.exp(-(vxc * vxc + vyc * vyc) / (2 * sigma_v * sigma_v))
s = vel.sum()
return (vel / s).astype(np.float32) if s > 0 else vel.astype(np.float32)
def save_tiff_stack(frames, path):
"""Save [N,H,W] float frames as a multi-page 32-bit TIFF (opens in ImageJ)."""
from PIL import Image
imgs = [Image.fromarray(np.asarray(f, dtype=np.float32), mode="F") for f in frames]
imgs[0].save(path, save_all=True, append_images=imgs[1:])
if __name__ == "__main__":
import sys, os
out = sys.argv[1] if len(sys.argv) > 1 else "/tmp/c5p_dnn_samples"
os.makedirs(out, exist_ok=True)
rng = np.random.default_rng(12345)
# positives at several SNRs (target sub-pixel near patch center)
for snr in [2.0, 3.0, 5.0, 8.0]:
frames, lab = generate_sample(rng, snr=snr, target=True)
p = f"{out}/pos_snr{snr:.0f}.tif"
save_tiff_stack(frames, p)
print(f"POS snr={snr:.0f} center-offset dx={lab['dx']:+.2f} dy={lab['dy']:+.2f} "
f"vx={lab['vx']:+.3f} vy={lab['vy']:+.3f} px/fr -> {p}")
# off-center negative (target owned by a neighbor pixel - the key new class)
foff, loff = generate_sample(rng, place="offcenter", snr=8.0)
save_tiff_stack(foff, f"{out}/offcenter.tif")
print(f"OFFCENTER det={loff['det']} at t0-offset dx={loff['dx']:+.2f} dy={loff['dy']:+.2f} "
f"(|off|={np.hypot(loff['dx'],loff['dy']):.2f}) vx={loff['vx']:+.3f} vy={loff['vy']:+.3f} -> {out}/offcenter.tif")
# a noise negative
fneg, lneg = generate_sample(rng, place="none")
save_tiff_stack(fneg, f"{out}/noise.tif")
print(f"NOISE det={lneg['det']} -> {out}/noise.tif")
# batch + velocity soft-target sanity (three-class mix)
fb, lb = generate_batch(rng, 200, frac_pos=0.4, frac_off=0.4)
print(f"batch frames={fb.shape} positives(det=1)={lb['det'].mean():.2f} (≈frac_pos)")
f, lab = generate_sample(rng, snr=5.0, target=True)
t = soft_target_vel(lab)
print(f"soft_target_vel shape={t.shape} sum={t.sum():.4f} "
f"argmax(vy,vx)={np.unravel_index(t.argmax(), t.shape)} "
f"(true cells vy={lab['vy']*4:+.2f} vx={lab['vx']*4:+.2f})")
#!/usr/bin/env python3
# test_infer_client.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Stateful batched benchmark client for infer_server.py. By Claude on 06/20/2026.
UPLOAD -> INFER(a chunk of scenes) -> READBACK; reports timing + transfer sizes.
Usage: test_infer_client.py [host] [port] [T] [H] [W] [roi_w] [roi_h] [count]"""
import socket, struct, sys, time
from datetime import datetime
import numpy as np
host = sys.argv[1] if len(sys.argv) > 1 else "127.0.0.1"
port = int(sys.argv[2]) if len(sys.argv) > 2 else 5577
T = int(sys.argv[3]) if len(sys.argv) > 3 else 40
H = int(sys.argv[4]) if len(sys.argv) > 4 else 512
W = int(sys.argv[5]) if len(sys.argv) > 5 else 640
RW = int(sys.argv[6]) if len(sys.argv) > 6 else 70
RH = int(sys.argv[7]) if len(sys.argv) > 7 else 20
COUNT = int(sys.argv[8]) if len(sys.argv) > 8 else 16
CMD_BYE, CMD_UPLOAD, CMD_INFER, CMD_READBACK = 0, 1, 2, 3
s = socket.socket(); s.connect((host, port)); s.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
def rd(n):
b = b""
while len(b) < n: b += s.recv(n - len(b))
return b
# --- UPLOAD ---
stack = np.random.randn(T, H, W).astype(">f4").tobytes()
t = time.perf_counter()
s.sendall(struct.pack(">iiii", CMD_UPLOAD, T, H, W)); s.sendall(stack)
nl = struct.unpack(">i", rd(4))[0]
levs = [struct.unpack(">i", rd(4))[0] for _ in range(nl)]
N, bms = struct.unpack(">id", rd(12))
print(f"{datetime.now():%H:%M:%S} UPLOAD {T}x{H}x{W} ({T*H*W*4/1e6:.1f}MB up) -> {nl} levels {levs} N={N} "
f"build={bms:.1f}ms total={(time.perf_counter()-t)*1e3:.0f}ms")
# --- INFER a chunk of scenes at level 0 (newest = start + j*stride) ---
start = N - 1 # first valid newest in level 0
count = min(COUNT, levs[0] - start)
for it in range(2): # twice: it0 includes autotune
t = time.perf_counter()
s.sendall(struct.pack(">iiiiiiiii", CMD_INFER, 0, start, count, 1, 200, 250, RW, RH))
s.sendall(struct.pack(">d", 1.4 * 4)) # rmax_cells = vmax*vel_decimate
gms, oh, ow, cnt, nvel, rh, rw = struct.unpack(">diiiiii", rd(32))
o5 = rd(cnt * 5 * oh * ow * 4)
rf = rd(cnt * rh * rw * nvel * 4)
rt = (time.perf_counter() - t) * 1e3
down = (len(o5) + len(rf)) / 1e6
print(f"{datetime.now():%H:%M:%S} INFER it{it}: {cnt} scenes -> offset5[{cnt},5,{oh},{ow}] + roi[{cnt},{rh},{rw},{nvel}] "
f"{down:.1f}MB down | gpu={gms:.1f}ms ({gms/cnt:.1f}ms/scene) roundtrip={rt:.1f}ms ({rt/cnt:.1f}ms/scene)")
# --- READBACK (debug) ---
s.sendall(struct.pack(">iii", CMD_READBACK, 0, 0))
fh, fw = struct.unpack(">ii", rd(8)); _ = rd(fh * fw * 4)
print(f"{datetime.now():%H:%M:%S} READBACK lev0 f0 -> [{fh},{fw}]")
s.sendall(struct.pack(">i", CMD_BYE)); s.close()
# train.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""
Train the C5P FCN on synthetic Gaussian-noise patches. # By Claude on 06/13/2026
Phase 1: all-synthetic, pure Gaussian noise, SNR-swept. The benchmark question is whether
this matches phase correlation (velocity) and localizes (x,y) at low SNR. Runs inside the
NGC PyTorch container on the DGX Spark (GB10).
Usage (inside container, with the project dir mounted at /work):
python /work/train.py --steps 20000 --batch 256 --snr 1 8 --out /work/runs/raw1
"""
import argparse, os, time
import numpy as np
import torch
import synth
from model import RawFCN, fcn_loss, vel_bias_loss, reg_loss
def batched_soft_vel(vx, vy, vel_radius=5, vel_decimate=4, sigma_v=0.9):
"""[B] vx,vy (px/frame) -> [B, vdim*vdim] soft P(vx,vy) targets (vy outer, vx inner)."""
vdim = 2 * vel_radius + 1
cells = np.arange(vdim) - vel_radius # [vdim]
vcx = (vx * vel_decimate)[:, None, None] # [B,1,1]
vcy = (vy * vel_decimate)[:, None, None]
dvx = cells[None, None, :] - vcx # [B,1,vdim]
dvy = cells[None, :, None] - vcy # [B,vdim,1]
g = np.exp(-(dvx * dvx + dvy * dvy) / (2 * sigma_v * sigma_v)) # [B,vdim,vdim]
g = np.nan_to_num(g) # negatives have NaN vx/vy
s = g.reshape(g.shape[0], -1).sum(1, keepdims=True)
flat = g.reshape(g.shape[0], -1)
return np.divide(flat, s, out=np.zeros_like(flat), where=s > 0).astype(np.float32)
def make_batch(rng, B, dev, frac_pos=0.4, frac_off=0.4,
off_w_near=1.5, off_w_far=0.3, off_w_tau=2.5, **kw):
vr = kw.pop("vel_radius", 5); vd = kw.pop("vel_decimate", 4); sv = kw.pop("sigma_v", 0.9)
frames, lab = synth.generate_batch(rng, B, frac_pos=frac_pos, frac_off=frac_off, **kw)
vel_soft = batched_soft_vel(lab["vx"], lab["vy"], vr, vd, sv)
off = np.stack([np.nan_to_num(lab["dx"]), np.nan_to_num(lab["dy"])], axis=1).astype(np.float32)
# per-sample detection-loss weight: 1 for center/noise, confusability-weighted for off-center
det_w = np.ones(frames.shape[0], dtype=np.float32)
isoff = lab["place_code"] == 2
if isoff.any():
offd = np.hypot(lab["dx"][isoff], lab["dy"][isoff])
det_w[isoff] = offcenter_weight(offd, w_near=off_w_near, w_far=off_w_far, tau=off_w_tau)
x = torch.from_numpy(frames).to(dev) # [B,N,P,P]
return (x,
torch.from_numpy(lab["det"]).to(dev),
torch.from_numpy(vel_soft).to(dev),
torch.from_numpy(off).to(dev),
torch.from_numpy(det_w).to(dev),
lab)
def offcenter_weight(off, off_min=1.0, w_near=2.0, w_far=0.3, tau=2.5):
"""Detection-loss weight for off-center negatives: heaviest at the immediate neighbors # By Claude on 06/13/2026
(confusability ~ PSF overlap with center), decaying with crossing distance to a floor."""
return (w_far + (w_near - w_far) * np.exp(-np.maximum(off - off_min, 0.0) / tau)).astype(np.float32)
def s_by_class(out, model, lab, near_r=2.0):
"""Mean confidence s=sigmoid(det), split center / off-NEAR / off-FAR / noise. # By Claude on 06/13/2026
The decisive number is off-NEAR (|off|<=near_r): if the net drives it to 0 it learned
fine spatial discrimination (NOT the MF); if it stays high, a small net can't resolve
near-misses -> evidence for a deeper CNN."""
det, _, _ = model.split(out)
s = torch.sigmoid(det.reshape(det.shape[0])).detach().cpu().numpy()
pc = lab["place_code"]
off = np.hypot(np.nan_to_num(lab["dx"]), np.nan_to_num(lab["dy"]))
def mean_of(m):
return float(s[m].mean()) if m.any() else float("nan")
isoff = pc == 2
return [mean_of(pc == 1), mean_of(isoff & (off <= near_r)),
mean_of(isoff & (off > near_r)), mean_of(pc == 0)] # ctr, offNear, offFar, noise
def vel_err_px(out, model, lab, vel_decimate=4):
"""Expected-velocity error (px/frame) on positives: softmax-weighted velocity centroid
vs the true (vx,vy). A quick training-quality readout (full PC benchmark is separate)."""
pos = lab["det"] > 0.5
if pos.sum() == 0:
return float("nan")
if model.velocity_mode == "reg": # T7: direct (Vx,Vy) // By Claude on 06/17/2026
_, v, _, _ = model.split_reg(out)
v = v.reshape(v.shape[0], 2).detach().cpu().numpy()
e = np.sqrt((v[pos, 0] - lab["vx"][pos]) ** 2 + (v[pos, 1] - lab["vy"][pos]) ** 2)
return float(e.mean())
det, vel, _ = model.split(out)
vdim = model.vdim
p = torch.softmax(vel.reshape(vel.shape[0], -1), dim=1).reshape(-1, vdim, vdim)
cells = torch.arange(vdim, device=p.device) - model.vel_radius
evx = (p.sum(1) * cells).sum(1) / vel_decimate # [B] px/frame
evy = (p.sum(2) * cells).sum(1) / vel_decimate
evx = evx.detach().cpu().numpy(); evy = evy.detach().cpu().numpy()
e = np.sqrt((evx[pos] - lab["vx"][pos]) ** 2 + (evy[pos] - lab["vy"][pos]) ** 2)
return float(e.mean())
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--steps", type=int, default=20000)
ap.add_argument("--batch", type=int, default=256)
ap.add_argument("--lr", type=float, default=1e-3)
ap.add_argument("--snr", type=float, nargs=2, default=[1.0, 8.0])
ap.add_argument("--frac_pos", type=float, default=0.4) # center-positive
ap.add_argument("--frac_off", type=float, default=0.4) # off-center negative (the key class)
ap.add_argument("--w_vel", type=float, default=1.0) # velocity loss weight (raise to emphasize velocity)
ap.add_argument("--w_bias", type=float, default=1.0) # batch-moment de-biasing weight (per-SNR mean scale -> 1); 0 disables // By Claude on 06/15/2026
ap.add_argument("--bias_bins", type=int, default=4) # number of bins for the de-biasing term // By Claude on 06/15/2026
ap.add_argument("--bias_by", choices=["snr", "s"], default="snr") # de-bias conditioning var: snr (clean label, in-loss) or s=sigmoid(det) // By Claude on 06/15/2026
ap.add_argument("--vmax", type=float, default=1.0) # training velocity disk radius, px/frame (was hardcoded 1.0; raise to cover the grid +-1.25 on-axis) // By Claude on 06/15/2026
ap.add_argument("--velocity_mode", choices=["grid", "reg"], default="grid") # grid=121-cell softmax | reg=continuous Vx,Vy,logvar (T7) // By Claude on 06/17/2026
ap.add_argument("--snr_log", action="store_true") # sample SNR log-uniform over [snr_lo,snr_hi] (span high for sharpness anchor) // By Claude on 06/17/2026
ap.add_argument("--mf_s", action="store_true") # (reg) channel 0 regresses the MF path-sum S (option a) instead of det BCE -> informative vote weight // By Claude on 06/18/2026
ap.add_argument("--w_mfs", type=float, default=0.02) # MF-S regression loss weight (balances large path-sum MSE vs velocity NLL) // By Claude on 06/18/2026
ap.add_argument("--motion_blur", action="store_true") # render moving targets motion-blurred (streak ~|v|*blur_frac) - higher pyramid levels (T2) // By Claude on 06/17/2026
ap.add_argument("--blur_frac", type=float, default=1.0) # motion-blur window length in frames (1.0 non-overlap decimation; 2.0 ~ 50%-overlap) // By Claude on 06/17/2026
ap.add_argument("--w_off", type=float, default=0.3) # sub-pixel offset loss weight (low: position precision is secondary)
ap.add_argument("--off_w_near", type=float, default=1.5) # near-miss suppression weight (lower if +-1px ambiguity is acceptable)
ap.add_argument("--off_w_far", type=float, default=0.3)
ap.add_argument("--off_w_tau", type=float, default=2.5)
ap.add_argument("--nframes", type=int, default=8)
ap.add_argument("--patch", type=int, default=24)
ap.add_argument("--vel_radius", type=int, default=5)
ap.add_argument("--vel_decimate", type=int, default=4)
ap.add_argument("--sigma_v", type=float, default=0.9)
ap.add_argument("--seed", type=int, default=0)
ap.add_argument("--log_every", type=int, default=200)
ap.add_argument("--out", type=str, default="/tmp/c5p_run")
args = ap.parse_args()
os.makedirs(args.out, exist_ok=True)
dev = "cuda" if torch.cuda.is_available() else "cpu"
print(f"device={dev} torch={torch.__version__} steps={args.steps} batch={args.batch} "
f"snr={args.snr} out={args.out}", flush=True)
rng = np.random.default_rng(args.seed)
torch.manual_seed(args.seed)
model = RawFCN(n_frames=args.nframes, vel_radius=args.vel_radius, patch=args.patch,
velocity_mode=args.velocity_mode, vmax=args.vmax).to(dev)
opt = torch.optim.Adam(model.parameters(), lr=args.lr)
nparam = sum(p.numel() for p in model.parameters())
print(f"model params={nparam} out_ch={model.out_ch}", flush=True)
bkw = dict(N=args.nframes, H=args.patch, W=args.patch, snr=tuple(args.snr),
vmax_px=args.vmax, snr_log=args.snr_log,
motion_blur=args.motion_blur, blur_frac=args.blur_frac,
vel_radius=args.vel_radius, vel_decimate=args.vel_decimate, sigma_v=args.sigma_v)
print(f"vmax_px={args.vmax} w_bias={args.w_bias} de-bias: {args.bias_bins} {args.bias_by}-bins (equal-population)", flush=True)
csv_path = f"{args.out}/losses.csv"
csv = open(csv_path, "w")
csv.write("step,det,vel,off,bias,velRMSE\n"); csv.flush()
t0 = time.time()
run = {"det": 0.0, "vel": 0.0, "off": 0.0, "bias": 0.0}
for step in range(1, args.steps + 1):
x, det_t, vel_t, off_t, det_w, lab = make_batch(rng, args.batch, dev,
frac_pos=args.frac_pos, frac_off=args.frac_off,
off_w_near=args.off_w_near, off_w_far=args.off_w_far,
off_w_tau=args.off_w_tau, **bkw)
out = model(x)
if args.velocity_mode == "reg": # T7 continuous head // By Claude on 06/17/2026
vx_t = torch.from_numpy(lab["vx"]).to(dev); vy_t = torch.from_numpy(lab["vy"]).to(dev)
bin_var = (torch.from_numpy(lab["snr"]).to(dev) if args.bias_by == "snr"
else torch.sigmoid(model.split_reg(out)[0].reshape(-1)).detach())
mfsum_t = torch.from_numpy(lab["mfsum"]).to(dev) if args.mf_s else None
loss, comp = reg_loss(out, model, det_t, vx_t, vy_t, off_t, det_w=det_w,
w_vel=args.w_vel, w_off=args.w_off, w_bias=args.w_bias,
bin_var=bin_var, n_bins=args.bias_bins,
mfsum_t=mfsum_t, w_mfs=args.w_mfs)
else:
loss, comp = fcn_loss(out, model, det_t, vel_t, off_t, det_w=det_w,
w_vel=args.w_vel, w_off=args.w_off)
comp["bias"] = 0.0
if args.w_bias > 0: # By Claude on 06/15/2026
vx_t = torch.from_numpy(lab["vx"]).to(dev); vy_t = torch.from_numpy(lab["vy"]).to(dev)
if args.bias_by == "snr":
bin_var = torch.from_numpy(lab["snr"]).to(dev)
else: # bin by the network's own confidence s
bin_var = torch.sigmoid(model.split(out)[0].reshape(-1)).detach()
lb = vel_bias_loss(out, model, vx_t, vy_t, det_t, bin_var, args.bias_bins, args.vel_decimate)
loss = loss + args.w_bias * lb
comp["bias"] = lb.detach().item()
opt.zero_grad(); loss.backward(); opt.step()
for k in run: run[k] += comp[k]
if step % args.log_every == 0:
n = args.log_every
verr = vel_err_px(out, model, lab, args.vel_decimate)
sc, son, sof, sn = s_by_class(out, model, lab)
sps = step / (time.time() - t0)
print(f"step {step:6d} det {run['det']/n:.4f} vel {run['vel']/n:.4f} "
f"off {run['off']/n:.4f} bias {run['bias']/n:.4f} velRMSE {verr:.4f}px/fr "
f"s[ctr/offN/offF/noise]={sc:.2f}/{son:.2f}/{sof:.2f}/{sn:.2f} {sps:.0f} it/s", flush=True)
csv.write(f"{step},{run['det']/n:.5f},{run['vel']/n:.5f},{run['off']/n:.5f},{run['bias']/n:.5f},{verr:.5f}\n")
csv.flush()
run = {k: 0.0 for k in run}
csv.close()
torch.save({"model": model.state_dict(), "args": vars(args)}, f"{args.out}/model.pt")
print(f"saved {args.out}/model.pt losses->{csv_path}", flush=True)
# ONNX export - the single artifact for BOTH deploy phases: ORT-Java (array-fed test) and # By Claude on 06/13/2026
# TensorRT (zero-copy CUDA prod). Dynamic H/W axes so the all-conv FCN slides over any frame.
try:
model.eval()
dummy = torch.zeros(1, args.nframes, args.patch, args.patch, device=dev)
onnx_path = f"{args.out}/model.onnx"
torch.onnx.export(
model, dummy, onnx_path,
input_names=["frames"], output_names=["out"],
dynamic_axes={"frames": {0: "B", 2: "H", 3: "W"}, "out": {0: "B", 2: "Hout", 3: "Wout"}},
opset_version=17)
print(f"exported {onnx_path} (input frames[B,{args.nframes},H,W] -> out[B,{model.out_ch},Hout,Wout])", flush=True)
except Exception as e:
print(f"ONNX export skipped: {type(e).__name__}: {e}", flush=True)
if __name__ == "__main__":
main()
#!/usr/bin/env python3
# velocity_bias.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""Velocity-bias diagnostic. # By Claude on 06/15/2026
Predicted vs true per-frame velocity across SNR, on clean CENTER targets, to decide whether the
underestimate is SYSTEMATIC (clean/no-noise also < true) or SNR-DEPENDENT (only low SNR hedges
toward 0 as the fan broadens). Fits predicted = gain*true + off for argmax and softmax-centroid;
gain<1 = underestimate. Pools vx and vy (symmetric)."""
import argparse, numpy as np, torch, synth
from model import RawFCN
def softmax(z):
z = z - z.max(); e = np.exp(z); return e / e.sum()
if __name__ == "__main__":
ap = argparse.ArgumentParser()
ap.add_argument("ck")
ap.add_argument("--nframes", type=int, default=9)
ap.add_argument("--patch", type=int, default=24) # must match the model's training patch (24 or 32) # By Claude on 06/16/2026
ap.add_argument("--vel_radius", type=int, default=5)
ap.add_argument("--vel_decimate", type=int, default=4)
ap.add_argument("--m", type=int, default=1500)
ap.add_argument("--seed", type=int, default=0)
ap.add_argument("--velocity_mode", choices=["grid", "reg"], default="grid") # T7 // By Claude on 06/17/2026
ap.add_argument("--vmax", type=float, default=1.4) # reg bound (must match training)
ap.add_argument("--vmax_px", type=float, default=None) # test-target velocity disk (defaults to vmax); raise for higher-velocity models // By Claude on 06/17/2026
ap.add_argument("--motion_blur", action="store_true") # test on motion-blurred targets (match blur-trained models) // By Claude on 06/17/2026
ap.add_argument("--blur_frac", type=float, default=1.0)
a = ap.parse_args()
vmax_px = a.vmax_px if a.vmax_px is not None else a.vmax
dev = "cuda" if torch.cuda.is_available() else "cpu"
ck = torch.load(a.ck, map_location=dev)
m = RawFCN(n_frames=a.nframes, vel_radius=a.vel_radius, patch=a.patch,
velocity_mode=a.velocity_mode, vmax=a.vmax).to(dev)
m.load_state_dict(ck["model"]); m.eval()
n = 2 * a.vel_radius + 1; step = 1.0 / a.vel_decimate
ix = np.arange(n * n); vxc = (ix % n - a.vel_radius) * step; vyc = (ix // n - a.vel_radius) * step
rng = np.random.default_rng(a.seed)
print(f"model {a.ck} nframes={a.nframes} vel grid +/-{a.vel_radius*step:.2f}px step {step} m={a.m}/snr")
print(f"{'snr':>6} {'pairs':>6} {'argmax gain':>12} {'off':>7} {'cen gain':>10} {'off':>7} {'cenRMSE':>8}")
for snr in [100.0, 8.0, 4.0, 2.0, 1.0]:
tv, pa, pc, sig = [], [], [], []
for _ in range(a.m):
f, lab = synth.generate_sample(rng, N=a.nframes, H=a.patch, W=a.patch, snr=snr, place="center",
vmax_px=vmax_px, motion_blur=a.motion_blur, blur_frac=a.blur_frac)
x = torch.from_numpy(f[None]).float().to(dev)
with torch.no_grad():
out = m(x)[0, :, 0, 0].cpu().numpy()
tv += [lab["vx"], lab["vy"]]
if a.velocity_mode == "reg": # out = [det, Vx, Vy, logvar, dx, dy]
pvx, pvy = float(out[1]), float(out[2])
pa += [pvx, pvy]; pc += [pvx, pvy]; sig.append(float(np.exp(0.5 * out[3])))
else:
vel = softmax(out[1:1 + n * n]); s = vel.sum(); k = int(np.argmax(vel))
pa += [vxc[k], vyc[k]]
pc += [(vxc * vel).sum() / s, (vyc * vel).sum() / s]
tv, pa, pc = np.array(tv), np.array(pa), np.array(pc)
ga, ba = np.polyfit(tv, pa, 1); gc, bc = np.polyfit(tv, pc, 1)
rmse = float(np.sqrt(np.mean((pc - tv) ** 2)))
sigstr = f" sigma={np.mean(sig):.3f}" if sig else ""
print(f"{snr:6.0f} {len(tv):6d} {ga:12.3f} {ba:+7.2f} {gc:10.3f} {bc:+7.2f} {rmse:8.3f}{sigstr}")
# viz_results.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""
Visualize a trained C5P FCN's OUTPUT on test patches. # By Claude on 06/13/2026
Loads a model.pt, runs a GxG grid of test samples (mixed classes), and writes a velocity-
block image like our -POST: each cell is the 11x11 softmax P(vx,vy) for that sample, 1-px
NaN gaps. Two stacks (pages): [0] the net's P(vx,vy); [1] for reference, the true-velocity
one-hot bump (so you can eyeball pred vs truth side by side). Prints per-cell s + pred/true v.
Run inside the container:
python /work/viz_results.py /work/runs/weighted/model.pt /work/runs/weighted/results
"""
import sys
import numpy as np
import torch
import synth
from model import RawFCN
def main():
ckpt_path = sys.argv[1]
out = sys.argv[2] if len(sys.argv) > 2 else ckpt_path.rsplit("/", 1)[0] + "/results"
grid = int(sys.argv[3]) if len(sys.argv) > 3 else 10
ck = torch.load(ckpt_path, map_location="cpu")
a = ck["args"]
vr = a.get("vel_radius", 5); vd = a.get("vel_decimate", 4)
N = a.get("nframes", 8); P = a.get("patch", 24)
model = RawFCN(n_frames=N, vel_radius=vr); model.load_state_dict(ck["model"]); model.eval()
vdim = 2 * vr + 1
cell = vdim + 1
side = grid * cell - 1
pred_img = np.full((side, side), np.nan, dtype=np.float32) # raw net P(vx,vy)
sw_img = np.full((side, side), np.nan, dtype=np.float32) # P(vx,vy) * s (confidence-gated)
true_img = np.full((side, side), np.nan, dtype=np.float32) # true velocity one-hot
rng = np.random.default_rng(123)
print("cell place s pred(vx,vy) true(vx,vy)")
for r in range(grid):
for c in range(grid):
place = synth.pick_place(rng, 0.5, 0.3)
f, lab = synth.generate_sample(rng, N=N, H=P, W=P, snr=(2.0, 8.0), place=place)
with torch.no_grad():
o = model(torch.from_numpy(f[None])) # [1,C,1,1]
det, vel, _ = model.split(o)
s = float(torch.sigmoid(det).reshape(-1)[0])
pv = torch.softmax(vel.reshape(1, -1), 1).reshape(vdim, vdim).numpy()
y0 = r * cell; x0 = c * cell
pred_img[y0:y0 + vdim, x0:x0 + vdim] = pv
sw_img[y0:y0 + vdim, x0:x0 + vdim] = pv * s # dark unless the net is confident
# true-velocity marker, intensity encodes CLASS so FP are readable in-image:
# center target -> 1.0 dot, offcenter target -> 0.5 dot, noise -> blank.
tb = np.zeros((vdim, vdim), np.float32)
if place in ("center", "offcenter"):
cy = int(round(vr + lab["vy"] * vd)); cx = int(round(vr + lab["vx"] * vd))
if 0 <= cy < vdim and 0 <= cx < vdim:
tb[cy, cx] = 1.0 if place == "center" else 0.5
true_img[y0:y0 + vdim, x0:x0 + vdim] = tb
cells = np.arange(vdim) - vr
evx = (pv.sum(0) * cells).sum() / vd; evy = (pv.sum(1) * cells).sum() / vd
tv = (f"{lab['vx']:+.2f},{lab['vy']:+.2f}" if place != "none" else " - ")
print(f"({r},{c}) {place:9s} {s:.2f} {evx:+.2f},{evy:+.2f} {tv}")
# order so the two comparison pages are ADJACENT (single scrollwheel toggle 1<->2):
stack = np.stack([sw_img, true_img, pred_img])
path = out if out.endswith(".tif") else out + "-velblocks.tif"
synth.save_tiff_stack(stack, path)
print(f"\nwrote {path} 3 pages: [0]=net P(vx,vy)*s (dark=no detection), "
f"[1]=truth (1.0=center target, 0.5=offcenter target, blank=noise), [2]=raw P(vx,vy). "
f"grid={grid}x{grid} cell={vdim}x{vdim}, step=0.25px/fr, center=0")
if __name__ == "__main__":
main()
# viz_trainingdata.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""
Visualize a subset of the training data as a time-sliced grid. # By Claude on 06/13/2026
Output: a multi-page 32-bit TIFF where
- page (slice) index = time/frame (0 = newest), scrub it to watch targets move;
- each page is a GxG grid of independent training samples (one patch per cell),
1-px NaN separators between cells (renders as a grid in ImageJ, like -RECT).
Plus a printed label table per cell (det / vx,vy / dx,dy / snr) so you can correlate
what you see with the ground truth - the (b) "is the training data right?" check.
Usage:
python viz_trainingdata.py /tmp/c5p_train_viz --grid 8 --snr 1 8 --neg_frac 0.3
"""
import argparse
import numpy as np
import synth
def make_grid_stack(rng, grid=8, N=8, P=24, snr=(1.0, 8.0),
frac_pos=0.4, frac_off=0.4, radial=False):
"""Returns (stack [N, gh, gw] with NaN gaps, labels list[grid*grid]).
Three classes mixed: center-positive / off-center-negative / noise - so you can SEE the
off-center targets the net must learn to suppress (det=0 despite a target in the patch)."""
gap = 1
cell = P + gap
gh = grid * cell - gap
gw = grid * cell - gap
stack = np.full((N, gh, gw), np.nan, dtype=np.float32)
labels = []
for r in range(grid):
for c in range(grid):
place = synth.pick_place(rng, frac_pos, frac_off)
frames, lab = synth.generate_sample(rng, N=N, H=P, W=P, snr=snr,
place=place, radial=radial)
y0 = r * cell; x0 = c * cell
stack[:, y0:y0 + P, x0:x0 + P] = frames
lab["cell"] = (r, c)
labels.append(lab)
return stack, labels
if __name__ == "__main__":
ap = argparse.ArgumentParser()
ap.add_argument("out")
ap.add_argument("--grid", type=int, default=8)
ap.add_argument("--nframes", type=int, default=8)
ap.add_argument("--patch", type=int, default=24)
ap.add_argument("--snr", type=float, nargs=2, default=[1.0, 8.0])
ap.add_argument("--frac_pos", type=float, default=0.4)
ap.add_argument("--frac_off", type=float, default=0.4)
ap.add_argument("--radial", action="store_true")
ap.add_argument("--seed", type=int, default=0)
args = ap.parse_args()
rng = np.random.default_rng(args.seed)
stack, labels = make_grid_stack(rng, args.grid, args.nframes, args.patch,
tuple(args.snr), args.frac_pos, args.frac_off, args.radial)
path = args.out if args.out.endswith(".tif") else args.out + ".tif"
synth.save_tiff_stack(stack, path)
print(f"grid={args.grid}x{args.grid} {args.nframes} time slices patch={args.patch} "
f"-> {path} (size {stack.shape[2]}x{stack.shape[1]})")
print("cell(r,c) class det vx vy |off| snr")
for lab in labels:
r, c = lab["cell"]
pl = lab["place"]
if pl == "none":
print(f" ({r},{c}) noise 0 - - - (~{lab['snr']:.1f})")
else:
off = np.hypot(lab["dx"], lab["dy"])
print(f" ({r},{c}) {pl:9s} {int(lab['det'])} {lab['vx']:+.3f} {lab['vy']:+.3f} "
f"{off:.2f} {lab['snr']:.1f}")
# vote_1d.py - part of imagej_elphel_dnn (Elphel DNN: tile-processor motion detection / ranging)
#
# Copyright (C) 2026 Elphel, Inc.
#
# -----------------------------------------------------------------------------
# imagej_elphel_dnn is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""C5P v2 diagnostic: 1-D vote-contribution profile for a single horizontal target. By Claude 06/18/2026
Andrey's question: a disk pixel's local MF can lock the bright target at ANY temporal layer i, not
only the oldest (i=N-1). The velocity it reports is V_i = V_true + (P-H)/i and it back-projects to
T_i = H - V_true*(N-1) - (P-H)*[(N-1)/i - 1]
so the per-target vote is a FAN, not a delta; the i=N-1 term (T-8) is the densest ("closest wrong
solution") but real bright points at other layers smear it. This script renders ONE horizontal
target (vy=0; the y-axis is symmetric, so 1-D along x is enough), runs frozen Stage 1, and shows,
along the head row: the velocity ramp V_P(x), where each pixel votes (tail), and the resulting
1-D vote histogram for three weightings (s, s^2, MF path-sum). It quantifies the fan vs SNR/velocity.
"""
import numpy as np, torch, torch.nn.functional as F
import synth, stage2 as S2
from model import RawFCN
dev = "cuda" if torch.cuda.is_available() else "cpu"
N, vmax, P = 9, 2.8, 52
half, Nm1 = P // 2, N - 1
HW = 140
F_ = HW - P + 1
s1 = RawFCN(n_frames=N, patch=P, velocity_mode="reg", vmax=vmax).to(dev)
s1.load_state_dict(torch.load("/work/runs/stage1_mf/model.pt", map_location=dev)["model"]); s1.eval()
def render_h(V, amp, noise, blur_frac=1.0, nb=4, seed=0):
"""One horizontal target, head centered, velocity (V,0), causal motion blur."""
rng = np.random.default_rng(seed)
frames = rng.standard_normal((N, HW, HW)).astype(np.float32) if noise else np.zeros((N, HW, HW), np.float32)
Hx = HW / 2.0 + V * Nm1 * 0.5 # center the whole track in the field
Hy = HW / 2.0
subs = np.arange(nb) * (blur_frac / nb)
for i in range(N):
acc = np.zeros((HW, HW), np.float64)
for ss in subs:
acc += synth.halfcos_bump(Hx - V * (i + ss), Hy, HW, HW)
frames[i] += (amp * acc / nb).astype(np.float32)
return frames, Hx, Hy
def bar(v, vmax_, width=40):
n = int(round(width * v / vmax_)) if vmax_ > 0 else 0
return "#" * n
def analyze(V, amp, noise, nseed=1, win=24):
"""Print the head-row ramp + the 1-D vote histogram (weights s, s^2, MF-sum)."""
txf_true = (HW / 2.0 + V * Nm1 * 0.5 - V * Nm1) - half # true tail, field x
hxf = (HW / 2.0 + V * Nm1 * 0.5) - half # head, field x
yf = int(round(HW / 2.0 - half)) # head row, field y
xs_win = np.arange(max(0, int(hxf) - win), min(F_, int(hxf) + win + 1))
# accumulate vote histograms over seeds; keep seed-0 fields for the per-pixel table
BINS = np.arange(-12, 21) # tail offset (field x) relative to true tail
hist = {k: np.zeros(len(BINS)) for k in ("s", "s2", "mf")}
tab = None
for seed in range(nseed):
fr, Hx, Hy = render_h(V, amp, noise, seed=seed)
s, vx, vy = S2.stage1_dense(s1, fr, dev=dev)
mf = S2.mf_sum(torch.from_numpy(fr).to(dev), vx, vy, half, N)
s = s.cpu().numpy(); vx = vx.cpu().numpy(); vy = vy.cpu().numpy(); mf = mf.cpu().numpy()
sv = s[yf, xs_win]; vxv = vx[yf, xs_win]; mfv = mf[yf, xs_win]
tail = xs_win - vxv * Nm1 # where each pixel votes (field x)
toff = tail - txf_true # offset from the true tail
for name, w in (("s", sv), ("s2", sv * sv), ("mf", mfv)):
for t, wv in zip(toff, w):
b = int(round(t)) - BINS[0]
if 0 <= b < len(BINS):
hist[name][b] += wv
if seed == 0:
tab = (xs_win - hxf, vxv, vy[yf, xs_win], toff, sv, mfv)
print(f"\n===== V={V} amp={amp} noise={noise} (disk |dx|<{(vmax-abs(V))*Nm1:.1f}px; nseed={nseed}) =====")
dx, vxv, vyv, toff, sv, mfv = tab
print(" per-pixel (head row, seed0): dx=x-head V_P vy tailΔ(=tail-trueT) s MF")
for j in range(0, len(dx), 2):
if sv[j] > 0.05 or abs(dx[j]) < 6:
print(f" dx={dx[j]:+5.1f} V_P={vxv[j]:+5.2f} vy={vyv[j]:+4.2f} tailΔ={toff[j]:+6.2f} s={sv[j]:.3f} MF={mfv[j]:6.2f}")
for name in ("s2", "mf"):
h = hist[name] / max(nseed, 1); mx = h.max()
# stats: peak offset, weighted centroid, weighted std, concentration within +/-1.5 of true tail
if h.sum() > 0:
c = (BINS * h).sum() / h.sum()
sd = np.sqrt(((BINS - c) ** 2 * h).sum() / h.sum())
conc = h[np.abs(BINS) <= 1.5].sum() / h.sum()
pk = BINS[h.argmax()]
else:
c = sd = conc = pk = 0
print(f" -- vote histogram (weight={name}): peak@Δ{pk:+d} centroidΔ{c:+.2f} width(σ){sd:.2f} conc(±1.5){conc*100:.0f}%")
for b, hv in zip(BINS, h):
mark = " <-trueT" if b == 0 else ""
print(f" Δ{b:+3d} |{bar(hv, mx):40s}| {hv:7.2f}{mark}")
if __name__ == "__main__":
analyze(V=1.0, amp=5, noise=False)
analyze(V=2.5, amp=5, noise=False)
analyze(V=1.0, amp=5, noise=True, nseed=16)
analyze(V=2.5, amp=5, noise=True, nseed=16)
analyze(V=2.5, amp=2, noise=True, nseed=16)
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment