- 27 Jun, 2026 1 commit
-
-
Andrey Filippov authored
Piece 1 of the native-JNA DNN path (no Python server). Adds export_l2_torchscript.py: wraps a trained Layer2Net's cell+head into a single-step L2Step module forward(x,h)->(h_new,det,vel) — exactly infer_server's per-scene recurrence (h=cell(x,h); det,vel=decode(h)) — so the C++ side just carries h and calls it per scene. Size-agnostic (circular pad + 1x1 head), runs on the full field. Validated: scripted==eager exact (0.0); C++ LibTorch (libtorch_probe/l2_probe) loads it on Blackwell CUDA and replays the recurrence with hidden-state match 9.5e-7. Required disabling the TorchScript JIT fuser (nvrtc element-wise fusion fails on Blackwell -arch; production wants no runtime nvrtc) — folds into the native lib startup in piece 2. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
- 26 Jun, 2026 6 commits
-
-
Andrey Filippov authored
infer_server decode batched GPU_CHUNK scenes (p=[chunk,121,H,W]); 16 -> ~2.5GB -> OOM on the 16GB 5060 Ti (shared with Eyesis JCuda). Make GPU_CHUNK env-tunable (default 16 for big GPUs); run_infer_local.sh sets GPU_CHUNK=4 (+ expandable_segments) so local L1+L2 fits. By Claude 06/27/2026. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Run the PyTorch L1+L2 server on the workstation 5060 Ti; Java pipeline points at 127.0.0.1:5577. Verified: server loads L1(weighted9_pm_s)+L2(l2_v1) on CUDA, warm-up + pyramid build OK. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Per Andrey: future DNN companions to imagej-elphel (all PyTorch, not Java) live here too. README + per-file header tagline generalized accordingly. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Validated in PoC: L1 (weighted9_pm_s) -> TorchScript -> C++/CUDA on Blackwell matches PyTorch (7.6e-4). Writes raw-f32 reference vectors for the native probe. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
L1 RawFCN + L2 ConvGRU(torus), synthetic data gen, training/eval, infer_server, and export_torchscript.py (self-contained TorchScript for native LibTorch inference). GPLv3 (Elphel norm); headers on all .py/.sh; LICENSE = GPLv3. runs/ checkpoints untracked. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-