• Andrey Filippov's avatar
    CLAUDE: native LibTorch L1+L2 inference shim (libtpdnn.so) for JNA · 9202dd30
    Andrey Filippov authored
    Piece 2 of the native-JNA DNN path. tp_dnn.cpp is a C-ABI port of infer_server.py's
    hot path so the Java client can run L1+L2 in-process instead of over TCP:
      tpdnn_init/upload/infer/free  (+ num_levels/level_frames)
    faithfully reproducing build_pyramid, the 16x shift-and-stitch full-res recovery,
    decode (ghostbuster + velocity centroid), and the L2 ConvGRU recurrence + track-age.
    Loads the TorchScript models from imagej_elphel_dnn (export_torchscript /
    export_l2_torchscript). Disables the TorchScript JIT fuser at init (nvrtc element-wise
    fusion fails on Blackwell; production wants no runtime nvrtc).
    
    Validated: native vs the running Python server (same CUDA) max|diff| offset5=0,
    roi=0 — bit-for-bit. (Oracle dump_ref.py + driver tpdnn_test.cpp, scratch.)
    
    Built standalone via build_dnn.sh (g++ + libtorch 2.7.1+cu128, ABI=1), separate
    from the nvcc-built libtileproc.so; fetch_libtorch.sh pulls the pinned libtorch.
    Context unification + zero-copy kernel<->tensor sharing is a later step.
    Co-Authored-By: 's avatarClaude Opus 4.8 (1M context) <noreply@anthropic.com>
    9202dd30
Name
Last commit
Last update
..
.gitignore Loading commit data...
build_dnn.sh Loading commit data...
build_lib.sh Loading commit data...
build_probe.sh Loading commit data...
fetch_libtorch.sh Loading commit data...
tp_dnn.cpp Loading commit data...
tp_jna.cpp Loading commit data...
tp_nvrtc_probe.cpp Loading commit data...