Commit 9202dd30 authored by Andrey Filippov's avatar Andrey Filippov

CLAUDE: native LibTorch L1+L2 inference shim (libtpdnn.so) for JNA

Piece 2 of the native-JNA DNN path. tp_dnn.cpp is a C-ABI port of infer_server.py's
hot path so the Java client can run L1+L2 in-process instead of over TCP:
  tpdnn_init/upload/infer/free  (+ num_levels/level_frames)
faithfully reproducing build_pyramid, the 16x shift-and-stitch full-res recovery,
decode (ghostbuster + velocity centroid), and the L2 ConvGRU recurrence + track-age.
Loads the TorchScript models from imagej_elphel_dnn (export_torchscript /
export_l2_torchscript). Disables the TorchScript JIT fuser at init (nvrtc element-wise
fusion fails on Blackwell; production wants no runtime nvrtc).

Validated: native vs the running Python server (same CUDA) max|diff| offset5=0,
roi=0 — bit-for-bit. (Oracle dump_ref.py + driver tpdnn_test.cpp, scratch.)

Built standalone via build_dnn.sh (g++ + libtorch 2.7.1+cu128, ABI=1), separate
from the nvcc-built libtileproc.so; fetch_libtorch.sh pulls the pinned libtorch.
Context unification + zero-copy kernel<->tensor sharing is a later step.
Co-Authored-By: 's avatarClaude Opus 4.8 (1M context) <noreply@anthropic.com>
parent 7540202f
libtileproc.so libtileproc.so
libtpdnn.so
tp_nvrtc_probe tp_nvrtc_probe
*.o *.o
#!/usr/bin/env bash
# Build libtpdnn.so — native LibTorch L1+L2 inference shim for JNA (no Python server).
# Separate from libtileproc.so (nvcc kernels): this links libtorch (g++). The two .so's unify
# their CUDA context later (zero-copy kernel<->tensor). By Claude on 2026-06-26.
#
# Requires libtorch 2.7.1+cu128 (matches the TorchScript export torch version). Set LIBTORCH to
# its root (default /home/elphel/git/libtorch). Run jna/fetch_libtorch.sh to obtain it.
set -e
cd "$(dirname "$0")"
LIBTORCH="${LIBTORCH:-/home/elphel/git/libtorch}"
[ -d "$LIBTORCH/include/torch" ] || { echo "libtorch not found at $LIBTORCH (set LIBTORCH= or run fetch_libtorch.sh)"; exit 1; }
g++ -std=gnu++17 -O3 -DNDEBUG -fPIC --shared \
-D_GLIBCXX_USE_CXX11_ABI=1 \
-I"$LIBTORCH/include" -I"$LIBTORCH/include/torch/csrc/api/include" \
tp_dnn.cpp \
-o libtpdnn.so \
-L"$LIBTORCH/lib" -Wl,-rpath,"$LIBTORCH/lib" \
-ltorch -ltorch_cpu -ltorch_cuda -lc10 -lc10_cuda
echo "built ./libtpdnn.so (LIBTORCH=$LIBTORCH)"
#!/bin/bash
# Fetch + extract the pinned libtorch (cu128 / CUDA 12.8, Blackwell sm_120) from mirror.elphel.com.
# Runtime dependency for native DNN inference (L1/L2 via TorchScript). NOT in git (~3.8 GB zip / ~GB extracted).
# Default extract location: /home/elphel/git/libtorch (native build uses -DCMAKE_PREFIX_PATH=<that>).
# By Claude on 06/27/2026.
set -euo pipefail
LT_ZIP="libtorch-cxx11-abi-shared-with-deps-2.7.1-cu128.zip"
LT_URL="https://mirror.elphel.com/libtorch/${LT_ZIP}"
PARENT="${1:-/home/elphel/git}" # libtorch extracts to $PARENT/libtorch
DEST="$PARENT/libtorch"
if [ -f "$DEST/build-version" ]; then
echo "libtorch already present: $DEST ($(cat "$DEST/build-version"))"; exit 0
fi
mkdir -p "$PARENT"; cd "$PARENT"
echo "Downloading $LT_URL ..."
# NOTE: mirror.elphel.com WAF returns 406 to curl's default UA -> use a browser UA.
curl -fSL -A "Mozilla/5.0 (X11; Linux x86_64)" "$LT_URL" -o "$LT_ZIP"
echo "Extracting -> $DEST ..."
unzip -q -o "$LT_ZIP" # extracts top-level ./libtorch/
rm -f "$LT_ZIP"
echo "libtorch ready: $DEST ($(cat "$DEST/build-version"))"
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment