- 25 Jun, 2026 3 commits
-
-
Andrey Filippov authored
Add tp_convert_direct_selftest to the JNA shim: mirrors TpHostGpu allTests' convert path (setImageKernels/setImgBuffers/setCltBuffers/setTasks + calc_reverse_distortions -> rot_derivs -> calculate_tiles_offsets [CDP] -> convert_direct), reusing the harness runtime-API host helpers (tp_utils/tp_files/TpParams/tp_paths) for ALL allocation and porting only the launches to driver-API cuLaunchKernel against the NVRTC module. Reads CLT back, compares to clt/aux_chnN.clt golden. build_lib.sh: nvcc + -std=c++17 (static constexpr TpParams members become inline), -Isrc + cuda-samples Common (helper_cuda.h), --pre-include algorithm. Validated on RTX 5060 Ti via Java->JNA: num_active_tiles=5120 (all), max|CLT-golden| =0.1085 over peaks of 12260 -> relative ~8.85e-6 (float32 NVRTC-vs-nvcc variation). First CDP (calculate_tiles_offsets) and 17-arg pointer-of-pointers convert_direct launch executing natively on Blackwell. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Add TpInstance to the JNA shim: device buffers (gpu_geometry_correction, gpu_rByRDist, gpu_rot_deriv, gpu_correction_vector) + setters (HtoD), the two pure-geometry launches (calcReverseDistortionTable {16,1,1}/{3,3,3}, calc_rot_deriv {num_cams,1,1}/{3,3,3}), and readback getters. Driver-API cuLaunchKernel against the NVRTC module (mirrors GpuQuad.execCalcReverseDistortions / execRotDerivs, no JCuda). build_lib.sh builds libtileproc.so. Validated via Java->JNA against tile_processor_gpu/clt reference data on the RTX 5060 Ti: rByRDist == clt/*.rbyrdist to ~1e-7 (aux 16-cam and main), rot_deriv rows orthogonal to ~1e-10 (scaled-rotation structure, det~zoom^3). Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
libtileproc shim (tp_jna.cpp: extern "C" tp_create_module/num_functions/last_error/destroy) + standalone tp_nvrtc_probe.cpp + build_probe.sh. NVRTC-compiles the kernels (+ JCUDA defines) -> cuLink(libcudadevrt, CDP) -> module -> 19 functions, validated on the RTX 5060 Ti (sm_120 via compute_90 PTX + driver JIT). Build artifacts gitignored. By the JCuda->JNA migration. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
- 03 Dec, 2025 2 commits
-
-
Andrey Filippov authored
-
Andrey Filippov authored
-
- 28 Sep, 2025 1 commit
-
-
Andrey Filippov authored
-
- 22 Jul, 2025 1 commit
-
-
Andrey Filippov authored
-
- 15 Apr, 2025 2 commits
-
-
Andrey Filippov authored
-
Andrey Filippov authored
-
- 13 Apr, 2025 1 commit
-
-
Andrey Filippov authored
-
- 12 Apr, 2025 3 commits
-
-
Andrey Filippov authored
-
Andrey Filippov authored
-
Andrey Filippov authored
-
- 10 Apr, 2025 2 commits
-
-
Andrey Filippov authored
-
Andrey Filippov authored
-
- 09 Apr, 2025 1 commit
-
-
Andrey Filippov authored
-
- 08 Apr, 2025 1 commit
-
-
Andrey Filippov authored
-
- 07 Apr, 2025 1 commit
-
-
Andrey Filippov authored
-
- 06 Apr, 2025 1 commit
-
-
Andrey Filippov authored
-
- 03 Apr, 2025 1 commit
-
-
Andrey Filippov authored
-
- 01 Apr, 2025 2 commits
-
-
Andrey Filippov authored
-
Andrey Filippov authored
-
- 31 Mar, 2025 1 commit
-
-
Andrey Filippov authored
-
- 26 Mar, 2025 1 commit
-
-
Andrey Filippov authored
-
- 18 Feb, 2025 1 commit
-
-
Andrey Filippov authored
-
- 13 Feb, 2025 1 commit
-
-
Andrey Filippov authored
-
- 08 Feb, 2024 1 commit
-
-
Andrey Filippov authored
-
- 27 Nov, 2022 1 commit
-
-
Andrey Filippov authored
-
- 21 Nov, 2022 2 commits
-
-
Andrey Filippov authored
-
Andrey Filippov authored
-
- 20 Nov, 2022 2 commits
-
-
Andrey Filippov authored
-
Andrey Filippov authored
-
- 19 Nov, 2022 2 commits
-
-
Andrey Filippov authored
-
Andrey Filippov authored
-
- 16 Nov, 2022 1 commit
-
-
Andrey Filippov authored
-
- 14 Nov, 2022 1 commit
-
-
Andrey Filippov authored
-
- 13 Nov, 2022 1 commit
-
-
Andrey Filippov authored
-
- 10 Aug, 2022 1 commit
-
-
Andrey Filippov authored
-
- 16 Jun, 2022 2 commits
-
-
Andrey Filippov authored
tiles)
-
Andrey Filippov authored
-