CLAUDE: FIX convert_direct deconvolution in JNA — pass (kernels_hor, kernels_vert), not (0, *)
THE production-mismatch bug (RMSE ~1.7 vs JCUDA, invariant to FPN/row-col/MB). convert_direct gates
the deconvolution kernels on `kernels_hor>0` (TileProcessor.cu:2782-2783): with kernels_hor=0 it passes
NULL kernels -> NO deconvolution. tp_proc_exec_convert_direct hardcoded kh=0 (copied from the harness,
whose golden was itself made with no deconvolution), so JNA skipped aberration deconvolution while
production GpuQuad passes (kernels_hor, kernels_vert)=(82,66) and applies it.
Fix: add kernels_vert to TpProc (= kern_tiles/(kernels_hor*num_colors)); exec passes
(no_kernels?0:kernels_hor, no_kernels?0:kernels_vert). tp_proc_convert_selftest now uses no_kernels=1
to keep matching the NO-deconv harness golden (StageProc still PASS: CLT 0.1085 / RBG 0.0201 / corr 2e-5).
Production (GpuQuadJna no_kernels=false) now applies deconvolution = matches JCUDA. .so-only change.
Co-Authored-By:
Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Showing
Please register or sign in to comment