• Andrey Filippov's avatar
    CLAUDE: FIX convert_direct deconvolution in JNA — pass (kernels_hor, kernels_vert), not (0, *) · 0399a26d
    Andrey Filippov authored
    THE production-mismatch bug (RMSE ~1.7 vs JCUDA, invariant to FPN/row-col/MB). convert_direct gates
    the deconvolution kernels on `kernels_hor>0` (TileProcessor.cu:2782-2783): with kernels_hor=0 it passes
    NULL kernels -> NO deconvolution. tp_proc_exec_convert_direct hardcoded kh=0 (copied from the harness,
    whose golden was itself made with no deconvolution), so JNA skipped aberration deconvolution while
    production GpuQuad passes (kernels_hor, kernels_vert)=(82,66) and applies it.
    
    Fix: add kernels_vert to TpProc (= kern_tiles/(kernels_hor*num_colors)); exec passes
    (no_kernels?0:kernels_hor, no_kernels?0:kernels_vert). tp_proc_convert_selftest now uses no_kernels=1
    to keep matching the NO-deconv harness golden (StageProc still PASS: CLT 0.1085 / RBG 0.0201 / corr 2e-5).
    Production (GpuQuadJna no_kernels=false) now applies deconvolution = matches JCUDA. .so-only change.
    Co-Authored-By: 's avatarClaude Opus 4.8 (1M context) <noreply@anthropic.com>
    0399a26d
Name
Last commit
Last update
eclipse_setup Loading commit data...
jna Loading commit data...
src Loading commit data...
.gitignore Loading commit data...
LICENSE Loading commit data...
README.md Loading commit data...