- 04 Jul, 2026 9 commits
-
-
Andrey Filippov authored
Phase A2/B building block: consolidate the 16 per-sensor CLT channels into ONE averaged TD channel (average images BEFORE correlation - multiply averages, not average products). Per-tile granularity: sum sensors that have the tile (first element NaN = absent), count, divide; count plane returned as the weight; a stray in-tile NaN poisons the whole result tile (fail-visible). Not available on GPU (combine_inter only sums correlation PRODUCTS) - this CPU implementation + get/setCltData D2H/H2D is the A2 bridge and the bit oracle for the future clt_average_sensors kernel. - CuasTD.validateConsolidation(): linearity oracle - imclt(TD-avg) must equal pixel-average of per-sensor imclt renders (same GPU imclt both sides); prints count-plane stats + max|diff|/RMS, saves -CUAS-TDAVG-CHECK 3-slice stack, restores original TD. Wired into the curt_cond_test branch after perSensorFromRawJp4 (uses its raw-jp4 16-sensor TD). - GpuQuadJna.getCltData() override added (base derefs null gpu_clt_h on JNA shells - the known un-overridden-accessor class); uses tp_proc_get_clt. mvn compile clean. Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
Andrey Filippov authored
Findings from the FILT150 run: (1) scattered rank-150 selection starved the per-scene neighbor consolidation (min_str_neib/eig_str_neib) - only ~57 tiles with accidental neighbors were measured per scene (neighbors-vs-measured corr 0.78); (2) roll degraded (RMS 0.106 vs 0.059 mrad, bias +0.073) - the selection carried only 11% of the full set's roll information. deriveSelection() stage 2 now picks disjoint 3x3 CLUSTERS of gate-passing tiles (>=CLUSTER_MIN_ELIGIBLE=6 of 9), round-robin from three pools: LEFTMOST, RIGHTMOST (per Andrey - edge tiles have the most roll influence), BEST-QUALITY (median member fmax), until the tile budget is filled; scattered best tiles fill any remainder. Offline simulation on the real calibration: 24 clusters (8/8/8), 150 tiles, mean 3.97 in-selection neighbors, roll info +48% vs scattered rank-150. Measurement code untouched (oracle identical). mvn compile clean. Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
Andrey Filippov authored
Per Andrey: (1) calibration reuse is now automatic - if -POSE-RT-MAXDXY exists it is used (filtered run), else a full run generates it; new curt_pose_recalc flag forces regeneration (replaces the backwards curt_pose_use_filt enable). Matches the FPN reuse pattern. (2) MAXDXY stores NaN instead of +inf for NaN-in-any-scene tiles - deriveSelection rejects NaN and +inf identically (non-finite), and NaN keeps the TIFF viewable in ImageJ (+inf broke min/max autoscaling). mvn compile clean (Eyesis closed). Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
Andrey Filippov authored
Replaces the absolute curt_pose_max_dxy with the scale-free scheme (Andrey's histogram rule formalized + robustness for worse footage): - Calibration artifact -POSE-RT-MAXDXY.tiff: per-tile max-over-scenes residual, +inf where any scene NaN (auto-reject, mergeable across runs by max), NaN where unmeasured. Saved only from FULL-selection runs (a filtered run never shrinks coverage). Continuous statistic persisted, boolean selection derived at load - policy can change without re-measuring. - deriveSelection(): stage 1 outlier gate keep max <= median + k*NMAD of finite per-tile maxes (curt_pose_dxy_k=0.75; on the reference footage: MBEN gate 0.477 keeps 595, degraded NOMB self-adapts to 0.728 keeps 626 - same ~65%); stage 2 rank-N budget keep curt_pose_num_tiles=150 best (threshold-free). - curt_pose_use_filt now loads MAXDXY and derives; missing -> full run generates it (FPN-style reuse pattern). Importance-greedy (3x3 information matrix) ranking = next step. mvn compile clean. Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
Andrey Filippov authored
Per Andrey: a selected tile is BAD if its measured dxy is NaN in any scene or exceeds curt_pose_max_dxy (absolute, default 0.25 pix) in at least one scene. Survivors saved as -POSE-RT-RELIABLE-FILT.tiff; curt_pose_use_filt loads it on a next run and ANDs with the strength selection (two-pass workflow: full run calibrates the selection, subsequent runs use ~191 clean tiles instead of 1074 on the reference footage; kept-tile mean dxy 0.087 pix). Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
Andrey Filippov authored
Exercises the existing MB machinery in the RT iterator: when imp.mb_en is ON, per-scene blur vectors from OpticalFlow.getMotionBlur (FD-based rates) send interCorrPair down the setInterTasksMotionBlur/interCorrTDMotionBlur path - convert_direct runs twice, the second run subtracting the shifted+scaled copy via negative TpTask.scale (LWIR bolometer exponential-tail removal). mb_en OFF keeps the single-run path, giving a one-checkbox A/B. Same getMotionBlur usage as offline setInitialOrientationsCuas (stored truth was produced WITH MB on). mvn compile clean (Eyesis closed). Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
Andrey Filippov authored
Per Andrey: (1) fitted-vs-stored deltas reported in PIXELS (the informative unit), using the same scales as the LMA par_scales - az/tilt = focal/pixelSize, roll = distortionRadius/pixelSize; mrad kept secondary. (2) A 'CuasPoseRT scene i (of N) <timestamp> Done/FAILED' line after each fit so the unlabeled LMA iteration prints above it are attributable to a scene (SYSTEM_OUT-01.log had iterations but no index/timestamp). Per-scene line also shows dstored in pix. Verified with standalone javac (Eyesis live - no mvn). Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
Andrey Filippov authored
Per Andrey: characterize the per-scene measurement, incl. the eigenvector data. -POSE-RT-HYPER (80x64, z=scenes, t=components, make_hyper layout): dx, dy, strength, dxy=|dx,dy| from vector_XYS; sqrt_l0, sqrt_l1 (peak-ellipse half-axes, pix), elong=sqrt(l1/l0) (linear-feature indicator), eig0_ang (precise-axis direction, [0,PI)) from coord_motion eigen {eig_x,eig_y,l0,l1} - NaN unless imp.eig_use. Data = last LMA cycle's coord_motion via the existing coord_motion_rslt out-param. -POSE-RT-RELIABLE = tile selection mask. Verified with standalone javac against target/classes (Eyesis live - no mvn; Eclipse rebuilds on restart). Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
Andrey Filippov authored
Top-level scene iterator re-generating per-scene 3-angle poses against the persistent virtual-center reference, RT-style: ascending time order, zero-order prediction seeding (fit anchored to the center, prediction only warm-starts the LMA), single pass on the final combo DSI (no refinement pass - it only existed because disparity arrived after initial orientations offline). Measurement engine = proven Interscene.adjustPairsLMAInterscene (reference GPU data set once); phase B will swap it for the lean TD-average x virtual-center path with GPU argmax+eigen kernels, keeping this iterator + CSV as the oracle. - new cuas/rt/CuasPoseRT.testPoseSequence(): reference prep (strength> curt_pose_str tile selection, setReferenceGPU with center CLT), stored-pose seed/truth from center ErsCorrection scenes_poses, per-scene fit with 3-angle param_select (XYZ locked), ERS dt from pose finite differences (disable_ers), MB off, coast-on-failure; writes -POSE-RT-TEST.csv + fitted-vs-stored summary - params curt_pose_test (bool) + curt_pose_str (1.0) - 6 plumbing sites - OpticalFlow curt_en branch: curt_pose_test runs INSTEAD of detection Build: mvn compile clean. Runtime validation pending (Eclipse/Eyesis run on sequence 1773135476_186641, truth = re-adjusted stored poses). Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
- 03 Jul, 2026 5 commits
-
-
Andrey Filippov authored
CLAUDE: self-documenting comments: TpTask bridge role, differential rectification, offset composition Comment-only (no code change; mvn compile clean). Documents, from Andrey's explanation: TpTask as the Java<->CUDA work-list bridge; the per-sensor xy offset as the differential-rectification composed shift (factory kernel offset + misalignment + disparity + relative pose) split integer/fractional; historic host-side vs current GPU-side geometry fill; updateTasks() D2H; disp_dist[cam][4] = d(x,y)/d(disp,ndisp) Jacobian consumed by Corr2dLMA and lazy-eye/ERS. Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
Andrey Filippov authored
New curt_calib parameter (CUAS RT dialog, saved/restored) runs/bypasses the per-sensor photometric (re)calibration as the first step of the CUAS RT processing flow, before detection (no longer tied to the diagnostic). Extracted CuasMotion.rtPhotometricCalibration() = convertFromData() (upload + own uniform grid convert, split out of perSensorFromData) + fit + apply/ save. Production step converts and calibrates without saving stacks; the curt_cond_test diagnostic (replaces detection) keeps the raw-vs-conditioned stack compare and makes the calibration step save -CUAS-PERSENSOR[-ADJ]. Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
Andrey Filippov authored
The virtual -CENTER INTERFRAME corr-xml only carries poses/velocities, so saving the recalculated 16+16 lwir offsets/scales there was futile. Follow the established photometric machinery (runPhotometric()/photoEach()) and the top-menu save/restore convention instead: set the new values on master_CLT (immediate use), quadCLTs[ref_index] (physical photometric owner, its <scene>-INTERFRAME.corr-xml is saved) and quadCLT_main (applied to next sequences and saved in the main configuration file). Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
Andrey Filippov authored
curt_cond_test rework: both PERSENSOR stacks now converted with the test's own uniform sensor-domain task grid (scale 1.0) instead of leftover GPU state (MB secondary tasks with negative fractional scales made 'raw' renders = -1/6 x input; leftover virtual-view grid lost the same border ROI on every sensor). perSensorFromRawJp4 no longer overwrites the scene's conditioned image_data. GpuQuadJna.setBayerImages(force,center) restored the base-class skip-guard via a native-side jna_bayer_set flag (gpuTileProcessor is null in JNA shell instances): every execConvertDirect unconditionally re-pulled quadCLT.getResetImageData(), silently clobbering explicit uploads - made the raw baseline bit-identical to the conditioned render. CuasMotion.perSensorLinearFit(): per-sensor a+b*x photometric fit over safe tiles (weak strength<0.5 or far disparity<1 from -INTER-INTRA-LMA, inner rect, 8x8 tile->pixel map) against the cross-sensor mean, gauge keep_averages (mean offset 0, mean scale 1), 3-sigma outlier rejection. Validated on 1773135476_186641: sensor-mean spread 1353->5 counts, cross-sensor RMS 358->17 (inliers), b in 0.83..1.11. CuasMotion.applyLwirLinearCalibration(): folds the fit into the 16+16 lwir offsets/scales (scale'=b*scale, offset'=offset-a/scale'), updates the center instance + photometric_scene provenance, saves -INTERFRAME.corr-xml. Applied the standard way at load they compensate the remaining per-sensor mismatch of the raw /jp4/ tiffs. Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
Andrey Filippov authored
Add make_hyper parameter (code-selected, not in settings): 0 - flat stack (old behavior, all pre-existing call sites); >0 - transpose to hyperstack [sensors][avgs+timestamps][pixel], z (top slider) - timestamps, t (bottom slider) - sensor channels; 2 - insert per-timestamp average of all used sensors as first channel (17th), computed from final conditioned slices. Per-sensor full + center-fraction averages now work in individual mode (pre-calculated merged-only average falls back to slice computation with a warning instead of AIOOBE). Number of average frames stays variable (0/1/2); fopen paths bit-identical by design. Verified on 495-scene CUAS sequence: INDIVIDUAL debug hyperstack matches the MERGED convention - to be used as oracle for RT conditioning. Co-Authored-By:Claude Fable 5 <noreply@anthropic.com>
-
- 02 Jul, 2026 2 commits
-
-
Andrey Filippov authored
-
Andrey Filippov authored
Add a curt_cond_test path (boolean at the top of the CUAS-RT dialog) that, inside the curt_en branch, renders the 16 per-sensor images and saves them to the -CENTER instance for calibration inspection: - CuasMotion.perSensorAveragesFromTD: imclt the per-sensor TD, print the 16-sensor average spread, save -CUAS-PERSENSOR (16-slice stack, per-slice avg labels). Saves via the -CENTER instance, not gpuQuad.getQuadCLT(). - CuasMotion.perSensorFromRawJp4: read RAW /jp4/ per-sensor (oracle getJp4Tiff, one thread/sensor), force-H2D (bypass the "GPU mem already correct" verify), execConvertDirect from raw, save -CUAS-PERSENSOR-RAW (uncorrected baseline; calibration stays a separate "cheat"). RT-seed for the future SATA raw stream; GPU port later with Java as oracle. Fix the NaN border on the RT SUBAVG-CONV2D product: - CuasDetectRT subtract-average -> NaN-tolerant union (average only non-NaN scenes per pixel), matching the oracle -CUAS-MERGED-CUAS; the plain sum NaN-propagated (one missing scene poisoned the pixel in every frame -> thick border after LoG). - CuasRTUtils.convolve2DLReLU -> NaN-aware (NaN out only if the center is NaN; substitute the center value for NaN taps), so the LoG can't bloom a thin border into a thick NaN frame. - Add -SUBAVG-PRELOG save (post-subtract-avg, pre-LoG) for bisecting. Compiles (mvn -DskipTests clean package). WIP: the raw-path values/edge and the in-memory-vs-file (MERGED-CUAS) divergence are still under review; the ~28px edge residual is traced to the temporal subtract-average at the rotation-swept composite edge. See ANDREY_CONTINUE.md open items. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
- 01 Jul, 2026 1 commit
-
-
Andrey Filippov authored
Piece 1 of the RT conditioning migration (design: internal handoff 2026-06-30_rt_conditioning_design.md): - cuas/rt/CuasConditioning: lean, self-contained per-sensor conditioning for the TP/RT path - Row/Col denoise (on/off, optional HPF of the 1-D avg profile) then photometric scales2*(raw-C0)^2 + scale*(raw-C0) - FPN (bit-matches the current additive path when scales2=0). Bypasses the heavy QuadCLT conditioning path. - CuasMotion.perSensorAveragesFromTD(GpuQuad, use_reference): memory-lean render of all 16 per-sensor from TD; per-sensor average + spread = calibration-quality gauge. Building blocks only; full test wiring (raw jp4 -> condition -> convert_direct -> renderSceneSequence per-sensor averages) + Eyesis invocation entry still pending. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
- 27 Jun, 2026 3 commits
-
-
Andrey Filippov authored
Adds an OFF-by-default profile that dependency:unpacks the libtorch native runtime (org.pytorch:libtorch-cxx11-cu128:2.7.1:zip, cu128) from mirror.elphel.com/maven-dependencies into target/libtorch-dist for the native DNN backend (libtpdnn.so / CuasDnnLocal) on a deployment box. The default build never downloads the 3.8GB zip. Artifact published to the mirror in maven layout (server-side copy of the existing zip) via tile_processor_gpu/jna/publish_libtorch_mirror.sh. Verified: zip + .pom reachable at the computed maven URL (HTTP 200, 3.78GB), profile parses (mvn -Plibtorch validate OK). Full unpack deferred (redundant on this box - libtorch already extracted); exercises on first deployment machine via `mvn -Plibtorch generate-resources`. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Bundles the exported TorchScript models + their .meta.json sidecars under src/main/resources/cuas_dnn/<name>/ so CuasDnnLocal runs with no local model dir (deployment needs no PyTorch/dev tooling - just the .so + libtorch runtime): weighted9_pm_s/model.ts.pt (+.meta.json) L1 (N=9,P=24,vr=5,out_ch=124) mexhat_gaps_boost40/model.l2.ts.pt (+.meta.json) L2 (ch_hidden=24,vmax=1.4) Validated: CuasDnnLocal bundled-resource path (curt_dnn_local_dir empty) extracts from the jar and matches the server oracle EXACTLY (offset5=0.0, roi=0.0). Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Piece 3 of the native-JNA DNN path. Adds a local backend that runs the SAME L1+L2 inference as CuasDnnRemote but in-process via LibTorch (libtpdnn.so / JNA), so the CUAS pipeline runs without the DGX or any Python server: - CuasDnnBackend : shared interface (upload/getNFrames/inferBatch->BatchResult/close) - TpDnnJna : JNA Library binding libtpdnn.so's C-ABI - CuasDnnLocal : wraps it; reads N/P/vr/l2_ch from each model's bundled .meta.json (single source of truth), float[][]<->float[], builds BatchResult - CuasDnnRemote : now implements CuasDnnBackend (signatures unchanged) - CuasDetectRT : DNN path gate now fires on (curt_dnn_remote || curt_dnn_local); backend = local? CuasDnnLocal : CuasDnnRemote; ensureServer skipped when local; local-CPU-ORT gate also excludes curt_dnn_local (no double-run). runDnnRemote loop unchanged. - IntersceneMatchParameters: curt_dnn_local (flag) + curt_dnn_local_dir (model dir override; empty = bundled /cuas_dnn resource) + GUI labels/persist. Validated: full Java->JNA->libtpdnn vs the Python-server oracle = EXACT (offset5=0.0, roi=0.0, nch=6). mvn -DskipTests package OK. Runtime: -Djna.library.path=<dir with libtpdnn.so>; libtpdnn.so finds libtorch via its rpath. Model resolution mirrors CuasDnnRemote's bundled-vs-override scheme. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
- 26 Jun, 2026 14 commits
-
-
Andrey Filippov authored
Migration validated (JNA CUAS targets match JCuda). Cleanup: - Removed all TEMP debug probes (-Dtp.dbg.corrpair, probeClt, saveTDRender, the one-shot DBG/PROBE blocks in GpuQuadJna + CuasMotion). Real fixes kept (rectilinear port, num_pairs=3, setCorrIndicesTdData, imclt ref_scene, num_corr_tiles propagation f6dcc90f). - Proactive sweep for the f6dcc90f bug-class (JNA override drops a base side-effect field write): getCorrComboIndices/getCorr2DCombo propagate num_corr_combo_tiles, setCorrIndicesTdData propagates num_corr_tiles, getTextureIndices propagates num_texture_tiles; those fields made protected. These four are LATENT (no live consumer on the validated CUAS path) and are marked NOT-YET-TESTED inline. Java-only. mvn compile clean. Co-Authored-By:
Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Root cause of the CORR2D-all-NaN / 0-targets: the inter-correlation actually works (probe showed num_corr_tiles=8850 = 4425 tiles x (1 sensor + 1 sum)), but the TD readback dropped it. Base GpuQuad.getCorrIndices() sets the num_corr_tiles field ("also sets num_corr_tiles"); GpuQuadJna.getCorrIndices() read the native count locally and returned the array WITHOUT setting the field. So TDCorrTile.getFromGpu (num_tiles = getNumCorrTiles()/num_pairs) and base getCorrTilesTd (uses the field directly) saw a stale 0 -> built 0 tiles -> empty target sequence -> null ROUND_ONE image -> saveImagePlusInModelDirectory NPE (the misplaced-null-guard latent bug is just the messenger). Fix: GpuQuadJna.getCorrIndices() sets num_corr_tiles = n (native count); field made protected so the subclass can. Java-only. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Post-mortem showed both CLT buffers loaded but inter-correlation -> 0 tiles. index_inter_correlate selects by __popc(sel_sensors); static reading says sel_sensors should be 1 (single-cam rectilinear), so a runtime value differs. - GpuQuadJna.execCorr2D_inter_TD: one-shot print sel_sensors/popc/num_cams/ num_colors/scales + the returned num_corr_tiles. - saveTDRender: makeArrays NPE'd on null titles (derefs titles[i]); pass a non-null titles[] so the render saves instead of crashing the run. TEMP — remove with the rest of the -Dtp.dbg.corrpair probe. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Single sacrificial run -> generous logging without spam: - GpuQuadJna: probe BOTH first ref convert (gpu_clt_ref) AND first scene convert (gpu_clt) — NaN%/nonzero/range each (probeClt helper). - CuasMotion.correlatePair one-shot: log targets_mv / tp_ref,tp_img counts / erase_cltr,erase_clt / fpixels null-ness, plus TD-correlation read-back stats (tile count + NaN% of TD values) alongside the DBG-REF/DBG-IMG renders. All gated/one-shot; no native change (reads via existing tp_proc_get_clt). TEMP — remove with the rest of the -Dtp.dbg.corrpair probe. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
(1) GpuQuadJna.execImcltRbgAll now passes ref_scene -> tp_proc_exec_imclt(use_ref) so renderFromTD(true) renders gpu_clt_ref, not gpu_clt (TpJna binding updated). (2) TEMP post-mortem in CuasMotion.correlatePair (gate -Dtp.dbg.corrpair=1, one-shot): after the inter-correlation, SAVE ref (gpu_clt_ref) + scene (gpu_clt) CLT renders to the model dir via saveImagePlusInModelDirectory (persist past the later crash; no window flood). DBG-REF blank/NaN => reference not loaded => explains the all-NaN CORR2D (inter corr needs both images). REMOVE after fix. Needs the tile_processor_gpu imclt use_ref commit + libtileproc.so rebuild. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Diagnostic for the CORR2D-all-NaN numeric divergence: after the first ref_scene=true convert, read back gpu_clt_ref (tp_proc_get_clt use_ref=1) and print NaN%/nonzero/min/max once. Confirms whether the reference convert populates gpu_clt_ref at all (vs scene gpu_clt which is correct -> SOURCE). Prints one "PROBE gpu_clt_ref[cam0]: ..." line to System.out (captured in the per-scene -SYSTEM_OUT.log). TEMP — remove after the divergence is fixed. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Third JNA-mode gap on the CUAS oracle path: TDCorrTile.convertTDtoPD re-uploads host-selected TD correlation tiles via GpuQuad.setCorrIndicesTdData, which was not overridden -> base JCuda cuMemcpyHtoD on a null gpu_corr_indices -> NPE. Adds the GpuQuadJna override (ensureRbgCorr() then delegate to the new native tp_proc_set_corr_indices_td) + the TpJna binding. Gap-finder over the full CUAS TD path (CuasMotion + TDCorrTile) confirms this was the LAST GPU-touching un-overridden method; the rest are pure config getters. Needs the matching tile_processor_gpu commit (native fn) + libtileproc.so rebuild. mvn compile clean. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Second JNA-mode failure after the Phase-1 NPE fix: cudaErrorIllegalAddress at tp_utils.cu:142 (image upload), actually a DEFERRED fault from the inter-scene correlate2D_inter kernel writing out of bounds. Root cause: GpuQuadJna.ensureRbgCorr sized the native correlation buffers via Correlation2d.getNumPairs(num_cams). For the rectilinear single-camera config num_cams=1 -> getNumPairs(1)=0 -> tp_proc_setup_rbg_corr allocates zero-size gpu_corrs_td / gpu_corrs / gpu_corr_indices, so the inter-scene correlation wrote past them -> illegal address, surfacing (sticky) at the next CUDA call. Fix: mirror the JCuda oracle, whose rectilinear ctor hardcodes num_pairs=3 (GpuQuad.java:732) for exactly the inter-scene case -> int num_pairs = rectilinear ? 3 : Correlation2d.getNumPairs(num_cams); Java-only; libtileproc.so untouched. mvn compile clean. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Fixes the whole bug-class behind the -Dtp.backend=jna NPE in GpuQuad.setLpfRbg (CuasRanging.detectTargets -> CuasMotion -> setRectilinearReferenceTD): the rectilinear single-camera GpuQuad was built via the raw JCuda ctor, bypassing the backend factory, so in JNA mode it got a null gpuTileProcessor. - GpuQuad.createRectilinear(): backend-aware factory parallel to create(). JCUDA branch is byte-for-byte the legacy ctor (oracle path untouched); JNA branch builds a clean rectilinear GpuQuadJna. New no-alloc rectilinear ctor (num_cams=1, no kernels/geometry). - GpuQuadJna: rectilinear ctor + shared initNative(); the two overrides the gap-finder predicted -- reAllocateClt (no-op; native CLT pre-sized in setup) and singular setBayerImage (-> tp_proc_set_image). execConvertDirect already guarded on the rectilinear flag. - CuasMotion:452 routed through createRectilinear (CUAS rectilinear now JNA-capable). - ComboMatch:899 fail-loud UnsupportedOperationException in JNA mode (orthomosaic, wider unported surface, off the current path -- stays JCuda). Java-only; libtileproc.so untouched. mvn compile clean. JCuda legacy frozen as oracle; core convert_direct flag-soup cleanup deferred to Phase 2. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Root-level, Andrey-facing, single predictable "what to do right now" file the agent overwrites before session exit, so lost tmux scrollback never costs the restart plan. Local-only; gitignored alongside CLAUDE.md/AGENTS.md/MEMORY.md. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
just debug
-
Andrey Filippov authored
Completes the oracle GPU surface. The reliable gap finder is comm -23 <(ImageDtt gpuQuad.* calls) <(GpuQuadJna overrides) not the gpuTrace dump (only ~14 methods are instrumented, so e.g. getFlatTextures was invisible in the trace though it is on the path). Overrides (delegating to the new tp_proc_* texture API): - execTextures: builds weights[3]/params[5], forwards calc_textures/calc_extra/ linescan/dust/keep flags. Implements the production (USE_DS_DP) behavior. - getTextureIndices: reads kernel-built count + packed indices. - getExtra: reshapes diff_rgb_combo (texture_indices order) into [num_cams*(num_colors+1)][tilesX*tilesY] keyed by ntile -- identical to base. - getFlatTextures: de-pitches gpu_textures -- identical to base. TpJna.java: bindings for tp_proc_exec_textures/get_texture_indices/ get_diff_rgb_combo/get_textures. Edits only -- not mvn-compiled (Eyesis run was live). Signatures match base @Override; referenced fields are public final / public static. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
gpuTrace now prints each Class.method ONCE (was per-call -> spammy). Oracle JCuda trace showed it uses the TD-correlation readback path: getCorrIndices / getCorrTdData / getCorrComboIndices / eraseGpuCorrs (un-overridden -> would NPE in JNA). Override them via the new native tp_proc_get_corr_indices / get_corr_combo_indices / get_corr_td (DtoH) + tp_proc_erase_corrs. mvn -DskipTests compile clean. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Add GpuQuad.gpuTrace(m) printing "[GPUTRACE] "+getClass().getSimpleName()+"."+m (off unless -Dtp.trace=1). Instrument the un-overridden GPU methods (potential oracle gaps): getCltData, presentCltData, eraseGpuCorrs, execCorr2D (bundled), readbackTasks, setFullFrameImages, getCorrTdData, getCorrIndices, getCorrComboIndices, getExtra, getTextureIndices, getRBGA, execRBGA, execTextures. Since GpuQuadJna extends GpuQuad, the trace prints "GpuQuad.X" under JCuda and "GpuQuadJna.X" if a JNA run falls through to one (= coverage gap) -> reveals oracle's real GPU usage before any NPE. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
- 25 Jun, 2026 6 commits
-
-
Andrey Filippov authored
updateTasks is called all over ImageDtt right after execSetTilesOffsets (reads gpu_ftasks back to rebuild TpTask[] with computed centerXY/disp_dist) -> tp_proc_get_tasks (DtoH). getWH returns full frame (base returns null gpu_clt_wh). Proactive (locating same-cause base-method derefs of null JCuda fields). Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
convertCenterClt -> setComboToTD -> setCltData pushes the restored center CLT to GPU; base getCltSize/ getNumTiles deref null gpu_clt_wh. Override to full-frame dims; setCltData -> new tp_proc_set_clt (HtoD per-cam slice, inverse of tp_proc_get_clt). Fixes the third JNA NPE (getCltSize:1211). Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
updateQuadCLT sets quadCLT + flag-only resetGeometryCorrection*(); skips the base's gpuTileProcessor.bayer_set clear (N/A natively - bayer re-uploaded each convert). resetBayer no-op. Fixes the second JNA NPE (updateQuadCLT:263). Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
setLpfRbg flattens the 4x64 r/b/g/m arrays -> "lpf_data"; setLpfCorr -> const_name (lpf_corr / lpf_rb_corr). Uploads to the native module's constant memory, matching JCUDA. mvn clean. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-
Andrey Filippov authored
GpuQuad.create(gpuTileProcessor, quadCLT, debug) returns the JCuda GpuQuad by default, or the native GpuQuadJna when -Dtp.backend=jna (srcdir/devrt overridable via -Dtp.jna.srcdir / -Dtp.jna.devrt). Routed all 32 main+aux `new GpuQuad(...)` 3-arg sites in Eyesis_Correction.java through the factory. JCUDA remains the default (behavior identical when the property is unset). mvn -DskipTests compile clean. Migration now fully implemented + compiling end-to-end (Step 1 native TpProc API, Step 2 GpuQuadJna full CUAS surface, Step 3 selector). Ready for the JCUDA-vs-JNA comparison + incremental troubleshooting. Co-Authored-By:Claude Opus 4.8 (1M context) <noreply@anthropic.com>
-