new method refineMotionVectors() implemented by Claude and related docs

96fc22bc · Andrey Filippov · 03a9cc77 · 96fc22bc · 96fc22bc
Commit 96fc22bc authored May 06, 2026 by Andrey Filippov
Show whitespace changes
Inline Side-by-side

Showing with 340 additions and 6 deletions

project-details.md docs/project-details.md +117 -0

CuasMotion.java src/main/java/com/elphel/imagej/cuas/CuasMotion.java +223 -6

No files found.
--- a/docs/project-details.md
+++ b/docs/project-details.md
@@ -556,6 +556,123 @@ This section captures the latest validated state before pausing Global LMA work
 4. Run `Aux Build Series`.
 5. Inspect console + CSV/TIFF outputs; archive debug artifacts as needed.

+## `refineMotionVectors()` — Two-Stage Motion Vector Refinement
+
+### Background: MCLT, Phase Correlation, and Noise Integration
+
+The pipeline detects sub-pixel UAS targets using the GPU Tile Processor (TP), which operates
+on fixed 16×16 50%-overlapped tiles using a 2D Modified Complex Lapped Transform (MCLT). MCLT
+is structurally similar to FFT: convolution and correlation are multiplication in the Transform
+Domain (TD). Critically, fractional-pixel image shifts in the pixel domain (PD) correspond to
+phase rotations (multiplication by complex unit vectors) in the TD. Integer shifts select which
+16×16 window is loaded; the residual fractional shift (−0.5 to +0.5 px) is applied as TD phase
+rotation. Both are handled by the GPU kernel.
+
+Pairwise tile correlation uses Phase Correlation: each TD component is normalized by its
+amplitude (retaining only phase), so the inverse-transformed result is a near-delta-function
+for noiseless data. Noise mitigation uses two mechanisms:
+
+1. **Fat zero:** a positive constant added to the normalization denominator, trading between
+   pure phase correlation (sharp but noise-fragile) and classical correlation (noise-robust but
+   blunt). Controlled by `cuas_fat_zero` / `cuas_rng_fz`.
+2. **TD pre-averaging:** for a linearly moving target, pairwise TD products with the same
+   temporal offset carry the same target phase shift. Accumulating N such pairs before
+   normalization averages down noise by √N while the signal remains coherent. This is done in
+   TD, before normalization — pairs like (0,8),(1,9),(2,10),…,(7,15) are summed, then
+   normalized and IDCT'd. `getTargetsFromCorr2d()` implements this.
+
+The current motion-vector scan integrates the **full 16×16 tile** for each pairwise product.
+For a sub-pixel target occupying only ~2–4 px, this wastes most of the integration area on
+background noise, limiting achievable SNR.
+
+### Purpose of `refineMotionVectors()`
+
+After the first (non-centered) pass gives approximate motion vectors (MV) and the
+virtual-tracking-camera pass (`shiftAndRenderAccumulate()`) locates the target centroid
+within each tile, we know where in each tile the target sits. The new method exploits this
+knowledge to re-run the pairwise correlation with a spatial mask that zeros out pixels far
+from the known target center. This:
+
+- Reduces the effective noise integration area from 16×16 to a small window of radius ~r1
+  (default 6 tiles = ~6 px),
+- Optionally increases the number of accumulated pairs (`recalc_mv_boost`, default 4×),
+- Produces a refined MV estimate, which feeds back into the next centered-accumulation pass
+  for improved ranging and target localization.
+
+The method is called (when `recalc_mv=true`) **before**
+`CuasMotion.shiftAndRenderAccumulate()` at line 8203, replacing the motion vectors that
+`extended_scan` / `targets_nonoverlap` carry.
+
+`recalc_mv` is disabled for `slow_mode` (background suppression path) because the slow
+path uses a different temporal pre-filter and the refinement is only meaningful for the fast
+(target-motion) path.
+
+### Parameters
+
+```java
+boolean recalc_mv       = clt_parameters.imp.cuas_recalc_mv && !slow_mode;
+//   When true, re-correlate with spatial mask to refine MVs before centered accumulation.
+
+double  recalc_mv_boost = clt_parameters.imp.cuas_recalc_mv_boost; // default ~4.0
+//   Multiply the default number of correlation pairs used in the refinement step.
+//   More pairs → better SNR; practical ceiling is limited by sequence length.
+
+double  recalc_mv_r0    = clt_parameters.imp.cuas_recalc_mv_r0;    // default ~2.0 (tiles)
+//   Raised-cosine mask inner radius. For r ≤ r0: weight = 1.0
+
+double  recalc_mv_r1    = clt_parameters.imp.cuas_recalc_mv_r1;    // default ~6.0 (tiles)
+//   Raised-cosine mask outer radius. For r ≥ r1: weight = 0.0
+//   For r0 < r < r1: weight = 0.5 * (cos(π*(r−r0)/(r1−r0)) + 1)
+```
+
+The mask is centered on the known target fractional position within the tile (from the
+prior centroid/LMA fit). Units are pixels (not tile indices).
+
+### Pipeline position
+
+```
+[non-centered pass]  → approximate MV in target_sequence_multi / extended_scan
+         ↓
+[refineMotionVectors()]    ← NEW (optional, recalc_mv=true, fast mode only)
+   apply spatial mask around target center
+   re-run TD pairwise accumulation with boosted pairs
+   return refined MV → overwrite extended_scan (or produce new vector_field)
+         ↓
+[shiftAndRenderAccumulate()]   ← existing, line 8203
+   virtual tracking camera, long exposure
+         ↓
+[getAccumulatedCoordinates() / LMA]
+```
+
+### Open questions (to resolve before implementation)
+
+1. **Where is the target center at this point in the code?** The centroid/LMA result lives
+   in `target_sequence_multi`. Which field indices (`RSLT_X`, `RSLT_Y`, etc.) hold the
+   fractional pixel offset within the tile for each keyframe?
+
+2. **What does the mask multiply?** The source `fpixels_tum` float array, or the TD
+   tiles already computed by the GPU? If the GPU has already done the TD transform by this
+   point, we need to apply the mask in PD before re-loading, or apply it in TD (which is a
+   convolution — more complex). Is the mask applied in Java before calling the GPU, or does
+   it need a new GPU kernel path?
+
+3. **Pair geometry for the boosted refinement:** `recalc_mv_boost` scales the number of
+   pairs. Does this mean reusing the same `cuas_corr_offset` with more `(frame0+d,
+   frame1+d)` pairs, or changing the offset too? Are there enough frames in the sequence
+   to support 4× more pairs without running past the sequence boundary?
+
+4. **Return value:** Should `refineMotionVectors()` return a new `double[][][]` vector
+   field (same structure as `extended_scan`) with refined `RSLT_VX/VY`, or modify the
+   existing `target_sequence_multi` in-place?
+
+5. **Tiles with no known target:** For tiles where the first pass found no valid target
+   (`target_sequence_multi[nseq][ntile] == null` or score below threshold), should
+   refinement be skipped (keep original MV) or skipped entirely (null)?
+
+6. **New `IntersceneMatchParameters` fields:** Are `cuas_recalc_mv`,
+   `cuas_recalc_mv_boost`, `cuas_recalc_mv_r0`, `cuas_recalc_mv_r1` already added to
+   `IntersceneMatchParameters.java`, or do they need to be added as part of this work?
+
 ### Next TODO (priority order)
 1. Performance pass: identify current bottlenecks and low-hanging optimizations.
   - **Two-stage motion vector calculation:** Repeat correlation with a tight mask (R=4) around located targets to significantly decrease the noise integration area and eliminate motion blur.

--- a/src/main/java/com/elphel/imagej/cuas/CuasMotion.java
+++ b/src/main/java/com/elphel/imagej/cuas/CuasMotion.java
@@ -2272,6 +2272,201 @@ public class CuasMotion {
 		return targets;
 	}

+	// By Claude on 05/05/2026
+	/**
+	 * Refine motion vectors using a raised-cosine spatial mask around each known target.
+	 * Reduces noise integration area from the full 16x16 tile to a small circle of radius r1.
+	 * Re-runs pairwise phase correlation on the masked source frames (boosted pair count) and
+	 * adds the resulting differential displacement to RSLT_VX/VY in targets_nonoverlap in-place.
+	 * Called between shiftAndRenderAccumulate() and getAccumulatedCoordinatesMulti() so that
+	 * getAccumulatedCoordinatesMulti() uses the refined vectors for target localisation.
+	 * Only acts on tiles where RSLT_X/Y is already set (i.e., a prior centred pass ran).
+	 */
+	public static void refineMotionVectors(
+			CLTParameters       clt_parameters,
+			boolean             batch_mode,
+			CuasMotion          cuasMotion,
+			double              recalc_mv_boost,    // scale factor for number of correlation pairs (~4.0)
+			double              recalc_mv_r0,       // inner radius in pixels: weight = 1.0
+			double              recalc_mv_r1,       // outer radius in pixels: weight = 0.0
+			float [][]          fpixels,            // source frames [nframes][width*height]
+			double [][][]       targets_nonoverlap, // [nseq][ntile][] in-place RSLT_VX/VY update
+			int                 frame0,             // frame index for nseq=0
+			int                 corr_inc,           // frame step between scan positions
+			int                 half_accum_range,   // corr_pairs/2
+			boolean             smooth,
+			int                 corr_offset,        // inter-frame distance for correlation pairs
+			int                 debugLevel) {
+
+		final int tileSize = GPUTileProcessor.DTT_SIZE;   // 8 pixels
+		final int tilesX   = cuasMotion.tilesX;
+		final int width    = cuasMotion.gpu_max_width;
+		final int height   = cuasMotion.gpu_max_height;
+		final int nframes  = fpixels.length;
+
+		final double fat_zero    = clt_parameters.imp.cuas_fat_zero;
+		final double cent_radius = clt_parameters.imp.cuas_cent_radius;
+		final int    n_recenter  = clt_parameters.imp.cuas_n_recenter;
+		final double rstr        = clt_parameters.imp.cuas_rstr;
+
+		// Boosted pair count, centred on frame_center
+		final int corr_pairs_ref = (int) Math.round(2 * half_accum_range * recalc_mv_boost);
+
+		// Hard-coded debug selectors: set >= 0 to enable per-scan/per-tile visualisation
+		final int dbg_nseq = -1;
+		final int dbg_tile = -1;
+
+		// Show the mask weight profile once
+		if ((dbg_nseq >= 0 || dbg_tile >= 0) && debugLevel >= 0) {
+			int r1i = (int) Math.ceil(recalc_mv_r1);
+			int msize = 2 * r1i + 1;
+			float[] mask_img = new float[msize * msize];
+			for (int dy = -r1i; dy <= r1i; dy++) {
+				for (int dx = -r1i; dx <= r1i; dx++) {
+					double r = Math.sqrt((double)(dx * dx + dy * dy));
+					double w;
+					if (r <= recalc_mv_r0)      w = 1.0;
+					else if (r >= recalc_mv_r1) w = 0.0;
+					else w = 0.5 * (Math.cos(Math.PI * (r - recalc_mv_r0) / (recalc_mv_r1 - recalc_mv_r0)) + 1.0);
+					mask_img[(dy + r1i) * msize + (dx + r1i)] = (float) w;
+				}
+			}
+			new ImagePlus("refineMotionVectors-mask-r0_" + recalc_mv_r0 + "-r1_" + recalc_mv_r1,
+					new FloatProcessor(msize, msize, mask_img)).show();
+		}
+
+		final int r1i = (int) Math.ceil(recalc_mv_r1) + 1;   // pixel search radius
+		float[][] fpixels_masked = new float[nframes][];       // lazy-allocated per nseq
+
+		for (int nseq = 0; nseq < targets_nonoverlap.length; nseq++) {
+			// Skip scan positions where no target has a known centroid yet
+			boolean has_centered = false;
+			for (int ntile = 0; ntile < targets_nonoverlap[nseq].length; ntile++) {
+				double[] t = targets_nonoverlap[nseq][ntile];
+				if (t != null && !Double.isNaN(t[CuasMotionLMA.RSLT_X])) {
+					has_centered = true;
+					break;
+				}
+			}
+			if (!has_centered) continue;
+
+			int frame_center = frame0 + nseq * corr_inc;
+			// Centre the boosted correlation window on frame_center
+			int frame0_ref  = frame_center - corr_pairs_ref / 2;
+			int frame1_ref  = frame0_ref + corr_offset;
+
+			// Determine the frame range actually accessed by correlatePairs
+			// (pairs where frame0+dframe>=0 AND frame1+dframe<nframes)
+			int fmin_alloc = Math.max(0, frame0_ref);
+			int fmax_alloc = Math.min(nframes - 1, frame1_ref + corr_pairs_ref - 1);
+			if (fmin_alloc > fmax_alloc) continue;  // no valid pairs in range
+
+			// Reset masked-frame array for this scan, allocate zero arrays for the window
+			for (int f = 0; f < nframes; f++) fpixels_masked[f] = null;
+			for (int f = fmin_alloc; f <= fmax_alloc; f++) {
+				fpixels_masked[f] = new float[width * height];
+			}
+
+			// Apply the raised-cosine mask around each target, tracking its motion
+			for (int ntile = 0; ntile < targets_nonoverlap[nseq].length; ntile++) {
+				double[] target = targets_nonoverlap[nseq][ntile];
+				if (target == null || Double.isNaN(target[CuasMotionLMA.RSLT_X])) continue;
+				int tileX = ntile % tilesX;
+				int tileY = ntile / tilesX;
+				// Target centre in image pixels at frame_center
+				double cx0 = tileX * tileSize + tileSize / 2.0 + target[CuasMotionLMA.RSLT_X];
+				double cy0 = tileY * tileSize + tileSize / 2.0 + target[CuasMotionLMA.RSLT_Y];
+				double vx  = target[CuasMotionLMA.RSLT_VX];  // pixels per corr_offset frames
+				double vy  = target[CuasMotionLMA.RSLT_VY];
+
+				for (int f = fmin_alloc; f <= fmax_alloc; f++) {
+					double offset_scale = (double)(f - frame_center) / corr_offset;
+					double cx = cx0 + vx * offset_scale;
+					double cy = cy0 + vy * offset_scale;
+					int icx = (int) Math.round(cx);
+					int icy = (int) Math.round(cy);
+					double dcx = cx - icx;
+					double dcy = cy - icy;
+					float[] src = fpixels[f];
+					float[] dst = fpixels_masked[f];
+					for (int dy = -r1i; dy <= r1i; dy++) {
+						int py = icy + dy;
+						if (py < 0 || py >= height) continue;
+						for (int dx = -r1i; dx <= r1i; dx++) {
+							int px = icx + dx;
+							if (px < 0 || px >= width) continue;
+							double r = Math.sqrt((dx - dcx) * (dx - dcx) + (dy - dcy) * (dy - dcy));
+							if (r >= recalc_mv_r1) continue;
+							double w = (r <= recalc_mv_r0) ? 1.0 :
+								0.5 * (Math.cos(Math.PI * (r - recalc_mv_r0) / (recalc_mv_r1 - recalc_mv_r0)) + 1.0);
+							dst[py * width + px] += (float)(w * src[py * width + px]);
+						}
+					}
+				}
+			}
+
+			// Debug: show the masked source frame at frame_center for this scan
+			if (nseq == dbg_nseq && debugLevel >= 0) {
+				int f_show = Math.max(fmin_alloc, Math.min(fmax_alloc, frame_center));
+				new ImagePlus("refineMotionVectors-masked-nseq" + nseq + "-f" + f_show,
+						new FloatProcessor(width, height, fpixels_masked[f_show].clone())).show();
+				// Also show the same frame unmasked for comparison
+				new ImagePlus("refineMotionVectors-source-nseq" + nseq + "-f" + f_show,
+						new FloatProcessor(width, height, fpixels[f_show].clone())).show();
+			}
+
+			TDCorrTile[] tdCorrTiles = cuasMotion.correlatePairs(
+					clt_parameters,   // CLTParameters clt_parameters
+					fpixels_masked,   // float [][]    fpixels
+					frame0_ref,       // int           frame0
+					frame1_ref,       // int           frame1
+					corr_pairs_ref,   // int           frame_len
+					1,                // int           corr_ra_step (direct accumulation, no rolling avg)
+					smooth,           // boolean       smooth
+					batch_mode,       // boolean       batch_mode
+					null,             // String        dbg_suffix
+					debugLevel);
+
+			double scale_fat_zero = 1.0 / recalc_mv_boost;
+			double[][] corr_tiles_pd = cuasMotion.convertTDtoPD(
+					tdCorrTiles,
+					0xFE,
+					fat_zero * scale_fat_zero,
+					debugLevel);
+
+			double[][] vector_field = TDCorrTile.getMismatchVector(
+					corr_tiles_pd,
+					-rstr,        // rmax < 0: no relative-strength filter (same convention as getTargetsFromCorr2d)
+					cent_radius,
+					n_recenter,
+					true);
+
+			// Debug: show 2D correlation tile for the debug tile
+			if (nseq == dbg_nseq && dbg_tile >= 0 && dbg_tile < corr_tiles_pd.length
+					&& corr_tiles_pd[dbg_tile] != null && debugLevel >= 0) {
+				int corr_side = 2 * GPUTileProcessor.DTT_SIZE - 1; // 15
+				float[] corr_img = new float[corr_side * corr_side];
+				for (int i = 0; i < corr_img.length; i++) corr_img[i] = (float) corr_tiles_pd[dbg_tile][i];
+				new ImagePlus("refineMotionVectors-corr2d-nseq" + nseq + "-tile" + dbg_tile,
+						new FloatProcessor(corr_side, corr_side, corr_img)).show();
+			}
+
+			// Add differential MV to targets_nonoverlap in-place
+			for (int ntile = 0; ntile < targets_nonoverlap[nseq].length; ntile++) {
+				double[] target = targets_nonoverlap[nseq][ntile];
+				if (target == null || Double.isNaN(target[CuasMotionLMA.RSLT_X])) continue;
+				if (vector_field[ntile] == null) continue;
+				if (ntile == dbg_tile && debugLevel >= 0) {
+					System.out.printf("refineMotionVectors(): nseq=%d ntile=%d  before: VX=%.4f VY=%.4f  delta: dVX=%.4f dVY=%.4f%n",
+							nseq, ntile,
+							target[CuasMotionLMA.RSLT_VX], target[CuasMotionLMA.RSLT_VY],
+							vector_field[ntile][INDX_VX],   vector_field[ntile][INDX_VY]);
+				}
+				target[CuasMotionLMA.RSLT_VX] += vector_field[ntile][INDX_VX];
+				target[CuasMotionLMA.RSLT_VY] += vector_field[ntile][INDX_VY];
+			}
+		}
+	}

 	/**
 	 * Convert Transform-domain 2D correlation to phase correlation,
@@ -8089,6 +8284,11 @@ public class CuasMotion {
 		if (debugLevel > -4) {
 			System.out.println("\n========================== Starting centered iterations =============================.\n");
 		}
+		// By Claude on 05/05/2026 — optional MV refinement parameters for centered loop
+		boolean recalc_mv       = clt_parameters.imp.cuas_recalc_mv && !slow_mode;
+		double  recalc_mv_boost = clt_parameters.imp.cuas_recalc_mv_boost;
+		double  recalc_mv_r0    = clt_parameters.imp.cuas_recalc_mv_r0;
+		double  recalc_mv_r1    = clt_parameters.imp.cuas_recalc_mv_r1;
 		// second pass, using non-centered targets
 		int niter_lim = niter+ num_cycles;
 		int iter_show1 = iter_show + niter;
@@ -8243,7 +8443,24 @@ public class CuasMotion {
 			// targets_new will contain motion vectors, centroid, and LMA results combined
 			//save_filtered_high

-			
+			// By Claude on 05/05/2026 — re-correlate with spatial mask around known target
+			if (recalc_mv) {
+				refineMotionVectors(
+						clt_parameters,      // CLTParameters   clt_parameters,
+						batch_mode,          // boolean         batch_mode,
+						cuasMotion,          // CuasMotion      cuasMotion,
+						recalc_mv_boost,     // double          recalc_mv_boost,
+						recalc_mv_r0,        // double          recalc_mv_r0,
+						recalc_mv_r1,        // double          recalc_mv_r1,
+						fpixels_tum,         // float [][]      fpixels,
+						targets_nonoverlap,  // double [][][]   targets_nonoverlap,
+						frame0,              // int             frame0,
+						corr_inc,            // int             corr_inc,
+						half_accum_range,    // int             half_accum_range,
+						smooth,              // boolean         smooth,
+						corr_offset,         // int             corr_offset,
+						debugLevel);         // int             debugLevel)
+			}

 			float [][] accum_debug = save_filtered_high? new float [num_corr_samples][]:null; //fpixels_accumulated.length]
 			boolean    keep_failed = false; // keep failed targets