docs: Update roadmap with Rong-Rong meeting and FPGA student timeline

cf0298e1 · Andrey Filippov · cf747e18 · cf0298e1 · cf0298e1
Commit cf0298e1 authored May 01, 2026 by Andrey Filippov
Show whitespace changes
Inline Side-by-side

Showing with 9 additions and 2 deletions

project-details.md docs/project-details.md +2 -2

CuasMotion.java src/main/java/com/elphel/imagej/cuas/CuasMotion.java +7 -0

No files found.
--- a/docs/project-details.md
+++ b/docs/project-details.md
@@ -562,13 +562,13 @@ This section captures the latest validated state before pausing Global LMA work
   - **Adaptive Integration Windows:** Implement a competitive selection mechanism (e.g., 1x vs 4x integration) where the pipeline automatically selects the motion vector that yields the highest Contrast-to-Noise Ratio (CNR), rather than relying on hardcoded thresholds.
   - **Fat Zero Auto-Scaling:** Automatically scale `cuas_rng_fz` (fat zero) down proportionally to the length of the temporal integration (longer integrations lower the stochastic noise floor, allowing closer-to-pure phase correlation).
   - **Acceleration Compensation:** Refine the "Virtual Moving Camera" model to handle non-linear motion (like U-turns) that currently "squash" correlation peaks during long integrations.
-   - **Hybrid Classical/DNN Tracking (U of U FOPEN Collaboration):** Export unnormalized MCLT correlation hyperstacks to train an Attention/SSM-based neural network. The goal is to replace the LMA fitter with a DNN regression head `(X, Y, vX, vY)` that applies learned "soft masks" to 16x16 macroblocks, running inference via TorchScript/DJL inside the Java pipeline. See `03_UU_RongRong_Hybrid_DNN_Architecture.md` for details.
+   - **Hybrid Classical/DNN Tracking (U of U FOPEN Collaboration):** Export unnormalized MCLT correlation hyperstacks to train an Attention/SSM-based neural network. The goal is to replace the LMA fitter with a DNN regression head `(X, Y, vX, vY)` that applies learned "soft masks" to 16x16 macroblocks, running inference via TorchScript/DJL inside the Java pipeline. See `03_UU_RongRong_Hybrid_DNN_Architecture.md` for details. **Meeting scheduled with Rong-Rong team for Thursday, May 7th @ 2 PM at Elphel.**
   - *Note: These deeper algorithmic optimizations are intentionally deferred. The strategy is to establish a working baseline first, expose the necessary low-bandwidth tile metrics via the MCP server, and then allow AI agents (Codex, Claude, Gemini) to autonomously sweep, analyze, and optimize these specific sub-problems.*
 3. **FPGA / Hardware Teaming Roadmap (U of U Collaboration):**
   - **MCP for GTKWave:** Develop a Model Context Protocol (MCP) bridge to allow LLMs to natively analyze `.vcd` files. This will enable natural language querying of simulation waveform data (e.g., "Find the memory arbiter hang").
   - **Cocotb Integration:** Revive the Python-based simulation-to-hardware workflow. The goal is to ensure that testbenches used in Icarus Verilog remain perfectly valid through physical hardware testing and eventual C-code kernel driver development.
   - **GPU Top-Level Dispatcher (Human Latency Reduction):** Investigate moving the GPU pipeline orchestration from Java (JCuda sequential calls) into a single C++ "Master Dispatcher" kernel. By hollowing out the Java loops and decision logic and placing them into a single `.cu` file that calls mathematical modules as `__device__` functions, we eliminate the need to duplicate scheduling code across C++ and Java. This ensures that the production ImageJ environment uses the exact same orchestration logic as the development/Nsight environment, reducing human effort and convergence-translation errors.
-   - **Agent-Assisted Onboarding:** Leverage agents to bridge the gap for "occasional" users (like graduate students) by guiding them through the specialized hardware/Verilog knowledge base.
+   - **Agent-Assisted Onboarding:** Leverage agents to bridge the gap for "occasional" users (like graduate students) by guiding them through the specialized hardware/Verilog knowledge base. **Collaboration with U of U FPGA students starts after exams end in May.**
 4. Batch replay of this quarter+global stage on previously processed data; classify failures and choose representative/challenging short test sequences.
 5. Algorithm improvement for occlusion handling:
   - Predict likely-occluded tiles from depth/disparity behavior.

--- a/src/main/java/com/elphel/imagej/cuas/CuasMotion.java
+++ b/src/main/java/com/elphel/imagej/cuas/CuasMotion.java
@@ -8618,6 +8618,13 @@ public class CuasMotion {
 													lma_rslts[CuasMotionLMA.RSLT_FAIL] = CuasMotionLMA.FAIL_FAR;
 													break try_failures; // below horizon line
 												}
+												// see if it is completely outside 
+												if ((Math.abs(x) >= (GPUTileProcessor.DTT_SIZE-1)) ||
+														(Math.abs(y) >= (GPUTileProcessor.DTT_SIZE-1))) {
+													lma_rslts[CuasMotionLMA.RSLT_FAIL] = CuasMotionLMA.FAIL_FAR;
+													break try_failures; // below horizon line
+												}
 											}
 											failed = false; // all tests passed
 										}