Commit 6da5747a authored by Andrey Filippov's avatar Andrey Filippov

Moved project docs to the git-controlled directory

parent 20be6135
# Project Details (Draft)
## Overview
This document is a structured, living description of the data layout and processing pipeline for the current active parts of the `imagej-elphel` project. It consolidates earlier notes and the latest additions into a coherent reference.
## Data Layout and List Files
The program starts with a list file that defines the root and source directories plus scene sequences. Example list entry:
```
1763232231_345331 1, -2, -1, 0, 0 # NC32-LY 13:43:51 #nc34a
```
The first token is a scene-sequence timestamp (directory name). The next two numbers define the scene index range to process. For example, with 499 subfolders (0..498), `1, -2` means process scenes 1..497.
Each subdirectory in the sequence is a "scene" (a set of 16 simultaneously captured LWIR images). Sensors are arranged on a 220 mm diameter circle, evenly spaced at 360/16 degrees. Sensor 0 is at the top; 1..15 are clockwise when viewed from behind the camera (same direction as the camera looks). Sensors 16..19 are RGB cameras and are not used in the current pipeline.
The raw LWIR data under:
```
/media/elphel/btrfs-data/lwir16-proc/NC/scenes/
```
is copied once into model directories (as `jp4` subfolders). Example:
```
/media/elphel/btrfs-data/lwir16-proc/NC/scenes/1763232231_345331/1763232231_362003
```
is copied to:
```
/media/elphel/btrfs-data/lwir16-proc/NC/models/1763232117-1763234145/1763232231_362003/jp4
```
IMU data for the scene is stored as:
```
/media/elphel/btrfs-data/lwir16-proc/NC/models/1763232117-1763234145/1763232231_362003/1763232231_362003-ims.corr-xml
```
The IMU source path currently comes from `CORRECTION_PARAMETERS.getImsSourceDirectory()` (edited in "Setup CLT Batch parameters") and is not yet exposed in `*.list` files.
Scene sequences are captured at ~60 Hz for ~8 seconds, followed by ~3 seconds pause due to bandwidth constraints.
## Model Directories, Versions, and Linked Views
The parent directory for processed scenes is large (the latest set contains ~85K model directories). Most contain only `jp4` and IMS data. Some contain additional results and are considered special:
- Reference scenes with computed data for their segment.
- Index scenes (latest timestamp per sequence) that include `*-ground_planes.csv`. The first column contains timestamps of reference scenes in the segment.
Processing results are versioned. The version directory is defined by `EyesisCorrectionParameters.x3dModelVersion` (configured in “Setup CLT Batch parameters”). A single model directory may contain multiple versions.
For manual browsing, the program creates a directory with symlinks to special model directories. Example:
```
/media/elphel/btrfs-data/lwir16-proc/NC/models/1763232117-1763234145
```
## Core Processing Pipeline (Segment-Based)
Scene sequences are processed separately in a loop such as:
```
for (int nseq = 0; nseq < num_seq; nseq++) {
long start_time_seq = System.nanoTime();
System.out.println("\nSTARTED PROCESSING SCENE SEQUENCE "+nseq+" (last is "+(num_seq-1)+")\n");
...
}
```
Segments are defined around a reference scene, with overlap and matching constraints. Scenes are processed in reverse order (latest first) for historical reasons.
### Segment Bounds
The program finds how many preceding scenes can be included in the segment, based on overlap and the number of good matching tiles. Scenes can be "partially valid" if overlap is sufficient for translation but not for zoom or rotation fitting.
In the current application (Cessna, 25–150 m AGL, 30–40 m/s), the reference scene is near the center of the segment. Earliest and latest scenes may not overlap with each other but must overlap with the reference. The segment is continuous with no gaps. The earliest scene is determined by overlap with the reference scene (`EstimateSceneRange.getPrescanRange()`).
### Initial Depth Map (Intrascene)
Depth is computed per tile (80x64 tiles corresponding to 640x512 images) in `OpticalFlow.buildRefDSI()`. This uses intrascene correlation (parallax across sensors) and GPU/CUDA kernels under:
```
/home/elphel/git/imagej-elphel/src/main/resources/kernels
```
Java code uses JCuda (`/home/elphel/git/imagej-elphel/src/main/java/com/elphel/imagej/gpu`). Kernels are debugged via Nvidia NSight using a C++ testbench. The external CUDA repo is:
```
/home/elphel/git/tile_processor_gpu/
https://git.elphel.com/Elphel/tile_processor_gpu/tree/lwir16_2
```
Because baselines are narrow, keypoints are not used. Instead, the pipeline uses 2D phase correlation between 16x16 tiles. There are 120 sensor pairs. Pairwise results are combined by:
- Baseline-rotated/scaled correlations.
- LMA fit to all 120 pairs (more accurate, more expensive).
- DNN approach (from the arXiv paper, not yet implemented for the current camera).
Argmax accuracy is improved by an iterative approach. After each disparity estimate, a fractional shift is applied in the frequency domain and correlation is repeated. Residual disparity converges to zero without assuming correlation shape. The phase-correlation step uses a "fat zero" (small constant in the denominator) to balance noise sensitivity.
The initial disparity map is saved as:
```
.../v89/<timestamp>-DSI_MAIN.tiff
```
### Pose Estimation (Interscene)
The pipeline estimates per-scene pose relative to the reference scene. Positions are stored in `*-egomotion.csv` columns 3–5 and orientation in columns 6–8. IMS provides hints. Matching is done with interscene correlation on per-tile motion vectors.
Workflow:
- Build initial disparity map.
- Match consecutive scenes to estimate poses.
- Refine disparity map with interscene accumulation.
- Repeat with more LMA parameters enabled.
Outputs include 3D scene models, DNN training data, and auxiliary files. State is persisted in:
- `*-DSI_MAIN.tiff`
- `*-INTER-INTRA-LMA.tiff` (refined disparity, overwritten per iteration)
- `*-INTERFRAME.corr-xml` (poses, overwritten per iteration)
Iteration counters:
- `EYESIS_DCT_AUX.num_orient`
- `EYESIS_DCT_AUX.num_accum`
Used in loops like:
```
while ((master_CLT.getNumOrient() < min_num_orient) || (master_CLT.getNumAccum() < min_num_interscene))
```
### ERS and Motion Blur
Electronic Rolling Shutter (ERS) is handled during pose fitting. Each step computes per-tile motion vectors, then uses LMA to fit pose and optional velocities. Since motion blur can exceed 5 px, images are pre-aligned using IMS predictions. When IMS is unavailable, a spiral search is used. Matching starts from scenes closest to the reference, then expands outward.
#### GPU-side motion-blur handling and TD correlation flow
Current interscene matching for a segment uses a GPU-centric transform-domain (TD) pipeline:
1) The reference scene (all 16 LWIR images) is transferred to GPU memory and converted to TD once.
2) Factory-calibration aberration correction is applied in TD by convolution with the calibration kernels.
3) Motion-blur compensation is applied in TD workflow for both reference and matched scenes:
- microbolometer blur is modeled as a first-order LPF in time;
- correction subtracts a scaled copy of the same image shifted by motion vectors (tile-granular approximation);
- task definitions are packed in Java (`TpTask`) and passed as float task arrays to GPU kernels.
For each matched scene, the 16 images are TD-transformed and geometrically aligned to the reference (tile by tile), using integer tile placement plus fractional subpixel shift through TD phase rotation. Pairwise products with cached reference TD are computed per sensor pair, then all 16 channels are accumulated in TD before phase-only normalization. A small "fat zero" term in normalization controls SNR vs peak sharpness tradeoff. After inverse transform, the peak argmax gives tile motion vectors (`mvX`, `mvY`), and peak magnitude is used as sample strength/weight for subsequent LMA.
This "accumulate in TD before normalization" step is important for low-contrast LWIR data because it improves robustness relative to per-channel independent normalization.
### FPN Mitigation
Uncooled LWIR sensors have fixed-pattern noise (FPN). If interscene pixel shifts are too small, correlation may lock to FPN. Scenes with expected shifts below thresholds (`IntersceneMatchParameters.fpn_min_offset`, `min_offset`) are skipped and added to `fpn_list`. Later, each skipped scene is matched to a farther scene using `Interscene.getFPNPairs`, ensuring pixel shifts exceed tile size and avoid FPN dominance.
## Differential Image Rectification (TP/CLT)
TPNET uses "differential image rectification" (US10638109B2). Instead of full rectification to a pinhole model, the pipeline computes a virtual average camera and rectifies only the small differential distortions of each sensor relative to that average. Local shifts are typically <2 pixels.
CLT (Complex Lapped Transform) is used instead of FFT. CLT has perfect reconstruction and allows lossless fractional pixel shifts via phase rotations. This enables sub-pixel disparities (~0.05–0.01 px) without upsampling or resampling. Multiple operations (aberration correction and phase correlation) are combined in the transform domain for GPU efficiency.
## Down-View Aerial Applications (SfM / SLAM)
Downward-looking aerial views reduce depth variation, which makes separating translation from rotation harder. Lateral motion and azimuth rotation both appear as X shifts; vertical motion and tilt appear as Y shifts. Traditional intrascene disparity becomes insufficient for high altitudes, so the workflow was adjusted.
### Multicopter (Mine Detection) workflow
1) Initial intrascene depth map
2) Initial segment pose estimation
3) Depth refinement via interscene accumulation
4) Pose refinement with more LMA parameters
5) SfM refinement, repeated multiple times
6) Calibration (IMS vs camera frame)
7) Outputs
### Foliage Penetration: SLAM approach
At high altitudes, pure intrascene depth fails for long baselines. The pipeline uses SfM simultaneously with pose estimation in `EstimateSceneRange.scanSfmIMS()`. Disparity is refined as scenes get farther from the reference, with weights proportional to baseline distance. Short baselines skip SfM (`min_sfm_meters`). Long baselines may filter low-confidence tiles (`sfma_filt_meters`). Background tiles are preferred when FG/BG confusion occurs.
### Current Limitations
- Disparity offset compensation is still incomplete. In navigation, far objects allowed treating minimal disparity as infinity; for down-view this is not possible. Current correction attempts use:
- `OpticalFlow.getImgImsScale()`
- `OpticalFlow.getImsDisparityCorrection()`
- `QuadCLT.offsetDSI()`
- `QuadCLT.setDispInfinityRef()`
- `OpticalFlow.scaleImgXYZ()`
This fails in some low-altitude forest cases. The correction should be per-sequence, not per-segment, for better accuracy.
- IMS/camera orientation offset (`Interscene.getQuaternionCorrection()`) is disabled and should move to per-sequence. `QuaternionLma` should be restored to use linear motion plus rotations. We should weight angular samples (closer to reference are more accurate) and ensure time consistency. Angular data is in `*-egomotion.csv` columns 6–8.
## Latest Additions
### Segment freezing with `keep_segments`
Index scenes (`*-index`) contain `*-INTERFRAME.corr-xml` with keys like:
```
<entry key="EYESIS_DCT_AUX.refscenes_<timestamp>"></entry>
```
The new parameter:
```
IntersceneMatchParameters.keep_segments = true
```
If true, and refscenes exist in the index scene’s `*-INTERFRAME.corr-xml`, those reference scenes are reused instead of recomputing segment splits. This preserves segment definitions when re-running with new parameters or calibrations.
### Elevation histograms for quick classification
Files like:
```
.../v88/<timestamp>-elevations_histogram.csv
```
contain cumulative histograms of tile elevations relative to a fitted ground plane. Columns:
- `AGL`: single value in row 2, altitude above ground.
- `ELEVATION`: elevation value for histogram bin.
- `FRACTION_LOWER`: fraction of tiles below that elevation.
Parameters in `IntersceneMatchParameters`:
```
fgnd_hist_gen = true
fgnd_hist_suffix = "-elevations_histogram.csv"
fgnd_hist_frac = 0.03
fgnd_hist_bins = 100
```
`fgnd_hist_frac` defines outlier trimming; min/max elevation are computed from the remaining tiles. These histograms enable fast, low-bandwidth MCP classification by reading small CSVs instead of large images. Example classification features:
- AGL per segment
- "Forestation ratio" (fraction of tiles above a height threshold, e.g. 1 m)
- Maximum elevation at a given FRACTION_LOWER (e.g. 0.9)
Suggested aggregation: for each scene sequence, compute average/min/max AGL and forestation ratio across reference scenes.
### Global LMA objective and correlation modes (implementation notes)
The global LMA stage is intended to solve one joint optimization problem, not separate ones:
- Correlation residuals for all selected scene/reference pairs (currently scene-to-center, later also 1/4 and 3/4 references).
- Inter-scene LPF/curvature penalties.
- Optional pull-to-target penalties.
The practical requirement is to keep these terms explicit and inspectable in one place in the code:
- `prepareLMA()` should build one combined `y_vector` and one combined `w_vector`.
- Correlation rows should be concatenated first, then LPF rows, then pull rows.
- `getFxDerivs()` should contain visible blocks for each term group in that same order.
#### Correlation residual representation
The single-scene pair LMA supports two measurement modes:
1. Direct XY mode:
- Two residual components per tile are used as-is (`dx`, `dy`).
2. Eigenvector mode (preferred in current data):
- Correlation peak shape (ellipse) is used to weight directional certainty.
- Residuals and derivatives are transformed into eigen directions.
- Direction perpendicular to linear texture carries most information; along-line direction is de-emphasized.
Current development/testing mostly uses mode 2 (eigenvector mode), but mode 1 should remain supported and periodically checked to avoid bit-rot.
#### Why this structure matters
Keeping all residual segments in one explicit Jacobian path makes it easier to:
- Verify convergence term-by-term.
- Diagnose whether LPF/pull are dominating correlation, or vice versa.
- Add new residual segments incrementally without losing traceability of `y - f(x)` composition.
## Parked State Snapshot (2026-02-25)
This section captures the latest validated state before pausing Global LMA work for C-UAS field-test preparation.
### What is validated now
- Quarter references are selected and used in Global LMA flow.
- Quarter `-INTER-INTRA-LMA.tiff` prerequisites are generated before Global LMA when missing.
- Force regeneration path is working (`force quarter regeneration = true`).
- SfM-only quarter generation path is running and bypassing pose/LMA adjustment in the quarter build stage.
- Regenerated quarter files looked reasonable on manual visual inspection.
- Full run continues into Global LMA and finishes normally.
### Last known-good config for quarter-force + sfm-only
- `/media/elphel/btrfs-data/lwir16-proc/NC/config/MI250-v89-nc_site_37A-FORCE-SFM-XYAT-lpf_XYZ100-ATR3-freeze-RMS1e-4-ratio100-quat4-tw0.0005-cw1.0-noperseries-regularization0.8-transl-rot-nocombo-no_out-preseries_only.corr-xml`
### Key files from the validated run
- Quarter 1:
`/media/elphel/btrfs-data/lwir16-proc/NC/models/1763232117-1763234145/1763233239_481130/v89/1763233239_481130-INTER-INTRA-LMA.tiff`
- Quarter 3:
`/media/elphel/btrfs-data/lwir16-proc/NC/models/1763232117-1763234145/1763233241_531813/v89/1763233241_531813-INTER-INTRA-LMA.tiff`
- Baseline copy for comparison:
`/media/elphel/btrfs-data/lwir16-proc/NC/models/1763232117-1763234145/1763233240_531480/debug01/46`
### Current run pattern (stable for automation)
1. Restore config.
2. Optional parameter changes in `Setup CLT` (`Inter-Global-LMA` section).
3. Save config revision.
4. Run `Aux Build Series`.
5. Inspect console + CSV/TIFF outputs; archive debug artifacts as needed.
### Next TODO (priority order)
1. Performance pass: identify current bottlenecks and low-hanging optimizations.
2. Batch replay of this quarter+global stage on previously processed data; classify failures and choose representative/challenging short test sequences.
3. Algorithm improvement for occlusion handling:
- Predict likely-occluded tiles from depth/disparity behavior.
- Conditionally zero tile weights even when correlation strength is high, if non-occluded data is sufficient.
- Keep enough constraints for small-overlap scenes where joint XYZATR fitting can become underconstrained.
4. Add global-LMA-style disparity/depth refinement stage:
- Start from center + quarter maps.
- Reuse pairwise correlations and include occlusion-aware weighting.
- Revisit FG/BG disparity slices in `-INTER-INTRA-LMA.tiff` (currently mostly pass-through/copy).
5. Deeper threading/performance tuning after correctness is stable.
# Project Summary (imagej-elphel)
## Context
- Java ImageJ 1.x plugin evolved over ~10 years; code is large and entangled due to shifting applications.
- Original target: vehicle navigation in complete darkness (DARPA “Invisible Headlights”).
- Later applications: drone‑based mine detection, long‑range drone detection, and most recently foliage penetration from a Cessna.
- Hardware is now stable: 16‑sensor LWIR camera; processing runs in batch mode with list files.
## Current workflow
- Main batch entry path: `Eyesis_Correction.runMenuCommand()``buildSeries()``TwoQuadCLT.buildSeriesTQ()``OpticalFlow.buildSeries()`.
- Batch inputs are `.list` files such as:
`/media/elphel/btrfs-data/lwir16-proc/NC/lists/nc_site_32A.list`
Example line:
`1763232778_911052 1, -2, -1, 0, 0` (start at scene 1, skip last 2; other params unused here)
- Scenes live under:
`/media/elphel/btrfs-data/lwir16-proc/NC/scenes/` (timestamp folders; “_” used instead of “.”)
## Output data
- Batch processing produces many per‑reference outputs (images, XML, CSV) per 8‑second sequence.
- Calibration files of interest:
`*-FIELD_CALIBRATION.corr-xml`
Example:
`/home/elphel/lwir16-proc/NC/models/1763232117-1763234145/1763232535_880101/1763232535_880101-FIELD_CALIBRATION.corr-xml`
- These contain per‑sensor azimuth/tilt/roll corrections (radians), e.g.:
`EYESIS_DCT_AUX.extrinsic_corr_azimuth9`, `EYESIS_DCT_AUX.extrinsic_corr_roll8`, etc.
## Immediate goals
1) Scan data directories for anomalies (missing files, bad CSVs, etc.).
2) Aggregate calibration data across sequences (LY “lazy eye”):
- Extract azimuth/tilt/roll per sensor from `*-FIELD_CALIBRATION.corr-xml`.
- Build CSV for Calc to visualize sensor drift (notably sensors 3 and 8).
3) Prepare data for university DNN training and scene classification.
4) Generate map screenshots from GPS data (later; network/GUI dependent).
## Global LMA pause/resume note
- Current Global LMA/quarter-generation parked state and next-step backlog are documented in:
`attic/CODEX/HOWTOs/project-details.md` under `Parked State Snapshot (2026-02-25)`.
## MCP thoughts (high‑level)
- MCP should be thin and flexible, with a small set of batch tools and a generic parameter map.
- Avoid GUI automation; use batch‑mode entry points and Properties‑based parameter IO.
- A minimal MVP is feasible: one batch command + parameter map + status/logs.
## Logging
- Added `LogTee` helper (in repo) to tee `System.out/err` to per‑scene files without changing existing printlns.
## Notes
- The code is expected to change; tooling should allow rapid iteration with minimal refactor.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment