Commit 09adc66b authored by Andrey Filippov's avatar Andrey Filippov

docs: Update project-details.md with CUAS and Agent workflow context

parent 736b9069
......@@ -3,6 +3,21 @@
## Overview
This document is a structured, living description of the data layout and processing pipeline for the current active parts of the `imagej-elphel` project. It consolidates earlier notes and the latest additions into a coherent reference.
Currently, active development is driven by a USAF contract for CUAS (Counter-Unmanned Aircraft Systems) detection, and a subcontract with the University of Utah for FOPEN (Foliage Penetration). The CUAS application uses the LWIR-16 camera mounted on a gimbal, utilizing rotational motion for FPN (Fixed-Pattern Noise) mitigation.
## Autonomous Agent Workflow (MCP & Daemon Integration)
The processing pipeline takes significant time (often hours) for large sequences. To optimize this, the program is wrapped with an MCP (Model Context Protocol) interface, an MCP server, and a lifecycle daemon.
This architecture enables autonomous, overnight operation by AI agents (Codex, Claude, Gemini). The expected agent lifecycle loop is:
1. Execute the program via MCP.
2. Configure processing parameters (e.g., adjusting FPN thresholds, tuning LMA weights).
3. Launch the processing job.
4. Wait for completion.
5. Analyze the resulting CSVs, Histograms, or TIFF profiles.
6. Re-run with refined parameters or directly edit the Java source code to fix issues.
This workflow has been successfully demonstrated for the FOPEN datasets and is currently being adapted to handle the CUAS parameter optimization sweeps to maximize the number of valid targets detected.
## Data Layout and List Files
The program starts with a list file that defines the root and source directories plus scene sequences. Example list entry:
......@@ -195,15 +210,24 @@ These notes describe the practical difference between:
### CUAS mode (gimbal-rotation, mostly parked after 2025-10-27)
- Camera is fixed in ground frame while gimbal rotates view direction around a fixed axis.
- A virtual reference scene is used (`center_CLT = QuadCLT.restoreCenterClt(...)`) rather than a captured reference frame.
- Typical scan: about one full revolution per ~3 seconds (~0.33 Hz), with rotation radius around ~3 angular degrees.
- Typical scan: about one full revolution per ~3 seconds (~0.33 Hz), with a rotation radius around ~3 angular degrees.
- Resulting inter-frame shift remains small (roughly <10% of image size), so all scenes in a long sequence (~500) can overlap a single virtual center.
- Full-sequence processing is therefore feasible without segment splitting logic used in moving-camera mode.
### Why CUAS uses rotation
The main target class is difficult: simultaneously low-contrast, small, and slow apparent motion in pixels.
FPN (fixed-pattern noise) in microbolometer LWIR sensors can dominate this regime.
### Why CUAS uses rotation and deep disparity integration
The primary target class for CUAS detection is extremely difficult: targets are simultaneously low-contrast, physically sub-pixel (e.g., <0.2 pixels linear, appearing slightly larger only due to lens aberrations), and exhibit very slow apparent motion (e.g., a drone moving directly toward or away from the camera).
In this regime, the uncooled microbolometer's Fixed-Pattern Noise (FPN) becomes the dominant limiting factor. Because FPN slowly fluctuates over time, it cannot be cleanly removed by subtracting a single static "dark frame". If the camera were completely stationary, the sub-pixel target signal would be completely drowned out by this slowly drifting FPN.
By rotating the camera at ~1/3 rps on the gimbal, we force the target and the physical background to move rapidly across the sensor array in image coordinates, while the FPN remains geographically tied to the sensor pixels. This deliberate mechanical decoupling vastly improves the mathematical separability of the target from the noise floor.
Because the target footprint is incredibly small, acquiring accurate disparity (ranging) requires massive data integration. The pipeline employs a two-step process:
1. **Sensor Consolidation:** Combine data from the 16 separate LWIR cameras (each 640x512, 32° HFoV, distributed on a 220 mm diameter circle).
2. **Virtual Long-Exposure Tracking:** Implement a virtual "tracking" camera that follows the target using piece-linear motion models.
By accumulating hundreds of these 16-image scenes over several seconds, the pipeline pushes the disparity resolution deeper than a single frame allows. In favorable conditions (like the recent Eagle Mountain test), this enables sub-pixel disparity resolutions below **0.02 pixels**. Because of the 220 mm baseline across the 16 cameras, a 0.02 pixel disparity error translates to roughly a 10% ranging error at 1000 meters.
Rotation introduces faster target/background variation in image coordinates while FPN drifts more slowly, improving separability.
Current efforts are focused on achieving similar detection and ranging performance on recent data from Latvia (Selonia NATO range) to facilitate faster target detection. Hardware anomalies during acquisition (including a laptop failure linked to ground loops/CAT5 connections in heavy EW jamming environments) resulted in noisy sequences, but valid drone (DJI 4 Mini Pro) and bird tracks are present and being actively processed.
### FPN behavior and mitigation in CUAS
- FPN is not perfectly static over long periods.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment