Commit 978388ba authored by Andrey Filippov's avatar Andrey Filippov

CLAUDE: Add perspective remap — inverse-shift TERRAIN-DISP to per-scene perspective images

perspective_remap.py: new module that inverts the terrain-reference warp
in TERRAIN-DISP-MERGED frames using ELEV_GND depth (64×80 tiles, tilted
plane, no NaN). Produces natural-parallax perspective images per scene,
valid only in the overlap with the reference camera's ground footprint.

elphel_to_colmap_perspective.py: new CLI script (parallel to the original
elphel_to_colmap.py, which is kept unchanged) that adds --elev_gnd and
applies the remap before writing COLMAP output.

Tested on 185-frame NC dataset: valid-pixel min=0.10, mean=0.54, max=1.00.
SciPy added to Dependencies in README.
Co-authored-by: 's avatarClaude <claude@elphel.com>
parent 925d0f34
......@@ -27,7 +27,7 @@ Converts Elphel imagej-elphel segment outputs to COLMAP / OpenMVS format.
points3D.txt empty (Phase 1 will add the 80×64 sparse seed)
```
### Usage
### Usage — standard (terrain-referenced images)
```bash
python3 scripts/elphel_to_colmap.py \
......@@ -38,6 +38,35 @@ python3 scripts/elphel_to_colmap.py \
Optional: `--hfov 32.0` (degrees, default 32).
This writes the raw TERRAIN-DISP frames as-is. The images are
terrain-referenced (zero parallax for flat ground), which is correct
for FOPEN analysis but incompatible with OpenMVS feature matching.
### Usage — perspective remap (recommended for OpenMVS / 3D reconstruction)
```bash
python3 scripts/elphel_to_colmap_perspective.py \
--tiff /path/to/<ts>-TERRAIN-DISP-MERGED.tiff \
--xml /path/to/<ts>-INTERFRAME.corr-xml \
--elev_gnd /path/to/<ts>-ELEV_GND.tiff \
--out /tmp/colmap_perspective
```
Additionally requires `*-ELEV_GND.tiff` (64×80 float32 ground-depth map,
produced alongside TERRAIN-DISP-MERGED by imagej-elphel).
This applies an inverse-shift remap (via `src/elphel_to_mvs/perspective_remap.py`)
to convert each terrain-referenced frame into a proper per-scene perspective
image. Regions outside the reference camera's ground footprint are filled
with NaN (saved as 0 in the PNG). Expected valid-pixel fractions:
- Reference scene: 100 %
- Central frames: ~97 %
- End-of-sequence frames: ~10–18 %
Natural terrain parallax is restored (~5–6 px/frame at 175 m AGL),
making the images geometrically consistent with the supplied camera poses.
### Full pipeline (one command)
```bash
......@@ -55,7 +84,7 @@ See the comment block at the top of the script for build/install instructions.
### Dependencies
Python 3.8+, Pillow, NumPy.
Python 3.8+, Pillow, NumPy, SciPy.
OpenMVS (CLI tools only — `InterfaceCOLMAP`, `DensifyPointCloud`, etc.).
### Coordinate-convention notes
......
#!/usr/bin/env python3
# -----------------------------------------------------------------------------
# elphel_to_colmap_perspective.py — Elphel LWIR-16 → COLMAP with perspective remap
#
# Copyright (C) 2026 Elphel, Inc.
# -----------------------------------------------------------------------------
#
# elphel_to_colmap_perspective.py is free software: you can redistribute it
# and/or modify it under the terms of the GNU General Public License as
# published by the Free Software Foundation, either version 3 of the License,
# or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""
Phase 0 pose+camera-model bridge with perspective remap.
This is a companion to elphel_to_colmap.py that additionally applies an
inverse-shift remap to convert the terrain-referenced TERRAIN-DISP-MERGED
frames into proper perspective images in each scene's own camera frame.
Why this is needed
------------------
imagej-elphel's TERRAIN-DISP-MERGED.tiff warps every frame to the reference
camera's coordinate system so that flat-terrain features appear at identical
pixel positions across all frames. This is correct for FOPEN analysis but
incompatible with OpenMVS, which expects natural-parallax perspective images.
The remap uses ELEV_GND.tiff (64×80 tile-grid, fitted tilted plane through
ground-visible pixels, no NaN) to recover the per-scene perspective images.
Valid pixels cover only the overlap with the reference camera's ground patch
(~97 % for central frames, ~18 % for end-of-sequence frames). NaN fills the
non-overlapping portions — OpenMVS feature detectors simply skip those.
This script is a Python-only workaround. The preferred long-term solution
is a new imagej-elphel export mode that renders each scene directly from its
own camera position.
Usage
-----
python3 scripts/elphel_to_colmap_perspective.py \\
--tiff PATH/<ts>-TERRAIN-DISP-MERGED.tiff \\
--xml PATH/<ts>-INTERFRAME.corr-xml \\
--elev_gnd PATH/<ts>-ELEV_GND.tiff \\
--out /tmp/colmap_perspective
# All three input files are normally in the same v88/ directory:
BASE=/home/elphel/lwir16-proc/NC/linked/.../1763232146_700470/v88
TS=1763232146_700470
python3 scripts/elphel_to_colmap_perspective.py \\
--tiff $BASE/$TS-TERRAIN-DISP-MERGED.tiff \\
--xml $BASE/$TS-INTERFRAME.corr-xml \\
--elev_gnd $BASE/$TS-ELEV_GND.tiff \\
--out /tmp/colmap_perspective
Output layout
-------------
<out>/
images/ 16-bit PNG per scene (named <timestamp>.png)
Perspective view from scene i's camera; NaN regions → 0
sparse/0/
cameras.txt single PINHOLE camera
images.txt per-image poses (quaternion + translation)
points3D.txt empty (Phase 0; sparse seed added in Phase 1)
"""
import sys
import argparse
import numpy as np
from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parent.parent / "src"))
from elphel_to_mvs.read_interframe_xml import parse as parse_xml
from elphel_to_mvs.read_tiff_stack import read_scene_slices
from elphel_to_mvs.export_colmap import write_colmap_model
from elphel_to_mvs.perspective_remap import read_elev_gnd, remap_scenes
def main():
ap = argparse.ArgumentParser(
description="Elphel → COLMAP Phase 0 bridge with perspective remap",
formatter_class=argparse.RawDescriptionHelpFormatter,
)
ap.add_argument("--tiff", required=True,
help="*-TERRAIN-DISP-MERGED.tiff")
ap.add_argument("--xml", required=True,
help="*-INTERFRAME.corr-xml")
ap.add_argument("--elev_gnd", required=True,
help="*-ELEV_GND.tiff (64×80 ground-depth map)")
ap.add_argument("--out", required=True,
help="Output directory (will be created)")
ap.add_argument("--hfov", type=float, default=32.0,
help="Camera horizontal FoV in degrees (default 32)")
args = ap.parse_args()
# ── Load poses ────────────────────────────────────────────────────────────
print(f"Reading poses from {args.xml}")
poses, ref_ts = parse_xml(args.xml)
print(f" {len(poses)} scene poses (reference: {ref_ts})")
# ── Load raw TERRAIN-DISP frames ──────────────────────────────────────────
print(f"Reading TERRAIN-DISP stack from {args.tiff}")
scenes, gmin, gmax = read_scene_slices(args.tiff)
print(f" {len(scenes)} timestamp slices "
f"(global float range [{gmin:.2f}, {gmax:.2f}])")
# ── Load ELEV_GND ─────────────────────────────────────────────────────────
print(f"Reading ELEV_GND from {args.elev_gnd}")
elev_gnd = read_elev_gnd(args.elev_gnd)
print(f" Shape {elev_gnd.shape} "
f"depth range [{-elev_gnd.max():.1f}, {-elev_gnd.min():.1f}] m AGL")
# ── Compute focal length (same formula as export_colmap.py) ──────────────
# Image width taken from the first scene frame
_, first_arr = scenes[0]
height, width = first_arr.shape
fl = width / (2.0 * np.tan(np.radians(args.hfov / 2.0)))
print(f" Focal length: {fl:.1f} px ({width}×{height} px, HFoV={args.hfov}°)")
# ── Apply perspective remap ───────────────────────────────────────────────
print("Applying perspective remap ...")
scenes_perspective = remap_scenes(scenes, poses, fl, elev_gnd, verbose=True)
# ── Write COLMAP model ────────────────────────────────────────────────────
# gmin/gmax from the original stack are still valid (same LWIR intensity
# range; the remap only changes spatial mapping, not radiometry).
write_colmap_model(
output_dir=args.out,
scenes=scenes_perspective,
poses=poses,
global_min=gmin,
global_max=gmax,
width=width,
height=height,
hfov_deg=args.hfov,
)
if __name__ == "__main__":
main()
# -----------------------------------------------------------------------------
# perspective_remap.py — Undo terrain-reference warp from TERRAIN-DISP frames
#
# Copyright (C) 2026 Elphel, Inc.
# -----------------------------------------------------------------------------
#
# perspective_remap.py is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -----------------------------------------------------------------------------
"""
Convert TERRAIN-DISP frames (terrain-referenced, in the reference camera
coordinate system) into proper perspective images in each scene's own frame.
Background
----------
imagej-elphel's renderSceneSequence() applies a per-tile translation that
warps every frame to the reference camera view using the SfM-refined ground
disparity. The resulting TERRAIN-DISP-MERGED.tiff has zero parallax for
flat-terrain features — incorrect for OpenMVS, which requires natural-parallax
perspective images.
This module inverts that per-tile translation using ELEV_GND.tiff (the
best available ground-depth estimate: 64×80 tile grid, tilted-plane fit
through ground-visible pixels, no NaN) to recover the perspective view for
each scene, valid only in the region that overlaps the reference camera patch.
Derivation (COLMAP world frame, nadir camera, terrain at depth Z m AGL)
------------------------------------------------------------------------
Forward render (what renderSceneSequence does):
TERRAIN_DISP_i[tx, ty] = scene_i_sensor[tx − x_i·fl/Z, ty + y_i·fl/Z]
where x_i, y_i are the scene camera's X, Y displacement from the reference
(metres, Elphel convention: X = flight direction, Y = cross-track).
Inverse (source coordinates in TERRAIN_DISP for output pixel (col, row)):
src_col = col + x_i · fl / Z_ground(col, row)
src_row = row − y_i · fl / Z_ground(col, row)
ELEV_GND stores depth as a negative value (terrain is below the camera in
the Elphel Z convention), so Z_ground = −ELEV_GND > 0.
This is a temporary Python workaround. The correct long-term fix is a new
imagej-elphel export mode that renders each scene from its own camera
position without the terrain-reference warp.
"""
from pathlib import Path
import numpy as np
from PIL import Image
from scipy.ndimage import map_coordinates
def read_elev_gnd(elev_gnd_path):
"""
Read ELEV_GND.tiff — single-frame 64×80 float32 ground-depth map.
Values are negative (terrain below camera in Elphel Z convention).
Returns a (64, 80) float32 array.
"""
img = Image.open(Path(elev_gnd_path))
arr = np.array(img, dtype=np.float32)
if arr.ndim != 2:
raise ValueError(f"ELEV_GND expected 2-D, got shape {arr.shape}")
return arr
def _build_depth_map(elev_gnd, img_height, img_width):
"""
Upsample the 64×80 ELEV_GND tile grid to full image resolution.
Each tile covers an 8×8 pixel block; depth is constant within the tile.
Returns a (img_height, img_width) float32 array of positive AGL depths.
"""
tile_h, tile_w = elev_gnd.shape # (64, 80)
factor_h = img_height // tile_h # 8
factor_w = img_width // tile_w # 8
Z_ground = -elev_gnd # positive depth (metres)
Z_map = np.repeat(np.repeat(Z_ground, factor_h, axis=0), factor_w, axis=1)
# Pad to exact image size if image dimensions aren't exact multiples
Z_map = Z_map[:img_height, :img_width]
return Z_map
def remap_to_perspective(arr, x_i, y_i, fl, elev_gnd):
"""
Convert one TERRAIN-DISP frame to a perspective image in scene i's frame.
Parameters
----------
arr : (H, W) float32 — TERRAIN-DISP frame (reference-camera coords)
x_i : float — scene i X displacement from reference (metres)
y_i : float — scene i Y displacement from reference (metres)
fl : float — focal length (pixels)
elev_gnd : (64, 80) float32 — ELEV_GND map (negative values)
Returns
-------
(H, W) float32 — perspective image in scene i's own camera coordinates.
NaN where scene i's ground patch does not overlap the reference patch.
"""
H, W = arr.shape
Z_map = _build_depth_map(elev_gnd, H, W) # positive AGL metres
# Per-pixel source coordinates in the TERRAIN-DISP frame
rows_out, cols_out = np.mgrid[0:H, 0:W].astype(np.float64)
src_rows = rows_out - y_i * fl / Z_map
src_cols = cols_out + x_i * fl / Z_map
# Interpolate data and a validity mask separately so that NaN in the input
# (from tiles with no disparity data) and out-of-bounds reads both yield NaN.
valid = np.isfinite(arr).astype(np.float32)
filled = np.where(valid, arr, 0.0).astype(np.float64)
data_out = map_coordinates(filled, [src_rows, src_cols],
order=1, mode='constant', cval=0.0)
valid_out = map_coordinates(valid, [src_rows, src_cols],
order=1, mode='constant', cval=0.0)
result = np.where(valid_out > 0.5, data_out, np.nan).astype(np.float32)
return result
def remap_scenes(scenes, poses, fl, elev_gnd, verbose=True):
"""
Apply perspective remap to every scene in a TERRAIN-DISP stack.
Parameters
----------
scenes : list of (timestamp_str, float32 array) — from read_scene_slices()
poses : dict timestamp_str -> (x, y, z, az, tilt, roll)
fl : float — focal length (pixels)
elev_gnd : (64, 80) float32 — from read_elev_gnd()
verbose : bool
Returns
-------
list of (timestamp_str, float32 array) — remapped scenes (same order,
only timestamps that have a pose entry are included).
"""
remapped = []
skipped = 0
for ts, arr in scenes:
if ts not in poses:
skipped += 1
continue
x_i, y_i = poses[ts][0], poses[ts][1]
out = remap_to_perspective(arr, x_i, y_i, fl, elev_gnd)
remapped.append((ts, out))
if verbose:
valid_fracs = []
for _, out in remapped:
valid_fracs.append(np.isfinite(out).mean())
print(f" Remapped {len(remapped)} scenes (skipped {skipped} without pose)")
if valid_fracs:
print(f" Valid-pixel fraction: "
f"min={min(valid_fracs):.2f} "
f"mean={np.mean(valid_fracs):.2f} "
f"max={max(valid_fracs):.2f}")
return remapped
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment