CLAUDE: Add perspective remap — inverse-shift TERRAIN-DISP to per-scene perspective images

perspective_remap.py: new module that inverts the terrain-reference warp in TERRAIN-DISP-MERGED frames using ELEV_GND depth (64×80 tiles, tilted plane, no NaN). Produces natural-parallax perspective images per scene, valid only in the overlap with the reference camera's ground footprint. elphel_to_colmap_perspective.py: new CLI script (parallel to the original elphel_to_colmap.py, which is kept unchanged) that adds --elev_gnd and applies the remap before writing COLMAP output. Tested on 185-frame NC dataset: valid-pixel min=0.10, mean=0.54, max=1.00. SciPy added to Dependencies in README. Co-authored-by: Claude <claude@elphel.com>

CLAUDE: Add perspective remap — inverse-shift TERRAIN-DISP to per-scene perspective images
perspective_remap.py: new module that inverts the terrain-reference warp in TERRAIN-DISP-MERGED frames using ELEV_GND depth (64×80 tiles, tilted plane, no NaN). Produces natural-parallax perspective images per scene, valid only in the overlap with the reference camera's ground footprint. elphel_to_colmap_perspective.py: new CLI script (parallel to the original elphel_to_colmap.py, which is kept unchanged) that adds --elev_gnd and applies the remap before writing COLMAP output. Tested on 185-frame NC dataset: valid-pixel min=0.10, mean=0.54, max=1.00. SciPy added to Dependencies in README. Co-authored-by: Claude <claude@elphel.com>
978388ba · Andrey Filippov · 925d0f34 · 978388ba · 978388ba · 978388ba
Commit 978388ba authored May 15, 2026 by Andrey Filippov
Showing with 348 additions and 2 deletions

README.md README.md +31 -2

elphel_to_colmap_perspective.py scripts/elphel_to_colmap_perspective.py +147 -0

perspective_remap.py src/elphel_to_mvs/perspective_remap.py +170 -0

No files found.
--- a/README.md
+++ b/README.md
@@ -27,7 +27,7 @@ Converts Elphel imagej-elphel segment outputs to COLMAP / OpenMVS format.
    points3D.txt   empty (Phase 1 will add the 80×64 sparse seed)
 ```

-### Usage
+### Usage — standard (terrain-referenced images)

 ```bash
 python3 scripts/elphel_to_colmap.py \
@@ -38,6 +38,35 @@ python3 scripts/elphel_to_colmap.py \

 Optional: `--hfov 32.0` (degrees, default 32).

+This writes the raw TERRAIN-DISP frames as-is.  The images are
+terrain-referenced (zero parallax for flat ground), which is correct
+for FOPEN analysis but incompatible with OpenMVS feature matching.
+
+### Usage — perspective remap (recommended for OpenMVS / 3D reconstruction)
+
+```bash
+python3 scripts/elphel_to_colmap_perspective.py \
+  --tiff     /path/to/<ts>-TERRAIN-DISP-MERGED.tiff \
+  --xml      /path/to/<ts>-INTERFRAME.corr-xml \
+  --elev_gnd /path/to/<ts>-ELEV_GND.tiff \
+  --out      /tmp/colmap_perspective
+```
+
+Additionally requires `*-ELEV_GND.tiff` (64×80 float32 ground-depth map,
+produced alongside TERRAIN-DISP-MERGED by imagej-elphel).
+
+This applies an inverse-shift remap (via `src/elphel_to_mvs/perspective_remap.py`)
+to convert each terrain-referenced frame into a proper per-scene perspective
+image.  Regions outside the reference camera's ground footprint are filled
+with NaN (saved as 0 in the PNG).  Expected valid-pixel fractions:
+
+- Reference scene: 100 %
+- Central frames:  ~97 %
+- End-of-sequence frames: ~10–18 %
+
+Natural terrain parallax is restored (~5–6 px/frame at 175 m AGL),
+making the images geometrically consistent with the supplied camera poses.
+
 ### Full pipeline (one command)

 ```bash
@@ -55,7 +84,7 @@ See the comment block at the top of the script for build/install instructions.

 ### Dependencies

-Python 3.8+, Pillow, NumPy.
+Python 3.8+, Pillow, NumPy, SciPy.
 OpenMVS (CLI tools only — `InterfaceCOLMAP`, `DensifyPointCloud`, etc.).

 ### Coordinate-convention notes

--- a/scripts/elphel_to_colmap_perspective.py
+++ b/scripts/elphel_to_colmap_perspective.py
+#!/usr/bin/env python3
+# -----------------------------------------------------------------------------
+# elphel_to_colmap_perspective.py — Elphel LWIR-16 → COLMAP with perspective remap
+#
+# Copyright (C) 2026 Elphel, Inc.
+# -----------------------------------------------------------------------------
+#
+# elphel_to_colmap_perspective.py is free software: you can redistribute it
+# and/or modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation, either version 3 of the License,
+# or (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+# -----------------------------------------------------------------------------
+"""
+Phase 0 pose+camera-model bridge with perspective remap.
+
+This is a companion to elphel_to_colmap.py that additionally applies an
+inverse-shift remap to convert the terrain-referenced TERRAIN-DISP-MERGED
+frames into proper perspective images in each scene's own camera frame.
+
+Why this is needed
+------------------
+imagej-elphel's TERRAIN-DISP-MERGED.tiff warps every frame to the reference
+camera's coordinate system so that flat-terrain features appear at identical
+pixel positions across all frames.  This is correct for FOPEN analysis but
+incompatible with OpenMVS, which expects natural-parallax perspective images.
+
+The remap uses ELEV_GND.tiff (64×80 tile-grid, fitted tilted plane through
+ground-visible pixels, no NaN) to recover the per-scene perspective images.
+Valid pixels cover only the overlap with the reference camera's ground patch
+(~97 % for central frames, ~18 % for end-of-sequence frames).  NaN fills the
+non-overlapping portions — OpenMVS feature detectors simply skip those.
+
+This script is a Python-only workaround.  The preferred long-term solution
+is a new imagej-elphel export mode that renders each scene directly from its
+own camera position.
+
+Usage
+-----
+  python3 scripts/elphel_to_colmap_perspective.py \\
+      --tiff     PATH/<ts>-TERRAIN-DISP-MERGED.tiff \\
+      --xml      PATH/<ts>-INTERFRAME.corr-xml \\
+      --elev_gnd PATH/<ts>-ELEV_GND.tiff \\
+      --out      /tmp/colmap_perspective
+
+  # All three input files are normally in the same v88/ directory:
+  BASE=/home/elphel/lwir16-proc/NC/linked/.../1763232146_700470/v88
+  TS=1763232146_700470
+  python3 scripts/elphel_to_colmap_perspective.py \\
+      --tiff     $BASE/$TS-TERRAIN-DISP-MERGED.tiff \\
+      --xml      $BASE/$TS-INTERFRAME.corr-xml \\
+      --elev_gnd $BASE/$TS-ELEV_GND.tiff \\
+      --out      /tmp/colmap_perspective
+
+Output layout
+-------------
+  <out>/
+    images/          16-bit PNG per scene (named <timestamp>.png)
+                     Perspective view from scene i's camera; NaN regions → 0
+    sparse/0/
+      cameras.txt    single PINHOLE camera
+      images.txt     per-image poses (quaternion + translation)
+      points3D.txt   empty (Phase 0; sparse seed added in Phase 1)
+"""
+
+import sys
+import argparse
+import numpy as np
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).resolve().parent.parent / "src"))
+
+from elphel_to_mvs.read_interframe_xml import parse as parse_xml
+from elphel_to_mvs.read_tiff_stack     import read_scene_slices
+from elphel_to_mvs.export_colmap       import write_colmap_model
+from elphel_to_mvs.perspective_remap   import read_elev_gnd, remap_scenes
+
+
+def main():
+    ap = argparse.ArgumentParser(
+        description="Elphel → COLMAP Phase 0 bridge with perspective remap",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    ap.add_argument("--tiff",     required=True,
+                    help="*-TERRAIN-DISP-MERGED.tiff")
+    ap.add_argument("--xml",      required=True,
+                    help="*-INTERFRAME.corr-xml")
+    ap.add_argument("--elev_gnd", required=True,
+                    help="*-ELEV_GND.tiff  (64×80 ground-depth map)")
+    ap.add_argument("--out",      required=True,
+                    help="Output directory (will be created)")
+    ap.add_argument("--hfov",     type=float, default=32.0,
+                    help="Camera horizontal FoV in degrees (default 32)")
+    args = ap.parse_args()
+
+    # ── Load poses ────────────────────────────────────────────────────────────
+    print(f"Reading poses from {args.xml}")
+    poses, ref_ts = parse_xml(args.xml)
+    print(f"  {len(poses)} scene poses  (reference: {ref_ts})")
+
+    # ── Load raw TERRAIN-DISP frames ──────────────────────────────────────────
+    print(f"Reading TERRAIN-DISP stack from {args.tiff}")
+    scenes, gmin, gmax = read_scene_slices(args.tiff)
+    print(f"  {len(scenes)} timestamp slices  "
+          f"(global float range [{gmin:.2f}, {gmax:.2f}])")
+
+    # ── Load ELEV_GND ─────────────────────────────────────────────────────────
+    print(f"Reading ELEV_GND from {args.elev_gnd}")
+    elev_gnd = read_elev_gnd(args.elev_gnd)
+    print(f"  Shape {elev_gnd.shape}  "
+          f"depth range [{-elev_gnd.max():.1f}, {-elev_gnd.min():.1f}] m AGL")
+
+    # ── Compute focal length (same formula as export_colmap.py) ──────────────
+    # Image width taken from the first scene frame
+    _, first_arr = scenes[0]
+    height, width = first_arr.shape
+    fl = width / (2.0 * np.tan(np.radians(args.hfov / 2.0)))
+    print(f"  Focal length: {fl:.1f} px  ({width}×{height} px, HFoV={args.hfov}°)")
+
+    # ── Apply perspective remap ───────────────────────────────────────────────
+    print("Applying perspective remap ...")
+    scenes_perspective = remap_scenes(scenes, poses, fl, elev_gnd, verbose=True)
+
+    # ── Write COLMAP model ────────────────────────────────────────────────────
+    # gmin/gmax from the original stack are still valid (same LWIR intensity
+    # range; the remap only changes spatial mapping, not radiometry).
+    write_colmap_model(
+        output_dir=args.out,
+        scenes=scenes_perspective,
+        poses=poses,
+        global_min=gmin,
+        global_max=gmax,
+        width=width,
+        height=height,
+        hfov_deg=args.hfov,
+    )
+
+
+if __name__ == "__main__":
+    main()
--- a/src/elphel_to_mvs/perspective_remap.py
+++ b/src/elphel_to_mvs/perspective_remap.py
+# -----------------------------------------------------------------------------
+# perspective_remap.py — Undo terrain-reference warp from TERRAIN-DISP frames
+#
+# Copyright (C) 2026 Elphel, Inc.
+# -----------------------------------------------------------------------------
+#
+# perspective_remap.py is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+# -----------------------------------------------------------------------------
+"""
+Convert TERRAIN-DISP frames (terrain-referenced, in the reference camera
+coordinate system) into proper perspective images in each scene's own frame.
+
+Background
+----------
+imagej-elphel's renderSceneSequence() applies a per-tile translation that
+warps every frame to the reference camera view using the SfM-refined ground
+disparity.  The resulting TERRAIN-DISP-MERGED.tiff has zero parallax for
+flat-terrain features — incorrect for OpenMVS, which requires natural-parallax
+perspective images.
+
+This module inverts that per-tile translation using ELEV_GND.tiff (the
+best available ground-depth estimate: 64×80 tile grid, tilted-plane fit
+through ground-visible pixels, no NaN) to recover the perspective view for
+each scene, valid only in the region that overlaps the reference camera patch.
+
+Derivation (COLMAP world frame, nadir camera, terrain at depth Z m AGL)
+------------------------------------------------------------------------
+Forward render (what renderSceneSequence does):
+    TERRAIN_DISP_i[tx, ty] = scene_i_sensor[tx − x_i·fl/Z,  ty + y_i·fl/Z]
+
+where x_i, y_i are the scene camera's X, Y displacement from the reference
+(metres, Elphel convention: X = flight direction, Y = cross-track).
+
+Inverse (source coordinates in TERRAIN_DISP for output pixel (col, row)):
+    src_col = col + x_i · fl / Z_ground(col, row)
+    src_row = row − y_i · fl / Z_ground(col, row)
+
+ELEV_GND stores depth as a negative value (terrain is below the camera in
+the Elphel Z convention), so Z_ground = −ELEV_GND > 0.
+
+This is a temporary Python workaround.  The correct long-term fix is a new
+imagej-elphel export mode that renders each scene from its own camera
+position without the terrain-reference warp.
+"""
+
+from pathlib import Path
+import numpy as np
+from PIL import Image
+from scipy.ndimage import map_coordinates
+
+
+def read_elev_gnd(elev_gnd_path):
+    """
+    Read ELEV_GND.tiff — single-frame 64×80 float32 ground-depth map.
+
+    Values are negative (terrain below camera in Elphel Z convention).
+    Returns a (64, 80) float32 array.
+    """
+    img = Image.open(Path(elev_gnd_path))
+    arr = np.array(img, dtype=np.float32)
+    if arr.ndim != 2:
+        raise ValueError(f"ELEV_GND expected 2-D, got shape {arr.shape}")
+    return arr
+
+
+def _build_depth_map(elev_gnd, img_height, img_width):
+    """
+    Upsample the 64×80 ELEV_GND tile grid to full image resolution.
+
+    Each tile covers an 8×8 pixel block; depth is constant within the tile.
+    Returns a (img_height, img_width) float32 array of positive AGL depths.
+    """
+    tile_h, tile_w = elev_gnd.shape          # (64, 80)
+    factor_h = img_height // tile_h          # 8
+    factor_w = img_width  // tile_w          # 8
+    Z_ground = -elev_gnd                     # positive depth (metres)
+    Z_map = np.repeat(np.repeat(Z_ground, factor_h, axis=0), factor_w, axis=1)
+    # Pad to exact image size if image dimensions aren't exact multiples
+    Z_map = Z_map[:img_height, :img_width]
+    return Z_map
+
+
+def remap_to_perspective(arr, x_i, y_i, fl, elev_gnd):
+    """
+    Convert one TERRAIN-DISP frame to a perspective image in scene i's frame.
+
+    Parameters
+    ----------
+    arr      : (H, W) float32  — TERRAIN-DISP frame (reference-camera coords)
+    x_i      : float  — scene i X displacement from reference (metres)
+    y_i      : float  — scene i Y displacement from reference (metres)
+    fl       : float  — focal length (pixels)
+    elev_gnd : (64, 80) float32  — ELEV_GND map (negative values)
+
+    Returns
+    -------
+    (H, W) float32  — perspective image in scene i's own camera coordinates.
+    NaN where scene i's ground patch does not overlap the reference patch.
+    """
+    H, W = arr.shape
+    Z_map = _build_depth_map(elev_gnd, H, W)          # positive AGL metres
+
+    # Per-pixel source coordinates in the TERRAIN-DISP frame
+    rows_out, cols_out = np.mgrid[0:H, 0:W].astype(np.float64)
+    src_rows = rows_out - y_i * fl / Z_map
+    src_cols = cols_out + x_i * fl / Z_map
+
+    # Interpolate data and a validity mask separately so that NaN in the input
+    # (from tiles with no disparity data) and out-of-bounds reads both yield NaN.
+    valid = np.isfinite(arr).astype(np.float32)
+    filled = np.where(valid, arr, 0.0).astype(np.float64)
+
+    data_out  = map_coordinates(filled, [src_rows, src_cols],
+                                order=1, mode='constant', cval=0.0)
+    valid_out = map_coordinates(valid,  [src_rows, src_cols],
+                                order=1, mode='constant', cval=0.0)
+
+    result = np.where(valid_out > 0.5, data_out, np.nan).astype(np.float32)
+    return result
+
+
+def remap_scenes(scenes, poses, fl, elev_gnd, verbose=True):
+    """
+    Apply perspective remap to every scene in a TERRAIN-DISP stack.
+
+    Parameters
+    ----------
+    scenes   : list of (timestamp_str, float32 array)  — from read_scene_slices()
+    poses    : dict  timestamp_str -> (x, y, z, az, tilt, roll)
+    fl       : float  — focal length (pixels)
+    elev_gnd : (64, 80) float32  — from read_elev_gnd()
+    verbose  : bool
+
+    Returns
+    -------
+    list of (timestamp_str, float32 array)  — remapped scenes (same order,
+    only timestamps that have a pose entry are included).
+    """
+    remapped = []
+    skipped = 0
+    for ts, arr in scenes:
+        if ts not in poses:
+            skipped += 1
+            continue
+        x_i, y_i = poses[ts][0], poses[ts][1]
+        out = remap_to_perspective(arr, x_i, y_i, fl, elev_gnd)
+        remapped.append((ts, out))
+
+    if verbose:
+        valid_fracs = []
+        for _, out in remapped:
+            valid_fracs.append(np.isfinite(out).mean())
+        print(f"  Remapped {len(remapped)} scenes  (skipped {skipped} without pose)")
+        if valid_fracs:
+            print(f"  Valid-pixel fraction: "
+                  f"min={min(valid_fracs):.2f}  "
+                  f"mean={np.mean(valid_fracs):.2f}  "
+                  f"max={max(valid_fracs):.2f}")
+    return remapped