Personal list. With relevant research advancing fast and branching out widely, I'll only add papers meeting my needs hereafter.
-
CroCo: Self-Supervised Pre-training for 3D Vision Tasks by Cross-View Completion [NeurIPS 2022] [croco]
-
CroCo v2: Improved Cross-view Completion Pre-training for Stereo Matching and Optical Flow [ICCV 2023] [croco]
-
3D-Consistent Image Inpainting with Diffusion Models [arXiv 2024] [croco-diff]
-
Alligat0R: Pre-Training through Co-Visibility Segmentation for Relative Camera Pose Regression [arXiv 2025] []
-
Cameras as Rays: Pose Estimation via Ray Diffusion [ICLR 2024] [RayDiffusion]
-
Reloc3r: Large-Scale Training of Relative Camera Pose Regression for Generalizable, Fast, and Accurate Visual Localization [arXiv 2024] [reloc3r]
-
Visual Geometry Grounded Deep Structure From Motion [CVPR 2024] [vggsfm]
-
Grounding Image Matching in 3D with MASt3R [arXiv 2024] [mast3r]
-
MASt3R-SfM: a Fully-Integrated Solution for Unconstrained Structure-from-Motion [arXiv 2024] [mast3r]
-
MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds [arXiv 2024] [mv-dust3rp]
-
Continuous 3D Perception Model with Persistent State [arXiv 2025] [cut3r]
-
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass [arXiv 2025] [fast3r-3d]
-
Light3R-SfM: Towards Feed-forward Structure-from-Motion [arXiv 2025] []
-
MUSt3R: Multi-view Network for Stereo 3D Reconstruction [arXiv 2025] [must3r]
-
PE3R: Perception-Efficient 3D Reconstruction [arXiv 2025] [pe3r]
-
VGGT: Visual Geometry Grounded Transformer [CVPR 2025] [vggt]
-
Pow3R: Empowering Unconstrained 3D Reconstruction with Camera and Scene Priors [CVPR 2025] []
-
Matrix3D: Large Photogrammetry Model All-in-One [CVPR 2025] [ml-matrix3d]
-
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion [arXiv 2024] [monst3r]
-
MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos [arXiv 2024] [mega-sam]
-
Driv3R: Learning Dense 4D Reconstruction for Autonomous Driving [arXiv 2024] [Driv3R]
-
Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction [arXiv 2025] [Geo4D]
-
SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos [arXiv 2024] [SLAM3R]
-
MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors [arXiv 2024] [mast3r-slam]
-
Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs [arXiv 2024] [splatt3r]
-
No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [arXiv 2024] [NoPoSplat]
-
PreF3R: Pose-Free Feed-Forward 3D Gaussian Splatting from Variable-length Image Sequence [arXiv 2024] [PreF3R]
-
SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction [arXiv 2024] [SPARS3R]