Computing Differential Properties of 3-D Shapes from Stereoscopic Images without 3-D Model

Abstract

We are considering the problem of recovering the three-dimensional geometry of a scene from binocular stereo disparity. Once a dense disparity map has been computed from a stereo pair of images, one often needs to calculate some local differential properties of the corresponding 3-D surface such as orientation or curvatures. The usual approach is to build a 3-D reconstruction of the surface(s) from which all shape properties will then be derived without ever going back to the original images. In this paper, we depart from this paradigm and propose to use the images directly to compute the shape properties. We thus propose a new method extending the classical correlation method to estimate accurately both the disparity and its derivatives directly from the image data. We then relate those derivatives to differential properties of the surface such as orientation and curvatures.

We present the results of the reconstruction and of the estimation of the surface orientation and curvatures on some stereo pairs of real images.

This paper actually made it into CVPR '94, but here's a full-length version(728K) that was released as an INRIA technical report.

An Example: Hervé's face

This example demonstrate how accurate stereo vision can be on real images (and live objects). From a single stereo pair (left and right images), we recover the 3-D geometry of the face both as range and orientation data.

The first important step was calibration. Since monocular calibration methods were not accurate enough for our enhanced correlation method to work properly, we used Zhengyou Zhang's robust method to estimate the fundamental matrix (which enables a projective reconstruction of the scene), and then recovered the Euclidean geometry, i.e. the projection matrices, using a calibration grid.
Here's a VRML model (490K) of the reconstructed head.

This (232K) is a small MPEG-1 animation showing the 3-D reconstruction (look at those false matches floating around and those holes in the face where correlation failed), and here (852K) is another one showing a subsampling of the field of normals obtained directly from image intensity data.

More examples

Images were taken using our monocular stereo system. The first reconstruction was computed using a classical stereo algorithm, the second one using our enhanced correlation method. The reconstruction was subsampled in both direction before building the VRWM model. For better viewing with VRweb, turn off lighting calculations on texturing, and switch between "Smooth shading" and "Texturing" display.

Richard Szeliski: Stereo pair, Reconstruction 1, Reconstruction 2 (subsampling is 2x2). These are not as good as Hervé's reconstruction, mainly because the face is smaller on the image. Thanks to Veit Schenk for the pictures.
A close-up of my face, Stereo pair, Reconstruction 1, Reconstruction 2 (subsampling is 4x4).

I would really like to see the results of other stereo algorithms on the same pair of images, so please send me an email if you have something interesting.

Frederic Devernay