Depth estimation
Monocular depth estimation
Monocular depth estimation is a computer vision task that involves predicting the depth information of a scene from a single image. In other words, it is the process of estimating the distance of objects in a scene from a single camera viewpoint.
Monocular depth estimation has various applications, including 3D reconstruction, augmented reality, autonomous driving, and robotics. It is a challenging task as it requires the model to understand the complex relationships between objects in the scene and the corresponding depth information, which can be affected by factors such as lighting conditions, occlusion, and texture.
The task illustrated in this tutorial is supported by the following model architectures:
In this guide you’ll learn how to:
create a depth estimation pipeline
run depth estimation inference by hand
Before you begin, make sure you have all the necessary libraries installed:
Copied
Depth estimation pipeline
The simplest way to try out inference with a model supporting depth estimation is to use the corresponding pipeline(). Instantiate a pipeline from a checkpoint on the BOINC AI Hub:
Copied
Next, choose an image to analyze:
Copied
Pass the image to the pipeline.
Copied
The pipeline returns a dictionary with two entries. The first one, called predicted_depth
, is a tensor with the values being the depth expressed in meters for each pixel. The second one, depth
, is a PIL image that visualizes the depth estimation result.
Let’s take a look at the visualized result:
Copied
Depth estimation inference by hand
Now that you’ve seen how to use the depth estimation pipeline, let’s see how we can replicate the same result by hand.
Start by loading the model and associated processor from a checkpoint on the BOINC AI Hub. Here we’ll use the same checkpoint as before:
Copied
Prepare the image input for the model using the image_processor
that will take care of the necessary image transformations such as resizing and normalization:
Copied
Pass the prepared inputs through the model:
Copied
Visualize the results:
Copied
Last updated