3D Instance Segmentation using Deep Learning

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

To endow machines with the ability to perceive the real-world in a three-dimensional representation as we do as humans is a fundamental and longstanding topic in Artificial Intelligence. One of the important goals is to understand the geometric structure and semantics of the 3D environment given various types of visual inputs such as images or point clouds acquired by 2D/3D sensors. Traditional approaches usually leverage handcrafted features to estimate the shape and semantics of objects or scenes. However, they struggle to overcome critical issues caused by visual occlusions and find it challenging to generalize to novel objects and scenarios. In contrast, understanding scenes and the objects within each-other is the goal of deep neural networks trained on large-scale real-world 3D data to learn general and robust representations. To achieve these aims, from object-level 3D shape estimation from single or multiple views to scene-level semantic understanding, this research made three key contributions. In Chapter 3, we start by estimating the full 2D shape of a small detailed defect as an object from a single image. To recover a dense 2D detection with geometric details, a powerful architecture with a bounding box is proposed to learn feasible geometric priors from small-scale 2D defect repositories. In Chapter 4, we extend our study to 3D instance segmentation which is used to detect multiple objects in an indoor environment using an RGB-D sensor. From RGB images captured from the sensor, first Mask R-CNN is adopted to take the 2D instance segmentation. The results of segmented regions of objects are combined with the depth image of the sensor and produced segmented depth regions of individual objects. The depth points are transferred to 3D coordinates expressing 3D instance segmentation. The experimentation results show that the proposed algorithm produces good performance in the test of zoom-in and zoom-out view of the scene. As a result, the proposed 3D instance segmentation algorithm can be applied to an intelligent robot to enhance cognitive capability in the real world.

Article activity feed