ASD-Conv: Monocular 3D object detection network based on Asymmetrical Segmentation Depth-aware Convolution

Xingyuan Yu (Wuhan University); Neng Du (Wuhan University); Ge Gao (Wuhan University)*; Fan Wen (Wuhan University)


In the field of 3D object recognition, monocular 3D recognition technology is a valuable recognition technology. Compared with binocular technology and lidar technology, its cost is lower. In this paper, based on the existing monocular 3D recognition network, we propose an asymmetrical segmentation depth-aware network: ASD-Conv Network, which is used to better obtain the depth information of monocular images, so as to obtain better recognition results. Compared with other monocular recognition networks, ASD-Conv network performs special segmentation on the image, which can better obtain the depth distribution of the image, and has made a good breakthrough and improvement in the image recognition tasks of 2D, BEV and 3D. The improved algorithm proposed in this paper can improve the detection accuracy while maintaining a certain real-time performance. Experimental results show that compared with the current model, the proposed monocular 3D object detection algorithm based on D-ASDConv has an average improvement rate of 2.82%(AP) in large object detection and the highest average improvement rate of 2.01%(AP) in small object detection on Kitti dataset. The algorithm can effectively learn more advanced features of spatial perception, and the detection results of monocular images are more accurate.