UniVoxel: A Novel Framework for 3-D Object Detection in Autonomous Vehicles With Multimodal Voxel Representation

Kaiqi Liu, Yuanyuan Deng, Jiaxun Tong*, Wei Li

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

Fusing camera and LiDAR information is one of the effective means for achieving robust 3-D object detection. However, current 3-D multimodal methods typically rely on independent branches to extract features from different sensors separately, leading to underutilization of complementary information. In this article, a multimodal detector named UniVoxel is proposed, which is built on a query-based detection paradigm. The UniVoxel integrates inputs from various modalities into the voxel representation for fusion. Specifically, a semantic-guided query generator (SQG) is proposed, in which the low-level voxel features are utilized to adaptively sample multiscale image features, producing unified multimodal voxel features. The multimodal voxel features contain both the geometric and semantic information of the voxels and can ensure that the model focuses on the regions of interest (RoIs). Meanwhile, for maximizing the utilization of complementary information, a fusion voxel encoder (FVE) is introduced to update the multimodal voxels through interacting with the multiscale semantic information of different cameras. Extensive experiments are conducted on the nuScenes dataset. With the help of the proposed framework, the precision of the object detection has been improved both on the validation set and the test set.

源语言英语
页(从-至)33142-33152
页数11
期刊IEEE Sensors Journal
25
17
DOI
出版状态已出版 - 2025
已对外发布

指纹

探究 'UniVoxel: A Novel Framework for 3-D Object Detection in Autonomous Vehicles With Multimodal Voxel Representation' 的科研主题。它们共同构成独一无二的指纹。

引用此