S3M: Semantic Segmentation Sparse Mapping for UAVs with RGB-D Camera

Thanh Nguyen Canh

Van-Truong Nguyen

Xiem HoangVan

Armagan Elibol

Nak Young Chong

School of Information Science, JAIST, Japan
VNU University of Engineering and Technology, Vietnam
SII, 2024.

[Paper]

[Code]

This paper presents a novel approach to address challenges in semantic information extraction and utilization within UAV operations. Our system integrates state-of-the-art visual SLAM to estimate a comprehensive 6-DoF pose and advanced object segmentation methods at the back end. To improve the computational and storage efficiency of the framework, we adopt a streamlined voxel-based 3D map representation - OctoMap to build a working system. Furthermore, the fusion algorithm is incorporated to obtain the semantic information of each frame from the front-end SLAM task, and the corresponding point. By leveraging semantic information, our framework enhances the UAV's ability to perceive and navigate through indoor spaces, addressing challenges in pose estimation accuracy and uncertainty reduction. Through Gazebo simulations, we validate the efficacy of our proposed system and successfully embed our approach into a Jetson Xavier AGX unit for real-world applications.

Paper

Thanh Nguyen Canh, Van-Truong Nguyen, Xiem HoangVan, Armagan Elibol, Nak Young Chong

S3M: Semantic Segmentation Sparse Mapping for UAVs with RGB-D Camera

SII 2024.

[pdf]

Overview and Results

Overview

Proposed S3M SLAM Architecture: The system is composed of three units: a full 6 DoF pose estimation of the drone, a 3D semantic segmentation branch, and a semantic fusion scheme.

3D Pose Estimation.

Structure and method for semantic extraction

To fuse semantic information from multiple view, we introduced semantic fusion scheme.

Experiments

Comparison of trajectory for ORB-SLAM2, Our system and ground truth in X-Y and X-Z axis.

Comparison of Relative Rose Error (RPE) between ORB-SLAM2 and our system.

3D visual representation of the obtained semantic maps.

Code

[github]

Citation

1. Canh T. N., Nguyen V-T, HoangVan X., Elibol A., Chong A.Y. S3M: Semantic Segmentation Sparse Mapping for UAVs with RGB-D Camera. Symposium on System Integration (SII), 2024.

@inproceedings{canh2024s3m, 
                                  

                                    author    = {Canh, Thanh Nguyen and Nguyen, Van-Truong and HoangVan, Xiem and Elibol, Armagan and Chong, Nak Young}, 
                                  

                                    title     = {{S3M: Semantic Segmentation Sparse Mapping for UAVs with RGB-D Camera}}, 
                                  

                                    booktitle = {2024 IEEE/SICE International Symposium on System Integration (SII)}, 
                                  

                                    year      = {2024}, 
                                  

                                    address   = {Vietnam}, 
                                  

                                    month     = {Jan}, 
                                  

                                    DOI       = {10.1109/SII58957.2024.10417379}
                                  

                                }

Acknowledgements

We gratefully acknowledge support fromthe Asian Office of Aerospace Research and Development under Grant/Cooperative Agreement Award No. FA2386-22-1-4042.
This webpage template was borrowed from https://akanazawa.github.io/cmr/.