Hier-SLAM: Scaling-up Semantics in SLAM with a Hierarchically Categorical Gaussian Splatting (ICRA'25)

Authors: Boying Li, Zhixi Cai, Yuan-Fang Li, Ian Reid, and Hamid Rezatofighi

We propose 🌳 Hier-SLAM: a ⭐ LLM-assitant ⭐ Fast ⭐ Semantic 3D Gaussian Splatting SLAM method featuring ⭐ a Novel Hierarchical Categorical Representation, which enables accurate global 3D semantic mapping, scaling-up capability, and explicit semantic prediction in the 3D world.

Getting Start

Installation

Clone the repository and set up the Conda environment:

git clone https://github.com/LeeBY68/Hier-SLAM.git
cd HierSLAM
conda create -n hierslam python=3.10
conda activate hierslam
conda install gcc=10 gxx=10 -c conda-forge
conda install -c "nvidia/label/cuda-11.6.0" cuda-toolkit
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge
pip install -r requirements.txt

# Compile the Semantic-3DGS：
cd hierslam-diff-gaussian-rasterization-w-depth
pip install ./

Download

Replica

The Replica dataset is a synthetic indoor dataset. Our method uses the same sequences provided by previous works, including NICE-SLAM and iMAP (Same RGB and depth sequences with the same trajectories), to ensure a fair comparison with visual SLAM methods.

Since these sequences do not generate per-frame semantic ground truth, we have rendered and generated the semantic ground truth from the synthetic Replica dataset.

To automatically download the Replica RGBD sequences, run the following script to download the data originally generated via NICE-SLAM:
```
bash bash_scripts/download_replica.sh
```
Download the corresponding per-frame semantic ground truth we rendered from the following link: 📥 Replica_Semantic_Tree
The generated hierarchical tree file info_semantic_tree.json, located under the Replica directory. The tree is created based on the entire set of semantic classes in the Replica dataset (info_semantic.json: provided by official Replica). Copy info_semantic_tree.json into each sequence folder.

After downloading the RGB & depth & poses & semantic gt, and the tree file, the final directory structure for Replica should be as follows (click to expand):

[Final Replica Structure]

  DATAROOT
  └── Replica
        └── room0
            ├── results
            │   ├── depth000000.png
            │   ├── depth000001.png
            │   ├── ...
            │   └── ...    
            │   ├── frame000000.png
            │   ├── frame000001.png
            │   ├── ...
            │   └── ...            
            ├── semantic_class
            │   ├── semantic_class_0.png
            │   ├── semantic_class_1.png
            │   ├── ...
            │   └── ... 
            └── traj.txt 
            └── info_semantic_tree.json

ScanNet

The ScanNet dataset is a real-world RGB-D video dataset.

To use it, follow the official data download procedure provided on the ScanNet website. After downloading, extract the color and depth frames from the .sens files using the provided reader script.
For semantic labels, we use the label-filt files due to their higher quality:
```
unzip [seq_name]_2d-label-filt.zip
```
We provide the generated hierarchical tree files for ScanNet, which are all generated from the semantic classes in the ScanNet dataset (scannetv2-labels.combined.tsv: Provided by official ScanNet):
- scannetv2-labels.combined.tree.tsv: A tree generated based on the NYU40 semantic classes.
- scannetv2-labels.combined.tree-large.tsv: A large tree generated based on the full set of original ScanNet semantic classes, covering up to 550 unique labels, derived from the 'id' and 'category' columns in scannetv2-labels.combined.tsv.
You can download both hierarchical tree files from the following link: 📥 ScanNet_Tree

After downloading the RGB & depth & poses & semantics and the tree file, the final directory structure for ScanNet should be as follows (click to expand):

[Final ScanNet Structure]

  DATAROOT
  └── scannet
        └── scene0000_00
            ├── color
            │   ├── 0.jpg
            │   ├── 1.jpg
            │   ├── ...
            │   └── ...    
            ├── depth
            │   ├── 0.png
            │   ├── 1.png
            │   ├── ...
            │   └── ...            
            ├── label-filt
            │   ├── 0.png
            │   ├── 1.png
            │   ├── ...
            │   └── ... 
            ├── intrinsic
            └── pose
                ├── 0.txt
                ├── 1.txt
                ├── ...
                └── ... 
        ├── scanetv2-labels.combined.tree.tsv  
        └── scanetv2-labels.combined.tree-large.tsv

Run

Replica

🔹 Run Hier-SLAM on the Replica dataset using the default hierarchical semantic setting, use:

python scripts/hierslam.py configs/replica/hierslam_semantic_run.py

You can also try different configurations:

🔹 Run Hier-SLAM without semantic (Visual-only Hier-SLAM) on Replica, use:

python scripts/hierslam.py configs/replica/hierslam_nosemantic_run.py

🔹 Run Hier-SLAM with flat semantic encoding (one-hot) :

Modify the number of semantic categories in the CUDA config file:

// In hierslam-diff-gaussian-rasterization-w-depth/cuda_rasterizer/config.h
#define NUM_SEMANTIC 102

Reinstall the CUDA extension:

conda activate hierslam
cd hierslam-diff-gaussian-rasterization-w-depth
pip install ./
cd ..

Run following command:

 python scripts/hierslam.py configs/replica/hierslam_semantic_flat_run.py

ScanNet

🔹 Run Hier-SLAM on the ScanNet dataset using the default hierarchical semantic setting:

Modify the number of semantic categories in the CUDA config file:

// In hierslam-diff-gaussian-rasterization-w-depth/cuda_rasterizer/config.h
#define NUM_SEMANTIC 16

Reinstall the CUDA extension:

conda activate hierslam
cd hierslam-diff-gaussian-rasterization-w-depth
pip install ./
cd ..

Run following command:

python scripts/hierslam.py configs/scannet/hierslam_semantic_run.py

You can also try different configurations:

🔹 Run Hier-SLAM without Semantic (Visual-only Hier-SLAM) on ScanNet, use:

python scripts/hierslam.py configs/scannet/hierslam_nosemantic_run.py

🔹 Run Hier-SLAM with scaling-up semantic encoding :

Modify the number of semantic categories in the CUDA config file:

// In hierslam-diff-gaussian-rasterization-w-depth/cuda_rasterizer/config.h
#define NUM_SEMANTIC 74

Reinstall the CUDA extension:

conda activate hierslam
cd hierslam-diff-gaussian-rasterization-w-depth
pip install ./
cd ..

Run following command:

 python scripts/hierslam.py configs/scannet/hierslam_semantic_large_run.py

Tree generation

Refer to LLM_tree/readme.md for details on tree generation using LLMs.

Evaluation and visualization

🔸 Once a sequence completes, run following command to evaluate:

python scripts/eval_novel_view.py configs/replica/hierslam_semantic_run.py

Subset-classes evaluation: In configs/scannet/hierslam_semantic_run.py, set eval_gt_transfer = True to evaluate only the classes visible in each frame.
We recommend using the full set of semantic classes (eval_gt_transfer = False) for a standard semantic evaluation. The subset option is provided to maintain consistency with previous works and ensure fair comparisons.

🔸 To export the reconstructed global 3D semantic world to a .PLY file by running:

python scripts/export_ply_semantic_tree.py configs/replica/hierslam_semantic_run.py

We recommend using MeshLab or Blender to visualize the resulting PLY files.

🔸 To visualize the reconstructed semantic map and estimated camera poses, run:

python viz_scripts/online_recon_sem_replica.py configs/replica/hierslam_semantic_run.py --flag_semantic

Add --flag_semantic to enable semantic visualization.
Omit --flag_semantic to display the RGB reconstruction instead.

Acknowledgement

We thank the authors for releasing code for their awesome works: 3DGS、 SplaTAM、 GauStudio、 Gaussian Grouping、 Feature 3DGS, and many other inspiring works in the community.

Citation

If you find our work useful, please cite:

@inproceedings{li2025hier,
  title={Hier-SLAM: Scaling-up Semantics in SLAM with a Hierarchically Categorical Gaussian Splatting},
  author={Li, Boying and Cai, Zhixi and Li, Yuan-Fang and Reid, Ian and Rezatofighi, Hamid},
  booktitle={IEEE International Conference on Robotics and Automation (ICRA)},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
LLM_tree		LLM_tree
assets		assets
bash_scripts		bash_scripts
configs		configs
datasets		datasets
hierslam-diff-gaussian-rasterization-w-depth		hierslam-diff-gaussian-rasterization-w-depth
scripts		scripts
utils		utils
viz_scripts		viz_scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hier-SLAM: Scaling-up Semantics in SLAM with a Hierarchically Categorical Gaussian Splatting (ICRA'25)

Hier-SLAM

Getting Start

Installation

Download

Replica

ScanNet

Run

Replica

ScanNet

Tree generation

Evaluation and visualization

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

LeeBY68/Hier-SLAM

Folders and files

Latest commit

History

Repository files navigation

Hier-SLAM: Scaling-up Semantics in SLAM with a Hierarchically Categorical Gaussian Splatting (ICRA'25)

Hier-SLAM

Getting Start

Installation

Download

Replica

ScanNet

Run

Replica

ScanNet

Tree generation

Evaluation and visualization

Acknowledgement

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages