Scene Segmentation using Fast-SAM | Semantic Segmentation using PanopticFCN |
---|---|
![]() |
![]() |
A framework for real-time (semantic) segmentation of given images (video frames) based on the given parameters and configurations. The main use case of this repository is to be used in vS-Graphs, where the camera output seen by the robot is sent to the current package to be segmented.
The framework currently covers two main modules, including FastSAM for real-time scene segmentation and PanopticFCN for real-time scene segmentation and semantic object detection.
Here you can see the benchmarking results of the work using different libraries.
Create a new workspace and clone the repo in its src
folder. In case the repository is going to be used for vS-Graphs
, it is recommended to clone the repository in the same src
folder where it exists, as vS-Graphs
depends on it. Accordingly, you can use the command below:
git clone --recurse-submodules [email protected]:snt-arg/scene_segment_ros.git
After cloning the repository, you may add a command like alias sourcecsr='source ~/workspace/[PATH]/devel/setup.bash'
in your .bashrc
file.
Install the required Python
libraries for running this program using the command below:
pip install -r src/requirements.txt
Download one of the model checkpoints (PanopticFCN or Fast-SAM) from the repository, and put it in the /include
folder. Remember that you need to specify the path in the config/cfg_[model].yaml
file (e.g., include/[model].pth
).
/include
directory. The main reason for better integration of different frameworks, we read the absolute path of this folder to target a model.
The next step is to install the required frameworks:
- For FastSAM, you need to install required OpenAI modules using
pip install git+https://github.com/openai/CLIP.git
. - For PanopticFCN, you need to install Detectron2 using
git clone https://github.com/facebookresearch/detectron2.git
and then,python -m pip install -e detectron2
.
The system has different configurations for each of the segmentation libraries, accessible from /config
folder. In the table below, you can see these configurations in details.
Main Category | Parameter | Default | Description |
---|---|---|---|
image_params |
image_params |
640 | width of the input image |
ros_topics |
raw_image_topic |
/img |
raw image topic |
segmented_image_topic |
- | segmented image topic (custom Msg) | |
segmented_image_vis |
- | segmented image topic (visualization) | |
model_params |
model_name |
- | name of the model |
model_path |
- | path of the model file | |
model_config |
- | path of the model's specific configs | |
point_prompt |
[[0, 0]] | a point for segmentation | |
box_prompt |
[[0, 0, 0, 0]] | boxes for segmentation | |
text_prompt |
- | text prompt (e.g., "a dog") | |
point_label |
[0] | 0: background, 1: foreground | |
iou |
0.9 | annots filtering threshold | |
conf |
0.4 | object confidence threshold | |
contour |
False | draw contours |
In order get only the classes that you want (such as wall
or floor
), you need to know the identifier of the class in Detectron2 (link) and set the category_id
in the configuration file. For instance, the array [43, 52]
can detect [floors, walls]
in the scene.
Note: For an exhaustive list of class labels for the COCO panoptic dataset, refer to this.
You can run the below launch files (accessible from /launch
folder):
- YOSO:
roslaunch segmenter_ros segmenter_yoso.launch
- FastSAM:
roslaunch segmenter_ros segmenter_fastSAM.launch
- PanopticFCN:
roslaunch segmenter_ros segmenter_pFCN.launch
- MaskDino:
roslaunch segmenter_ros segmenter_maskDino.launch
- SegFormer:
roslaunch segmenter_ros segmenter_segFormer.launch
Topic | Description |
---|---|
/vs_graphs/keyframe_image |
for providing input KeyFrames to be segmented (for normal frames use /camera/color/image_raw ) topic |
---- keyFrameId (UInt64) |
the ID of the KeyFrame sent to be processed |
---- keyFrameImage (Image) |
the content of the KeyFrame image sent to be processed |
Topic | Description |
---|---|
/camera/color/image_segment |
the output semantically segmented topic |
---- keyFrameId (UInt64) |
the ID of the processed KeyFrame |
---- segmentedImage (Image) |
the segmented image |
---- segmentedImageUncertainty (Image) |
the segmented image uncertainty values |
---- segmentedImageProbability (PointCloud2) |
the segmented image probability values |
- Removing unnecessary logs
- A new repository to contain all ROS messages (similar to moveit_msgs)