Closed
Description
With the current design, rai_tts and rai_asr share a Sounddevice connector, which is implemented in rai_core. rai_core should not depend on sounddevice (or any other SDConnector dependency) by default (rai_core), therefore merging the two packages is beneficial in terms of dependency separation.
The tts and asr packages were separated, due to various model dependencies. To minimize the risk of huge package size, models should be optional. This means, that the user has to specify which model during installation e.g.:
pip install rai_s2s[faster_whisper, kokoro_tts]
Refactors:
- refactor rai_asr and rai_tts into one package rai_s2s sharing
- rai_s2s should install sounddevice by default
- move sounddevice connector from rai_core to rai_s2s
Agent implementation:
- ASRAgent
- TTSAgent
- S2SAgent using bidirectional sounddevice stream, compatible with ReActAgent
Docs
- README.md
- S2SAgent docs (including compatibility info with ReActAgent and limitations (if it's bound to ros2))
- ASRAgent docs
- TTSAgent docs
Misc:
- optionally (if can be done): the agents should not directly depend on connectors but expose an api which is compatible with connectors (see the example snippet below)
- resolve "Speech Recognition" and "Text to speech" unavailable in Streamlit configurator despite installed s2s #563
from abc import abstractmethod
from rai.communication import HRIMessage
from rai.communication.ros2 import ROS2HRIConnector, ROS2HRIMessage
class S2SAgent:
def __init__(**audio_kwargs):
pass
def tts_callback(self, message: HRIMessage):
# process input
@abstractmethod
def send(self, message: HRIMessage):
# method implemented by subclass with concrete connectors
pass
class ROS2S2SAgent(S2SAgent):
def __init__(self, from_human: str, to_human: str, **audio_kwargs):
super().__init__(**audio_kwargs)
self.in_topic = from_human
self.out_topic = to_human
self.connector = ROS2HRIConnector()
self.connector.register_callback(
callback=self.tts_callback, source=self.in_topic
)
def send(self, message: HRIMessage):
msg = ROS2HRIMessage(
text=message.text,
images=message.images,
audios=message.audios,
communication_id=message.communication_id,
seq_no=message.seq_no,
seq_end=message.seq_end,
)
self.connector.send_message(target=self.out_topic, message=msg)