gitaskhub

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Stars · 10,507
Language · Python
License · Apache-2.0
Ask anything about this repo to start.
Full explanation on explaingit →

By chatting or signing in you agree to the Terms and chat-message logging (revocable in History).