A modular RL library to fine-tune language models to human preferences
By chatting or signing in you agree to the Terms and chat-message logging (revocable in History).