A production-minded FastAPI sidecar for serving Gemma 4 31B on vLLM with Gemma 4 Multi-Token Prediction (MTP) speculative decoding.
By chatting or signing in you agree to the Terms and chat-message logging (revocable in History).