Back to feed

Changelog Update

Apr 9, 2025
Google/Gemini APIAPIv2.0-generate
  • Released veo-2.0-generate-001 , a generally available (GA) text- and image-to-video model, capable of generating detailed and artistically nuanced videos. To learn more, see the Veo docs .
  • Released gemini-2.0-flash-live-001 , a public preview version of the Live API model with billing enabled.

    • Enhanced Session Management and Reliability

      • Session Resumption: Keep sessions alive across temporary network disruptions. The API now supports server-side session state storage (for up to 24 hours) and provides handles (session_resumption) to reconnect and resume where you left off.
      • Longer Sessions via Context Compression: Enable extended interactions beyond previous time limits. Configure context window compression with a sliding window mechanism to automatically manage context length, preventing abrupt terminations due to context limits.
      • Graceful Disconnect Notification: Receive a GoAway server message indicating when a connection is about to close, allowing for graceful handling before termination.
    • More Control over Interaction Dynamics

    • Configurable Voice Activity Detection (VAD): Choose sensitivity levels or disable automatic VAD entirely and use new client events ( activityStart , activityEnd ) for manual turn control.

    • Configurable Interruption Handling: Decide whether user input should interrupt the model's response.

    • Configurable Turn Coverage: Choose whether the API processes all audio and video input continuously or only captures it when the end-user is detected speaking.

    • Configurable Media Resolution: Optimize for quality or token usage by selecting the resolution for input media.

    • Richer Output and Features

    • Expanded Voice & Language Options: Choose from two new voices and 30 new languages for audio output. The output language is now configurable within speechConfig .

    • Text Streaming: Receive text responses incrementally as they are generated, enabling faster display to the user.

    • Token Usage Reporting: Gain insights into usage with detailed token counts provided in the usageMetadata field of server messages, broken down by modality and prompt or response phases.