> ## Documentation Index
> Fetch the complete documentation index at: https://docs.agipower.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Generate Content

# Google Vertex AI API: Generate Content

<Tip title="💡 Troubleshooting">
  Encountering errors? See the [API Error Codes Reference](/en/guide/advanced/error-codes) for a complete list of error types and troubleshooting steps.
</Tip>

### Non-streaming:

```
POST https://api.agipower.ai/v1/publishers/{provider}/models/{model}:generateContent
```

### Streaming:

```
POST https://api.agipower.ai/v1/publishers/{provider}/models/{model}:streamGenerateContent
```

AGIPower supports the Google Vertex AI API via the Gen AI SDK. For detailed request parameters and response schemas, see the [official Google Vertex AI documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference).

## Path parameters

### provider `string` **Required**

Model provider (e.g., google).

### model `string` **Required**

Model name (e.g., gemini-2.5-pro).

## Request headers

### Authorization `string` **Required**

Bearer token authentication.

### Content-Type `string` **Required**

Default: `application/json`

## Request body

The request body is JSON.

### contents `array&lt;Content&gt;` **Required**

Current conversation content (single turn / multi-turn history + the current input).

#### role `string` **Optional**

The content producer (defaults to user).

* user: Indicates the message is sent by a human, typically user-generated.
* model: Indicates the message is generated by the model.

#### parts `Part[]` **Required**

At least 1 Part.

#### part

* text `string`**Optional**

  Text prompt or code snippet.

* inlineData `Blob`**Optional**

  Inline data as raw bytes.

  * mimeType `string` **Optional**

    The media type of the file specified in the data or fileUri field. Acceptable values include:

    * application/pdf
    * audio/mpeg
    * audio/mp3
    * audio/wav
    * image/png
    * image/jpeg
    * image/webp
    * text/plain
    * video/mov
    * video/mpeg
    * video/mp4
    * video/mpg
    * video/avi
    * video/wmv
    * video/mpegps
    * video/flv
    * video/x-ms-wmv

  * data `bytes` **Optional**

    Inline data as raw bytes.

* fileData `FileData`**Optional**

  Data stored in a file.

  * mimeType `string` **Optional**
  * fileUri `string` **Optional**

    The URI or URL of the file to include in the prompt.

* functionCall `FunctionCall`**Optional**

  Contains a string representing the FunctionDeclaration.name field, plus a structured JSON object with all parameters of the function call predicted by the model.

  * name `string` **Optional**

    The name of the function to call.

  * args `Record&lt;string,any&gt;` **Optional**

    Function parameters and values represented as a JSON object.

* functionResponse `FunctionResponse`**Optional**

  The output of a FunctionCall. Contains a string representing the FunctionDeclaration.name field and a structured JSON object containing any output of the function call. It is used as context for the model.

  * name `string` **Optional**

    The name of the function to call.

  * response `Record&lt;string,any&gt;` **Optional**

    Function response represented as a JSON object.

* videoMetadata `VideoMetadata` **Optional**

  For video inputs: the start and end offsets (duration format), and the frame rate.

  * startOffset `number` **Optional**

    Video start offset (duration format).

  * endOffset `number` **Optional**

    Video end offset (duration format).

  * fps `number` **Optional**

    Video frame rate.

* mediaResolution `enum`**Optional**

  Controls how input media is processed. If specified, this configuration overrides the mediaResolution setting in generationConfig. LOW reduces the number of tokens per image/video, which may cause loss of detail, but allows longer videos to be included in the context. Supported values: `HIGH`, `MEDIUM`, `LOW`.

### cachedContent `string` **Optional**

Cached content resource name (used as context):

* `projects/\{project\}/locations/\{location\}/cachedContents/\{cachedContent\}`

### tools `array&lt;Tool&gt;` **Optional**

List of tools (e.g., function calling, retrieval, search, code execution, etc.).

### toolConfig `ToolConfig` **Optional**

Tool configuration (shared across all tools in this request).

### safetySettings `array&lt;SafetySetting&gt;` **Optional**

Per-request safety settings (applies to candidates).

#### category `string` **Required**

The safety category to configure a threshold for.

* `HARM_CATEGORY_UNSPECIFIED`: Unspecified harm category.
* `HARM_CATEGORY_HATE_SPEECH`: Harm category is hate speech.
* `HARM_CATEGORY_HARASSMENT`: Harm category is harassment.
* `HARM_CATEGORY_SEXUALLY_EXPLICIT`: Harm category is sexually explicit content.
* `HARM_CATEGORY_DANGEROUS_CONTENT`: Harm category is dangerous content.

#### threshold `string` **Required**

Threshold for blocking responses that fall into the specified safety category based on probability.

* `OFF`: Turn off safety settings when all categories are disabled.
* `BLOCK_NONE`: Block nothing.
* `BLOCK_ONLY_HIGH`: Block only high-threshold content (i.e., block less).
* `BLOCK_MEDIUM_AND_ABOVE`: Block medium-threshold and above content.
* `BLOCK_LOW_AND_ABOVE`: Block low-threshold and above content (i.e., block more).
* `HARM_BLOCK_THRESHOLD_UNSPECIFIED`: Unspecified harm block threshold.

#### method `string` **Optional**

Specifies whether the threshold applies to probability scores or severity scores. If not specified, the threshold applies to probability scores.

* `HARM_BLOCK_METHOD_UNSPECIFIED`: Unspecified harm block method.
* `SEVERITY`: Harm block method uses both probability and severity scores.
* `PROBABILITY`: Harm block method uses probability scores.

### generationConfig `GenerationConfig` **Optional**

Generation parameters (controls sampling, length, stop conditions, structured output, logprobs, audio timestamps, thinking, media processing quality, etc.).

#### temperature `number` **Optional**

Controls randomness/diversity in the output. Lower values are more deterministic and “test-like”; higher values are more creative/diverse. `0` tends to always pick the highest-probability token, so it is closer to deterministic (but small variations can still occur). If replies are too templated/too short, try increasing it; if you see anomalies such as “runaway generation,” try raising it to at least `0.1`. Ranges/defaults vary by model (for example, some Gemini Flash models commonly use `0.0~2.0` with a default of `1.0`).

#### topP `number` **Optional**

Nucleus sampling threshold: the model samples only from the smallest set of tokens whose cumulative probability reaches `topP`. Lower values are more conservative/less random; higher values are more diverse. Range `0.0~1.0` (model defaults may vary). In general, it is recommended to primarily tune **either temperature or topP**, not both significantly.

#### topK `integer` **Optional**

Top-K sampling threshold: the model samples only from the `topK` tokens with the highest probability. For example, `topK=40` means each step selects the next token only from the top 40 candidates. Smaller values are more conservative; larger values are more diverse. Model defaults may vary.

#### candidateCount `integer` **Optional**

Number of returned candidates (response variations). **Output tokens for all candidates are billed** (inputs are typically billed once). Multi-candidate is usually a preview capability and typically only supported by `generateContent` (not `streamGenerateContent`), and different models constrain the range/max (e.g., some support `1~8`).

#### maxOutputTokens `integer` **Optional**

Maximum number of output tokens to limit response length; tokens can be roughly understood as \~4 characters in English. Smaller values produce shorter output; larger values allow longer output.

#### stopSequences `array&lt;string&gt;` **Optional**

Stop sequence list: generation stops immediately when any stop sequence is encountered in the output, and is truncated at the first occurrence; **case-sensitive**. The list can contain up to **5** elements.

#### presencePenalty `number` **Optional**

Presence penalty: penalizes tokens that have already appeared in the “generated text,” increasing the probability of generating new/different content. Range `-2.0 ~ <2.0`.

#### frequencyPenalty `number` **Optional**

Frequency penalty: penalizes tokens that are repeated, reducing the probability of repetitive output. Range `-2.0 ~ <2.0`.

#### seed `integer` **Optional**

Random seed: with a fixed seed, the model will “try” to return the same result for repeated requests, but **does not guarantee full determinism**; changes in model version or parameters (e.g., temperature) may also cause differences. If omitted, a random seed is used by default.

#### responseMimeType `string` **Optional**

Specifies the MIME type of candidate outputs. Common supported values:

* `text/plain` (default): plain text output
* `application/json`: JSON output (for structured output / JSON mode)
* `text/x.enum`: for classification tasks, outputs enum values defined by `responseSchema`

Note: If you want to constrain structured output using `responseSchema`, you must set `responseMimeType` to a value **other than** `text/plain` (e.g., `application/json`).

#### responseSchema `Schema` **Optional**

Schema for structured output: constrains candidate text to conform to this schema (for “controlled generation output / JSON Schema” scenarios). When using this field, you must set `responseMimeType` to a supported non-`text/plain` type (e.g., `application/json`).

#### Schema object fields

* `type` `enum (Type)` **Required**

  Data type. Supported values:

  * `STRING`: String type; supports constraints such as `enum`, `format` (e.g., `date-time`, `email`, `byte`), `minLength`, `maxLength`, `pattern`, etc.
  * `INTEGER`: Integer type; supports `format` (e.g., `int32`, `int64`), `minimum`, `maximum`, etc.
  * `NUMBER`: Floating-point number type; supports `format` (e.g., `float`, `double`), `minimum`, `maximum`, etc.
  * `BOOLEAN`: Boolean type.
  * `ARRAY`: Array type; supports `items`, `minItems`, `maxItems`, etc.
  * `OBJECT`: Object type; supports `properties`, `required`, `propertyOrdering`, `nullable`, etc.

* `format` `string` **Optional**

  Additional format info. For `NUMBER`: `float`, `double`; for `INTEGER`: `int32`, `int64`; for `STRING`: `email`, `byte`, `date`, `date-time`, `password`, etc.

* `description` `string` **Optional**

  Text description of the property/field to help the model understand what to generate.

* `enum` `array&lt;string&gt;` **Optional**

  List of allowed enum values; the model can only choose one. Typically used with `type: STRING`.

* `items` `Schema` **Optional**

  When `type` is `ARRAY`, specifies the schema for array elements.

* `properties` `map&lt;string, Schema&gt;` **Optional**

  When `type` is `OBJECT`, defines each property and its schema.

* `required` `array&lt;string&gt;` **Optional**

  List of required properties in the object.

* `propertyOrdering` `array&lt;string&gt;` **Optional**

  Specifies the output order of object properties.

* `nullable` `boolean` **Optional**

  Whether `null` is allowed.

* `minimum` `number` **Optional**

  When `type` is `INTEGER` or `NUMBER`, the minimum allowed value.

* `maximum` `number` **Optional**

  When `type` is `INTEGER` or `NUMBER`, the maximum allowed value.

* `minItems` `integer` **Optional**

  When `type` is `ARRAY`, the minimum number of elements.

* `maxItems` `integer` **Optional**

  When `type` is `ARRAY`, the maximum number of elements.

* `minLength` `integer` **Optional**

  When `type` is `STRING`, the minimum string length.

* `maxLength` `integer` **Optional**

  When `type` is `STRING`, the maximum string length.

* `pattern` `string` **Optional**

  When `type` is `STRING`, a regex constraint. Example: `"^[\\w\\s,.-]+$"`.

* `title` `string` **Optional**

  Title for the schema.

* `anyOf` `array&lt;Schema&gt;` **Optional**

  Union/conditional types: the value must satisfy at least one schema in `anyOf`. `oneOf` is also interpreted with the semantics of `anyOf`.

> **Note**: Excessive schema complexity (very long property names, too many enum values, overly deep nesting, etc.) may cause `InvalidArgument: 400` errors. Keep schemas as simple as possible. Circular references are not supported (only limited expansion is allowed in non-required properties).

#### responseJsonSchema `object` **Optional**

A JSON Schema alternative to `responseSchema`. When this field is set, you must omit `responseSchema`, and set `responseMimeType` to `application/json`. Accepts standard JSON Schema syntax directly.

#### responseLogprobs `boolean` **Optional**

Whether to return log probabilities for output tokens. When set to `true`, the response includes per-token log probability details in `logprobsResult`. **You must enable this parameter before using the `logprobs` field.**

#### logprobs `integer` **Optional**

Returns log probabilities for the **top candidate tokens** at each generation step. Range `1~20`. **Requires `responseLogprobs=true` to use this field**, and the token selected by the model is not necessarily the top candidate token.

#### responseModalities `array&lt;enum&gt;` **Optional**

Specifies which modalities to return in the response. Only some Gemini models support multimodal output. If not set, defaults to text only. Supported values:

* `TEXT`: Text output (default).
* `IMAGE`: Image output; when used, must also include `TEXT`, i.e., `["TEXT", "IMAGE"]`. Only supported by some models (e.g., `gemini-2.5-flash-image`, `gemini-3-pro-image-preview`).
* `AUDIO`: Audio output; mainly for Live API (real-time streaming) scenarios.

#### audioTimestamp `boolean` **Optional**

Audio timestamp understanding: timestamp understanding for **audio-only files** (preview). Only supported by some models (e.g., some Gemini Flash models).

#### thinkingConfig `ThinkingConfig` **Optional**

“Thinking” (internal reasoning) configuration for Gemini 2.5 and later. Setting this field on models that do not support thinking will return an error.

* `thinkingBudget` `integer` **Optional**

  Token budget for thinking (applies to **Gemini 2.5**). Model-specific ranges:

  * Gemini 2.5 Pro: `128` \~ `32768` (thinking cannot be disabled)
  * Gemini 2.5 Flash: `0` \~ `24576` (`0` = disable thinking)
  * Gemini 2.5 Flash Lite: `512` \~ `24576`

  Set to `-1` for dynamic thinking (the model adjusts budget based on request complexity). If omitted, the model controls it automatically.

  > **Note**: Cannot be used together with `thinkingLevel`. `thinkingBudget` applies only to Gemini 2.5.

* `thinkingLevel` `enum (ThinkingLevel)` **Optional**

  Controls internal reasoning intensity (recommended for **Gemini 3**). Supported values:

  * `THINKING_LEVEL_UNSPECIFIED`: Unspecified; use the model’s default dynamic behavior.

  * `MINIMAL`: Use as few thinking tokens as possible (Gemini 3 Flash only).

  * `LOW`: Low reasoning intensity; suitable for simple tasks.

  * `MEDIUM`: Medium reasoning intensity (Gemini 3 Flash only).

  * `HIGH`: High reasoning intensity (Gemini 3 default); suitable for complex tasks like math, code, and logical analysis.

  > **Note**: Cannot be used together with `thinkingBudget`. `thinkingLevel` applies only to Gemini 3.

* `includeThoughts` `boolean` **Optional**

  Whether to return a summary of the thinking process in the response. When set to `true`, a thought summary is returned when available. This is best-effort: even if enabled, thoughts are not guaranteed to be returned.

#### mediaResolution `enum (MediaResolution)` **Optional**

Controls the token resolution when processing input media (images/videos/PDFs), balancing response quality and token usage. Higher resolution lets the model perceive more detail but consumes more tokens. Supported values:

* `MEDIA_RESOLUTION_UNSPECIFIED`: Unspecified; use the model default (token counts differ across model generations).
* `MEDIA_RESOLUTION_LOW`: Low resolution; fewer tokens, faster, cheaper.
* `MEDIA_RESOLUTION_MEDIUM`: Medium resolution; balance between quality and cost.
* `MEDIA_RESOLUTION_HIGH`: High resolution; more tokens, finer detail.
* `MEDIA_RESOLUTION_ULTRA_HIGH`: Ultra-high resolution (Gemini 3 only).

> Gemini 3 reference image token counts: `ULTRA_HIGH` ≈ 2240, `HIGH` ≈ 1120, `MEDIUM` ≈ 560.

#### speechConfig `SpeechConfig` **Optional**

Speech generation configuration; used when `responseModalities` includes `AUDIO`.

* `voiceConfig` `VoiceConfig` **Optional**

  Single-voice configuration (mutually exclusive with `multiSpeakerVoiceConfig`).

  * `prebuiltVoiceConfig` `PrebuiltVoiceConfig` **Optional**

    Prebuilt voice configuration.

    * `voiceName` `string` **Optional**

      Prebuilt voice name (e.g., `Kore`, `Puck`, `Charon`, etc.).

  * `replicatedVoiceConfig` `ReplicatedVoiceConfig` **Optional**

    Replicated voice configuration (clone a voice from an audio sample).

    * `mimeType` `string` **Optional**

      MIME type of the voice sample. Currently only `audio/wav` is supported (16-bit signed little-endian, 24kHz sample rate).

    * `voiceSample` `string(bytes)` **Optional**

      Custom voice sample (base64-encoded).

* `multiSpeakerVoiceConfig` `MultiSpeakerVoiceConfig` **Optional**

  Multi-speaker voice configuration (mutually exclusive with `voiceConfig`); for multi-role TTS scenarios.

  * `speakerVoiceConfigs` `array&lt;SpeakerVoiceConfig&gt;`

    List of voice configurations for each speaker.

    * `speaker` `string`: Speaker name.
    * `voiceConfig` `VoiceConfig`: Voice configuration used by this speaker (same structure as above).

* `languageCode` `string` **Optional**

  Language code for speech output (e.g., `en-US`).

#### routingConfig `RoutingConfig` **Optional**

Routing configuration (Vertex AI): routes requests to a specific model. `autoMode` and `manualMode` are mutually exclusive.

> **Note**: This field is deprecated. Google recommends using `modelConfig` instead.

* `autoMode` `AutoRoutingMode` **Optional**

  Auto routing: routing is determined by a pretrained routing model plus user preferences.

  * `modelRoutingPreference` `enum (ModelRoutingPreference)` **Optional**

    Routing preference. Supported values include `BALANCED`, etc.

* `manualMode` `ManualRoutingMode` **Optional**

  Manual routing: explicitly specify the target model.

  * `modelName` `string` **Optional**

    Target model name (e.g., `gemini-1.5-pro-001`).

#### imageConfig `ImageConfig` **Optional**

Image generation configuration; used when `responseModalities` includes `IMAGE`.

* `aspectRatio` `string` **Optional**

  Aspect ratio of generated images. Supported values: `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `9:16`, `16:9`, `21:9`; some models also support `4:5`, `5:4`.

* `imageSize` `string` **Optional**

  Size of generated images. Supported values: `1K`, `2K`, `4K`. Default is `1K`.

* `outputCompressionQuality` `number` **Optional**

  Output compression quality for generated images (applies only to `image/jpeg`).

* `outputMimeType` `string` **Optional**

  MIME type of generated images.

#### enableAffectiveDialog `boolean` **Optional**

Whether to enable affective dialogue: when enabled, the model detects the user’s emotion and adjusts the response style accordingly.

### systemInstruction `Content` **Optional**

System instruction (guides the model’s overall behavior; recommended to use only text in `parts`, with each part as a separate paragraph).

## Response (non-streaming)

The official response structure is as follows:

```ts theme={null}
{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": string
          }
        ]
      },
      "finishReason": enum (FinishReason),
      "safetyRatings": [
        {
          "category": enum (HarmCategory),
          "probability": enum (HarmProbability),
          "blocked": boolean
        }
      ],
      "citationMetadata": {
        "citations": [
          {
            "startIndex": integer,
            "endIndex": integer,
            "uri": string,
            "title": string,
            "license": string,
            "publicationDate": {
              "year": integer,
              "month": integer,
              "day": integer
            }
          }
        ]
      },
      "avgLogprobs": double,
      "logprobsResult": {
        "topCandidates": [
          {
            "candidates": [
              {
                "token": string,
                "logProbability": float
              }
            ]
          }
        ],
        "chosenCandidates": [
          {
            "token": string,
            "logProbability": float
          }
        ]
      }
    }
  ],
  "usageMetadata": {
    "promptTokenCount": integer,
    "candidatesTokenCount": integer,
    "totalTokenCount": integer

    // (possible extended stats)
    // "cachedContentTokenCount": integer,
    // "thoughtsTokenCount": integer,
    // "toolUsePromptTokenCount": integer,
    // "promptTokensDetails": [...],
    // "candidatesTokensDetails": [...],
    // "toolUsePromptTokensDetails": [...]
  },
  "modelVersion": string,
  "createTime": string,
  "responseId": string
}
```

### candidates `array&lt;Candidate&gt;`

List of candidate results returned for this generation.

#### Candidate.content `object`

Candidate content.

* content.parts `array`\
  Array of content parts.
  * parts\[].text `string`\
    Generated text.

#### Candidate.finishReason `enum (FinishReason)`

Why the model stopped generating tokens; if empty, it indicates generation has not stopped yet.

Common values (as listed officially):

* `FINISH_REASON_STOP`: Natural stopping point or matched a stop sequence
* `FINISH_REASON_MAX_TOKENS`: Reached the requested max token limit
* `FINISH_REASON_SAFETY`: Stopped for safety reasons (if output is blocked by the filter, `Candidate.content` is empty)
* `FINISH_REASON_RECITATION`: Stopped due to flagged unauthorized recitation
* `FINISH_REASON_BLOCKLIST`: Contains blocked terms
* `FINISH_REASON_PROHIBITED_CONTENT`: Contains prohibited content (e.g., CSAM)
* `FINISH_REASON_IMAGE_PROHIBITED_CONTENT`: An image in the prompt contains prohibited content
* `FINISH_REASON_NO_IMAGE`: The prompt should include an image but none was provided
* `FINISH_REASON_SPII`: Contains sensitive personally identifiable information (SPII)
* `FINISH_REASON_MALFORMED_FUNCTION_CALL`: Function call is malformed and cannot be parsed
* `FINISH_REASON_OTHER`: Other reasons
* `FINISH_REASON_UNSPECIFIED`: Unspecified

#### Candidate.safetyRatings `array&lt;SafetyRating&gt;`

Array of safety ratings.

* safetyRatings\[].category `enum (HarmCategory)`\
  Safety category (e.g., `HARM_CATEGORY_SEXUALLY_EXPLICIT`, `HARM_CATEGORY_HATE_SPEECH`, `HARM_CATEGORY_HARASSMENT`, `HARM_CATEGORY_DANGEROUS_CONTENT`).
* safetyRatings\[].probability `enum (HarmProbability)`\
  Harm probability level: `NEGLIGIBLE` / `LOW` / `MEDIUM` / `HIGH`, etc.
* safetyRatings\[].blocked `boolean`\
  Indicates whether the model input or output was blocked.

#### Candidate.citationMetadata `object`

Citation information (when the output includes citations).

* citationMetadata.citations `array&lt;Citation&gt;`
  * citations\[].startIndex `integer`\
    Start position of the citation in `content`, measured in **bytes** of the UTF-8 response.
  * citations\[].endIndex `integer`\
    End position of the citation in `content`, also measured in bytes.
  * citations\[].uri `string`\
    Source URL (official docs describe it as url/URL; in the example structure, the field name is `uri`).
  * citations\[].title `string`\
    Source title.
  * citations\[].license `string`\
    Associated license.
  * citations\[].publicationDate `object`\
    Publication date (valid formats: `YYYY` / `YYYY-MM` / `YYYY-MM-DD`).
    * publicationDate.year `integer`
    * publicationDate.month `integer`
    * publicationDate.day `integer`

#### Candidate.avgLogprobs `double`

Average log probability for the candidate.

#### Candidate.logprobsResult `object`

Returns top candidate tokens (`topCandidates`) and the actually selected tokens (`chosenCandidates`) for each step.

* logprobsResult.topCandidates `array`
  * topCandidates\[].candidates `array`
    * candidates\[].token `string`: token (character/word/phrase, etc.)
    * candidates\[].logProbability `float`: log probability (confidence) of the token
* logprobsResult.chosenCandidates `array`
  * chosenCandidates\[].token `string`
  * chosenCandidates\[].logProbability `float`

### usageMetadata `object`

Token usage statistics.

* usageMetadata.promptTokenCount `integer`\
  Number of tokens in the request.
* usageMetadata.candidatesTokenCount `integer`\
  Number of tokens in the response.
* usageMetadata.totalTokenCount `integer`\
  Total tokens for request + response.
* (May appear) thoughtsTokenCount / toolUsePromptTokenCount / cachedContentTokenCount and per-modality details.

> Note: The official docs add that for billing purposes, in Gemini 3 Pro and later models, tokens consumed for processing “document inputs” are billed as image tokens.

### modelVersion `string`

Model and version used for generation (example: `gemini-2.0-flash-lite-001`).
Below is the **Vertex AI `streamGenerateContent` (SSE streaming) response body for each chunk**.\
**Key point**: In streaming mode, multiple chunks are returned, and **each chunk’s JSON structure is still `GenerateContentResponse`**. Intermediate chunks often have an empty `finishReason`; only the final chunk provides termination info such as `finishReason`/`finishMessage`.

### createTime `string`

Server receive time (RFC3339 Timestamp).

### responseId `string`

Response identifier.

## Response (streaming: response body of each stream chunk)

```ts theme={null}
{
  "candidates": [
    {
      "index": integer,
      "content": {
        "role": string,
        "parts": [
          {
            "thought": boolean,
            "thoughtSignature": string, // bytes(base64)
            "mediaResolution": {
              "level": enum,
              "numTokens": integer
            },

            // Union field data (only one of the following fields appears at a time)
            "text": string,
            "inlineData": { "mimeType": string, "data": string, "displayName": string },
            "fileData": { "mimeType": string, "fileUri": string, "displayName": string },
            "functionCall": {
              "id": string,
              "name": string,
              "args": object,
              "partialArgs": [
                {
                  "jsonPath": string,
                  "stringValue": string,
                  "numberValue": number,
                  "boolValue": boolean,
                  "nullValue": string,
                  "willContinue": boolean
                }
              ],
              "willContinue": boolean
            },
            "functionResponse": {
              "id": string,
              "name": string,
              "response": object,
              "parts": [
                {
                  "inlineData": { /* bytes blob */ },
                  "fileData": { /* file ref */ }
                }
              ],
              "scheduling": enum,
              "willContinue": boolean
            },
            "executableCode": { "language": enum, "code": string },
            "codeExecutionResult": { "outcome": enum, "output": string },

            // Union field metadata (only when inlineData/fileData is a video)
            "videoMetadata": {
              "startOffset": string,
              "endOffset": string,
              "fps": number
            }
          }
        ]
      },
      "avgLogprobs": number,
      "logprobsResult": {
        "topCandidates": [
          {
            "candidates": [
              { "token": string, "tokenId": integer, "logProbability": number }
            ]
          }
        ],
        "chosenCandidates": [
          { "token": string, "tokenId": integer, "logProbability": number }
        ]
      },
      "finishReason": enum,
      "safetyRatings": [
        { "category": enum, "probability": enum, "blocked": boolean }
      ],
      "citationMetadata": {
        "citations": [
          {
            "startIndex": integer,
            "endIndex": integer,
            "uri": string,
            "title": string,
            "license": string,
            "publicationDate": { "year": integer, "month": integer, "day": integer }
          }
        ]
      },
      "groundingMetadata": {
        "webSearchQueries": [ string ],
        "retrievalQueries": [ string ],
        "groundingChunks": [
          {
            "web": { "uri": string, "title": string, "domain": string },
            "retrievedContext": object,
            "maps": object
          }
        ],
        "groundingSupports": [ object ],
        "sourceFlaggingUris": [ object ],
        "searchEntryPoint": { "renderedContent": string, "sdkBlob": string },
        "retrievalMetadata": { "googleSearchDynamicRetrievalScore": number },
        "googleMapsWidgetContextToken": string
      },
      "urlContextMetadata": {
        "urlMetadata": [
          { "retrievedUrl": string, "urlRetrievalStatus": enum }
        ]
      },
      "finishMessage": string
    }
  ],
  "modelVersion": string,
  "createTime": string,
  "responseId": string,
  "promptFeedback": {
    "blockReason": enum,
    "blockReasonMessage": string,
    "safetyRatings": [
      { "category": enum, "probability": enum, "blocked": boolean }
    ]
  },
  "usageMetadata": {
    "promptTokenCount": integer,
    "candidatesTokenCount": integer,
    "totalTokenCount": integer

    // (possible extended stats)
    // "cachedContentTokenCount": integer,
    // "thoughtsTokenCount": integer,
    // "toolUsePromptTokenCount": integer,
    // "promptTokensDetails": [...],
    // "candidatesTokensDetails": [...],
    // "toolUsePromptTokensDetails": [...]
  }
}
```

> `promptFeedback`: **Returned only in the first stream chunk**, and appears only when **no candidates are generated due to policy violations**.\
> `finishMessage`: Returned only when `finishReason` is present.

### candidates `array&lt;Candidate&gt;`

List of candidate results in this chunk.

#### Candidate.index `integer`

Candidate index (starting from 0).

#### Candidate.content `object`

Candidate content (multiple parts).

##### content.role `string`

Producer role: typically `'user'` or `'model'`.

##### content.parts `array&lt;Part&gt;`

Array of content parts; each part is a “single-type” data block (text / inlineData / functionCall …).

#### Part `object`（content.parts\[]）

#### Part.thought `boolean`

Whether this is a “thought/reasoning” part.

#### Part.thoughtSignature `string(bytes)`

Reusable signature for the thought (base64).

#### Part.mediaResolution `PartMediaResolution`

Input media resolution (affects media tokenization).

* mediaResolution.level `enum (PartMediaResolutionLevel)`: LOW / MEDIUM / HIGH / ULTRA\_HIGH / UNSPECIFIED.
* mediaResolution.numTokens `integer`: Expected media token sequence length.

#### Part.data (Union) **Only one is present per part**

##### Part.text `string`

Text content (the most common location for streaming incremental output).

##### Part.inlineData `Blob`

Inline binary data (base64).

* inlineData.mimeType `string`: IANA MIME Type.
* inlineData.data `string(bytes)`: base64 bytes.
* inlineData.displayName `string`: Optional display name (returned only in some scenarios).

##### Part.fileData `FileData`

Reference to an external file (e.g., GCS).

* fileData.mimeType `string`: IANA MIME Type.
* fileData.fileUri `string`: File URI.
* fileData.displayName `string`: Optional display name.

##### Part.functionCall `FunctionCall`

A function call predicted by the model.

* functionCall.id `string`: Function call id (used to match with functionResponse).
* functionCall.name `string`: Function name.
* functionCall.args `object`: Function arguments (JSON object).
* functionCall.partialArgs `array&lt;PartialArg&gt;`: **Streaming function-argument deltas** (available in some APIs/modes).
* functionCall.willContinue `boolean`: Whether additional deltas for this FunctionCall will follow.

###### PartialArg (for functionCall.partialArgs)

* jsonPath `string`: Path to the parameter being streamed incrementally (RFC 9535).
* stringValue / numberValue / boolValue / nullValue: The delta value for this chunk (one of four).
* willContinue `boolean`: Whether more deltas will follow for this jsonPath.

##### Part.functionResponse `FunctionResponse`

Structure used when you send tool execution results back to the model (may also appear in responses in some modes).

* functionResponse.id `string`: Corresponding functionCall.id.
* functionResponse.name `string`: Function name (matches functionCall.name).
* functionResponse.response `object`: Function result (JSON object; conventionally output/error).
* functionResponse.parts `array&lt;FunctionResponsePart&gt;`: Multi-part function response (can include files/inline data).
* functionResponse.scheduling `enum (FunctionResponseScheduling)`: SILENT / WHEN\_IDLE / INTERRUPT / …
* functionResponse.willContinue `boolean`: Whether more response fragments will follow.

##### Part.executableCode `ExecutableCode`

Code generated by the model for a code-execution tool.

* executableCode.language `enum (Language)`: e.g., PYTHON.
* executableCode.code `string`: Code string.

##### Part.codeExecutionResult `CodeExecutionResult`

Code execution result.

* codeExecutionResult.outcome `enum (Outcome)`: OUTCOME\_OK / OUTCOME\_FAILED / OUTCOME\_DEADLINE\_EXCEEDED.
* codeExecutionResult.output `string`: stdout or error message.

#### Part.metadata (Union)

##### Part.videoMetadata `VideoMetadata`

Metadata used only when the part carries video data.

* videoMetadata.startOffset `string`: Start offset.
* videoMetadata.endOffset `string`: End offset.
* videoMetadata.fps `number`: Frame rate.

### Candidate.avgLogprobs `number`

Candidate average logprob (length-normalized).

### Candidate.logprobsResult `LogprobsResult`

Logprobs details.

* logprobsResult.topCandidates `array&lt;LogprobsResultTopCandidates&gt;`: Per-step top-token lists.
  * topCandidates\[].candidates `array&lt;LogprobsResultCandidate&gt;`: Sorted by logProbability descending.
* logprobsResult.chosenCandidates `array&lt;LogprobsResultCandidate&gt;`: The final sampled/selected token per step.
* LogprobsResultCandidate.token / tokenId / logProbability: token, tokenId, log probability.

### Candidate.finishReason `enum (FinishReason)`

Why generation stopped; if empty, it means “not stopped yet”.

Common values (example enums):

* `STOP` / `MAX_TOKENS` / `SAFETY` / `RECITATION` / `BLOCKLIST` / `PROHIBITED_CONTENT` / `SPII` / `MALFORMED_FUNCTION_CALL` / `OTHER` / `FINISH_REASON_UNSPECIFIED` …

### Candidate.safetyRatings `array&lt;SafetyRating&gt;`

Safety ratings for the candidate output (at most one entry per category).

* safetyRatings\[].category `enum (HarmCategory)`: e.g., HATE\_SPEECH / SEXUALLY\_EXPLICIT / DANGEROUS\_CONTENT / HARASSMENT / CIVIC\_INTEGRITY …
* safetyRatings\[].probability `enum (HarmProbability)`: NEGLIGIBLE / LOW / MEDIUM / HIGH …
* safetyRatings\[].blocked `boolean`: Whether it was filtered due to this rating.

> Some APIs/SDKs may also provide finer-grained fields such as probabilityScore / severity / severityScore, which may not appear in all Vertex REST outputs.

### Candidate.citationMetadata `CitationMetadata`

Citation information.

* citationMetadata.citations `array&lt;Citation&gt;`
  * citations\[].startIndex `integer`: Citation start position
  * citations\[].endIndex `integer`: Citation end position
  * citations\[].uri `string`: Source URL/URI
  * citations\[].title `string`: Source title
  * citations\[].license `string`: License
  * citations\[].publicationDate `\{year,month,day\}`: Publication date

### Candidate.groundingMetadata `GroundingMetadata`

Retrieval/evidence source metadata returned when grounding is enabled.

* groundingMetadata.webSearchQueries `string[]`: Queries used for Google Search.
* groundingMetadata.retrievalQueries `string[]`: Queries actually executed by the retrieval tool.
* groundingMetadata.groundingChunks `array&lt;GroundingChunk&gt;`: Evidence chunks.
  * groundingChunks\[].web `\{uri,title,domain\}`: Web evidence.
  * groundingChunks\[].retrievedContext / maps: Other evidence sources (object structure depends on the source).
* groundingMetadata.searchEntryPoint `\{renderedContent,sdkBlob\}`: Search entry-point info.
* groundingMetadata.retrievalMetadata `\{googleSearchDynamicRetrievalScore\}`: Retrieval-related metadata.
* Plus sourceFlaggingUris / googleMapsWidgetContextToken, etc. (when using Google Maps grounding).

### Candidate.urlContextMetadata `UrlContextMetadata`

URL retrieval metadata returned when the model uses the `urlContext` tool.

* urlContextMetadata.urlMetadata `array&lt;UrlMetadata&gt;`: URL list.
  * urlMetadata\[].retrievedUrl `string`: The URL that was actually retrieved.
  * urlMetadata\[].urlRetrievalStatus `enum (UrlRetrievalStatus)`: SUCCESS / ERROR / PAYWALL / UNSAFE / UNSPECIFIED.

### Candidate.finishMessage `string`

A more detailed explanation of `finishReason` (returned only when `finishReason` is present).

### modelVersion `string`

Model version used for this generation.

### createTime `string`

Server receive time (RFC3339 Timestamp).

### responseId `string`

Response identifier.

### promptFeedback `object`

Prompt content-filtering result: **only appears in the first stream chunk and only when there are no candidates due to violations**.

* promptFeedback.blockReason `enum (BlockedReason)`: Blocking reason.
* promptFeedback.blockReasonMessage `string`: Human-readable reason (may not be supported in all environments).
* promptFeedback.safetyRatings `array&lt;SafetyRating&gt;`: Safety ratings for the prompt.

### usageMetadata `object`

Token usage.

* usageMetadata.promptTokenCount `integer`: Prompt token count.
* usageMetadata.candidatesTokenCount `integer`: Total candidate output token count.
* usageMetadata.totalTokenCount `integer`: Total token count.
* (May appear) thoughtsTokenCount / toolUsePromptTokenCount / cachedContentTokenCount and per-modality details.

<Card title="POST /api/vertex-ai/v1">
  ```TypeScript theme={null}
  import { GoogleGenAI } from "@google/genai";

  const client = GoogleGenAI({
    apiKey: "$AGIPower_API_KEY",
    vertexai: true,
    httpOptions: {
      baseUrl: "https://api.agipower.ai",
      apiVersion: "v1",
    },
  });

  const response = await client.models.generateContent({
    model: "google/gemini-2.5-pro",
    contents: "How does AI work?",
  });
  console.log(response);
  ```

  ```Python theme={null}
  from google import genai
  from google.genai import types

  client = genai.Client(
      api_key="$AGIPower_API_KEY",
      vertexai=True,
      http_options=types.HttpOptions(
          api_version='v1',
          base_url='https://api.agipower.ai'
      ),
  )

  response = client.models.generate_content(
      model="google/gemini-2.5-pro",
      contents="How does AI work?"
  )
  print(response.text)
  ```
</Card>

<Card title="API Response">
  ```json theme={null}
  {
    "candidates": [
      {
        "content": {
          "role": "model",
          "parts": [
            {
              "text": "Of course. This is a fantastic question. Let's break down how AI works using a simple analogy and then add the technical details.\n\n### The Simple Analogy: Teaching a Child to Recognize a Cat\n\nImagine you're teaching a very young child what a \"cat\" is. You don't write down a long list of rules like \"a cat has pointy ears, four legs, a tail, and whiskers.\" Why? Because some cats have folded ears, some might be missing a leg, and a dog also fits that description.\n\nInstead, you do this:\n\n1.  **Show Examples:** You show the child hundreds of pictures. You point and say, \"That's a cat.\" \"That's also a cat.\" \"This is *not* a cat; it's a dog.\"\n2.  **Let Them Guess:** You show them a new picture and ask, \"Is this a cat?\"\n3.  **Give Feedback:** If they're right, you say \"Yes, good job!\" If they're wrong, you say \"No, that's a fox.\"\n\nOver time, the child's brain, without being told the specific rules, starts to recognize the *patterns* that make a cat a cat. They build an internal, intuitive understanding.\n\n**AI works in almost the exact same way.** It's a system designed to learn patterns from data without being explicitly programmed with rules.\n\n---\n\n### The Core Components of How AI Works\n\nNow, let's replace the child with a computer program. The process has three key ingredients:\n\n#### 1. Data (The Pictures)\n\nThis is the most critical ingredient. AI is fueled by data. For our example, this would be a massive dataset of thousands or millions of images, each one labeled by a human: \"cat,\" \"dog,\" \"hamster,\" etc.\n\n*   **More Data is Better:** The more examples the AI sees, the better it gets at identifying the patterns.\n*   **Good Data is Crucial:** The data must be accurate and diverse. If you only show it pictures of black cats, it will struggle to recognize a white cat.\n\n#### 2. Model / Algorithm (The Child's Brain)\n\nThis is the mathematical framework that learns from the data. Think of it as the \"engine\" that finds the patterns. When you hear terms like **\"Neural Network,\"** this is what they're referring to.\n\nA neural network is inspired by the human brain. It's made of interconnected digital \"neurons\" organized in layers.\n\n*   **Input Layer:** Takes in the raw data (e.g., the pixels of an image).\n*   **Hidden Layers:** This is where the magic happens. Each layer recognizes increasingly complex patterns. The first layer might learn to spot simple edges and colors. The next might combine those to recognize shapes like ears and tails. A deeper layer might combine those shapes to recognize a \"cat face.\"\n*   **Output Layer:** Gives the final answer (e.g., a probability score: \"95% chance this is a cat, 3% dog, 2% fox\").\n\n#### 3. The Training Process (Learning from Feedback)\n\nThis is where the **Model** learns from the **Data**. It's an automated version of showing pictures and giving feedback.\n\n1.  **Prediction (The Guess):** The model is given an input (an image of a cat) and makes a random guess. Early on, its internal settings are all random, so its guess will be wild—it might say \"50% car, 50% dog.\"\n2.  **Compare (Check the Answer):** The program compares its prediction to the correct label (\"cat\"). It then calculates its \"error\" or \"loss\"—a measure of how wrong it was.\n3.  **Adjust (Learn):** This is the key step. The algorithm uses a mathematical process (often called **\"backpropagation\"** and **\"gradient descent\"**) to slightly adjust the millions of internal connections in the neural network. The adjustments are tiny, but they are specifically designed to make the model's guess *less wrong* the next time it sees that same image.\n4.  **Repeat:** This process is repeated **millions or billions of times** with all the data. Each time, the model gets a little less wrong. Over many cycles, these tiny adjustments cause the network to get incredibly accurate at recognizing the patterns it's being shown.\n\nAfter training is complete, you have a **\"trained model.\"** You can now give it brand new data it has never seen before, and it will be able to make accurate predictions.\n\n---\n\n### Major Types of AI Learning\n\nWhile the above is the most common method, there are three main ways AI learns:\n\n**1. Supervised Learning (Learning with an Answer Key)**\nThis is the \"cat\" example we just used. The AI is \"supervised\" because it's trained on data that is already labeled with the correct answers.\n*   **Examples:** Spam filters (emails labeled \"spam\" or \"not spam\"), predicting house prices (houses with known prices), language translation.\n\n**2. Unsupervised Learning (Finding Patterns on its Own)**\nThis is like giving the AI a giant pile of data with *no labels* and asking it to \"find interesting patterns.\" The AI might group the data into clusters based on hidden similarities.\n*   **Examples:** Customer segmentation (finding groups of customers with similar buying habits), identifying anomalies in a computer network.\n\n**3. Reinforcement Learning (Learning through Trial and Error)**\nThis is how you train an AI to play a game or control a robot. The AI takes an action in an environment and receives a reward or a penalty. Its goal is to maximize its total reward over time.\n*   **Examples:** An AI learning to play chess (it gets a reward for winning the game), a robot learning to walk (it gets a reward for moving forward without falling), self-driving car simulations.\n\n### Summary\n\nSo, \"How does AI work?\"\n\n**At its core, modern AI is a system that learns to recognize incredibly complex patterns by processing vast amounts of data, making guesses, and correcting its errors over and over again until it becomes highly accurate.**\n\nIt's less about being \"intelligent\" in a human sense and more about being a phenomenally powerful pattern-matching machine."
            }
          ]
        },
        "finishReason": "STOP",
        "avgLogprobs": -0.4167558059635994
      }
    ],
    "usageMetadata": {
      "promptTokenCount": 5,
      "candidatesTokenCount": 1353,
      "totalTokenCount": 2794,
      "trafficType": "ON_DEMAND",
      "promptTokensDetails": [
        {
          "modality": "TEXT",
          "tokenCount": 5
        }
      ],
      "candidatesTokensDetails": [
        {
          "modality": "TEXT",
          "tokenCount": 1353
        }
      ],
      "thoughtsTokenCount": 1436
    },
    "modelVersion": "google/gemini-2.5-pro",
    "createTime": "2026-01-29T08:40:38.791866Z",
    "responseId": "Bh17abqqMOSS4_UPqqeqoAc"
  }
  ```
</Card>