> ## Documentation Index
> Fetch the complete documentation index at: https://docs.agipower.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# API

# Create Chat Completion

<Tip title="💡 Troubleshooting">
  Encountering errors? See the [API Error Codes Reference](/en/guide/advanced/error-codes) for a complete list of error types and troubleshooting steps.
</Tip>

```
POST https://api.agipower.ai/v1/chat/completions
```

The Create Chat Completion endpoint is compatible with OpenAI’s [Create Chat Completion](https://platform.openai.com/docs/api-reference/chat/create) API, and is used to run inference for conversational LLMs.

The parameters below list all options that *may* be supported by different models. Parameter support varies by model; for the exact parameters supported by a given model, see that model’s detail page.

## Request headers

### Authorization `string` **Required**

Bearer Token authentication

### Content-Type `string` **Required**

The request content type. The default is `application/json`.

## Request

### messages `array` **Required**

Prompts provided to the model as a list of chat messages. Depending on model capabilities, supported message modalities may differ (e.g., text, images, audio, video). For details, refer to each model provider’s documentation.

Each element in `messages` represents one chat message and consists of `role` and `content`:

#### Developer message `object`

Instructions provided by the developer. The model should follow these instructions regardless of what the user says. In o1 and newer models, the `developer` message replaces the previous `system` message.

* content `string or array ` **Required**

  The content of the Developer message.

  * Text content `string`

    The content of the Developer message.

  * Array of content parts `array`

    An array of content parts with defined types. For Developer messages, only the `text` type is supported.

    * text `string` **Required**

      Text content.

    * type `string` **Required**

      The type of the content part.

* role `string` **Required**

  The role of the message author; in this case, `developer`.

* name `string` **Optional**

  An optional participant name. Helps the model distinguish between participants with the same role.

#### System message `object`

Instructions provided by the developer. The model should follow these instructions regardless of what the user says. In o1 and newer models, you should use `developer` messages for this purpose.

* content `string or array ` **Required**

  The content of the System message.

  * Text content `string`

    The content of the System message.

  * Array of content parts `array`

    An array of content parts with defined types. For System messages, only the `text` type is supported.

    * text `string` **Required**

      Text content.

    * type `string` **Required**

      The type of the content part.

* role `string` **Required**

  The role of the message author; in this case, `system`.

* name `string` **Optional**

  An optional participant name. Helps the model distinguish between participants with the same role.

#### User message `object`

A message sent to the model by the end user. In most chat scenarios, this is the only role you need.

* content `string or array` **Required**

  The content of the User message.

  * Text content `string` **Required**

    Plain text content (the most common usage).

  * Array of content parts `array` **Required**

    An array of multimodal content parts. Depending on model capabilities, it can include content types such as text, images, audio, etc. Common types include:

    * Text part
      * type `string` **Required**, always `text`
      * text `string` **Required**, the text content

    * Image part (multimodal models only)
      * type `string` **Required**, `image_url`
      * image\_url `object` **Required**
        * url `string` **Required**, an image URL or a base64 Data URL
        * detail `string` **Optional**, typical values: `low` / `high` / `auto`, used to control image parsing fidelity

    * Audio part (audio-input models only)
      * type `string` **Required**, `input_audio`
      * input\_audio `object` **Required**
        * data `string` **Required**, base64-encoded audio file content
        * format `string` **Required**, e.g. `wav`, `mp3`

    * File part (File content part; models that support file input only)\
      Used to provide an entire file as context to the model (e.g., PDF, Office documents).
      * type `string` **Required**, always `file`
      * file `object` **Required**
        * file\_id `string` **Optional**
          * The file ID obtained via the file upload endpoint. This is the recommended way to reference a file.
        * file\_data `string` **Optional**
          * Base64-encoded file data, for sending file content directly in the request body
        * filename `string` **Optional**
          * The filename, used to hint the file type to the model or to display it in the console

* role `string` **Required**

  The author role of the message; in this case, `user`.

* name `string` **Optional**

  An optional participant name. Helps the model distinguish between participants with the same role.

#### Assistant message `object`

A reply message sent to the user by the model during the conversation. You can include these historical assistant messages in new requests so the model can continue reasoning with the full context.

* content `string or array` Optional

  The content of the Assistant message. **Required when `tool_calls` or the (deprecated) `function_call` is not set.**

  * Text content `string`

    Plain-text assistant message content.

  * Array of content parts `array`

    An array of content parts with defined types. It can contain one or more `text` parts, or **exactly one** `refusal` part.

    * Text content part `object` (text content part)
      * type `string` **Required**\
        The type of the content part.

      * text `string` **Required**\
        Text content.

    * Refusal content part `object` (refusal content part)
      * type `string` **Required**\
        The type of the content part.

      * refusal `string` **Required**\
        The refusal message generated by the model.

* refusal `string or null` Optional

  The assistant’s refusal message content.

* role `string` **Required**

  The author role of the message; in this case, `assistant`.

* name `string` Optional

  Optional participant name. Helps the model distinguish between participants with the same role.

* audio `object or null` Optional

  Data about a **previous model audio response**, which can be referenced in subsequent turns.

  * id `string` **Required**

    The unique identifier of the previous audio response.

* tool\_calls `array` Optional
  * Function tool call `object`
    * id `string` **Required**

      The tool call ID, used to match `tool_call_id` in subsequent Tool messages.

    * type `string` **Required**

      The tool type. Currently only `function` is supported.

    * function `object` **Required**
      * name `string` **Required**

        The name of the function to call.

      * arguments `string` **Required**

        Function call arguments as a JSON string (generated by the model).\
        Note: The model is not guaranteed to generate strictly valid JSON and may include parameters not defined in the function schema. Validate on the application side before invoking.

    * Custom tool call `object`
      * id `string` **Required**

        The tool call ID, used to match `tool_call_id` in subsequent Tool messages.

      * type `string` **Required**

        The tool type. Always `custom`.

      * custom `object` **Required**
        * name `string` **Required**

          The name of the function to call.

        * input `string` **Required**

          The input for the custom tool call, generated by the model.

* function\_call `object or null` (deprecated) Optional

  Replaced by `tool_calls` and retained only for backward compatibility. Indicates the function name and arguments the model suggests calling.

  * name `string` **Required**\
    The name of the function to call.

  * arguments `string` **Required**\
    Function call arguments as a JSON string (generated by the model). You must still validate on the application side before actually invoking.

* reasoning `string` Optional

  The assistant message’s reasoning text. When reasoning is enabled, the model’s reasoning content will appear in this field. In multi-turn conversations, you can pass this back to maintain continuity.

* reasoning\_details `array` Optional (**required for multi-turn tool-calling scenarios**)

  An array with detailed reasoning information. **In multi-turn tool-calling scenarios with reasoning enabled, you must pass this field back in full—especially the `signature` field—otherwise subsequent turns will not work properly.**

  Each element includes:

  * type `string` **Required**

    The reasoning content type, e.g. `reasoning.text`.

  * text `string` **Required**

    The reasoning text content.

  * signature `string` **Required**

    A signed credential for the reasoning content. **This is the key field for maintaining reasoning context across turns and must be passed back unchanged.** The signature is generated by the model to verify integrity and continuity.

  * format `string` **Optional**

    Signature format identifier, e.g. `anthropic-claude-v1`.

  * index `number` **Optional**

    Index of the reasoning segment.

#### Tool message `object`

A message used to return the execution result of an external tool (function) call back to the model.

* content `string or array` **Required**

  The content of the tool execution result, typically text or structured data (serialized to a string).

  * Text content `string`

    The content of the Tool message.

  * Array of content parts `array`

    An array of content parts with defined types. For Tool messages, only the `text` type is supported.

    * text `string` **Required**

      Text content.

    * type `string` **Required**

      The type of the content part.

* role `string` **Required**

  The author role of the message; in this case, `tool`.

* tool\_call\_id `string` **Required**

  Corresponds to an `assistant` message’s `tool_calls[i].id`, used to associate this tool result with that call.

* name `string` **Optional**

  The tool name (usually matches the function name declared in `tools`).

<Info title="Function message `object` **Deprecated by the official spec and not supported**" />

### model `string` **Required**

The model ID for this inference request, in the format `&lt;provider&gt;/&lt;model_name&gt;`, e.g. openai/gpt-5. You can find it on each model’s detail page.

### max\_completion\_tokens `integer or null` **Optional**

Limits the length of the model’s generated content, including reasoning. If omitted, the model’s default limit is used. The maximum generation length for each model is available on its detail page.

### temperature `number` **Optional**

* Default: `1`
* AGIPower does not enforce a range; values in `[0, 2]` are recommended.

Sampling temperature to control randomness: higher values yield more randomness; lower values yield more deterministic output. Typically tuned as an alternative to `top_p`.

### top\_p `number` **Optional**

* Default: `1`

Nucleus sampling parameter: only sample from tokens whose cumulative probability mass is within `top_p`. For example, `top_p = 0.1` means only consider tokens in the top 10% probability mass.

### n `integer or null` **Optional**

Number of candidate responses to return. Currently only `n=1` is supported.

### frequency\_penalty `number or null` **Optional**

* Default: `0`
* Range: `-2.0` to `2.0`

Penalizes tokens that have appeared frequently. Higher values reduce repetition and can help avoid mechanical echoing.

### presence\_penalty `number or null` **Optional**

* Default: `0`
* Range: `-2.0` to `2.0`

Penalizes tokens based on whether they have appeared at all. Higher values encourage introducing new topics and reduce repeatedly discussing the same content.

### stop `string | array | null` **Optional**

* Default: `null`
* Up to 4 stop sequences

When the generated output matches any stop sequence, the model stops generating and the stop sequence is not included in the response. Some newer reasoning models (e.g. `o3`, `o4-mini`) do not support this parameter.

### logit\_bias `object` **Optional**

* Default: `null`

Used to fine-tune sampling probabilities for specific tokens. Keys are token IDs (integers) from the tokenizer; values are biases between `-100` and `100`.

* Positive: increase the chance of selecting the token
* Negative: decrease the chance of selecting the token
* Extreme values (e.g. ±100): approximate forcing a token off/on

### logprobs `boolean or null` **Optional**

* Default: `false`

Whether to include log probabilities for output tokens in the response.

### top\_logprobs `integer` **Optional**

Specifies the **number of most likely tokens** to return at each position (0–20), each with its logprob.

### tools `array` **Optional**

Declares a list of tools the model can call in this conversation. Each element can be a custom tool or a function tool (a function defined via JSON Schema).

### tool\_choice `string or object` **Optional**

Controls the model’s tool-usage strategy: ([platform.openai.com](https://platform.openai.com/docs/api-reference/chat))

* `"none"`: do not call any tools
* `"auto"`: let the model decide whether and which tools to call
* `"required"`: the model must call at least one tool in this turn
* Specify a single tool: `\{"type": "function", "function": \{"name": "my_function"\}\}`

### parallel\_tool\_calls `boolean` **Optional**

* Default: `true`

Whether to allow the model to call multiple tools (functions) **in parallel** within a single response.

### reasoning\_effort `string` **Optional** (reasoning models)

Controls how much effort a **reasoning model** puts into thinking: `none`, `minimal`, `low`, `medium`, `high`, `xhigh`, etc. Defaults and supported ranges vary by model.

### verbosity `string` **Optional**

* Default: `"medium"`

Constrains output verbosity: `low` (concise), `medium` (balanced), `high` (more detailed).

### web\_search\_options `object` **Optional**

Configures the behavior of the **web search tool**, enabling the model to proactively retrieve up-to-date information from the internet before answering.

### metadata `object` **Optional**

Allows up to 16 key-value pairs as structured business metadata for logging, retrieval, or querying in management UIs.

### stream `boolean or null` **Optional**

* Default: `false`

Whether to enable **streaming output** (Server-Sent Events). When `true`, results are returned as an event stream in chunks.

### stream\_options `object` **Optional**

Only effective when `stream: true`, used to configure streaming behavior, such as whether to include usage information at the end of the stream.

### provider `object` **Optional**

Used to configure routing and failover across multiple model providers (e.g., OpenAI, Anthropic, Google) for this request.\
If not specified, the project’s or model’s default routing strategy is used.

#### routing `object` **Required**

Routing policy configuration that determines how requests are selected and distributed among multiple providers.

##### type `string` **Required**

Routing type. Supported values:

* `priority`
  Select providers by priority order: try the first, then the next on failure (can be used with fallback).
* `round_robin`
  Round-robin distribution: evenly distribute request traffic across providers.
* `least_latency`
  Lowest-latency first: choose the currently fastest provider based on historical/real-time stats.

##### primary\_factor `string` **Optional**

The primary consideration when multiple providers are available. For example:

* `cost`
  Prefer lower-cost providers
* `speed`
  Prefer faster-responding providers
* `quality`
  Prefer higher-quality providers (e.g., stronger models / more stable service)

Actual behavior works in conjunction with `type`. For example, when `type = "priority"`, `primary_factor` mainly affects the priority sorting logic.

##### providers `array` **Required**

The list of model providers that can participate in routing. Example: `["openai", "anthropic", "google"]`

#### fallback `string` **Optional**

Failover strategy. When the currently selected provider fails (e.g., timeout, insufficient quota, service unavailable), how to automatically switch:

`"true"`: Enable automatic failover. When the current provider is unavailable, automatically try other available providers in the list according to the routing policy.

`"false"`: Disable failover. If the current provider call fails, return an error immediately without trying other providers.

`"&lt;provider_name&gt;"`: Explicitly specify a fixed fallback provider, e.g. `"anthropic"`:

Use the provider selected by the primary routing policy first\
If it fails, switch to the specified fallback provider\
If both primary + fallback fail, return an error

### model\_routing\_config `object` **Optional**

Used to configure selection and routing across different models **within the same provider** for this request (e.g., how to choose among `gpt-4o`, `gpt-4-turbo`, `claude-3-5-sonnet`).

If not specified, the project or SDK default model selection strategy is used (e.g., default model, default task-type mapping, etc.).

#### available\_models `array` **Required**

A list of **model names** available for routing or as candidates.

#### preference `string` **Optional**

Preferred model name.

#### task\_info `object` **Optional**

Task metadata used to decide the specific model or parameters **based on task type and complexity**.

Fields:

##### task\_type `string` **Required**

Task type, expressing what the request is for, to support routing or automatic parameter selection.

* Example supported values:
  * `"chat"` — conversational tasks (multi-turn chat, assistant Q\&A)
  * `"completion"` — general text generation/completion
  * `"embedding"` — vectorization/semantic embedding
* Purpose:
  * Set different default models or quota policies by task type
  * Work with `complexity` to decide whether to use stronger models

##### complexity `string` **Optional**

Task complexity, describing the difficulty or importance of the request.

* Supported values:
  * `"low"` — simple tasks (short answers, simple rewrites)
  * `"medium"` — moderate complexity (general Q\&A, basic code, routine analysis)
  * `"high"` — high complexity (long-document analysis, complex programming, large-scale reasoning)
* Purpose:
  * Choose models at different tiers based on complexity (e.g., cheaper models for low complexity; stronger models for high complexity)
  * Also used to control timeouts, retry strategies, etc.

##### additional\_properties `object` **Optional**

Task-related extension fields, as free-form key-value pairs.

#### additional\_properties `object` **Optional**

Extension fields for the model routing configuration itself, used to attach extra control information beyond the standard structure.

### reasoning `object` **Optional**

Used to configure behaviors related to the reasoning process (chain-of-thought / reasoning trace), including whether to enable it, depth/length controls, and whether to expose reasoning content externally.

If not specified, the system or model uses its default reasoning strategy.

#### enabled `boolean` **Required**

Whether to enable explicit reasoning.

* `true`: the model uses (and, when allowed, outputs) more detailed reasoning steps
* `false`: the model provides only a conclusion (or minimizes explicit reasoning)

#### effort `string` **Optional**

Reasoning effort level, balancing **depth / granularity** against **cost / latency**.

* Supported values:
  * `"low"` — lightweight reasoning: faster answers, fewer details
  * `"medium"` — moderate reasoning: a balanced choice for most tasks
  * `"high"` — deep reasoning: more detailed analysis, higher token usage and latency
* Typical usage:
  * Latency-sensitive online services: prefer `"low"` or `"medium"`
  * Mission-critical correctness: prefer `"high"`

#### max\_tokens `number` **Optional**

Maximum token limit for the reasoning process (not the final answer).

#### exclude `boolean` **Optional**

Whether to **exclude reasoning content from the user-visible response**.

* `false`:
  * Reasoning can be returned alongside the final answer (e.g., during debugging/tool development)
* `true`:
  * Reasoning is used internally only and not exposed to the user (typical production setting)
* Purpose:
  * Meet security/compliance requirements (do not expose chain-of-thought)
  * In development/debugging, set to `false` to observe the model’s reasoning and iterate on prompts/policies

#### usage `object` **Optional**

Usage statistics

##### include `boolean` **Required**

Whether to include usage statistics in the response

### response\_format `object` **Optional**

An object that specifies the required output format for the model.

Set to `\{ "type": "json_schema", "json_schema": \{...\} \}` to enable structured outputs and ensure the model matches the JSON Schema you provide.

Set to `\{ "type": "json_object" \}` to enable legacy JSON mode and ensure the generated message is valid JSON. For models that support it, `json_schema` is preferred.

#### Text `object`

* type `string` **Required**
  The type of the response format being defined. Always `text`.

#### JSON schema `object`

* json\_schema `object` **Required**
  The JSON Schema that defines the response format.
  * name `string` **Required**
    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes; maximum length is 64.
  * schema `object` **Optional**
    The schema for the response format, described as a JSON Schema object.
  * strict `boolean` **Optional**
    Whether to strictly follow the JSON schema.
  * description `string` **Optional**
    A description of the response format’s purpose. The model uses this description to determine how to respond in that format.
* type `string` **Required**
  The type of the response format being defined. Always `json_schema`.

#### JSON object `object`

* type `string` **Required**
  The type of the response format being defined. Always `json_object`.

### Unsupported fields

| Field name               | Type          | Supported                                                   | Description                                                                                   |
| ------------------------ | ------------- | ----------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
| audio                    | object/null   | \<span style="white-space: nowrap;">❌ Not supported\</span> | Audio output parameters                                                                       |
| modalities               | array         | ❌ Not supported                                             | Output modality types                                                                         |
| functions                | array         | ❌ Not supported                                             | Deprecated; this parameter is not accepted                                                    |
| function\_call           | string/object | ❌ Not supported                                             | Deprecated; this parameter is not accepted                                                    |
| prompt\_cache\_key       | string        | ❌ Not supported                                             | Prompt cache key                                                                              |
| prompt\_cache\_retention | string        | ❌ Not supported                                             | Cache retention policy                                                                        |
| safety\_identifier       | string        | ❌ Not supported                                             | Safety identifier                                                                             |
| store                    | bool/null     | ❌ Not supported                                             | Store this conversation                                                                       |
| service\_tier            | string        | ❌ Not supported                                             | Service tier                                                                                  |
| prediction               | object        | ❌ Not supported                                             | Predicted outputs configuration                                                               |
| seed                     | int/null      | ❌ Not supported                                             | Random seed for sampling; deprecated                                                          |
| user                     | string        | ❌ Not supported                                             | Legacy user identifier; now primarily replaced by `safety_identifier` and `prompt_cache_key`. |
| max\_tokens              | int/null      | ❌ Not supported                                             | Deprecated; replaced by `max_completion_tokens`                                               |

## Response

### Non-streaming: returns a “complete chat completion object”

When `stream: false` (or omitted), the endpoint returns a complete **chat.completion** object. Field descriptions follow the same order as the table above.

***

### Top-level field: `choices`

#### choices `array`

A list of chat completion choices. It corresponds one-to-one with `n` in the request. Currently only `n = 1` is supported, so it typically contains a single element.

***

### choices\[i] object

#### finish\_reason `string`

The reason the model stopped generating tokens. Common values include:

* `stop`: reached a natural stopping point or hit a stop sequence
* `length`: reached the maximum token limit specified in the request
* `content_filter`: content was omitted due to a content filter
* `tool_calls`: the model called a tool (`tool_calls`)
* `function_call`: the model called a function (legacy, deprecated)

#### index `integer`

The index of this choice in the `choices` array, starting from 0.

#### logprobs `object`

Log probability information for this choice, used to inspect the probability distribution for each output token. Present only when `logprobs`-related parameters are set in the request.

***

### choices\[i].logprobs.content

#### content `array`

A list of “message content tokens” with log probability information. Each element describes a token and its candidate tokens:

* bytes `array`\
  A list of integers representing the UTF‑8 bytes for the token. In some languages or for emojis, a character may consist of multiple tokens; merging these bytes reconstructs the correct text. If the token has no byte representation, this is `null`.

* logprob `number`\
  The log probability of the token. If the token is not among the top 20 most likely tokens, `-9999.0` is typically used to indicate “extremely unlikely”.

* token `string`\
  The text representation of the current output token.

* top\_logprobs `array`\
  A list of the most likely candidate tokens and their log probabilities at this position. In rare cases, the returned count may be smaller than requested.
  * bytes `array`\
    UTF‑8 bytes for the candidate token; `null` if not available.
  * logprob `number`\
    The log probability of the candidate token.
  * token `string`\
    The text of the candidate token.

***

### choices\[i].logprobs.refusal

#### refusal `array`

A list of “refusal content tokens” with log probability information. When the model outputs a refusal, this is used to inspect token probabilities for the refusal text.

* bytes `array`\
  UTF‑8 bytes for the refusal token; `null` if not available.
* logprob `number`\
  Log probability of the refusal token; typically `-9999.0` when not in the top 20.
* token `string`\
  The text of a token within the refusal content.
* top\_logprobs `array`\
  The most likely candidate refusal tokens at this position.
  * bytes `array`\
    UTF‑8 bytes for the candidate refusal token.
  * logprob `number`\
    The log probability of the candidate refusal token.
  * token `string`\
    The text of the candidate token within the refusal content.

***

### choices\[i].message

#### message `object`

The complete chat completion message generated by the model.

***

### choices\[i].message fields

#### reasoning `string` (AGIPower extension)

Reasoning text content, used to display the model’s thought process or intermediate analysis. Whether it is actually returned depends on the model and the reasoning configuration in the request.

#### reasoning\_details `string` (AGIPower extension)

The main body of the reasoning text, typically more complete or detailed than `reasoning`, and can serve as the primary carrier for chain-of-thought.

#### content `string`

The main message content, typically the model’s natural-language reply to the user. Some multimodal models may return structured content, but overall it follows the OpenAI chat format.

#### refusal `string or null`

If the model refuses to fulfill the user request in this turn, this contains the refusal message text; otherwise `null`.

#### role `string`

The author role. For a model reply, it is `"assistant"`.

#### annotations `array`

A list of annotations. When using tools such as web search, it can carry URL citations and similar references.

* type `string`\
  The type of URL citation; currently always `url_citation`.
* url\_citation `object`\
  URL citation details when using web search.
  * end\_index `integer`\
    The index of the last character of this citation within the message `content`.
  * start\_index `integer`\
    The index of the first character of this citation within the message `content`.
  * title `string`\
    The title of the web resource.
  * url `string`\
    The URL of the web resource.

#### audio `object`

When audio output modality is requested, this object contains the model’s audio response data.

* data `string`\
  Base64-encoded audio bytes generated by the model, in the requested format.
* expires\_at `integer`\
  Unix timestamp (seconds) after which this audio response is no longer available on the server for subsequent multi-turn conversations.
* id `string`\
  The unique identifier of this audio response.
* transcript `string`\
  The transcript (transcribed text) corresponding to the audio content.

#### function\_call `object`

Deprecated function-call field, replaced by `tool_calls` and retained only for backward compatibility. Indicates the function name and parameters the model suggests calling.

* arguments `string`\
  Function arguments as a JSON string. Note that the model is not guaranteed to produce strictly valid JSON and may include fields not defined in the schema; you should parse and validate before invocation.
* name `string`\
  The function name to call.

#### tool\_calls `array`

The new tool-call list. Each element describes one tool call, which can be a “function tool call” or a “custom tool call”. Models may call multiple tools in parallel within a single response.

* id `string`\
  Unique ID of the tool call, used to match `tool_call_id` in subsequent `tool` messages.
* type `string`\
  Tool type. The current standard is `function`; AGIPower may support other types such as `custom` in extensions.
* function `object`\
  When `type = "function"`, indicates the function the model calls.
  * arguments `string`\
    Function call arguments as a JSON string. The model may not always generate valid JSON and may include fields not defined in the schema; validate before invocation.
  * name `string`\
    The function name to call.

***

### Top-level fields: metadata and usage

#### created `integer`

Unix timestamp (seconds) when the chat completion was created.

#### id `string`

Unique identifier of this chat completion.

#### model `string`

Model identifier used for this completion, e.g. `openai/gpt-5`.

#### object `string`

Object type. For non-streaming responses, this is always `chat.completion`.

#### service\_tier `string`

The service tier/type used to process the request. AGIPower does not constrain values; if the upstream model returns this field, it will be passed through.

#### system\_fingerprint `string`

Backend configuration fingerprint for this request, used to identify the underlying service version or cluster. Passed through if provided upstream.

#### usage `object`

Usage statistics for this request, including token counts for prompts and completions.

* completion\_tokens `integer`\
  Number of tokens used in the generated completion.

* prompt\_tokens `integer`\
  Number of tokens used in the input prompt (messages, etc.).

* total\_tokens `integer`\
  Total tokens used (`prompt_tokens + completion_tokens`).

* completion\_tokens\_details `object`\
  Further breakdown of completion tokens.
  * accepted\_prediction\_tokens `integer`\
    When using Predicted Outputs, the number of predicted tokens that actually appeared in the completion. Typically unused by current models.
  * audio\_tokens `integer`\
    Tokens consumed by audio output generated by the model.
  * reasoning\_tokens `integer`\
    Tokens generated for the reasoning process (even if not fully shown to the user).
  * rejected\_prediction\_tokens `integer`\
    When using Predicted Outputs, the number of predicted tokens that did not appear in the completion; these tokens still count toward billing and context-window limits. Typically unused.

* prompt\_tokens\_details `object`\
  Breakdown of prompt tokens.
  * audio\_tokens `integer`\
    Tokens consumed by audio input in the prompt.
  * cached\_tokens `integer`\
    Tokens matched via prompt caching.

***

### Streaming: returns multiple “chat completion chunk objects”

When `stream: true`, the endpoint returns **chat.completion.chunk** objects multiple times via SSE (Server-Sent Events). Clients should consume and concatenate chunks in order. Field descriptions follow the same order as the table above.

***

### Top-level field: `choices` (streamed chunks)

#### choices `array`

A list of completion choices. If `n > 1`, it can contain multiple elements. When `stream_options: \{"include_usage": true\}` is set, the final chunk may have an empty `choices` array and carry only `usage` information.

***

### choices\[i] (Chunk) object

#### delta `object`

Incremental content produced by the streaming model response—i.e., what is “new” compared to previous chunks.

* reasoning `string` (AGIPower extension)\
  Incremental reasoning text, used to stream reasoning information chunk by chunk.

* reasoning\_content `string` (AGIPower extension)\
  Incremental fragment of the reasoning main body, typically used with `reasoning` to reconstruct the full reasoning text.

* content `string`\
  Incremental message content for this chunk. The client should concatenate `content` across chunks to build the full reply.

* function\_call `object` (deprecated)\
  Legacy incremental function-call information, replaced by `tool_calls` but still parseable.
  * arguments `string`\
    Incremental JSON fragment of function arguments; must be concatenated across chunks before parsing.
  * name `string`\
    The function name to call; typically appears in the first chunk of the call.

* refusal `string`\
  Incremental refusal message fragment for this chunk.

* role `string`\
  The author role for this message, typically `"assistant"` in the first chunk.

* tool\_calls `array`\
  Incremental tool-call information list.

  For each incremental tool-call element:

  * index `integer`\
    The position of this tool call within the `tool_calls` array.

  * function `object`\
    Incremental information for a function tool call.
    * arguments `string`\
      Incremental fragment of the JSON string for function-call arguments; must be concatenated across chunks before parsing.
    * name `string`\
      The function name to call; typically provided at the start of the tool call.

  * id `string`\
    The tool call ID; typically provided on first appearance for later association with `tool` messages.

  * type `string`\
    The tool type; currently only `function` is supported.

#### finish\_reason `string or null`

Why generation stopped for this chunk:

* `stop`: natural end or hit a stop sequence
* `length`: reached the maximum generation token limit
* `content_filter`: content was filtered
* `tool_calls`: tool call triggered
* `function_call`: legacy function call triggered
* `null`: not finished yet; more chunks will follow

#### index `integer`

The index of this choice in the `choices` array.

#### logprobs `object`

Log probability structure for the current chunk, same as non-streaming `logprobs`, but only for the “new” tokens.

***

### choices\[i].logprobs.content (streaming)

#### content `array`

A list of “message content tokens” newly generated in the current chunk.

* bytes `array`\
  UTF‑8 bytes for the current token.
* logprob `number`\
  Log probability for the current token; `-9999.0` if not in the top 20 most likely tokens.
* token `string`\
  Text representation of the current output token.
* top\_logprobs `array`\
  Candidate tokens most likely at this position.
  * bytes `array`\
    UTF‑8 bytes for the candidate token.
  * logprob `number`\
    Log probability for the candidate token.
  * token `string`\
    Text of the candidate token.

***

### choices\[i].logprobs.refusal (streaming)

#### refusal `array`

A list of “refusal content tokens” newly generated in the current chunk.

* bytes `array`\
  UTF‑8 bytes for the refusal token.
* logprob `number`\
  Log probability for the refusal token; `-9999.0` for low-probability cases.
* token `string`\
  Text of a token within the refusal content.
* top\_logprobs `array`\
  Candidate refusal tokens most likely at this position.
  * bytes `array`\
    UTF‑8 bytes for the candidate refusal token.
  * logprob `number`\
    Log probability for the candidate refusal token.
  * token `string`\
    Text of the candidate token.

***

### Other top-level streaming fields

#### created `integer`

Unix timestamp (seconds) when the chat completion was created. The value is the same for all chunks in a stream.

#### id `string`

Unique identifier of the chat completion. All chunks in the same stream share the same `id`.

#### model `string`

Model name used for this completion.

#### object `string`

Object type. For streaming responses, this is always `chat.completion.chunk`.

#### service\_tier `string`

The service tier/type used to process the request. Passed through if provided upstream.

#### system\_fingerprint `string`

Fingerprint of the backend configuration used for this request. Although marked Deprecated by some upstream providers, AGIPower still preserves and passes through this field.

***

### usage `object` (included only in the final chunk)

When `stream_options: \{"include_usage": true\}` is set, the final chunk includes the `usage` object; its structure is the same as the non-streaming response.

* completion\_tokens `integer`\
  Number of tokens used in the completion.

* prompt\_tokens `integer`\
  Number of tokens used in the prompt.

* total\_tokens `integer`\
  Total tokens used in this request.

* completion\_tokens\_details `object`\
  Completion token breakdown.
  * accepted\_prediction\_tokens `integer`\
    Number of predicted tokens accepted in the completion.
  * audio\_tokens `integer`\
    Tokens related to model-generated audio.
  * reasoning\_tokens `integer`\
    Tokens used by the model for reasoning.
  * rejected\_prediction\_tokens `integer`\
    Number of predicted tokens not used but still counted toward usage.

* prompt\_tokens\_details `object`\
  Prompt token breakdown.
  * audio\_tokens `integer`\
    Audio input tokens in the prompt.
  * cached\_tokens `integer`\
    Tokens matched via caching.

<Card title="POST /v1/chat/completions">
  ```TypeScript theme={null}
  import OpenAI from "openai";

  const openai = new OpenAI({
    baseURL: 'https://api.agipower.ai/v1',
    apiKey: '<AGIPower_API_KEY>',
  });

  async function main() {
    const completion = await openai.chat.completions.create({
      model: "openai/gpt-5",
      messages: [
        {
          role: "user",
          content: "What is the meaning of life?",
        },
      ],
    });

    console.log(completion.choices[0].message);
  }

  main();
  ```

  ```Python theme={null}
  from openai import OpenAI

  client = OpenAI(
      base_url="https://api.agipower.ai/v1",
      api_key="<your_AGIPower_API_KEY>",
  )

  completion = client.chat.completions.create(
      model="openai/gpt-5",
      messages=[
          {
              "role": "user",
              "content": "What is the meaning of life?"
          }
      ]
  )

  print(completion.choices[0].message.content)
  ```

  ```cURL theme={null}
  curl https://api.agipower.ai/v1/chat/completions \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $AGIPower_API_KEY" \
    -d '{
      "model": "openai/gpt-5",
      "messages": [
        {
          "role": "user",
          "content": "What is the meaning of life?"
        }
      ]
    }'
  ```
</Card>

<Card title="API Response">
  ```json theme={null}
  {
    "id": "dc41ec9a378d43a497ca2daff171ceb0",
    "model": "openai/gpt-5",
    "choices": [
      {
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "There isn’t a single, objective answer. Different traditions offer different meanings, and most people end up constructing their own.\n\n- Religious: To know or serve God, live virtuously, and love others.\n- Existential/humanist: Life has no built‑in meaning; you create it through choices, authenticity, and responsibility.\n- Scientific-naturalist: There’s no cosmic purpose; meaning comes from conscious experience—relationships, curiosity, creativity, and contribution.\n- Eudaimonic (Aristotle): Flourish by developing virtues, using your strengths, and living in accord with reason and values.\n- Eastern philosophies: Reduce suffering, cultivate compassion, and see through the illusion of a separate self.\n\nA practical way to find meaning:\n- Clarify your values (what you’d stand for even if it’s hard).\n- Invest in relationships and service.\n- Learn and create; pursue mastery in something that matters to you.\n- Contribute beyond yourself—help, build, protect, or heal.\n- Savor and be present; cultivate gratitude and awe.\n\nA simple summary many find helpful: Love well, learn continuously, and leave the world a little better than you found it.",
          "refusal": null,
          "annotations": [],
          "reasoning": "**Considering the meaning of life**\n\nI need to answer concisely but thoughtfully. The question is philosophical, so I should present various perspectives: religious, existential, scientific, and personal. It might be useful to suggest a practical framework for finding meaning, focusing on relationships, personal growth, and contributions. While a general response is appropriate, I should clarify that there’s no single objective answer. I can mention common themes like connection, creativity, and love, and propose questions for reflection. A nice one-liner could be about creating meaning through conscious engagement.",
          "reasoning_details": [
            {
              "index": "0",
              "format": "openai-responses-v1",
              "type": "reasoning.summary",
              "summary": "**Considering the meaning of life**\n\nI need to answer concisely but thoughtfully. The question is philosophical, so I should present various perspectives: religious, existential, scientific, and personal. It might be useful to suggest a practical framework for finding meaning, focusing on relationships, personal growth, and contributions. While a general response is appropriate, I should clarify that there’s no single objective answer. I can mention common themes like connection, creativity, and love, and propose questions for reflection. A nice one-liner could be about creating meaning through conscious engagement."
            },
            {
              "id": "rs_0639a0762f01111400696766d7af48819388646c9544e1107c",
              "index": "0",
              "format": "openai-responses-v1",
              "type": "reasoning.encrypted",
              "data": "gAAAAABpZ2br9iURFxvdEjmaRGKcjutfnC2dVpSTQxh8Vjel9pkdkU6b6sX_JjARvh4aU-hI9c4ZfGjWAze2FfWqfvNyGN55ljlnX9wHRTK6OR9VWyezo7PoXDS4uJPV62OjA5DvDrj6KZeMcxUnEo54XORRqgGbqCR6R0Pv1q2YoFfJZh0gVBdakKDTlm4JEb6o5hIEg9b1jh1mNxu-SyCxuIecmE_ZsDYphWyLu3S1jPM-ieNTJ97GLfiefbqk-SostjrIKpiVtrGMU0cHS7FYk01X260lXAAf54jqdMzF8Haw08m0zs0vTABPfP3WK5RCOlHd_EuEsabuZoZXwqyWkAA9G3l0i-0xlXnPNZlXwcUlfqZto6aszy-XPPUDXfpIZqEEpcF2ikXSdTSTOMxAtSb2Q1lUnI4rN45-dOonjJ_VltIHXJCf9c-wbF3d-9ymPDwhib4VnlNTbH03I6SK-_PebVkTF1efcaL5MonE0_lypsNn4ZF-T3wpp1jGTke5mMv8qjChJYUaO5C7eGugmM6pvxnAFBr375Wic-rh1wlBrPEtmXPLVO-TqCGNddB-Vrg0HVblXOphr1gPXcuE8VpGw40PtiT9YqYDaAlZRLZpxJfB9hAxtKDfgqh5f5TqfrXjuUJSeT6sQPgCv4vHulpwSWKNOh5PpCvW5FS1HHvPXW1d5WERDl_dngxRWU4NuIi0MlSLV5kd_oTOOM4AVRSYK0TA4o8YpAZVlVYGVp9b5Vs1rhVl56ga_iOBfiRw16Tb7nO7V-vcwrBQLOYiFixuE0Em5UAEaLp_wxP12QqoRSRezFTHkNT9ietR03Z38H8SzwbPoPB2XiI9pe5KxJGQ2cccdS9s5o4_Btj8kp9q9n2rqFg0Cuv-WChnzhgX8u5zrk1cAqCNhr5uul-RdJLWCz9IH35oOe14umu8ymaN4D1x1VTY5uPef7OrjYyYXqTQa-CMUFqw3qShwBftlZDfF6rLMgKUiEBP93ERFNBIMoBIn-BVEdi5yjImIUkH_q1iVyhtQTEHUh7TMF7_i2vWZUB-NXIPs9Zqt76pH-tKukLWvDrHqeajwvtt9d6X4xks9oGzepnWmL2nyFggLD24R8-59Sc5dco-Ssr91TfUpm8VrJXqUTtcMcWuCoY0i93MT8ty5Bc0hYQ23-vzZdyS0Rm6dO26HDXrvZ9TGL4uW_QXNBX6q51qlQ_xr4m51JU8Wul_You9-M03dO99LkdljtF5nKsnZNdiWGRnF9oFmokdHFAqfBM6KjLZUUkDsVG6hLElejg89t0kymwUJfao21MMCb56E2G6QtUOx4vf8F3myDFhOX3zrAAhoJ-Bw7rK3s2esbnDBn96ZzKoyGOLHm54kQM2_Rs9qQdjflxZ4WKhXoEJwz9H1uHILBMVbrl1aTu_ReYb8xJPVR5oB7Ky_1GPoeG82QntVExCJDZpb4fAqpzFzuV6B7GsVF6Z0cyeyPi3TGEjxSLxYqGWVMBSEsokx8USEET0T7ytiHpVQ4cOr2eimLzDp-hJbZKGEufU6Tnh9RZA2-0Q87X57RaoAydY6brj9S3tTAy2Iz8m_-qEGLXjUr6ffDg3lNMGQhFvN-YAWbdmidbZfCVQR1Oc6A6-ayowaHpyUeff7PxQFXaQ7k3P0W7p1N3VLTjC3lNk2gSPyq_6MvLmxXOlGLj_50Q1OLAFn0bK7knhFf8t7gS7MjOXMQl9PiSbtQL9URHrPeMYKjpQGa84rOnZzC8G9RXvzKatVHB0NpKO02DeTY4hzsMw-Wj73-ZpBSSiyOlTpuVVNxma83krKqMqU_9kX09mNWB6UKrm9v7RxFuOjyVd5x35iodmPUbaXbzqETubPRzVKedLAhaYVTZp1J_qWvLVPoSImyFrM0IPB2Jy5ksqqAbbDjTy3l6Jp3pNu-IhiACVA1JlxRQ67Esb7JaK3ZakR3ExWSPDgxonqX8YvS6dr0UM2tjpOnurQc5NUSYBwo9vHzQxbWVuBATJaSUqe0IrJKPyvErRoEtFGjKZ8CvZagw1-MfD0KTLAmzR3hYAXKADsMRibEXf8-SPUrnuvm4OsRj1Gg7jl4k_ITYjOiRLzBMvVVxxRFfAhR7BFYBC1H0dClGTy4yxPKDNUR9HctiuQFO2-Q4Sw4dEqnTYSwCJS4Zaw5DHvqbDh9JK3AKdatRHHImqOxxtUxiJ8IaQcd2n_CaNbIekuuqUclwnjW8IJquTAPDJX0MhsyBY3nXJMVfeyCFO0D0g8OcvCH_9pFrsGgpTb7DFloDeTfCFUfY0GGGtfuhSL3qDggFAurf9H3cN73dOW5wujFOTGAbWG8aHf2Rok_H06fcg4zJSu5TnHkoJjdyc5n_NIo1RATiKwNkSFHwc_2-RnrnmOVl4125ufyqqrvuENapGWm8xGySQW1Zb39AKdUpBr4zEgU_M3PR6D0ujubsJLncgO8X6DwQ47QlGjPYmnjG_-q3O3plr-ShFJQOZqBvSgtdcqQBu0LK8I3vLXjHkQweUsVRzxlbwOYFMjmYOFWzxq2gP86-4TldrnOsUw0afewm0s_d6N8t2F_mvEgmJ5fPA3KXIQ7Fjaqxt_KUgqZqA4j3wGaAqI89QUc2HwU7bVFrLvLa019bJMj4az7WYmw1ajorD0C8dB2tLMjGdVHul_oEod0vyoCt-7I7qxZhkoW24ULSsmtPpSu0zV_gK0runwxjx1csxkHQP-MeoJry_F_D2jhgEmeJjamddbyT2TcQ7S3FS3uNDQyl6agzXq3rRdX9VlUatq9LpUCqL6U7WrA8JlEyFSJVm9W0pYaqjPiHiP47twkjl3txuKraV-Wkg4TrjlcMM3IqkMcAvySekuZGbIhjRscByTmDL-sESsMVG5dV8NU33HwnL9wLyZZ416JF927SfRTkF7DRrl-PRVX-lLNtmoXXSFCBdMfiUhvfWLR7r44ZxMRJCLacN1dw49XDyzANSfRmQySGmWhYUjUej6bLy9bdL5HP21O1u_9XUFWc_boI0a7tphBlMiUBGV7jAKlN9QrMAJVUBamHM3GmabbmVpFrvnuYd5bD_iJN0BY6cZb9lWDs6P6yHip8SoMO9VM8ykcdTfLOqp_IhlUkD3eZ0cSObuPHPs4HfiFHlG6qLLBtT_ytUeIDc5VMjA_6i0mKm85HhqWdB_MWoqE-aSPpAEtmQTLPUyyxpYrMYtWJ_OUqBxiU3CiV9G1QS8oU2gMq60w0OCDoy1F-oxnOLpJIrDhnDTAXlYnbFlYkEAIb9QDn7UDfitHrPqaUwShDHX7XXVbuYYJMIJs2XXnOViviNn5SbVkSDPyt4xi-UfPKpcTJCmmOSvZn-fs3BdO7oGdZC8UmBM6sVmgxOPL361DcEs6fsLKhqKwVLqDS-CYmT811dqja2CcnTmHIQrO6Wg_hEi5C1YW0iA1stpw461VDh86rHRslJSIn6kDJ9W_X-3vsTUpk62jUs6Bv1KkoyhcojCvgXtDr7ff5mTqTbzX9d76yVwW97xqA86SgntP-N6cNE2GcBKaXea32gjGskvFDV5w7-DGoxeZrNM1Ur5-S3ADFDE-A2mrQCxbm66xcB8KNK181k3QWLrlrKWKNMCZLgkFxuXbD2plxgPDWaqaJxFoDibjHHS94JXhBMu3KB6_CziqK7irU3OHsqEGc7ZDHS4araDurJUlr_UhH4UTsS9pOsxF5XniWdyNBdr6CKSrSC0SIw9YUi39X9CLp5mzWspRssOwUhd1ECVkLgOF8yv5g="
            }
          ]
        },
        "index": 0,
        "logprobs": {
          "content": [],
          "refusal": null
        }
      }
    ],
    "usage": {
      "completion_tokens": 629,
      "prompt_tokens": 13,
      "total_tokens": 642,
      "completion_tokens_details": {
        "reasoning_tokens": 384
      },
      "prompt_tokens_details": {
        "cached_tokens": 0
      }
    },
    "created": 1768384213,
    "object": "chat.completion",
    "service_tier": "default"
  }
  ```
</Card>

## Multi-turn tool-calling scenarios: pass back reasoning\_details and signature

When using Anthropic models such as Claude Opus 4.5 with `reasoning` enabled, in multi-turn tool-calling scenarios you **must pass back the previous turn’s `reasoning_details` (including the `signature` field) in full**, otherwise subsequent turns will not work properly.

<Warning title="Important">
  * **Background**: Claude Opus 4.5 natively uses the Anthropic Messages protocol, and AGIPower converts the Chat Completion protocol to the Messages protocol.
  * **Issue**: When reasoning is enabled, the second turn in a tool-calling conversation must include the reasoning `signature` to verify the integrity and continuity of the reasoning content.
  * **Solution**: Pass back the previous assistant message’s `reasoning` and `reasoning_details` fields in full.
</Warning>

### Request example

```json theme={null}
{
  "model": "anthropic/claude-sonnet-4.5",
  "messages": [
    {
      "content": "今天是2025年8月15日，上海今天天气怎么样",
      "role": "user"
    },
    {
      "role": "assistant",
      "tool_calls": [
        {
          "function": {
            "name": "search_city_weather",
            "arguments": "{\"city\":\"上海\"}"
          },
          "id": "toolu_bdrk_01S7xyqV3GYLJYrvBC5SwtPP",
          "type": "function"
        }
      ],
      "content": "",
      "reasoning": "用户想知道2025年8月15日上海的天气情况。我需要使用search_city_weather函数来查询。\n\n参数：\n- city: \"上海\"\n- date: \"2025-08-15\"",
      "reasoning_details": [
        {
          "type": "reasoning.text",
          "text": "用户想知道2025年8月15日上海的天气情况。我需要使用search_city_weather函数来查询。\n\n参数：\n- city: \"上海\"\n- date: \"2025-08-15\"",
          "signature": "EscCCkgICxABGAIqQF3ngnbIR+15nndalNEqnr7vq0v0Hyvle+twPh2SCMpMmNKf1oXiRPsjZG6Z46M69x06wks+4jm4N4FO3RH2mkgSDLChkfyKfk3ZndjatxoMi+H4ghd4hlGd+MRVIjBLKGRIcRwXS09pK50C2/ygvhnTlVMPkcARYG3nXV2ZWr2IPRHzY9XAK6QBJeVrmcsqrAGoL7TTMBUsMqMkfXlcRYABi+OPDht/9BOPKnV1k0RIWnnqzLfx4MQ/WSvTALBchQkYbXtO2v1nn5EhG/b9FZ+ZjUK0pAObWxv8aAIK47N1cTK+OB+iByPvlFb2vi0gX7xVOQXrmR5FLH03/JzmtqLpjgX/uYCYHddOvZzTx65STtajQ94FVKS35XkmHlbOIXqi4j1FIAioP4oqvDXqlZOMh8IKMJypT2I3vF2eGAE=",
          "format": "anthropic-claude-v1",
          "index": 0
        }
      ]
    },
    {
      "role": "tool",
      "tool_call_id": "toolu_bdrk_01S7xyqV3GYLJYrvBC5SwtPP",
      "content": "{\"city\":\"上海市\",\"date\":\"2025-08-15\",\"week\":\"1\",\"dayweather\":\"多云\",\"nightweather\":\"多云\",\"daywind\":\"东南\",\"nightwind\":\"东南\",\"daypower\":\"1-3\",\"nightpower\":\"1-3\",\"daytemp_float\":\"35.0\",\"nighttemp_float\":\"28.0\"}"
    }
  ],
  "stream": false,
  "tools": [
    {
      "function": {
        "name": "search_city_weather",
        "description": "搜索城市天气",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {
              "type": "string",
              "description": "城市名称"
            },
            "date": {
              "type": "string",
              "description": "yyyy-mm-dd格式的日期"
            }
          },
          "required": ["city", "date"],
          "additionalProperties": false
        }
      },
      "type": "function"
    }
  ],
  "reasoning": {
    "enabled": true
  }
}
```

### Key field notes

| Field                           | Description                                                                 |
| ------------------------------- | --------------------------------------------------------------------------- |
| `reasoning`                     | Reasoning text in the assistant message; optionally passed back             |
| `reasoning_details`             | **Must be passed back in full**, containing the detailed reasoning array    |
| `reasoning_details[].signature` | **Most critical field**; the reasoning signature credential; pass unchanged |
| `reasoning_details[].format`    | Signature format identifier, e.g. `anthropic-claude-v1`                     |
| `reasoning_details[].type`      | Reasoning content type, e.g. `reasoning.text`                               |

### Workflow

1. **First request**: The user asks a question; the model returns an assistant message containing `tool_calls`, as well as `reasoning` and `reasoning_details` (including `signature`)
2. **Execute tools**: Your application executes the tool call and obtains the result
3. **Second request**: Pass back the previous assistant message (**including `reasoning` and `reasoning_details`**) together with the tool execution result (tool message)
4. **Model response**: The model generates the final answer based on the complete context
