メインコンテンツへスキップ
POST
/
v1
/
chat
/
completions
Create Chat Completion
curl --request POST \
  --url https://api.example.com/v1/chat/completions/ \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "messages": [
    {
      "content": "<string>",
      "role": "<string>",
      "name": "<string>"
    }
  ],
  "model": "<string>",
  "frequency_penalty": 0,
  "logit_bias": {},
  "logprobs": false,
  "top_logprobs": 0,
  "max_tokens": 123,
  "max_completion_tokens": 123,
  "n": 1,
  "presence_penalty": 0,
  "response_format": {
    "type": "text",
    "json_schema": {
      "name": "<string>",
      "description": "<string>",
      "schema": {},
      "strict": true
    }
  },
  "seed": 0,
  "stop": [],
  "stream": false,
  "stream_options": {
    "include_usage": true,
    "continuous_usage_stats": false
  },
  "temperature": 123,
  "top_p": 123,
  "tools": [
    {
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {}
      },
      "type": "function"
    }
  ],
  "tool_choice": "none",
  "reasoning_effort": "low",
  "include_reasoning": true,
  "parallel_tool_calls": true,
  "user": "<string>",
  "use_beam_search": false,
  "top_k": 123,
  "min_p": 123,
  "repetition_penalty": 123,
  "length_penalty": 1,
  "stop_token_ids": [],
  "include_stop_str_in_output": false,
  "ignore_eos": false,
  "min_tokens": 0,
  "skip_special_tokens": true,
  "spaces_between_special_tokens": true,
  "truncate_prompt_tokens": 0,
  "prompt_logprobs": 123,
  "allowed_token_ids": [
    123
  ],
  "bad_words": [
    "<string>"
  ],
  "echo": false,
  "add_generation_prompt": true,
  "continue_final_message": false,
  "add_special_tokens": false,
  "documents": [
    {}
  ],
  "chat_template": "<string>",
  "chat_template_kwargs": {},
  "mm_processor_kwargs": {},
  "structured_outputs": {
    "json": "<string>",
    "regex": "<string>",
    "choice": [
      "<string>"
    ],
    "grammar": "<string>",
    "json_object": true,
    "disable_fallback": false,
    "disable_any_whitespace": false,
    "disable_additional_properties": false,
    "whitespace_pattern": "<string>",
    "structural_tag": "<string>",
    "_backend": "<string>",
    "_backend_was_auto": false
  },
  "priority": 0,
  "request_id": "<string>",
  "logits_processors": [
    "<string>"
  ],
  "return_tokens_as_token_ids": true,
  "return_token_ids": true,
  "cache_salt": "<string>",
  "kv_transfer_params": {},
  "vllm_xargs": {}
}
'
{
  "model": "<string>",
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "<string>",
        "content": "<string>",
        "refusal": "<string>",
        "annotations": {
          "type": "<string>",
          "url_citation": {
            "end_index": 123,
            "start_index": 123,
            "title": "<string>",
            "url": "<string>"
          }
        },
        "audio": {
          "id": "<string>",
          "data": "<string>",
          "expires_at": 123,
          "transcript": "<string>"
        },
        "function_call": {
          "name": "<string>",
          "arguments": "<string>"
        },
        "tool_calls": [
          {
            "function": {
              "name": "<string>",
              "arguments": "<string>"
            },
            "id": "<string>",
            "type": "function"
          }
        ],
        "reasoning": "<string>",
        "reasoning_content": "<string>"
      },
      "logprobs": {
        "content": [
          {
            "token": "<string>",
            "logprob": -9999,
            "bytes": [
              123
            ],
            "top_logprobs": [
              {
                "token": "<string>",
                "logprob": -9999,
                "bytes": [
                  123
                ]
              }
            ]
          }
        ]
      },
      "finish_reason": "stop",
      "stop_reason": 123,
      "token_ids": [
        123
      ]
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "total_tokens": 0,
    "completion_tokens": 0,
    "prompt_tokens_details": {
      "cached_tokens": 123
    }
  },
  "id": "<string>",
  "object": "chat.completion",
  "created": 123,
  "service_tier": "auto",
  "system_fingerprint": "<string>",
  "prompt_logprobs": [
    {}
  ],
  "prompt_token_ids": [
    123
  ],
  "kv_transfer_params": {}
}

承認

Authorization
string
header
必須

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

ボディ

application/json
messages
(ChatCompletionDeveloperMessageParam · object | ChatCompletionSystemMessageParam · object | ChatCompletionUserMessageParam · object | ChatCompletionAssistantMessageParam · object | ChatCompletionToolMessageParam · object | ChatCompletionFunctionMessageParam · object | CustomChatCompletionMessageParam · object | Message · object)[]
必須

Developer-provided instructions that the model should follow, regardless of messages sent by the user. With o1 models and newer, developer messages replace the previous system messages.

model
string | null
frequency_penalty
number | null
デフォルト:0
logit_bias
Logit Bias · object
logprobs
boolean | null
デフォルト:false
top_logprobs
integer | null
デフォルト:0
max_tokens
integer | null
非推奨
max_completion_tokens
integer | null
n
integer | null
デフォルト:1
presence_penalty
number | null
デフォルト:0
response_format
ResponseFormat · object
seed
integer | null
必須範囲: -9223372036854776000 <= x <= 9223372036854776000
stop
デフォルト:[]
stream
boolean | null
デフォルト:false
stream_options
StreamOptions · object
temperature
number | null
top_p
number | null
tools
ChatCompletionToolsParam · object[] | null
tool_choice
デフォルト:none
Allowed value: "none"
reasoning_effort
enum<string> | null
利用可能なオプション:
low,
medium,
high
include_reasoning
boolean
デフォルト:true
parallel_tool_calls
boolean | null
デフォルト:true
user
string | null
top_k
integer | null
min_p
number | null
repetition_penalty
number | null
length_penalty
number
デフォルト:1
stop_token_ids
integer[] | null
include_stop_str_in_output
boolean
デフォルト:false
ignore_eos
boolean
デフォルト:false
min_tokens
integer
デフォルト:0
skip_special_tokens
boolean
デフォルト:true
spaces_between_special_tokens
boolean
デフォルト:true
truncate_prompt_tokens
integer | null
必須範囲: x >= -1
prompt_logprobs
integer | null
allowed_token_ids
integer[] | null
bad_words
string[]
echo
boolean
デフォルト:false

If true, the new message will be prepended with the last message if they belong to the same role.

add_generation_prompt
boolean
デフォルト:true

If true, the generation prompt will be added to the chat template. This is a parameter used by chat template in tokenizer config of the model.

continue_final_message
boolean
デフォルト:false

If this is set, the chat will be formatted so that the final message in the chat is open-ended, without any EOS tokens. The model will continue this message rather than starting a new one. This allows you to "prefill" part of the model's response for it. Cannot be used at the same time as add_generation_prompt.

add_special_tokens
boolean
デフォルト:false

If true, special tokens (e.g. BOS) will be added to the prompt on top of what is added by the chat template. For most models, the chat template takes care of adding the special tokens so this should be set to false (as is the default).

documents
Documents · object[] | null

A list of dicts representing documents that will be accessible to the model if it is performing RAG (retrieval-augmented generation). If the template does not support RAG, this argument will have no effect. We recommend that each document should be a dict containing "title" and "text" keys.

chat_template
string | null

A Jinja template to use for this conversion. As of transformers v4.44, default chat template is no longer allowed, so you must provide a chat template if the tokenizer does not define one.

chat_template_kwargs
Chat Template Kwargs · object

Additional keyword args to pass to the template renderer. Will be accessible by the chat template.

mm_processor_kwargs
Mm Processor Kwargs · object

Additional kwargs to pass to the HF processor.

structured_outputs
StructuredOutputsParams · object

Additional kwargs for structured outputs

priority
integer
デフォルト:0

The priority of the request (lower means earlier handling; default: 0). Any priority other than 0 will raise an error if the served model does not use priority scheduling.

request_id
string

The request_id related to this request. If the caller does not set it, a random_uuid will be generated. This id is used through out the inference process and return in response.

logits_processors
(string | LogitsProcessorConstructor · object)[] | null

A list of either qualified names of logits processors, or constructor objects, to apply when sampling. A constructor is a JSON object with a required 'qualname' field specifying the qualified name of the processor class/factory, and optional 'args' and 'kwargs' fields containing positional and keyword arguments. For example: {'qualname': 'my_module.MyLogitsProcessor', 'args': [1, 2], 'kwargs': {'param': 'value'}}.

return_tokens_as_token_ids
boolean | null

If specified with 'logprobs', tokens are represented as strings of the form 'token_id:{token_id}' so that tokens that are not JSON-encodable can be identified.

return_token_ids
boolean | null

If specified, the result will include token IDs alongside the generated text. In streaming mode, prompt_token_ids is included only in the first chunk, and token_ids contains the delta tokens for each chunk. This is useful for debugging or when you need to map generated text back to input tokens.

cache_salt
string | null

If specified, the prefix cache will be salted with the provided string to prevent an attacker to guess prompts in multi-user environments. The salt should be random, protected from access by 3rd parties, and long enough to be unpredictable (e.g., 43 characters base64-encoded, corresponding to 256 bit).

kv_transfer_params
Kv Transfer Params · object

KVTransfer parameters used for disaggregated serving.

vllm_xargs
Vllm Xargs · object

Additional request parameters with (list of) string or numeric values, used by custom extensions.

レスポンス

Successful Response

model
string
必須
choices
ChatCompletionResponseChoice · object[]
必須
usage
UsageInfo · object
必須
id
string
object
string
デフォルト:chat.completion
Allowed value: "chat.completion"
created
integer
service_tier
enum<string> | null
利用可能なオプション:
auto,
default,
flex,
scale,
priority
system_fingerprint
string | null
prompt_logprobs
(object | null)[] | null
prompt_token_ids
integer[] | null
kv_transfer_params
Kv Transfer Params · object

KVTransfer parameters.