OpenAI-Compatible API
Chat Completions
OpenAI-compatible chat completions endpoint
POST
Chat Completions
Documentation Index
Fetch the complete documentation index at: https://student-213fb9fc.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Create a chat completion using the OpenAI-compatible API format. Works with any OpenAI SDK, LangChain, LlamaIndex, and tools like Cursor or Continue.Request
Headers
Bearer token:
Bearer YOUR_API_KEYBody
Model identifier. See available models. Example:
"gemma3:27b", "deepseek-v3.2".Array of message objects:
role—"system","user", or"assistant"content— message text (or array of content objects for vision)
If
true, returns a stream of text/event-stream Server-Sent Events.Sampling temperature between 0 and 2. Higher = more random, lower = more focused.
Maximum number of tokens to generate. If unset, uses model default.
Nucleus sampling probability mass. Use with
temperature not both.Up to 4 sequences where the model will stop generating tokens.
List of tool definitions for function calling. Each tool has
type: "function" and a function object with name, description, and parameters (JSON Schema).Controls how the model responds to tools. Values:
"none", "auto", or {"type": "function", "function": {"name": "..."}}Set to
{"type": "json_object"} to enable JSON mode.Response
Unique identifier for this completion.
"chat.completion" or "chat.completion.chunk" for streaming.Unix timestamp when the completion was created.
The model used.
Array of completion choices. Usually one unless
n > 1:index— choice indexmessage.role—"assistant"message.content— generated textfinish_reason—"stop","length", or"tool_calls"
Token usage statistics:
prompt_tokenscompletion_tokenstotal_tokens
