Ollama-Compatible API
Generate
Generate a text completion for a given prompt (Ollama-compatible)
POST
Generate
Documentation Index
Fetch the complete documentation index at: https://student-213fb9fc.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Generate a response for a given prompt with a provided model. This is the basic text completion endpoint, compatible with the Ollama/api/generate format.
Request
Headers
Bearer token:
Bearer YOUR_API_KEYapplication/jsonBody
The model name to use. See available models.Example:
"gemma3:27b", "deepseek-v3.2", "kimi-k2:1t"The prompt to generate a response for.
If
true, responses are streamed as they are generated. If false, the full response is returned in one request.System message to set the behavior of the assistant.
Model parameter overrides. Supports the following fields:
temperature(float) — sampling temperature (0–2)top_p(float) — nucleus samplingtop_k(integer) — top-k samplingnum_predict(integer) — max tokens to generatestop(array of strings) — stop sequences
The context returned from a previous request, used to keep a short conversational memory.
If
true, no formatting is applied to the prompt. Use only when applying your own custom prompt template.Response
The model used for generation.
ISO 8601 timestamp of when the response was generated.
The generated text. Empty if streaming is in progress.
true when generation is complete.Reason generation stopped. One of:
stop, length, error.An encoding of the conversation for use in the next request (to keep memory).
Total time in nanoseconds.
Number of tokens generated.
