Documentation Index
Fetch the complete documentation index at: https://student-213fb9fc.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Overview
AJ STUDIOZ Cloud Infra provides text embedding endpoints compatible with both the Ollama and OpenAI APIs. Embeddings are vector representations of text used for semantic search, RAG (retrieval-augmented generation), clustering, and classification.
Ollama-Compatible Embeddings
curl https://api.ajstudioz.co.in/api/embeddings \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gemma3:4b",
"prompt": "The quick brown fox jumps over the lazy dog"
}'
{
"embedding": [0.12, -0.43, 0.89, 0.23, ...]
}
OpenAI-Compatible Embeddings
from openai import OpenAI
client = OpenAI(
base_url="https://api.ajstudioz.co.in/v1",
api_key="YOUR_API_KEY"
)
response = client.embeddings.create(
model="gemma3:4b",
input="The quick brown fox jumps over the lazy dog"
)
embedding = response.data[0].embedding
print(f"Embedding dimension: {len(embedding)}")
print(f"First 5 values: {embedding[:5]}")
Batch Embeddings
texts = [
"Machine learning is a subset of AI",
"Deep learning uses neural networks",
"Natural language processing handles text",
"Computer vision analyzes images"
]
response = client.embeddings.create(
model="gemma3:4b",
input=texts
)
embeddings = [item.embedding for item in response.data]
print(f"Generated {len(embeddings)} embeddings")
Building a RAG Pipeline
from openai import OpenAI
import numpy as np
client = OpenAI(
base_url="https://api.ajstudioz.co.in/v1",
api_key="YOUR_API_KEY"
)
def get_embedding(text: str) -> list[float]:
response = client.embeddings.create(model="gemma3:4b", input=text)
return response.data[0].embedding
def cosine_similarity(a: list[float], b: list[float]) -> float:
a, b = np.array(a), np.array(b)
return float(np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)))
# Your knowledge base
documents = [
"AJ STUDIOZ Cloud Infra hosts 30+ AI models",
"The Kimi K2 model has over 1 trillion parameters",
"Gemma 3 is made by Google DeepMind",
"DeepSeek V3.2 excels at coding tasks",
]
# Create embeddings for all documents
doc_embeddings = [get_embedding(doc) for doc in documents]
# Query
query = "Which models are best for programming?"
query_embedding = get_embedding(query)
# Find most similar
similarities = [cosine_similarity(query_embedding, doc_emb) for doc_emb in doc_embeddings]
best_idx = np.argmax(similarities)
print(f"Most relevant: {documents[best_idx]}")
print(f"Similarity: {similarities[best_idx]:.4f}")
# Use in a RAG prompt
context = documents[best_idx]
response = client.chat.completions.create(
model="deepseek-v3.2",
messages=[
{"role": "system", "content": f"Use this context to answer: {context}"},
{"role": "user", "content": query}
]
)
print(response.choices[0].message.content)
Available Embedding Models
Any text-capable model can be used for embeddings. Recommended models for embeddings:
| Model | Best for |
|---|
gemma3:4b | Fast, general-purpose embeddings |
gemma3:12b | Higher-quality embeddings |
ministral-3:3b | Ultra-fast, low latency |
deepseek-v3.2 | Code + text, high quality |