Skip to main content

Documentation Index

Fetch the complete documentation index at: https://student-213fb9fc.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Overview

AJ STUDIOZ Cloud Infra provides text embedding endpoints compatible with both the Ollama and OpenAI APIs. Embeddings are vector representations of text used for semantic search, RAG (retrieval-augmented generation), clustering, and classification.

Ollama-Compatible Embeddings

curl https://api.ajstudioz.co.in/api/embeddings \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma3:4b",
    "prompt": "The quick brown fox jumps over the lazy dog"
  }'
{
  "embedding": [0.12, -0.43, 0.89, 0.23, ...]
}

OpenAI-Compatible Embeddings

from openai import OpenAI

client = OpenAI(
    base_url="https://api.ajstudioz.co.in/v1",
    api_key="YOUR_API_KEY"
)

response = client.embeddings.create(
    model="gemma3:4b",
    input="The quick brown fox jumps over the lazy dog"
)

embedding = response.data[0].embedding
print(f"Embedding dimension: {len(embedding)}")
print(f"First 5 values: {embedding[:5]}")

Batch Embeddings

texts = [
    "Machine learning is a subset of AI",
    "Deep learning uses neural networks",
    "Natural language processing handles text",
    "Computer vision analyzes images"
]

response = client.embeddings.create(
    model="gemma3:4b",
    input=texts
)

embeddings = [item.embedding for item in response.data]
print(f"Generated {len(embeddings)} embeddings")

Building a RAG Pipeline

from openai import OpenAI
import numpy as np

client = OpenAI(
    base_url="https://api.ajstudioz.co.in/v1",
    api_key="YOUR_API_KEY"
)

def get_embedding(text: str) -> list[float]:
    response = client.embeddings.create(model="gemma3:4b", input=text)
    return response.data[0].embedding

def cosine_similarity(a: list[float], b: list[float]) -> float:
    a, b = np.array(a), np.array(b)
    return float(np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)))

# Your knowledge base
documents = [
    "AJ STUDIOZ Cloud Infra hosts 30+ AI models",
    "The Kimi K2 model has over 1 trillion parameters",
    "Gemma 3 is made by Google DeepMind",
    "DeepSeek V3.2 excels at coding tasks",
]

# Create embeddings for all documents
doc_embeddings = [get_embedding(doc) for doc in documents]

# Query
query = "Which models are best for programming?"
query_embedding = get_embedding(query)

# Find most similar
similarities = [cosine_similarity(query_embedding, doc_emb) for doc_emb in doc_embeddings]
best_idx = np.argmax(similarities)

print(f"Most relevant: {documents[best_idx]}")
print(f"Similarity: {similarities[best_idx]:.4f}")

# Use in a RAG prompt
context = documents[best_idx]
response = client.chat.completions.create(
    model="deepseek-v3.2",
    messages=[
        {"role": "system", "content": f"Use this context to answer: {context}"},
        {"role": "user", "content": query}
    ]
)
print(response.choices[0].message.content)

Available Embedding Models

Any text-capable model can be used for embeddings. Recommended models for embeddings:
ModelBest for
gemma3:4bFast, general-purpose embeddings
gemma3:12bHigher-quality embeddings
ministral-3:3bUltra-fast, low latency
deepseek-v3.2Code + text, high quality