Documentation Index
Fetch the complete documentation index at: https://student-213fb9fc.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Overview
AJ STUDIOZ Cloud Infra enforces rate limits to ensure fair access and platform stability. Limits apply per API key and are reset on a rolling or monthly basis depending on your plan.Rate Limit Headers
Every API response includes headers showing your current usage:| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the window |
X-RateLimit-Remaining | Requests remaining in the current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
Retry-After | Seconds to wait when rate limited (429 responses only) |
Plan Limits
| Plan | Requests / min | Tokens / day | Concurrent Requests |
|---|---|---|---|
| Free | 10 | 100K | 2 |
| Developer | 60 | 1M | 10 |
| Pro | 200 | 10M | 30 |
| Enterprise | Custom | Custom | Custom |
Token limits apply across all models. Larger models consume more tokens per request.
Rate Limit Errors
When you exceed your rate limit, the API returns a429 Too Many Requests response:
Handling Rate Limits
Python (with retry)
Python (with tenacity)
Best Practices
- Batch requests — combine multiple prompts where possible instead of making individual calls
- Stream responses — use streaming to get faster first tokens without increasing rate limit usage
- Cache results — cache identical prompts/responses to avoid redundant API calls
- Use smaller models for dev — use
gemma3:4borgemma3:12bduring development to save quota - Monitor headers — track
X-RateLimit-Remainingto proactively back off before hitting limits
