Skip to main content

Rate Limit Overview

API requests are rate limited to ensure fair usage and system stability. Limits are enforced at two levels: per IP address and per user account.

Current Limits

DimensionLimit
Per IP address60 requests / minute
Per user account600 requests / minute
When either limit is exceeded, the API returns HTTP status 429 Too Many Requests. Limits may vary based on your plan — contact support for higher limits.

Rate Limit Exceeded Response

When a rate limit is exceeded, the API responds with:
{
  "success": false,
  "message": "Rate limit exceeded. Please retry later.",
  "error_code": "RATE_LIMIT_EXCEEDED"
}

Handling Rate Limits

Implement Exponential Backoff

async function apiRequestWithBackoff(url, options, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const response = await fetch(url, options);

    if (response.status === 429) {
      const waitTime = Math.pow(2, i) * 1000;
      console.log(`Rate limited. Retrying in ${waitTime}ms...`);
      await sleep(waitTime);
      continue;
    }

    return response;
  }

  throw new Error('Max retries exceeded due to rate limiting');
}

Best Practices

  • Implement exponential backoff with jitter for retries
  • Cache responses when possible to reduce redundant requests
  • Batch operations where supported (e.g. batch document removal)
  • Implement client-side request queuing to smooth out traffic spikes
  • Monitor request volume and log 429 responses for observability

Storage Quotas

ResourceLimit
Total storage100 GB (varies by plan)
Max file size500 MB
ProjectsUnlimited
Memories per projectUnlimited
API keys per project10

Requesting Higher Limits

To request higher rate limits, contact support with:
  • Current usage patterns and peak request volume
  • Expected future usage
  • Use case description

Next Steps

API Overview

Return to API overview

Error Handling

Handle rate limit errors