Skip to main content

Rate Limits

The DocBit AI API implements rate limiting to ensure fair usage and system stability.

Rate Limit Tiers

TierRequests/MinuteRequests/HourConcurrent
Standard601,00010
Professional1205,00025
EnterpriseCustomCustomCustom
Contact your account manager for tier upgrades.

How Limits Apply

Rate limits are applied at the API key level, not per organization or user:
Your API Key
├── Org: acme (shares limit)
├── Org: beta (shares limit)
└── Org: gamma (shares limit)
All requests using the same API key count against the same limit.

Rate Limit Headers

Every response includes rate limit information:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1640995200
HeaderDescription
X-RateLimit-LimitMaximum requests per window
X-RateLimit-RemainingRequests remaining
X-RateLimit-ResetUnix timestamp when limit resets

Handling Rate Limits

When you exceed the rate limit, you’ll receive:
HTTP/1.1 429 Too Many Requests
Retry-After: 30

{
  "error": "Rate limit exceeded"
}

Retry-After Header

The Retry-After header indicates how many seconds to wait:
if (response.status === 429) {
  const retryAfter = parseInt(response.headers['retry-after']) || 60;
  await sleep(retryAfter * 1000);
  return retry();
}

Best Practices

Implement Exponential Backoff

async function requestWithBackoff(fn, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await fn();
    } catch (error) {
      if (error.response?.status === 429) {
        const retryAfter = error.response.headers['retry-after'];
        const waitTime = retryAfter 
          ? parseInt(retryAfter) * 1000 
          : Math.pow(2, attempt) * 1000;
        
        console.log(`Rate limited, waiting ${waitTime}ms...`);
        await sleep(waitTime);
        continue;
      }
      throw error;
    }
  }
  throw new Error('Max retries exceeded');
}

Python Implementation

import time
from requests.exceptions import HTTPError

def request_with_backoff(fn, max_retries=3):
    for attempt in range(max_retries):
        try:
            return fn()
        except HTTPError as e:
            if e.response.status_code == 429:
                retry_after = e.response.headers.get('Retry-After')
                wait_time = int(retry_after) if retry_after else (2 ** attempt)
                print(f'Rate limited, waiting {wait_time}s...')
                time.sleep(wait_time)
                continue
            raise
    raise Exception('Max retries exceeded')

Queue Requests

For high-volume integrations, implement a request queue:
class RequestQueue {
  constructor(maxPerMinute = 60) {
    this.queue = [];
    this.processing = false;
    this.interval = 60000 / maxPerMinute;
  }
  
  async add(request) {
    return new Promise((resolve, reject) => {
      this.queue.push({ request, resolve, reject });
      this.process();
    });
  }
  
  async process() {
    if (this.processing || this.queue.length === 0) return;
    this.processing = true;
    
    while (this.queue.length > 0) {
      const { request, resolve, reject } = this.queue.shift();
      
      try {
        const result = await request();
        resolve(result);
      } catch (error) {
        reject(error);
      }
      
      await sleep(this.interval);
    }
    
    this.processing = false;
  }
}

// Usage
const queue = new RequestQueue(60); // 60 req/min

await queue.add(() => DocBit AIClient.chat(msg1));
await queue.add(() => DocBit AIClient.chat(msg2));

Endpoint-Specific Limits

Some endpoints have additional limits:
EndpointAdditional Limit
/documents/upload10 uploads/minute
/ai/chat/stream30 streams/minute

Monitoring Usage

Track your API usage to avoid hitting limits:
class RateLimitMonitor {
  constructor() {
    this.remaining = null;
    this.resetTime = null;
  }
  
  update(headers) {
    this.remaining = parseInt(headers['x-ratelimit-remaining']);
    this.resetTime = parseInt(headers['x-ratelimit-reset']);
  }
  
  shouldThrottle() {
    if (this.remaining === null) return false;
    return this.remaining < 5; // Buffer of 5 requests
  }
  
  getWaitTime() {
    if (!this.resetTime) return 0;
    return Math.max(0, this.resetTime - Date.now() / 1000);
  }
}

Bulk Operations

For bulk operations, pace your requests:
async function uploadBulk(documents) {
  const results = [];
  
  for (const doc of documents) {
    results.push(await uploadDocument(doc));
    
    // Wait between uploads to avoid rate limits
    await sleep(1000); // 1 second between uploads
  }
  
  return results;
}

Getting Higher Limits

If you need higher rate limits:
  1. Contact support at [email protected]
  2. Describe your use case and expected volume
  3. We’ll work with you on an appropriate tier

FAQ

Limits are per API key. All organizations and users sharing an API key share the same limit.
Yes, all requests count against limits regardless of response status.
Monitor the rate limit headers in responses. We don’t currently offer webhooks for limit warnings.
Implement queuing and backoff. Contact us about burst capacity for planned high-traffic events.