Rate Limits
The DocBit AI API implements rate limiting to ensure fair usage and system stability.Rate Limit Tiers
| Tier | Requests/Minute | Requests/Hour | Concurrent |
|---|---|---|---|
| Standard | 60 | 1,000 | 10 |
| Professional | 120 | 5,000 | 25 |
| Enterprise | Custom | Custom | Custom |
How Limits Apply
Rate limits are applied at the API key level, not per organization or user:Rate Limit Headers
Every response includes rate limit information:| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests per window |
X-RateLimit-Remaining | Requests remaining |
X-RateLimit-Reset | Unix timestamp when limit resets |
Handling Rate Limits
When you exceed the rate limit, you’ll receive:Retry-After Header
TheRetry-After header indicates how many seconds to wait:
Best Practices
Implement Exponential Backoff
Python Implementation
Queue Requests
For high-volume integrations, implement a request queue:Endpoint-Specific Limits
Some endpoints have additional limits:| Endpoint | Additional Limit |
|---|---|
/documents/upload | 10 uploads/minute |
/ai/chat/stream | 30 streams/minute |
Monitoring Usage
Track your API usage to avoid hitting limits:Bulk Operations
For bulk operations, pace your requests:Getting Higher Limits
If you need higher rate limits:- Contact support at [email protected]
- Describe your use case and expected volume
- We’ll work with you on an appropriate tier
FAQ
Are limits per user or per API key?
Are limits per user or per API key?
Limits are per API key. All organizations and users sharing an API key share the same limit.
Do failed requests count against limits?
Do failed requests count against limits?
Yes, all requests count against limits regardless of response status.
Can I get real-time limit notifications?
Can I get real-time limit notifications?
Monitor the rate limit headers in responses. We don’t currently offer webhooks for limit warnings.
What happens during traffic spikes?
What happens during traffic spikes?
Implement queuing and backoff. Contact us about burst capacity for planned high-traffic events.