Documents API
The Documents API allows you to upload documents for indexing and querying.
Endpoints
| Method | Endpoint | Description |
|---|
| POST | /api/documents/upload | Upload a document |
Upload Document
Upload a document for processing and indexing. Documents are processed asynchronously.
API key with ApiKey prefix
Your client’s organization identifier
The user uploading the document
JSON array of user’s roles
Request Body (multipart/form-data)
The document file to upload
Roles that can access this document. Omit for organization-wide access.
Optional folder path (e.g., /hr/policies)
Supported File Types
| Type | Extensions | Max Size |
|---|
| PDF | .pdf | 100 MB |
| Word | .docx, .doc | 100 MB |
| Excel | .xlsx, .xls | 100 MB |
| Images | .png, .jpg, .jpeg | 20 MB |
| Text | .txt | 10 MB |
Example Request
curl -X POST https://api.docbit.ai/api/documents/upload \
-H "Authorization: ApiKey sk_yourpartner_abc123..." \
-H "X-External-Org-Id: acme" \
-H "X-External-User-Id: admin-1" \
-H "X-External-Roles: [\"admin\"]" \
-F "[email protected]" \
-F "aclRoles=employee" \
-F "aclRoles=all-staff" \
-F "folderPath=/hr/policies"
Response
{
"documentId": "550e8400-e29b-41d4-a716-446655440000",
"status": "Processing"
}
Response Fields
| Field | Type | Description |
|---|
documentId | string | Unique document identifier |
status | string | Processing status |
Document Status Values
| Status | Description |
|---|
Pending | Upload received, waiting to process |
Processing | Being extracted, chunked, and indexed |
Indexed | Ready for querying |
Failed | Processing failed |
Setting Access Control
Restrict to Specific Roles
# Only HR team can access
-F "aclRoles=hr"
# HR and Finance can access
-F "aclRoles=hr" \
-F "aclRoles=finance"
Organization-Wide Access
Omit aclRoles to allow all users in the organization:
# Everyone in the org can access
curl -X POST .../documents/upload \
-F "[email protected]"
# No aclRoles = public to org
Folder Organization
Use folderPath to organize documents:
# Upload to /hr/policies folder
-F "folderPath=/hr/policies"
# Upload to /projects/alpha folder
-F "folderPath=/projects/alpha"
Folders are created automatically if they don’t exist.
Processing Pipeline
After upload, documents go through:
- Extraction - Text is extracted from the document
- Chunking - Text is split into searchable segments
- Embedding - Chunks are converted to vectors
- Indexing - Vectors are indexed for search
This typically takes 10-60 seconds depending on document size.
Code Examples
JavaScript
const FormData = require('form-data');
const fs = require('fs');
async function uploadDocument(filePath, aclRoles = []) {
const form = new FormData();
form.append('file', fs.createReadStream(filePath));
aclRoles.forEach(role => {
form.append('aclRoles', role);
});
const response = await axios.post(
'https://api.docbit.ai/api/documents/upload',
form,
{
headers: {
'Authorization': `ApiKey ${API_KEY}`,
'X-External-Org-Id': 'acme',
'X-External-User-Id': 'admin',
'X-External-Roles': '["admin"]',
...form.getHeaders()
}
}
);
return response.data;
}
// Upload HR document
const result = await uploadDocument('./hr-policy.pdf', ['hr']);
console.log('Document ID:', result.documentId);
Python
import requests
def upload_document(file_path, acl_roles=None):
with open(file_path, 'rb') as f:
files = {'file': f}
data = {}
if acl_roles:
data['aclRoles'] = acl_roles
response = requests.post(
'https://api.docbit.ai/api/documents/upload',
files=files,
data=data,
headers={
'Authorization': f'ApiKey {API_KEY}',
'X-External-Org-Id': 'acme',
'X-External-User-Id': 'admin',
'X-External-Roles': '["admin"]'
}
)
response.raise_for_status()
return response.json()
# Upload HR document
result = upload_document('./hr-policy.pdf', acl_roles=['hr'])
print(f"Document ID: {result['documentId']}")
Error Responses
| Status | Error | Description |
|---|
| 400 | No file provided | File not included in request |
| 400 | Unsupported file type | File format not supported |
| 413 | Request entity too large | File exceeds size limit |
| 401 | Invalid API key | Authentication failed |
| 429 | Rate limit exceeded | Too many requests |
See Error Codes for complete error documentation.