Skip to main content

Documents API

The Documents API allows you to upload documents for indexing and querying.

Endpoints

MethodEndpointDescription
POST/api/documents/uploadUpload a document

Upload Document

Upload a document for processing and indexing. Documents are processed asynchronously.
Authorization
string
required
API key with ApiKey prefix
X-External-Org-Id
string
required
Your client’s organization identifier
X-External-User-Id
string
required
The user uploading the document
X-External-Roles
string
required
JSON array of user’s roles

Request Body (multipart/form-data)

file
file
required
The document file to upload
aclRoles
array
Roles that can access this document. Omit for organization-wide access.
folderPath
string
Optional folder path (e.g., /hr/policies)

Supported File Types

TypeExtensionsMax Size
PDF.pdf100 MB
Word.docx, .doc100 MB
Excel.xlsx, .xls100 MB
Images.png, .jpg, .jpeg20 MB
Text.txt10 MB

Example Request

curl -X POST https://api.docbit.ai/api/documents/upload \
  -H "Authorization: ApiKey sk_yourpartner_abc123..." \
  -H "X-External-Org-Id: acme" \
  -H "X-External-User-Id: admin-1" \
  -H "X-External-Roles: [\"admin\"]" \
  -F "[email protected]" \
  -F "aclRoles=employee" \
  -F "aclRoles=all-staff" \
  -F "folderPath=/hr/policies"

Response

{
  "documentId": "550e8400-e29b-41d4-a716-446655440000",
  "status": "Processing"
}

Response Fields

FieldTypeDescription
documentIdstringUnique document identifier
statusstringProcessing status

Document Status Values

StatusDescription
PendingUpload received, waiting to process
ProcessingBeing extracted, chunked, and indexed
IndexedReady for querying
FailedProcessing failed

Setting Access Control

Restrict to Specific Roles

# Only HR team can access
-F "aclRoles=hr"

# HR and Finance can access
-F "aclRoles=hr" \
-F "aclRoles=finance"

Organization-Wide Access

Omit aclRoles to allow all users in the organization:
# Everyone in the org can access
curl -X POST .../documents/upload \
  -F "[email protected]"
  # No aclRoles = public to org

Folder Organization

Use folderPath to organize documents:
# Upload to /hr/policies folder
-F "folderPath=/hr/policies"

# Upload to /projects/alpha folder
-F "folderPath=/projects/alpha"
Folders are created automatically if they don’t exist.

Processing Pipeline

After upload, documents go through:
  1. Extraction - Text is extracted from the document
  2. Chunking - Text is split into searchable segments
  3. Embedding - Chunks are converted to vectors
  4. Indexing - Vectors are indexed for search
This typically takes 10-60 seconds depending on document size.

Code Examples

JavaScript

const FormData = require('form-data');
const fs = require('fs');

async function uploadDocument(filePath, aclRoles = []) {
  const form = new FormData();
  form.append('file', fs.createReadStream(filePath));
  
  aclRoles.forEach(role => {
    form.append('aclRoles', role);
  });
  
  const response = await axios.post(
    'https://api.docbit.ai/api/documents/upload',
    form,
    {
      headers: {
        'Authorization': `ApiKey ${API_KEY}`,
        'X-External-Org-Id': 'acme',
        'X-External-User-Id': 'admin',
        'X-External-Roles': '["admin"]',
        ...form.getHeaders()
      }
    }
  );
  
  return response.data;
}

// Upload HR document
const result = await uploadDocument('./hr-policy.pdf', ['hr']);
console.log('Document ID:', result.documentId);

Python

import requests

def upload_document(file_path, acl_roles=None):
    with open(file_path, 'rb') as f:
        files = {'file': f}
        data = {}
        
        if acl_roles:
            data['aclRoles'] = acl_roles
        
        response = requests.post(
            'https://api.docbit.ai/api/documents/upload',
            files=files,
            data=data,
            headers={
                'Authorization': f'ApiKey {API_KEY}',
                'X-External-Org-Id': 'acme',
                'X-External-User-Id': 'admin',
                'X-External-Roles': '["admin"]'
            }
        )
        response.raise_for_status()
        return response.json()

# Upload HR document
result = upload_document('./hr-policy.pdf', acl_roles=['hr'])
print(f"Document ID: {result['documentId']}")

Error Responses

StatusErrorDescription
400No file providedFile not included in request
400Unsupported file typeFile format not supported
413Request entity too largeFile exceeds size limit
401Invalid API keyAuthentication failed
429Rate limit exceededToo many requests
See Error Codes for complete error documentation.