Rate Limits

Documenso implements rate limiting to ensure fair usage and maintain service quality. All API endpoints are subject to rate limits based on your authentication and usage patterns.

Rate Limit Overview

API v1

100 requests per minutePer API token

API v2

100 requests per minutePer API token

tRPC API

100 requests per minutePer API token

File Upload

20 uploads per minutePer IP address

Rate Limit Implementation

Documenso uses a distributed, database-backed rate limiting system that works across multiple server instances.

How It Works

Request Arrives

When your API request arrives, the rate limiter identifies you by:

IP Address: For global/anonymous limits
User/Token ID: For authenticated limits

Time Bucket Calculation

The current time is divided into fixed windows (buckets):

Source: packages/lib/server-only/rate-limit/rate-limit.ts

export const getBucket = (windowMs: number): Date => {
  const now = Date.now();
  return new Date(now - (now % windowMs));
};

For a 1-minute window, all requests between 10:00:00 and 10:00:59 share the same bucket.

Counter Increment

The system atomically increments a counter in the database for your bucket:

Creates a new counter if this is your first request in the window
Increments existing counter if you’ve made requests before

Limit Check

Your request count is compared against the limit:

Under limit: Request proceeds normally
Over limit: Request is rejected with 429 status

Bucket Configuration

Source: packages/lib/server-only/rate-limit/rate-limits.ts

export const apiV1RateLimit = createRateLimit({
  action: 'api.v1',
  max: 100,        // Maximum requests per window
  window: '1m',    // Time window (1 minute)
});

export const apiV2RateLimit = createRateLimit({
  action: 'api.v2',
  max: 100,
  window: '1m',
});

export const fileUploadRateLimit = createRateLimit({
  action: 'api.file-upload',
  max: 20,
  window: '1m',
});

Rate Limit Headers

Every API response includes rate limit information in the headers:

X-RateLimit-Limit

number

The maximum number of requests allowed in the current window

X-RateLimit-Remaining

number

The number of requests remaining in the current window

X-RateLimit-Reset

timestamp

The time when the current rate limit window resets (ISO 8601 format)

Example Response Headers

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 2026-03-04T10:01:00.000Z

Rate Limit Exceeded

When you exceed the rate limit, you’ll receive a 429 Too Many Requests response:

{
  "error": "Rate limit exceeded",
  "message": "Too many requests. Please try again later."
}

When rate limited, wait until the reset time before making additional requests. The Retry-After header indicates seconds to wait.

Handling Rate Limits

1. Monitor Rate Limit Headers

Always check rate limit headers in your responses:

const response = await fetch('https://app.documenso.com/api/v2/envelopes', {
  headers: {
    'Authorization': 'Bearer api_YOUR_TOKEN'
  }
});

const rateLimit = {
  limit: parseInt(response.headers.get('X-RateLimit-Limit')),
  remaining: parseInt(response.headers.get('X-RateLimit-Remaining')),
  reset: new Date(response.headers.get('X-RateLimit-Reset'))
};

console.log(`Requests remaining: ${rateLimit.remaining}/${rateLimit.limit}`);

if (rateLimit.remaining < 10) {
  console.warn('Approaching rate limit!');
}

2. Implement Exponential Backoff

When you receive a 429 response, implement exponential backoff:

async function makeRequestWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);
    
    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After');
      const waitTime = retryAfter 
        ? parseInt(retryAfter) * 1000 
        : Math.pow(2, attempt) * 1000;
      
      console.log(`Rate limited. Waiting ${waitTime}ms before retry...`);
      await new Promise(resolve => setTimeout(resolve, waitTime));
      continue;
    }
    
    return response;
  }
  
  throw new Error('Max retries exceeded');
}

3. Batch Operations

Reduce API calls by batching operations when possible:

// Good: Single request with array
await fetch(`https://app.documenso.com/api/v1/documents/${docId}/fields`, {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer api_YOUR_TOKEN',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify([
    { type: 'SIGNATURE', recipientId: 1, pageNumber: 1, pageX: 100, pageY: 200, pageWidth: 200, pageHeight: 50 },
    { type: 'DATE', recipientId: 1, pageNumber: 1, pageX: 100, pageY: 300, pageWidth: 150, pageHeight: 30 },
    { type: 'NAME', recipientId: 1, pageNumber: 1, pageX: 100, pageY: 350, pageWidth: 200, pageHeight: 30 }
  ])
});

// Bad: Multiple requests
for (const field of fields) {
  await fetch(`https://app.documenso.com/api/v1/documents/${docId}/fields`, {
    method: 'POST',
    body: JSON.stringify(field)
  });
}

4. Cache Responses

Cache API responses when the data doesn’t change frequently:

const cache = new Map();
const CACHE_TTL = 5 * 60 * 1000; // 5 minutes

async function getEnvelopeWithCache(envelopeId) {
  const cacheKey = `envelope:${envelopeId}`;
  const cached = cache.get(cacheKey);
  
  if (cached && Date.now() - cached.timestamp < CACHE_TTL) {
    return cached.data;
  }
  
  const response = await fetch(
    `https://app.documenso.com/api/v2/envelopes/${envelopeId}`,
    { headers: { 'Authorization': 'Bearer api_YOUR_TOKEN' } }
  );
  
  const data = await response.json();
  cache.set(cacheKey, { data, timestamp: Date.now() });
  
  return data;
}

Multi-Tier Rate Limiting

Documenso implements both per-identifier and global (IP-based) rate limits:

Source: packages/lib/server-only/rate-limit/rate-limit.ts

export const createRateLimit = (config: RateLimitConfig) => {
  return {
    async check(params: CheckParams): Promise<RateLimitCheckResult> {
      // Check IP against globalMax (if set)
      const ipResult = await prisma.rateLimit.upsert({
        where: {
          key_action_bucket: {
            key: `ip:${params.ip}`,
            action: config.action,
            bucket,
          },
        },
        create: { count: 1 },
        update: { count: { increment: 1 } },
      });
      
      if (config.globalMax && ipResult.count > config.globalMax) {
        return { isLimited: true, ... };
      }
      
      // Check identifier against max (if provided)
      if (params.identifier) {
        const identifierResult = await prisma.rateLimit.upsert({
          where: {
            key_action_bucket: {
              key: `id:${params.identifier}`,
              action: config.action,
              bucket,
            },
          },
          create: { count: 1 },
          update: { count: { increment: 1 } },
        });
        
        if (identifierResult.count > config.max) {
          return { isLimited: true, ... };
        }
      }
    }
  };
};

Example: Authentication Rate Limits

Some endpoints use stricter limits:

export const loginRateLimit = createRateLimit({
  action: 'auth.login',
  max: 10,        // 10 attempts per identifier
  globalMax: 50,  // 50 attempts per IP
  window: '15m',  // 15 minute window
});

export const forgotPasswordRateLimit = createRateLimit({
  action: 'auth.forgot-password',
  max: 3,         // 3 attempts per email
  globalMax: 20,  // 20 attempts per IP
  window: '1h',   // 1 hour window
});

Bypassing Rate Limits (Development Only)

Only for development/testing environments. Never use in production.

You can bypass rate limits in development by setting an environment variable:

.env

DANGEROUS_BYPASS_RATE_LIMITS=true

When enabled, all rate limit checks return success:

if (process.env.DANGEROUS_BYPASS_RATE_LIMITS === 'true') {
  return {
    isLimited: false,
    remaining: config.max,
    limit: config.max,
    reset,
  };
}

Rate Limit Best Practices

Respect Rate Limits

Always check X-RateLimit-Remaining headers
Implement throttling before hitting limits
Use exponential backoff when rate limited
Monitor your usage patterns

Optimize API Usage

Batch operations when possible
Cache responses appropriately
Use webhooks instead of polling
Only fetch data when needed

Handle Errors Gracefully

Implement retry logic with backoff
Queue requests during rate limit periods
Log rate limit incidents for analysis
Alert on repeated rate limiting

Distribute Load

Use multiple API tokens for high-volume applications
Distribute requests across time windows
Implement request queuing systems
Consider upgrading to higher tiers (if available)

Monitoring Rate Limits

Implement monitoring to track rate limit usage:

Rate Limit Monitoring

class RateLimitMonitor {
  constructor() {
    this.metrics = {
      totalRequests: 0,
      rateLimitedRequests: 0,
      lowestRemaining: Infinity
    };
  }
  
  recordRequest(response) {
    this.metrics.totalRequests++;
    
    if (response.status === 429) {
      this.metrics.rateLimitedRequests++;
    }
    
    const remaining = parseInt(response.headers.get('X-RateLimit-Remaining'));
    if (remaining < this.metrics.lowestRemaining) {
      this.metrics.lowestRemaining = remaining;
    }
  }
  
  getMetrics() {
    return {
      ...this.metrics,
      rateLimitPercentage: (
        this.metrics.rateLimitedRequests / this.metrics.totalRequests * 100
      ).toFixed(2)
    };
  }
}

Rate Limit Overview

API v1

API v2

tRPC API

File Upload

Rate Limit Implementation

How It Works

Bucket Configuration

Rate Limit Headers

Example Response Headers

Rate Limit Exceeded

Handling Rate Limits

1. Monitor Rate Limit Headers

2. Implement Exponential Backoff

3. Batch Operations

4. Cache Responses

Multi-Tier Rate Limiting

Example: Authentication Rate Limits

Bypassing Rate Limits (Development Only)

Rate Limit Best Practices

Monitoring Rate Limits

Next Steps

API Reference

Webhooks

​Rate Limit Overview

API v1

API v2

tRPC API

File Upload

​Rate Limit Implementation

​How It Works

​Bucket Configuration

​Rate Limit Headers

​Example Response Headers

​Rate Limit Exceeded

​Handling Rate Limits

​1. Monitor Rate Limit Headers

​2. Implement Exponential Backoff

​3. Batch Operations

​4. Cache Responses

​Multi-Tier Rate Limiting

​Example: Authentication Rate Limits

​Bypassing Rate Limits (Development Only)

​Rate Limit Best Practices

​Monitoring Rate Limits

​Next Steps

API Reference

Webhooks

Rate Limit Overview

Rate Limit Implementation

How It Works

Bucket Configuration

Rate Limit Headers

Example Response Headers

Rate Limit Exceeded

Handling Rate Limits

1. Monitor Rate Limit Headers

2. Implement Exponential Backoff

3. Batch Operations

4. Cache Responses

Multi-Tier Rate Limiting

Example: Authentication Rate Limits

Bypassing Rate Limits (Development Only)

Rate Limit Best Practices

Monitoring Rate Limits

Next Steps