Skip to main content
Documenso implements rate limiting to ensure fair usage and maintain service quality. All API endpoints are subject to rate limits based on your authentication and usage patterns.

Rate Limit Overview

API v1

100 requests per minutePer API token

API v2

100 requests per minutePer API token

tRPC API

100 requests per minutePer API token

File Upload

20 uploads per minutePer IP address

Rate Limit Implementation

Documenso uses a distributed, database-backed rate limiting system that works across multiple server instances.

How It Works

1

Request Arrives

When your API request arrives, the rate limiter identifies you by:
  • IP Address: For global/anonymous limits
  • User/Token ID: For authenticated limits
2

Time Bucket Calculation

The current time is divided into fixed windows (buckets):
Source: packages/lib/server-only/rate-limit/rate-limit.ts
export const getBucket = (windowMs: number): Date => {
  const now = Date.now();
  return new Date(now - (now % windowMs));
};
For a 1-minute window, all requests between 10:00:00 and 10:00:59 share the same bucket.
3

Counter Increment

The system atomically increments a counter in the database for your bucket:
  • Creates a new counter if this is your first request in the window
  • Increments existing counter if you’ve made requests before
4

Limit Check

Your request count is compared against the limit:
  • Under limit: Request proceeds normally
  • Over limit: Request is rejected with 429 status

Bucket Configuration

Source: packages/lib/server-only/rate-limit/rate-limits.ts
export const apiV1RateLimit = createRateLimit({
  action: 'api.v1',
  max: 100,        // Maximum requests per window
  window: '1m',    // Time window (1 minute)
});

export const apiV2RateLimit = createRateLimit({
  action: 'api.v2',
  max: 100,
  window: '1m',
});

export const fileUploadRateLimit = createRateLimit({
  action: 'api.file-upload',
  max: 20,
  window: '1m',
});

Rate Limit Headers

Every API response includes rate limit information in the headers:
X-RateLimit-Limit
number
The maximum number of requests allowed in the current window
X-RateLimit-Remaining
number
The number of requests remaining in the current window
X-RateLimit-Reset
timestamp
The time when the current rate limit window resets (ISO 8601 format)

Example Response Headers

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 2026-03-04T10:01:00.000Z

Rate Limit Exceeded

When you exceed the rate limit, you’ll receive a 429 Too Many Requests response:
{
  "error": "Rate limit exceeded",
  "message": "Too many requests. Please try again later."
}
When rate limited, wait until the reset time before making additional requests. The Retry-After header indicates seconds to wait.

Handling Rate Limits

1. Monitor Rate Limit Headers

Always check rate limit headers in your responses:
const response = await fetch('https://app.documenso.com/api/v2/envelopes', {
  headers: {
    'Authorization': 'Bearer api_YOUR_TOKEN'
  }
});

const rateLimit = {
  limit: parseInt(response.headers.get('X-RateLimit-Limit')),
  remaining: parseInt(response.headers.get('X-RateLimit-Remaining')),
  reset: new Date(response.headers.get('X-RateLimit-Reset'))
};

console.log(`Requests remaining: ${rateLimit.remaining}/${rateLimit.limit}`);

if (rateLimit.remaining < 10) {
  console.warn('Approaching rate limit!');
}

2. Implement Exponential Backoff

When you receive a 429 response, implement exponential backoff:
async function makeRequestWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);
    
    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After');
      const waitTime = retryAfter 
        ? parseInt(retryAfter) * 1000 
        : Math.pow(2, attempt) * 1000;
      
      console.log(`Rate limited. Waiting ${waitTime}ms before retry...`);
      await new Promise(resolve => setTimeout(resolve, waitTime));
      continue;
    }
    
    return response;
  }
  
  throw new Error('Max retries exceeded');
}

3. Batch Operations

Reduce API calls by batching operations when possible:
// Good: Single request with array
await fetch(`https://app.documenso.com/api/v1/documents/${docId}/fields`, {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer api_YOUR_TOKEN',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify([
    { type: 'SIGNATURE', recipientId: 1, pageNumber: 1, pageX: 100, pageY: 200, pageWidth: 200, pageHeight: 50 },
    { type: 'DATE', recipientId: 1, pageNumber: 1, pageX: 100, pageY: 300, pageWidth: 150, pageHeight: 30 },
    { type: 'NAME', recipientId: 1, pageNumber: 1, pageX: 100, pageY: 350, pageWidth: 200, pageHeight: 30 }
  ])
});

// Bad: Multiple requests
for (const field of fields) {
  await fetch(`https://app.documenso.com/api/v1/documents/${docId}/fields`, {
    method: 'POST',
    body: JSON.stringify(field)
  });
}

4. Cache Responses

Cache API responses when the data doesn’t change frequently:
const cache = new Map();
const CACHE_TTL = 5 * 60 * 1000; // 5 minutes

async function getEnvelopeWithCache(envelopeId) {
  const cacheKey = `envelope:${envelopeId}`;
  const cached = cache.get(cacheKey);
  
  if (cached && Date.now() - cached.timestamp < CACHE_TTL) {
    return cached.data;
  }
  
  const response = await fetch(
    `https://app.documenso.com/api/v2/envelopes/${envelopeId}`,
    { headers: { 'Authorization': 'Bearer api_YOUR_TOKEN' } }
  );
  
  const data = await response.json();
  cache.set(cacheKey, { data, timestamp: Date.now() });
  
  return data;
}

Multi-Tier Rate Limiting

Documenso implements both per-identifier and global (IP-based) rate limits:
Source: packages/lib/server-only/rate-limit/rate-limit.ts
export const createRateLimit = (config: RateLimitConfig) => {
  return {
    async check(params: CheckParams): Promise<RateLimitCheckResult> {
      // Check IP against globalMax (if set)
      const ipResult = await prisma.rateLimit.upsert({
        where: {
          key_action_bucket: {
            key: `ip:${params.ip}`,
            action: config.action,
            bucket,
          },
        },
        create: { count: 1 },
        update: { count: { increment: 1 } },
      });
      
      if (config.globalMax && ipResult.count > config.globalMax) {
        return { isLimited: true, ... };
      }
      
      // Check identifier against max (if provided)
      if (params.identifier) {
        const identifierResult = await prisma.rateLimit.upsert({
          where: {
            key_action_bucket: {
              key: `id:${params.identifier}`,
              action: config.action,
              bucket,
            },
          },
          create: { count: 1 },
          update: { count: { increment: 1 } },
        });
        
        if (identifierResult.count > config.max) {
          return { isLimited: true, ... };
        }
      }
    }
  };
};

Example: Authentication Rate Limits

Some endpoints use stricter limits:
export const loginRateLimit = createRateLimit({
  action: 'auth.login',
  max: 10,        // 10 attempts per identifier
  globalMax: 50,  // 50 attempts per IP
  window: '15m',  // 15 minute window
});

export const forgotPasswordRateLimit = createRateLimit({
  action: 'auth.forgot-password',
  max: 3,         // 3 attempts per email
  globalMax: 20,  // 20 attempts per IP
  window: '1h',   // 1 hour window
});

Bypassing Rate Limits (Development Only)

Only for development/testing environments. Never use in production.
You can bypass rate limits in development by setting an environment variable:
.env
DANGEROUS_BYPASS_RATE_LIMITS=true
When enabled, all rate limit checks return success:
if (process.env.DANGEROUS_BYPASS_RATE_LIMITS === 'true') {
  return {
    isLimited: false,
    remaining: config.max,
    limit: config.max,
    reset,
  };
}

Rate Limit Best Practices

  • Always check X-RateLimit-Remaining headers
  • Implement throttling before hitting limits
  • Use exponential backoff when rate limited
  • Monitor your usage patterns
  • Batch operations when possible
  • Cache responses appropriately
  • Use webhooks instead of polling
  • Only fetch data when needed
  • Implement retry logic with backoff
  • Queue requests during rate limit periods
  • Log rate limit incidents for analysis
  • Alert on repeated rate limiting
  • Use multiple API tokens for high-volume applications
  • Distribute requests across time windows
  • Implement request queuing systems
  • Consider upgrading to higher tiers (if available)

Monitoring Rate Limits

Implement monitoring to track rate limit usage:
Rate Limit Monitoring
class RateLimitMonitor {
  constructor() {
    this.metrics = {
      totalRequests: 0,
      rateLimitedRequests: 0,
      lowestRemaining: Infinity
    };
  }
  
  recordRequest(response) {
    this.metrics.totalRequests++;
    
    if (response.status === 429) {
      this.metrics.rateLimitedRequests++;
    }
    
    const remaining = parseInt(response.headers.get('X-RateLimit-Remaining'));
    if (remaining < this.metrics.lowestRemaining) {
      this.metrics.lowestRemaining = remaining;
    }
  }
  
  getMetrics() {
    return {
      ...this.metrics,
      rateLimitPercentage: (
        this.metrics.rateLimitedRequests / this.metrics.totalRequests * 100
      ).toFixed(2)
    };
  }
}

Next Steps

API Reference

Explore all available API endpoints

Webhooks

Use webhooks to reduce polling and API calls