A 429 error with the message “Concurrency limit reached for requests” means you’re sending too many concurrent requests to the Serverless Inference API.Documentation Index
Fetch the complete documentation index at: https://wb-21fd5541-john-wbdocs-2044-rename-serverless-products.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Why this happens
Serverless Inference enforces concurrency limits to ensure fair usage and service stability. When the number of simultaneous requests from your account exceeds the allowed limit, additional requests are rejected with a 429 status code.What you can do
-
Reduce concurrent requests
- Implement request queuing or throttling in your application
- Use exponential backoff when retrying failed requests
-
Increase your limits
- Review your plan’s concurrency limits and upgrade if needed
Quotas & Rate Limits