Skip to main content

Documentation Index

Fetch the complete documentation index at: https://wb-21fd5541-john-wbdocs-2044-rename-serverless-products.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Learn about pricing, limits, and other important usage information before using Serverless Inference.

Pricing

For detailed model pricing information, visit Serverless Inference pricing.

Purchase more credits

Serverless Inference credits come with Free, Pro, and Academic plans for a limited time. Enterprise availability may vary. When credits run out:
  • Free accounts must activate a pay-as-you-go inference (under the Billing tab), or upgrade to a paid plan to continue using Serverless Inference. Activate pay-as-you-go or upgrade
  • Pro plan users are billed for overages monthly, based on model-specific pricing
  • Enterprise accounts should contact their account executive

Account tiers and default usage caps

Each account tier has a default spending cap to help manage costs and prevent unexpected charges. W&B requires prepayment for paid Inference access. If you need to change your cap, contact your account executive or support to adjust your limit.
Account TierDefault CapHow to Change Limit
Free$100/monthUpgrade to Pro or Enterprise
Pro$6,000/monthContact your account executive or support for manual review
Enterprise$700,000/yearContact your account executive or support for manual review

Concurrency limits

If you exceed the rate limit, the API returns a 429 Concurrency limit reached for requests response. To fix this error, reduce the number of concurrent requests. For detailed troubleshooting, see Concurrency limits. W&B applies rate limits per W&B project. For example, if you have 3 projects in a team, each project has its own rate limit quota.

Geographic restrictions

The Inference service is only available from supported geographic locations. For more information, see the Terms of Service.

Next steps