
The API for AI phone calls
Independent AI evaluations lab
Make AI phone calls from a single API call.
We build independent and contamination-proof benchmarks that measure real world performance. LLM Stats is the most complete LLM leaderboard. We have the most complete archive of LLM benchmark results and also run independent evaluations that are not the classical ones that are already in the training data of most models. Our mission: become the biggest community dedicated to AI transparency.
CallingBox completely pivoted from an API service for making AI phone calls to an independent AI evaluation lab that builds benchmarks and runs LLM performance tests - entirely different product, market, and problem space.
Independent AI evaluations lab(viewing)
Auto-optimizer for AI agents