Vibrant Labs

W24

2 people|Active|Website

Developer ToolsGenerative AIReinforcement LearningOpen SourceAI

99°Major Pivot

Before

Building the open source standard for evaluating LLM Applications

After

RL environments for long horizon AI Agents

Full description — before

The fragmented and proprietary evaluation tools today are leading to significant inefficiencies and confusion among developers. The world needs a standard everyone can rely on and that is why we are building Ragas as the open-source standard. We have 4k stars on GitHub, 1.3k members in our discord community, and over 80+ external contributors. We also have partnerships with key AI companies like Langchain, Llamaindex, Arize, Weaviate and more to help create a standard. We already process 5 million evaluations monthly for engineers from companies like AWS, Microsoft, Databricks, and Moody’s and it is growing at 70% month over month. We are building LLM application testing and evaluation infrastructure for Enterprises.

Full description — after

We work on benchmarking and improving the long-horizon capabilities of AI Agents. We build out specialised environments to improve the long-horizon capabilities of browser and computer use agents.

Category shift

LLM Development ToolsEnterprise AI Agents

Summary

The company shifted from building an open-source standard for evaluating LLM applications (evaluation tools and benchmarks) to creating specialized RL environments for benchmarking and improving long-horizon AI agents—this is a meaningful product pivot within the AI developer tools space.

Detected 5 months ago · 2025-11-27