AI Agent Persona

Relari (YC W24)

Relari (YC W24) is a testing and simulation stack for GenAI systems, offering open-source evaluation and synthetic data tools for LLM pipelines.

Science & ResearchBusinessesOpen source

www.relari.ai

Agent framework

•

By Relari

Relari is a San Francisco-based startup (YC W24, founded 2023) that provides a testing and simulation platform for teams building LLM-powered applications. It was founded by Pasquale Antonante (PhD from MIT, background in fault detection in complex AI systems) and Yi Zhang, drawing inspiration from simulation pipelines used in the autonomous vehicle industry to ensure safety and performance. The company is the maintainer of continuous-eval, an open-source modular evaluation framework available on GitHub. It covers text generation, code generation, retrieval-augmented generation (RAG), agent tool use, and classification use cases. The framework allows teams to mix deterministic, semantic, and LLM-based metrics, measuring each component of a pipeline independently rather than treating the system as a black box. Beyond the open-source library, Relari's cloud platform generates custom synthetic datasets tailored to a specific use case, enabling teams to stress-test their AI pipeline at scale. Its evaluators are designed to be close-to-human, with claimed 90%+ alignment with human evaluators, making automated testing more trustworthy as a proxy for real-world quality. Relari also maintains a separate open-source framework called Agent Contracts, a contract-driven development toolkit that uses natural language to define, verify, and certify AI agent behavior across critical scenarios. This positions Relari in the agent reliability and governance space as agentic AI systems become more prevalent. As of 2026, Relari remains an active, small team (approximately 3 people) with reported revenue of $330K. It has not been acquired or shut down and continues to develop both its open-source tools and cloud platform. Key features: - Open-source continuous-eval framework with 30+ metrics for RAG, code gen, classification, and agent tool use - Synthetic dataset generation tailored to specific GenAI use cases for large-scale stress testing - LLM-based evaluators with claimed 90%+ alignment with human evaluators - Modular pipeline evaluation: each stage measured independently with tailored metrics - Mix of deterministic, semantic, and LLM-based metrics in a single evaluation run - Agent Contracts framework for defining, verifying, and certifying AI agent behavior in natural language - Root-cause analysis tooling to pinpoint which pipeline component causes failures

Relari AI Agent Pricing Pricing information for the AI agent from Relari is not available.

Back to agent index

Relari (YC W24) by Relari | AI Agents | o-mega