Skip to main content

Why Merit exists

You can assess AI agent and workflow quality using two major approaches.
  • One is to treat them as AI models and measure quality using metrics and datasets.
  • The other is to treat them as software and write automated tests case-by-case.
Merit was built for developers who want both worlds in one framework.

Why developers choose Merit

Merit vs evals

  • native Python syntax instead of custom DSL
  • small granular predicates instead of bloated evaluators
  • cases, assertions, metrics are defined in code, not abstracted away
  • fully composable, no need for going all-in from the beginning

Merit vs pytest

  • access OTEL traces for assertions within the test
  • collect failed assertions into metrics
  • use semantic predicates to assert natural language
  • aggregate results for thousands of iterated cases

Get started

Quick Start

Write your first merit in 5 minutes

Merit Functions

Discovery, parametrization, and dependency injection

Semantic Predicates

LLM-powered assertions for natural language