Why Merit exists
You can assess AI agent and workflow quality using two major approaches.- One is to treat them as AI models and measure quality using metrics and datasets.
- The other is to treat them as software and write automated tests case-by-case.
Why developers choose Merit
Merit vs evals
- native Python syntax instead of custom DSL
- small granular predicates instead of bloated evaluators
- cases, assertions, metrics are defined in code, not abstracted away
- fully composable, no need for going all-in from the beginning
Merit vs pytest
- access OTEL traces for assertions within the test
- collect failed assertions into metrics
- use semantic predicates to assert natural language
- aggregate results for thousands of iterated cases