Skip to main content

What are AI Predicates?

AI Predicates are special async assertions that use LLMs to evaluate complex criteria that traditional assertions can’t handle:
  • Semantic fact checking - Not just string matching
  • Hallucination detection - Find unsupported claims
  • Style and structure matching - Beyond regex
  • Policy compliance - Natural language rules

Quick Example

from merit.predicates import has_facts, has_unsupported_facts

async def merit_chatbot_accuracy():
    response = chatbot("Tell me about Paris")
    
    # Check facts are present (semantic matching)
    assert await has_facts(response, "Paris is the capital of France")
    
    # Check no hallucinations
    assert not await has_unsupported_facts(
        response,
        "Paris is the capital of France. The Eiffel Tower is located in Paris."
    )

Available Predicates

Fact Checking

Topics & Policy

Style & Structure

Setup

AI Predicates call the Merit cloud service for evaluation. You need to configure:
  1. Merit API credentials (for predicate evaluation)
  2. LLM provider credentials (used by the Merit service)

Merit API Configuration

Set these environment variables:
.env
# Required
MERIT_API_BASE_URL=https://api.appmerit.com
MERIT_API_KEY=your_merit_api_key_here

# Optional - for debugging
MERIT_DEBUGGING_MODE=false

Get API Access

Contact us to get your Merit API key. Merit automatically loads .env files from your project directory.
How it works: You call Merit’s API with your API key. The Merit service handles all LLM provider interactions internally - you don’t need your own OpenAI/Anthropic keys for predicates.

For Error Analyzer (Different Setup)

If you’re using Merit Analyzer (not predicates), see the Analyzer documentation - it uses your own LLM credentials (bring your own key).

How It Works

When you call an AI predicate:
  1. Request - Merit sends the actual and reference text to the Merit API
  2. Evaluation - An LLM evaluates based on the predicate type
  3. Response - Returns a boolean result with confidence score
  4. Assertion - Use in standard Python assert statements
async def merit_example():
    # Makes API call to evaluate
    result = await has_facts(actual, reference)
    
    # Returns boolean for assertion
    assert result  # True or False

Predicate Results

Predicates return boolean values for assertions, but you can access detailed results:
from merit.predicates import has_facts

async def merit_detailed_results():
    response = chatbot("Tell me about Paris")
    
    # Get detailed result
    result = await has_facts(response, "Paris is capital of France")
    
    # Access metadata (when using predicates directly)
    print(f"Passed: {result.passed}")
    print(f"Confidence: {result.confidence}")
    print(f"Reasoning: {result.reasoning}")
When used in assertions, predicates return simple boolean values. Access PredicateResult objects directly for detailed information.

Async Context

All AI predicates are async and must be called with await:
# ✅ Correct - async test with await
async def merit_correct():
    assert await has_facts(text, reference)

# ❌ Wrong - missing async/await
def merit_wrong():
    assert has_facts(text, reference)  # Error!

Cost Considerations

AI predicates make LLM API calls:
  • ~1-2 cents per assertion (depending on provider and model)
  • Use strategically for complex checks
  • Regular assertions are free and instant
Example strategy:
async def merit_chatbot_comprehensive():
    response = chatbot("Tell me about Paris")
    
    # Free, fast assertions first
    assert len(response) > 10
    assert "Paris" in response
    assert response.isupper() == False
    
    # AI assertions for complex checks
    assert await has_facts(response, "Paris is the capital of France")

Combining Predicates

Use multiple predicates together:
async def merit_comprehensive_check():
    response = generate_article("Paris travel guide")
    
    # Must contain key facts
    assert await has_facts(response, "Paris is the capital of France")
    
    # Must not hallucinate
    assert not await has_unsupported_facts(response, source_text)
    
    # Must cover required topics
    assert await has_topics(response, "accommodation, transportation, attractions")
    
    # Must follow style guide
    assert await matches_writing_style(response, style_guide_example)

Error Handling

TODO: Document error handling for API failures Expected behavior:
async def merit_with_error_handling():
    try:
        assert await has_facts(response, reference)
    except APIError as e:
        # Handle API failures
        pass

Performance Tips

  1. Cache results - Don’t re-evaluate the same text
  2. Batch tests - Run multiple tests together
  3. Use strict mode wisely - Lenient mode is faster
  4. Regular assertions first - Filter obvious failures cheaply

Next Steps