Skip to main content

Setup Function

init_tracing

Initialize OpenTelemetry tracing with streaming file export. Signature:
def init_tracing(
    *,
    service_name: str = "merit",
    trace_content: bool | None = None,
    output_path: Path | str = ".merit/traces.jsonl",
) -> None
Parameters:
NameTypeDefaultDescription
service_namestr"merit"Service name in trace metadata
trace_contentbool | NoneNoneContent capture toggle. Note: today Merit uses MERIT_TRACE_CONTENT to control SUT span input/output attributes; trace_content is not currently applied to OpenAI/Anthropic client instrumentation.
output_pathPath | str".merit/traces.jsonl"File path for trace export
Returns: None Side Effects:
  • Sets up OpenTelemetry tracer provider
  • Instruments OpenAI and Anthropic clients
  • Creates/truncates output file
Important: Must be called before instantiating LLM clients to ensure instrumentation captures all calls. Example:
from merit import init_tracing

# Basic setup
init_tracing()

# Custom configuration
init_tracing(
    service_name="my-ai-system",
    output_path="traces/run_001.jsonl"
)

# Now create LLM clients - they'll be automatically traced
from openai import OpenAI
client = OpenAI()  # All calls traced

from anthropic import Anthropic
claude = Anthropic()  # All calls traced
Environment Variables:
  • MERIT_TRACE_CONTENT: Set to "false" to avoid recording SUT input/output content (sut.input.*, sut.output). This does not currently guarantee LLM client instrumentation content is redacted.

Context Manager

trace_step

Create a custom span for tracing application logic. Signature:
@contextmanager
def trace_step(
    name: str,
    attributes: dict[str, Any] | None = None
) -> Iterator[Span]
Parameters:
NameTypeDefaultDescription
namestr-Name of the span
attributesdict[str, Any] | NoneNoneOptional attributes to attach to span
Yields: Span - OpenTelemetry span object Example:
from merit import trace_step

async def merit_agent_pipeline(agent):
    with trace_step("retrieve_context", {"query": "user question"}):
        context = agent.retrieve("user question")

    with trace_step("generate_response", {"context_length": len(context)}) as span:
        response = await agent.generate(context)
        span.set_attribute("response_length", len(response))

    assert response
Nested Spans:
from merit import trace_step

def complex_pipeline():
    with trace_step("pipeline"):
        with trace_step("stage_1"):
            result_1 = process_stage_1()

        with trace_step("stage_2"):
            result_2 = process_stage_2(result_1)

        with trace_step("stage_3"):
            return process_stage_3(result_2)

Classes

TraceContext

Provides access to trace data for the current test execution. Injection: TraceContext is automatically injected when a merit function declares trace_context as a parameter. It enables querying child spans, LLM calls, and setting custom attributes on the test span. Properties:
NameTypeDescription
trace_idstrThe trace ID for this test’s span (32 hex characters)
span_idstrThe span ID for this test’s span (16 hex characters)
is_enabledboolWhether tracing is currently enabled
Methods:
MethodReturnsDescription
get_child_spans()list[ReadableSpan]Get all spans created during this test’s execution
get_llm_calls()list[ReadableSpan]Get spans from LLM API calls (OpenAI, Anthropic)
get_sut_spans(name=None)list[ReadableSpan]Get spans from @merit.sut decorated functions, optionally filtered by name
set_attribute(key, value)NoneSet a custom attribute on the test span
Example:
import merit

@merit.sut
def my_agent(prompt: str) -> str:
    with merit.trace_step("retrieve"):
        docs = retrieve_docs(prompt)

    with merit.trace_step("generate"):
        return generate_response(docs)

def merit_agent_workflow(my_agent, trace_context):
    """Use trace_context to assert on execution flow."""
    result = my_agent("What is Python?")

    # Set custom attributes on test span
    trace_context.set_attribute("response.length", len(result))

    # Get all child spans
    all_spans = trace_context.get_child_spans()
    assert len(all_spans) >= 2, "Expected at least 2 trace steps"

    # Get SUT-specific spans
    sut_spans = trace_context.get_sut_spans(name="my_agent")
    assert len(sut_spans) == 1
    assert sut_spans[0].name == "sut.my_agent"

    # Get LLM calls (if any)
    llm_calls = trace_context.get_llm_calls()
    for call in llm_calls:
        model = call.attributes.get("llm.model")
        print(f"LLM call used model: {model}")

    # Check trace IDs for correlation
    print(f"Trace ID: {trace_context.trace_id}")
    print(f"Span ID: {trace_context.span_id}")
Filtering SUT Spans:
def merit_multiple_suts(agent_a, agent_b, trace_context):
    """Filter SUT spans by name when multiple SUTs are used."""
    result_a = agent_a("query 1")
    result_b = agent_b("query 2")

    # Get spans for specific SUT
    agent_a_spans = trace_context.get_sut_spans(name="agent_a")
    agent_b_spans = trace_context.get_sut_spans(name="agent_b")

    assert len(agent_a_spans) == 1
    assert len(agent_b_spans) == 1
Conditional Logic Based on Tracing:
def merit_tracing_enabled_only(my_agent, trace_context):
    """This test requires tracing (run with `merit test --trace`)."""
    result = my_agent("test query")

    spans = trace_context.get_child_spans()
    assert len(spans) > 0, "Expected trace spans"
    assert result is not None

Utility Functions

get_tracer

Get an OpenTelemetry tracer instance for creating custom spans. Signature:
def get_tracer(name: str = "merit") -> Tracer
Parameters:
NameTypeDefaultDescription
namestr"merit"Tracer name
Returns: Tracer - OpenTelemetry tracer instance Example:
from merit.tracing import get_tracer

tracer = get_tracer("my-component")

def custom_function():
    with tracer.start_as_current_span("custom_operation") as span:
        span.set_attribute("custom_key", "custom_value")
        # Your code here
        result = do_work()
        span.set_attribute("result_size", len(result))
        return result

clear_traces

Clear the trace output file. Signature:
def clear_traces() -> None
Parameters: None Returns: None Example:
from merit.tracing import clear_traces, init_tracing

# Setup tracing
init_tracing(output_path=".merit/traces.jsonl")

# Run some tests...
merit_run_1()

# Clear traces before next run
clear_traces()

# Run more tests with fresh trace file
merit_run_2()

set_trace_output_path

Change the trace output path for the current exporter. Signature:
def set_trace_output_path(output_path: Path | str) -> None
Parameters:
NameTypeDescription
output_pathPath | strNew file path for trace export
Returns: None Example:
from merit.tracing import set_trace_output_path, init_tracing

# Initial setup
init_tracing(output_path=".merit/traces.jsonl")

# Change output path mid-run
set_trace_output_path("traces/experiment_2.jsonl")

get_span_collector

Get the current span collector instance for accessing collected spans. Signature:
def get_span_collector() -> InMemorySpanCollector | None
Parameters: None Returns: InMemorySpanCollector | None - The active span collector, or None if tracing is not enabled Example:
from merit.tracing import get_span_collector

def merit_advanced_tracing(my_agent, trace_context):
    """Access raw OpenTelemetry spans for the current test (requires `--trace`)."""
    result = my_agent("test query")

    collector = get_span_collector()
    if collector:
        spans = collector.get_spans(trace_context.trace_id)
        print(f"Total spans in this test trace: {len(spans)}")

    assert result
Note: Most tests should use trace_context parameter instead, which provides a cleaner API scoped to the current test. get_span_collector() is useful for advanced scenarios requiring access to all spans.

InMemorySpanCollector

Internal class that collects and stores OpenTelemetry spans during test execution. Purpose: This is an advanced/internal API used by Merit’s tracing system. Most users should use TraceContext instead. Methods:
MethodReturnsDescription
get_spans(trace_id)list[ReadableSpan]Get all spans for a specific trace ID
clear(trace_id)NoneClear spans for a specific trace ID
clear_all()NoneClear all collected spans
Example:
from merit.tracing import get_span_collector

def merit_inspect_current_trace(trace_context):
    """Inspect raw spans for the current test (requires `--trace`)."""
    collector = get_span_collector()
    if not collector:
        return

    spans = collector.get_spans(trace_context.trace_id)
    llm_spans = [s for s in spans if s.name.startswith(("openai.", "anthropic.", "gen_ai."))]
    print(f"Total spans: {len(spans)}")
    print(f"Total LLM spans: {len(llm_spans)}")
When to use:
  • Custom test runners or frameworks built on Merit
  • Advanced trace analysis across multiple tests
  • Performance profiling and debugging
When NOT to use:
  • Regular test assertions (use trace_context parameter)
  • Single-test trace inspection (use trace_context.get_child_spans())

Automatic Tracing

LLM Client Instrumentation

When init_tracing() is called, Merit automatically instruments:
  • OpenAI - openai package
  • Anthropic - anthropic package
All LLM calls are captured with:
  • Request parameters (model, temperature, messages, etc.)
  • Response/content details depend on the underlying OpenTelemetry instrumentor configuration (Merit does not currently toggle this via trace_content)
  • Timing information
  • Token usage
  • Error details
Example:
from merit import init_tracing
from openai import OpenAI

# Enable tracing
init_tracing()

# Create client - automatically instrumented
client = OpenAI()

# This call is automatically traced
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello"}]
)

# Trace includes:
# - Model name: "gpt-4"
# - Messages: [{"role": "user", "content": "Hello"}]
# - Response content: "Hi there!"
# - Tokens used
# - Latency

SUT Tracing

Functions and classes decorated with @sut are automatically traced:
from merit import sut

@sut
async def my_agent(prompt: str) -> str:
    # This entire function execution is traced as "sut.my_agent"
    return await llm.generate(prompt)

@sut
class RAGSystem:
    def __call__(self, query: str) -> str:
        # Traced as "sut.rag_system"
        context = self.retrieve(query)
        return self.generate(context)

def merit_test(my_agent, rag_system):
    # Both calls create spans with input/output
    result1 = await my_agent("Hello")
    result2 = rag_system("Question")
Captured Information:
  • Input arguments (args and kwargs)
  • Output values
  • Execution time
  • Nested LLM calls (as child spans)

Usage Patterns

Basic Setup

from merit import init_tracing, sut

# Initialize tracing
init_tracing(output_path=".merit/traces.jsonl")

@sut
async def chatbot(prompt: str) -> str:
    from openai import OpenAI
    client = OpenAI()
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

async def merit_chatbot(chatbot):
    response = await chatbot("Hello")
    assert "hello" in response.lower()
Trace Structure:
merit_chatbot
└── sut.chatbot
    └── openai.chat.completions.create

Custom Steps

from merit import init_tracing, trace_step, sut

init_tracing()

@sut
class RAGPipeline:
    def __call__(self, query: str) -> str:
        with trace_step("retrieve", {"query": query}):
            docs = self.retrieve(query)

        with trace_step("rerank", {"doc_count": len(docs)}):
            top_docs = self.rerank(docs, query)

        with trace_step("generate"):
            return self.generate(query, top_docs)

def merit_rag(rag_pipeline):
    result = rag_pipeline("What is Python?")
    assert result
Trace Structure:
merit_rag
└── sut.rag_pipeline
    ├── retrieve
    ├── rerank
    └── generate
        └── openai.chat.completions.create

Debugging with Traces

from merit import init_tracing, trace_step
import json

# Enable tracing
init_tracing(output_path="debug_traces.jsonl")

# Run merit
# ... merit functions execute ...

# Analyze traces
with open("debug_traces.jsonl") as f:
    traces = [json.loads(line) for line in f]

for trace in traces:
    # Each line is an OpenTelemetry span serialized via `ReadableSpan.to_json()`.
    name = trace.get("name", "<unknown>")
    attributes = trace.get("attributes") or {}

    print(name)
    if isinstance(attributes, dict) and "llm.model" in attributes:
        print(f"  Model: {attributes['llm.model']}")

Privacy Controls

from merit import init_tracing
import os

# Disable content capture for sensitive data
os.environ["MERIT_TRACE_CONTENT"] = "false"

init_tracing()

# Now traces capture:
# - Timing information
# - Model names
# - Token counts
# - Parameter counts
# But NOT:
# - SUT input/output values (`sut.input.*`, `sut.output`)
#
# Note: Merit does not currently guarantee LLM client instrumentation content is redacted.

CI/CD Integration

from merit import init_tracing
import os

# Trace to different files per run
run_id = os.environ.get("CI_RUN_ID", "local")
init_tracing(
    service_name=f"merit-{run_id}",
    output_path=f"traces/run_{run_id}.jsonl"
)

# Run merit suite
# ...

# Upload traces to analysis platform
# upload_traces(f"traces/run_{run_id}.jsonl")

Trace File Format

Traces are exported as JSONL (JSON Lines). Each line is a complete OpenTelemetry span serialized via ReadableSpan.to_json(). Because the exact shape can vary by OpenTelemetry version and installed instrumentations, inspect a line directly in your trace file. Merit-added attributes to look for (may be absent when MERIT_TRACE_CONTENT=false):
  • merit.sut / merit.sut.name
  • sut.input.args / sut.input.kwargs / sut.input.count
  • sut.output / sut.output.type
{
  "name": "sut.chatbot",
  "attributes": {
    "...": "span attributes (varies by instrumentation)"
  }
}

CLI Integration

Merit CLI automatically enables tracing when --trace flag is used:
# Enable tracing
merit test --trace

# Custom output path
merit test --trace --trace-output my_traces.jsonl

# Disable content (only metadata)
MERIT_TRACE_CONTENT=false merit test --trace