Documentation Index
Fetch the complete documentation index at: https://docs.appmerit.com/llms.txt
Use this file to discover all available pages before exploring further.
Setup Function
init_tracing
Initialize OpenTelemetry tracing with streaming file export.
Signature:
def init_tracing(
*,
service_name: str = "merit",
trace_content: bool | None = None,
output_path: Path | str = ".merit/traces.jsonl",
) -> None
Parameters:
| Name | Type | Default | Description |
|---|
service_name | str | "merit" | Service name in trace metadata |
trace_content | bool | None | None | Content capture toggle. Note: today Merit uses MERIT_TRACE_CONTENT to control SUT span input/output attributes; trace_content is not currently applied to OpenAI/Anthropic client instrumentation. |
output_path | Path | str | ".merit/traces.jsonl" | File path for trace export |
Returns: None
Side Effects:
- Sets up OpenTelemetry tracer provider
- Instruments OpenAI and Anthropic clients
- Creates/truncates output file
Important: Must be called before instantiating LLM clients to ensure instrumentation captures all calls.
Example:
from merit import init_tracing
# Basic setup
init_tracing()
# Custom configuration
init_tracing(
service_name="my-ai-system",
output_path="traces/run_001.jsonl"
)
# Now create LLM clients - they'll be automatically traced
from openai import OpenAI
client = OpenAI() # All calls traced
from anthropic import Anthropic
claude = Anthropic() # All calls traced
Environment Variables:
MERIT_TRACE_CONTENT: Set to "false" to avoid recording SUT input/output content (sut.input.*, sut.output). This does not currently guarantee LLM client instrumentation content is redacted.
Context Manager
trace_step
Create a custom span for tracing application logic.
Signature:
@contextmanager
def trace_step(
name: str,
attributes: dict[str, Any] | None = None
) -> Iterator[Span]
Parameters:
| Name | Type | Default | Description |
|---|
name | str | - | Name of the span |
attributes | dict[str, Any] | None | None | Optional attributes to attach to span |
Yields: Span - OpenTelemetry span object
Example:
from merit import trace_step
async def merit_agent_pipeline(agent):
with trace_step("retrieve_context", {"query": "user question"}):
context = agent.retrieve("user question")
with trace_step("generate_response", {"context_length": len(context)}) as span:
response = await agent.generate(context)
span.set_attribute("response_length", len(response))
assert response
Nested Spans:
from merit import trace_step
def complex_pipeline():
with trace_step("pipeline"):
with trace_step("stage_1"):
result_1 = process_stage_1()
with trace_step("stage_2"):
result_2 = process_stage_2(result_1)
with trace_step("stage_3"):
return process_stage_3(result_2)
Classes
TraceContext
Provides access to trace data for the current test execution.
Injection:
TraceContext is automatically injected when a merit function declares trace_context as a parameter. It enables querying child spans, LLM calls, and setting custom attributes on the test span.
Properties:
| Name | Type | Description |
|---|
trace_id | str | The trace ID for this test’s span (32 hex characters) |
span_id | str | The span ID for this test’s span (16 hex characters) |
is_enabled | bool | Whether tracing is currently enabled |
Methods:
| Method | Returns | Description |
|---|
get_child_spans() | list[ReadableSpan] | Get all spans created during this test’s execution |
get_llm_calls() | list[ReadableSpan] | Get spans from LLM API calls (OpenAI, Anthropic) |
get_sut_spans(name=None) | list[ReadableSpan] | Get spans from @merit.sut decorated functions, optionally filtered by name |
set_attribute(key, value) | None | Set a custom attribute on the test span |
Example:
import merit
@merit.sut
def my_agent(prompt: str) -> str:
with merit.trace_step("retrieve"):
docs = retrieve_docs(prompt)
with merit.trace_step("generate"):
return generate_response(docs)
def merit_agent_workflow(my_agent, trace_context):
"""Use trace_context to assert on execution flow."""
result = my_agent("What is Python?")
# Set custom attributes on test span
trace_context.set_attribute("response.length", len(result))
# Get all child spans
all_spans = trace_context.get_child_spans()
assert len(all_spans) >= 2, "Expected at least 2 trace steps"
# Get SUT-specific spans
sut_spans = trace_context.get_sut_spans(name="my_agent")
assert len(sut_spans) == 1
assert sut_spans[0].name == "sut.my_agent"
# Get LLM calls (if any)
llm_calls = trace_context.get_llm_calls()
for call in llm_calls:
model = call.attributes.get("llm.model")
print(f"LLM call used model: {model}")
# Check trace IDs for correlation
print(f"Trace ID: {trace_context.trace_id}")
print(f"Span ID: {trace_context.span_id}")
Filtering SUT Spans:
def merit_multiple_suts(agent_a, agent_b, trace_context):
"""Filter SUT spans by name when multiple SUTs are used."""
result_a = agent_a("query 1")
result_b = agent_b("query 2")
# Get spans for specific SUT
agent_a_spans = trace_context.get_sut_spans(name="agent_a")
agent_b_spans = trace_context.get_sut_spans(name="agent_b")
assert len(agent_a_spans) == 1
assert len(agent_b_spans) == 1
Conditional Logic Based on Tracing:
def merit_tracing_enabled_only(my_agent, trace_context):
"""This test requires tracing (run with `merit test --trace`)."""
result = my_agent("test query")
spans = trace_context.get_child_spans()
assert len(spans) > 0, "Expected trace spans"
assert result is not None
Utility Functions
get_tracer
Get an OpenTelemetry tracer instance for creating custom spans.
Signature:
def get_tracer(name: str = "merit") -> Tracer
Parameters:
| Name | Type | Default | Description |
|---|
name | str | "merit" | Tracer name |
Returns: Tracer - OpenTelemetry tracer instance
Example:
from merit.tracing import get_tracer
tracer = get_tracer("my-component")
def custom_function():
with tracer.start_as_current_span("custom_operation") as span:
span.set_attribute("custom_key", "custom_value")
# Your code here
result = do_work()
span.set_attribute("result_size", len(result))
return result
clear_traces
Clear the trace output file.
Signature:
def clear_traces() -> None
Parameters: None
Returns: None
Example:
from merit.tracing import clear_traces, init_tracing
# Setup tracing
init_tracing(output_path=".merit/traces.jsonl")
# Run some tests...
merit_run_1()
# Clear traces before next run
clear_traces()
# Run more tests with fresh trace file
merit_run_2()
set_trace_output_path
Change the trace output path for the current exporter.
Signature:
def set_trace_output_path(output_path: Path | str) -> None
Parameters:
| Name | Type | Description |
|---|
output_path | Path | str | New file path for trace export |
Returns: None
Example:
from merit.tracing import set_trace_output_path, init_tracing
# Initial setup
init_tracing(output_path=".merit/traces.jsonl")
# Change output path mid-run
set_trace_output_path("traces/experiment_2.jsonl")
get_span_collector
Get the current span collector instance for accessing collected spans.
Signature:
def get_span_collector() -> InMemorySpanCollector | None
Parameters: None
Returns: InMemorySpanCollector | None - The active span collector, or None if tracing is not enabled
Example:
from merit.tracing import get_span_collector
def merit_advanced_tracing(my_agent, trace_context):
"""Access raw OpenTelemetry spans for the current test (requires `--trace`)."""
result = my_agent("test query")
collector = get_span_collector()
if collector:
spans = collector.get_spans(trace_context.trace_id)
print(f"Total spans in this test trace: {len(spans)}")
assert result
Note: Most tests should use trace_context parameter instead, which provides a cleaner API scoped to the current test. get_span_collector() is useful for advanced scenarios requiring access to all spans.
InMemorySpanCollector
Internal class that collects and stores OpenTelemetry spans during test execution.
Purpose: This is an advanced/internal API used by Merit’s tracing system. Most users should use TraceContext instead.
Methods:
| Method | Returns | Description |
|---|
get_spans(trace_id) | list[ReadableSpan] | Get all spans for a specific trace ID |
clear(trace_id) | None | Clear spans for a specific trace ID |
clear_all() | None | Clear all collected spans |
Example:
from merit.tracing import get_span_collector
def merit_inspect_current_trace(trace_context):
"""Inspect raw spans for the current test (requires `--trace`)."""
collector = get_span_collector()
if not collector:
return
spans = collector.get_spans(trace_context.trace_id)
llm_spans = [s for s in spans if s.name.startswith(("openai.", "anthropic.", "gen_ai."))]
print(f"Total spans: {len(spans)}")
print(f"Total LLM spans: {len(llm_spans)}")
When to use:
- Custom test runners or frameworks built on Merit
- Advanced trace analysis across multiple tests
- Performance profiling and debugging
When NOT to use:
- Regular test assertions (use
trace_context parameter)
- Single-test trace inspection (use
trace_context.get_child_spans())
Automatic Tracing
LLM Client Instrumentation
When init_tracing() is called, Merit automatically instruments:
- OpenAI -
openai package
- Anthropic -
anthropic package
All LLM calls are captured with:
- Request parameters (model, temperature, messages, etc.)
- Response/content details depend on the underlying OpenTelemetry instrumentor configuration (Merit does not currently toggle this via
trace_content)
- Timing information
- Token usage
- Error details
Example:
from merit import init_tracing
from openai import OpenAI
# Enable tracing
init_tracing()
# Create client - automatically instrumented
client = OpenAI()
# This call is automatically traced
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello"}]
)
# Trace includes:
# - Model name: "gpt-4"
# - Messages: [{"role": "user", "content": "Hello"}]
# - Response content: "Hi there!"
# - Tokens used
# - Latency
SUT Tracing
Functions and classes decorated with @sut are automatically traced:
from merit import sut
@sut
async def my_agent(prompt: str) -> str:
# This entire function execution is traced as "sut.my_agent"
return await llm.generate(prompt)
@sut
class RAGSystem:
def __call__(self, query: str) -> str:
# Traced as "sut.rag_system"
context = self.retrieve(query)
return self.generate(context)
def merit_test(my_agent, rag_system):
# Both calls create spans with input/output
result1 = await my_agent("Hello")
result2 = rag_system("Question")
Captured Information:
- Input arguments (args and kwargs)
- Output values
- Execution time
- Nested LLM calls (as child spans)
Usage Patterns
Basic Setup
from merit import init_tracing, sut
# Initialize tracing
init_tracing(output_path=".merit/traces.jsonl")
@sut
async def chatbot(prompt: str) -> str:
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
async def merit_chatbot(chatbot):
response = await chatbot("Hello")
assert "hello" in response.lower()
Trace Structure:
merit_chatbot
└── sut.chatbot
└── openai.chat.completions.create
Custom Steps
from merit import init_tracing, trace_step, sut
init_tracing()
@sut
class RAGPipeline:
def __call__(self, query: str) -> str:
with trace_step("retrieve", {"query": query}):
docs = self.retrieve(query)
with trace_step("rerank", {"doc_count": len(docs)}):
top_docs = self.rerank(docs, query)
with trace_step("generate"):
return self.generate(query, top_docs)
def merit_rag(rag_pipeline):
result = rag_pipeline("What is Python?")
assert result
Trace Structure:
merit_rag
└── sut.rag_pipeline
├── retrieve
├── rerank
└── generate
└── openai.chat.completions.create
Debugging with Traces
from merit import init_tracing, trace_step
import json
# Enable tracing
init_tracing(output_path="debug_traces.jsonl")
# Run merit
# ... merit functions execute ...
# Analyze traces
with open("debug_traces.jsonl") as f:
traces = [json.loads(line) for line in f]
for trace in traces:
# Each line is an OpenTelemetry span serialized via `ReadableSpan.to_json()`.
name = trace.get("name", "<unknown>")
attributes = trace.get("attributes") or {}
print(name)
if isinstance(attributes, dict) and "llm.model" in attributes:
print(f" Model: {attributes['llm.model']}")
Privacy Controls
from merit import init_tracing
import os
# Disable content capture for sensitive data
os.environ["MERIT_TRACE_CONTENT"] = "false"
init_tracing()
# Now traces capture:
# - Timing information
# - Model names
# - Token counts
# - Parameter counts
# But NOT:
# - SUT input/output values (`sut.input.*`, `sut.output`)
#
# Note: Merit does not currently guarantee LLM client instrumentation content is redacted.
CI/CD Integration
from merit import init_tracing
import os
# Trace to different files per run
run_id = os.environ.get("CI_RUN_ID", "local")
init_tracing(
service_name=f"merit-{run_id}",
output_path=f"traces/run_{run_id}.jsonl"
)
# Run merit suite
# ...
# Upload traces to analysis platform
# upload_traces(f"traces/run_{run_id}.jsonl")
Traces are exported as JSONL (JSON Lines). Each line is a complete OpenTelemetry span serialized via ReadableSpan.to_json().
Because the exact shape can vary by OpenTelemetry version and installed instrumentations, inspect a line directly in your trace file.
Merit-added attributes to look for (may be absent when MERIT_TRACE_CONTENT=false):
merit.sut / merit.sut.name
sut.input.args / sut.input.kwargs / sut.input.count
sut.output / sut.output.type
{
"name": "sut.chatbot",
"attributes": {
"...": "span attributes (varies by instrumentation)"
}
}
CLI Integration
Merit CLI automatically enables tracing when --trace flag is used:
# Enable tracing
merit test --trace
# Custom output path
merit test --trace --trace-output my_traces.jsonl
# Disable content (only metadata)
MERIT_TRACE_CONTENT=false merit test --trace