Skip to main content

Why SemanticTest?

Testing AI systems is hard. Responses are non-deterministic, you need to validate tool usage, and semantic meaning matters more than exact text matching. SemanticTest solves this with:

Composable Blocks

Build complex test scenarios using simple, reusable blocks for HTTP, parsing, validation, and AI evaluation

Pipeline Architecture

Data flows through named slots, making tests readable and maintainable

LLM Judge

Evaluate responses semantically using AI instead of exact text matching

JSON Test Definitions

Version-controllable, readable test definitions that anyone can understand

Quick Example

Here’s a simple test that validates an API response semantically:
{
  "name": "User API Test",
  "tests": [{
    "id": "get-user",
    "pipeline": [
      {
        "id": "request",
        "block": "HttpRequest",
        "input": {
          "url": "https://api.example.com/users/1",
          "method": "GET"
        },
        "output": "response"
      },
      {
        "id": "judge",
        "block": "LLMJudge",
        "input": {
          "text": "${response.body}",
          "expected": {
            "expectedBehavior": "Should return user information with name and email"
          }
        },
        "output": "validation"
      }
    ],
    "assertions": {
      "response.status": 200,
      "validation.score": { "gt": 0.7 }
    }
  }]
}

What Makes It Different?

Instead of exact text matching, SemanticTest uses AI to understand the meaning of responses. “2:00 PM”, “2 PM”, “14:00”, and “two in the afternoon” are all semantically equivalent.
Test tool calls, streaming responses, multi-turn conversations, and non-deterministic outputs with confidence.
Mix and match 8 built-in blocks or create your own custom blocks. Each block does one thing well.
100% open source, runs locally, works with any LLM provider. You control your data and costs.

Use Cases

AI Agent Testing

Test AI agents that use tools, make decisions, and maintain conversations

API Testing

Traditional REST API testing with powerful validation and semantic checks

Streaming Responses

Parse and validate streaming SSE responses from OpenAI, Vercel AI SDK, and more

Integration Testing

Test complex workflows with multiple API calls, data transformations, and validations

Next Steps

Quickstart

Get started in 30 seconds

Core Concepts

Understand how SemanticTest works

Browse Blocks

Explore all available building blocks

View Examples

Learn from real-world examples