Skip to main content

Why SemanticTest?

Testing AI systems is hard. Responses are non-deterministic, you need to validate tool usage, and semantic meaning matters more than exact text matching. SemanticTest solves this with:

Quick Example

Here’s a simple test that validates an API response semantically:
{
  "name": "User API Test",
  "tests": [{
    "id": "get-user",
    "pipeline": [
      {
        "id": "request",
        "block": "HttpRequest",
        "input": {
          "url": "https://api.example.com/users/1",
          "method": "GET"
        },
        "output": "response"
      },
      {
        "id": "judge",
        "block": "LLMJudge",
        "input": {
          "text": "${response.body}",
          "expected": {
            "expectedBehavior": "Should return user information with name and email"
          }
        },
        "output": "validation"
      }
    ],
    "assertions": {
      "response.status": 200,
      "validation.score": { "gt": 0.7 }
    }
  }]
}

What Makes It Different?

Instead of exact text matching, SemanticTest uses AI to understand the meaning of responses. “2:00 PM”, “2 PM”, “14:00”, and “two in the afternoon” are all semantically equivalent.
Test tool calls, streaming responses, multi-turn conversations, and non-deterministic outputs with confidence.
Mix and match 8 built-in blocks or create your own custom blocks. Each block does one thing well.
100% open source, runs locally, works with any LLM provider. You control your data and costs.

Use Cases

AI Agent Testing

Test AI agents that use tools, make decisions, and maintain conversations

API Testing

Traditional REST API testing with powerful validation and semantic checks

Streaming Responses

Parse and validate streaming SSE responses from OpenAI, Vercel AI SDK, and more

Integration Testing

Test complex workflows with multiple API calls, data transformations, and validations

Next Steps

I