Documentation Index Fetch the complete documentation index at: https://docs.semantictest.dev/llms.txt
Use this file to discover all available pages before exploring further.
Why AI APIs Use Streaming
Modern AI APIs stream responses using Server-Sent Events (SSE) instead of waiting for the complete response:
Without Streaming
User: "Explain quantum computing"
[5 second wait...]
AI: "Quantum computing is a type of computation that..."
With Streaming
User: "Explain quantum computing"
AI: "Quantum"
AI: " computing"
AI: " is"
AI: " a"
AI: " type"
...
Benefits:
⚡ Faster perceived response time
📱 Better user experience (see response forming)
🔄 Can cancel long responses early
🛠️ Get tool calls before full response completes
AI APIs send responses as SSE streams:
data: {"id":"1","choices":[{"delta":{"content":"Hello"}}]}
data: {"id":"1","choices":[{"delta":{"content":" there"}}]}
data: {"id":"1","choices":[{"delta":{"content":"!"}}]}
data: [DONE]
Each data: line is a chunk. Combined: “Hello there!”
StreamParser Block
StreamParser extracts text and tool calls from SSE streams:
{
"block" : "StreamParser" ,
"input" : "${response.body}" ,
"config" : {
"format" : "sse-openai"
},
"output" : {
"text" : "aiMessage" ,
"toolCalls" : "aiTools"
}
}
Output:
{
aiMessage : "I'll search for that information for you." ,
aiTools : [
{
name: "search_database" ,
args: { query: "user data" , limit: 10 }
}
]
}
StreamParser supports multiple streaming formats:
Format Provider Description sse-openaiOpenAI ChatGPT API, Azure OpenAI sse-vercelVercel AI SDK Next.js AI applications sseGeneric Standard SSE format textAny Plain text (no parsing)
Basic Usage
Parse OpenAI Stream
{
"pipeline" : [
{
"block" : "HttpRequest" ,
"input" : {
"url" : "https://api.openai.com/v1/chat/completions" ,
"method" : "POST" ,
"headers" : {
"Authorization" : "Bearer ${env.OPENAI_API_KEY}" ,
"Content-Type" : "application/json"
},
"body" : {
"model" : "gpt-4o-mini" ,
"messages" : [{
"role" : "user" ,
"content" : "Hello!"
}],
"stream" : true
}
},
"output" : "response"
},
{
"block" : "StreamParser" ,
"input" : "${response.body}" ,
"config" : {
"format" : "sse-openai"
},
"output" : {
"text" : "aiMessage"
}
}
],
"assertions" : {
"response.status" : 200 ,
"aiMessage" : { "minLength" : 1 }
}
}
Parse Vercel AI SDK Stream
{
"pipeline" : [
{
"block" : "HttpRequest" ,
"input" : {
"url" : "${YOUR_API}/chat" ,
"method" : "POST" ,
"body" : {
"message" : "Hello!"
}
},
"output" : "response"
},
{
"block" : "StreamParser" ,
"input" : "${response.body}" ,
"config" : {
"format" : "sse-vercel"
},
"output" : {
"text" : "aiMessage" ,
"toolCalls" : "tools"
}
}
]
}
Text Only
{
"output" : {
"text" : "aiMessage"
}
}
Combines all text chunks into single string.
{
"output" : {
"toolCalls" : "tools"
}
}
Extracts all function/tool calls from stream.
Both Text and Tools
{
"output" : {
"text" : "aiMessage" ,
"toolCalls" : "tools"
}
}
{
"output" : {
"text" : "aiMessage" ,
"toolCalls" : "tools" ,
"metadata" : "streamMeta"
}
}
Metadata includes:
{
format : "sse-openai" ,
totalChunks : 45 ,
totalTools : 3
}
Real-World Examples
1. Test ChatGPT-Style Interface
{
"name" : "Chat API Streaming Test" ,
"context" : {
"OPENAI_URL" : "https://api.openai.com/v1/chat/completions" ,
"API_KEY" : "${env.OPENAI_API_KEY}"
},
"tests" : [{
"id" : "test-streaming-response" ,
"pipeline" : [
{
"id" : "call-chat-api" ,
"block" : "HttpRequest" ,
"input" : {
"url" : "${OPENAI_URL}" ,
"method" : "POST" ,
"headers" : {
"Authorization" : "Bearer ${API_KEY}" ,
"Content-Type" : "application/json"
},
"body" : {
"model" : "gpt-4o-mini" ,
"messages" : [{
"role" : "user" ,
"content" : "Explain the benefits of TypeScript in 2-3 sentences"
}],
"stream" : true
}
},
"output" : "response"
},
{
"id" : "parse-stream" ,
"block" : "StreamParser" ,
"input" : "${response.body}" ,
"config" : {
"format" : "sse-openai"
},
"output" : {
"text" : "aiMessage" ,
"metadata" : "streamMeta"
}
},
{
"id" : "validate-content" ,
"block" : "ValidateContent" ,
"input" : {
"from" : "aiMessage" ,
"as" : "text"
},
"config" : {
"contains" : [ "TypeScript" ],
"minLength" : 50
},
"output" : "contentCheck"
},
{
"id" : "validate-quality" ,
"block" : "LLMJudge" ,
"input" : {
"text" : "${aiMessage}" ,
"expected" : {
"expectedBehavior" : "Explains TypeScript benefits concisely (2-3 sentences) mentioning type safety or developer experience"
}
},
"output" : "qualityCheck"
}
],
"assertions" : {
"response.status" : 200 ,
"streamMeta.totalChunks" : { "gt" : 0 },
"contentCheck.passed" : true ,
"qualityCheck.score" : { "gte" : 0.8 }
}
}]
}
{
"tests" : [{
"id" : "test-streaming-with-tools" ,
"pipeline" : [
{
"block" : "HttpRequest" ,
"input" : {
"url" : "${OPENAI_URL}" ,
"method" : "POST" ,
"headers" : {
"Authorization" : "Bearer ${API_KEY}"
},
"body" : {
"model" : "gpt-4o-mini" ,
"messages" : [{
"role" : "user" ,
"content" : "Search for users with premium subscription"
}],
"tools" : [
{
"type" : "function" ,
"function" : {
"name" : "search_users" ,
"description" : "Search users database" ,
"parameters" : {
"type" : "object" ,
"properties" : {
"filter" : { "type" : "string" },
"limit" : { "type" : "number" }
}
}
}
}
],
"stream" : true
}
},
"output" : "response"
},
{
"block" : "StreamParser" ,
"input" : "${response.body}" ,
"config" : {
"format" : "sse-openai"
},
"output" : {
"text" : "aiMessage" ,
"toolCalls" : "tools"
}
},
{
"block" : "ValidateTools" ,
"input" : {
"from" : "tools" ,
"as" : "toolCalls"
},
"config" : {
"expected" : [ "search_users" ],
"validateArgs" : {
"search_users" : {
"filter" : "premium"
}
}
},
"output" : "toolValidation"
}
],
"assertions" : {
"toolValidation.passed" : true ,
"tools[0].name" : "search_users"
}
}]
}
3. Test Vercel AI SDK Streaming
{
"tests" : [{
"id" : "test-vercel-ai-stream" ,
"pipeline" : [
{
"block" : "HttpRequest" ,
"input" : {
"url" : "${YOUR_NEXTJS_API}/api/chat" ,
"method" : "POST" ,
"headers" : {
"Content-Type" : "application/json"
},
"body" : {
"messages" : [{
"role" : "user" ,
"content" : "Recommend a laptop under $1000"
}]
}
},
"output" : "response"
},
{
"block" : "StreamParser" ,
"input" : "${response.body}" ,
"config" : {
"format" : "sse-vercel"
},
"output" : {
"text" : "aiMessage" ,
"toolCalls" : "tools" ,
"metadata" : "meta"
}
},
{
"block" : "LLMJudge" ,
"input" : {
"text" : "${aiMessage}" ,
"expected" : {
"expectedBehavior" : "Recommends specific laptop models under $1000 with reasons"
}
},
"output" : "judgement"
}
],
"assertions" : {
"response.status" : 200 ,
"aiMessage" : { "minLength" : 20 },
"judgement.score" : { "gte" : 0.8 }
}
}]
}
4. Test Partial Response Quality
Test response quality even if stream is cut short:
{
"tests" : [{
"id" : "test-partial-response" ,
"pipeline" : [
{
"block" : "MockData" ,
"config" : {
"data" : {
"streamData" : "data: { \" choices \" :[{ \" delta \" :{ \" content \" : \" The \" }}]} \n\n data: { \" choices \" :[{ \" delta \" :{ \" content \" : \" capital \" }}]} \n\n data: { \" choices \" :[{ \" delta \" :{ \" content \" : \" of \" }}]} \n\n "
}
},
"output" : "mock"
},
{
"block" : "StreamParser" ,
"input" : "${mock.streamData}" ,
"config" : {
"format" : "sse-openai"
},
"output" : {
"text" : "partial"
}
}
],
"assertions" : {
"partial" : "The capital of"
}
}]
}
Response Time
Streaming doesn’t make the total time faster, but improves perceived speed:
// Total time: Same
Traditional : [████████████] 5 s
Streaming : [████████████] 5 s
// Time to first token: Much faster
Traditional : [ ............ ] wait 5 s
Streaming : [█ ......... ..] 0.2 s first chunk
{
"pipeline" : [
{
"block" : "HttpRequest" ,
"output" : "response"
}
],
"assertions" : {
"response.duration" : { "lt" : 10000 } // Under 10s total
}
}
For time-to-first-chunk testing, you’d need custom timing logic (not built-in yet).
Error Handling
Incomplete Streams
{
"pipeline" : [
{
"block" : "StreamParser" ,
"input" : "${response.body}" ,
"output" : {
"text" : "aiMessage" ,
"metadata" : "meta"
}
}
],
"assertions" : {
"aiMessage" : { "minLength" : 1 }
}
}
StreamParser handles common issues:
Missing data: prefix
Invalid JSON in chunks
Incomplete tool call objects
Timeout Handling
{
"block" : "HttpRequest" ,
"input" : {
"url" : "${STREAMING_API}" ,
"timeout" : 30000 // 30s timeout for long streams
}
}
Combining with Validation Blocks
StreamParser → ValidateContent → LLMJudge
{
"pipeline" : [
{
"block" : "HttpRequest" ,
"output" : "response"
},
{
"id" : "parse" ,
"block" : "StreamParser" ,
"input" : "${response.body}" ,
"output" : {
"text" : "aiMessage" ,
"toolCalls" : "tools"
}
},
{
"id" : "check-keywords" ,
"block" : "ValidateContent" ,
"input" : {
"from" : "aiMessage" ,
"as" : "text"
},
"config" : {
"contains" : [ "order" , "confirmed" ],
"minLength" : 20
},
"output" : "keywordCheck"
},
{
"id" : "check-tools" ,
"block" : "ValidateTools" ,
"input" : {
"from" : "tools" ,
"as" : "toolCalls"
},
"config" : {
"expected" : [ "create_order" ]
},
"output" : "toolCheck"
},
{
"id" : "check-quality" ,
"block" : "LLMJudge" ,
"input" : {
"text" : "${aiMessage}" ,
"expected" : {
"expectedBehavior" : "Professional order confirmation with next steps"
}
},
"output" : "qualityCheck"
}
],
"assertions" : {
"keywordCheck.passed" : true ,
"toolCheck.passed" : true ,
"qualityCheck.score" : { "gte" : 0.85 }
}
}
Streaming vs Non-Streaming
Pros:
Better UX (see response forming)
Can cancel long responses
Get tool calls early
Cons:
More complex to parse
Harder to debug
Can’t easily inspect full response
When to use:
User-facing chat interfaces
Long-form content generation
Real-time feedback needed
Pros:
Simpler to handle
Easier to debug
Can inspect full response
Cons:
Worse UX (wait for full response)
Can’t cancel
Higher perceived latency
When to use:
Background processing
Batch operations
Testing/development
Best Practices
1. Always Check Parse Errors
{
"output" : {
"metadata" : "meta"
}
}
2. Set Appropriate Timeouts
{
"block" : "HttpRequest" ,
"input" : {
"timeout" : 30000 // 30s for normal
// OR
"timeout" : 120000 // 2min for long generations
}
}
Streaming can take longer than traditional requests.
3. Validate Both Structure and Semantics
{
"pipeline" : [
{ "block" : "StreamParser" },
{ "block" : "ValidateContent" }, // Structure (fast, free)
{ "block" : "LLMJudge" } // Semantics (slower, costs $)
]
}
Empty streams
Incomplete streams (connection drops)
Very long responses (10,000+ tokens)
Multiple tool calls in one stream
Mixed text and tool calls
5. Extract Only What You Need
Debugging Streams
Problem: Empty aiMessage
Check:
Is stream format correct?
Look at raw response.body
{
"assertions" : {
"response.body" : true // Print raw stream
}
}
Check:
Correct format? (sse-openai vs sse-vercel)
Are tool calls in the response?
Check metadata.totalTools
{
"output" : {
"toolCalls" : "tools" ,
"metadata" : "meta"
},
"assertions" : {
"meta.totalTools" : { "gt" : 0 }
}
}
Problem: Parse errors
Common causes:
Wrong format specified
Malformed JSON in chunks
Mixed stream formats
Solution:
{
"config" : {
"format" : "sse" // Try generic SSE format
}
}
data: {"id":"1","choices":[{"delta":{"content":"Hello"},"index":0}]}
data: {"id":"1","choices":[{"delta":{"tool_calls":[{"function":{"name":"search"}}]},"index":0}]}
data: [DONE]
0:"Hello"
0:" world"
9:{"toolCallId":"123","toolName":"search"}
data: Any text content here
data: More content
data: Even more
Next Steps
Multi-Turn Conversations Test conversational AI flows
StreamParser Reference Complete StreamParser documentation
Tool Call Validation Validate AI tool/function calls
AI Chat Example Full streaming chat test example