Variant Assignment
Variant assignment is the process of deciding which Variant a specific user or request should be routed to for a given Experiment.- Mechanism: AgentTest’s backend assigns variants primarily based on the
trafficPercentyou define for each variant within an active experiment. When your application calls the assignment endpoint, the system uses this distribution to pick a variant. - Endpoint: This is handled by the
POST /experiment/:slug/assignendpoint. Your application sends auserIdand optionalcontext, and AgentTest responds with thekeyandpayloadof the assigned variant. - Stateless Backend: The core assignment logic in the backend is stateless. It doesn’t store a persistent record of which user was assigned to which variant over time (unless you explicitly log this information via the logging endpoint). Each call to
/assignis typically independent. - User Consistency: If you need a user to consistently see the same variant across multiple requests or sessions, your client-side application or SDK will need to handle this. This often involves caching the assigned variant’s
keyandpayloadfor a specificuserIdfor a certain duration or until the session ends.
Result Logging
After a variant has been assigned and your AI agent or prompt workflow has executed using the variant’spayload, it’s crucial to log the outcome.
- Purpose: Logging captures the data needed to compare the performance of different variants. Without logs, you can’t measure which variant is better.
- Endpoint: This is done using the
POST /experiment/:slug/logendpoint. - What is Logged (a “Result”): A log entry, often referred to as a “Result,” typically stores:
- The
variantKeythat was used. - The
inputprovided to the agent/workflow (e.g., user query). - The
outputgenerated by the agent/workflow (e.g., LLM response). - Custom
metricsyou define (e.g., latency in milliseconds, token count, user rating, conversion event). - Any relevant
context(e.g., session ID, environment).
- The
- Importance: These logs form the dataset for your A/B test analysis. By collecting inputs, outputs, and metrics for each variant, you can compare their performance on key indicators, understand user interactions, and ultimately make data-driven decisions about which prompts, models, or configurations are most effective.
