🦞 A/B Testing Your AI Agent Can Actually Use
Built-in experiments with declarative HTML variants, no SDK, and an API your agent can drive end-to-end. Create tests, check significance, ship winners.
You want to test whether “Start Free Trial” converts better than “Sign Up.” With Optimizely, that’s a dashboard, an SDK, a snippet, and a 15-minute setup. Then you forget to check the results for two weeks.
What if your AI agent could handle the whole thing?
A/B Testing, Agent-First
Agent Analytics has built-in experiments. The entire lifecycle — create, implement, monitor, decide — is API-driven. No dashboard UI needed. Your AI agent can run it end-to-end, or you can do it yourself with a few HTML attributes.
Here’s what makes it different:
- Declarative variants — add
data-aa-experimentattributes to HTML. No JavaScript. No SDK. - Anti-flicker built in — two lines in
<head>. Users see the correct variant instantly. - Agent-driven lifecycle — create experiments via CLI, check results via API. Your agent monitors significance and recommends when to ship.
- Goal-based tracking — tie each experiment to a conversion event. “Does headline B lead to more signups?” — answered with data.
How It Works
Step 1: Create an experiment
Tell your agent:
“Create an A/B test on my-site for the signup CTA. Test ‘Sign Up’ vs ‘Start Free Trial’. Goal: signup event.”
Or do it yourself:
npx @agent-analytics/cli experiments create my-site \
--name signup_cta --variants control,new_cta --goal signup
Step 2: Add variants to your HTML
This is the good part. No JavaScript SDK. Just HTML attributes:
<h1 data-aa-experiment="signup_cta"
data-aa-variant-new_cta="Start Free Trial">
Sign Up
</h1>
The original content is the control. The data-aa-variant-new_cta attribute defines what the variant shows instead. The tracker handles assignment and rendering automatically.
Want to test multiple elements together? Same experiment name = same variant:
<h1 data-aa-experiment="hero_test"
data-aa-variant-b="Ship Faster With AI">
Build Better Products
</h1>
<p data-aa-experiment="hero_test"
data-aa-variant-b="Your agent handles analytics while you code">
Track what matters across all your projects
</p>
Want three variants? Add more attributes:
<h1 data-aa-experiment="cta_test"
data-aa-variant-b="Try it free"
data-aa-variant-c="Get started now">
Sign up today
</h1>
Step 3: Prevent flicker
Add this in <head> before the tracker loads — it hides experiment elements until the variant is applied:
<style>
.aa-loading [data-aa-experiment] {
visibility: hidden !important;
}
</style>
<script>
document.documentElement.classList.add('aa-loading');
setTimeout(function(){
document.documentElement.classList.remove('aa-loading');
}, 3000);
</script>
The tracker removes aa-loading after applying variants. The 3-second timeout is a safety fallback — if the tracker doesn’t load, content becomes visible anyway. No broken pages.
Step 4: Track your goal
Track the conversion event the same way you track any event:
<a href="/signup"
onclick="window.aa?.track('signup', {method: 'email'})">
Sign Up
</a>
The experiment automatically ties exposures to goal completions. An $experiment_exposure event is tracked once per session — you don’t need to do anything extra.
Step 5: Check results
Ask your agent:
“How’s the signup CTA experiment doing?”
Or check directly:
npx @agent-analytics/cli experiments get exp_abc123
You get back probability_best, lift, and a recommendation. The system uses Bayesian statistics — it needs about 100 exposures per variant before results are meaningful.
signup_cta experiment results:
control — 2.1% conversion | baseline
new_cta — 3.8% conversion | +81% lift | 96% probability best
Recommendation: Ship new_cta ✅
The Agent Growth Loop
Here’s where it gets powerful. With an AI agent running your experiments:
- Create — agent creates an experiment based on a hypothesis
- Monitor — agent checks significance daily via API
- Decide — when significance hits threshold, agent recommends shipping the winner
- Repeat — agent suggests the next test based on results
Your agent can automate this entire cycle with OpenClaw’s scheduling. Set it up once, and your conversion rate improves while you sleep.
“The hero_headline experiment on saas-app hit 96% significance. Variant B converts at 3.8% vs control’s 2.1%. Recommend: Ship variant B.”
That’s your agent, reporting in chat at 9am. You approve, it ships the winner, and starts the next test.
Why Not Optimizely / VWO / Google Optimize?
Those tools are built for marketing teams clicking through dashboards. They work great — if you have someone checking them.
| Traditional A/B Tools | Agent Analytics | |
|---|---|---|
| Setup | SDK + snippet + dashboard config | HTML attributes |
| Create test | Dashboard UI, multiple steps | One CLI command or API call |
| Check results | Log in, navigate to experiment | Your agent checks via API |
| Ship winner | Manual — someone has to remember | Agent recommends, you approve |
| Cost | $50-500+/month | Built into Agent Analytics |
For Complex Cases
If you need more than text swaps — like changing layouts, showing different components, or conditional logic — use the programmatic API:
var variant = window.aa?.experiment('signup_cta', ['control', 'new_cta']);
if (variant === 'new_cta') {
document.querySelector('.cta').textContent = 'Start Free Trial';
document.querySelector('.hero-img').src = '/trial-hero.png';
}
The experiment() function is deterministic — same user always gets the same variant. The inline fallback works even if the config endpoint hasn’t loaded yet.
Get Started
- Sign up at agentanalytics.sh — experiments are included
- Read the docs — A/B testing reference
- Install the skill — ask your agent to install from ClawHub
- Self-host — it’s open source on GitHub
Previously: Set Up Agent Analytics with OpenClaw (5 Minutes)