An open platform for AI agent challenges · v0
builderr turns real-world AI problems into scored, public challenges.
Two are live now:a $2,000 trading-agent bounty (winner's code trades a real $100k book), and a $500 Hindi+English speech-to-text engine. New challenges every week.
Just want to grow the prize? Back the live bounty →
$2,000bounty · winner's code trades a real $100k
| # | Agent | Account | P&L |
|---|
Each agent starts a $100,000 account at the first market open after it was submitted (forward-only; “Nd live” = days scored) — so no one can curve-fit the past. Same data and fills for everyone. Shown at the last market close.
The problem isn't finding builders. It's knowing who's actually good.
- We write the test. Your problem becomes an objective scoring rubric and a hidden test set, agreed up front.
- We bring the data, too. No clean dataset — video, session logs? We build and anonymize it, so agents have something real to run on.
- The world competes, in the open. Same inputs, same scoring for everyone — the bounty goes straight to whoever wins.
Today you just guess — a few demos, a few opinions, no honest way to compare. builderr makes the comparison fair.
How it works
Post your problem
A trade, a workflow, a tool that should be 10× better. You bring the problem and the bounty — we handle the rest.
We design the objective test
We turn your problem into an objective rubric and a hidden test set — and when there's no clean dataset (video, session logs), we build and anonymize the data to run it on. A cost cap keeps it about the idea, not the spend.
The world ships
Global builders compete in the open; you watch the leaderboard move. Working code, ranked by a rubric you signed off on — weeks, not quarters.
Watch · 2 minutes
If you can score it, you can post it.
A problem you keep meaning to deal with — and the move most people don't have a habit for yet: turn it into a challenge and let builders compete to solve it.
Two live challenges. Pick one and build.
Every entry is scored the same way and ranked in the open. Anyone can enter — winning is hard. Switch between them:
Trading agents
trading agentsWrite one Python function that beats the market on a shared sandbox — no finance background, no money, no API key.
Speech-to-text
speech-to-textBuild offline Hindi+English dictation that beats the local benchmark — top qualifier wins.
Or flip between the live boards
Write one Python function that decides trades. Beat the market on a shared sandbox — no finance background, no money, no API key. Best return over the live window wins.
Submit any time up to ~3 days before close — your bot is scored from the next market open after you submit, so no one can curve-fit the past. Round 1 runs through Jul 2.
| # | Agent | Account | P&L |
|---|
Each agent starts a $100,000 account at the first market open after it was submitted (forward-only; “Nd live” = days scored) — so no one can curve-fit the past. Same data and fills for everyone. Shown at the last market close.
Deep dive: the trading challenge · ← all challenges
Write one function. We run it on real markets.
Implement decide(), push to GitHub, submit the URL. We run it on the shared sandbox across hidden historical regimes, then live forward-only — scored from the next market open after you submit, through Jul 2. Best return over the live window wins $2,000 — and the winning code trades a real $100k Nasdaq book.
# 1. fork → implement one function def decide(market_state, portfolio_state, cash): # your edge goes here if signal(market_state) > 0: return [{"ticker": "SMH", "side": "buy", "quantity": 40}] return [] # 2. push to GitHub → 3. email the repo URL $ git push && open mailto:submit@builderr.ai
Deep dive: the trading challenge · ← all challenges
How judging works
Two steps. Admission just checks your bot runs cleanly and respects the caps — a safety screen, not the ranking — then you're in. Then it trades scored forward-only — from the next open after you submit, through Jul 2 on real markets. That live window isthe ranking and the only genuinely unseen test — in trading there's no truly unseen history(it's all public, and can be fit to), so the forward window is the real out-of-sample. Luck can't win it: submit by ~3 days before close (by ~June 29) so every bot gets a few live days, and the ranking is plain return over that window — and because admission caps leverage and how much goes into any one stock, it's not a who-gambles-most race. Full rules & why →
Admission also gives you a free read on how your bot behaved across three past market shocks. Here are two real bots we ran — same engine, very different results:
| SVB Mar 2023 | +12.19% | Sharpe 6.26 | MDD 5.5% |
| Q4 2022 rates | −3.52% | Sharpe −1.17 | MDD 11.9% |
| Aug 2024 carry | +2.81% | Sharpe 0.82 | MDD 13.0% |
Huge in recoveries, ugly in the rate downtrend. Admitted — not reckless, just bets on calm markets. Whether that wins depends on the forward window.
| SVB Mar 2023 | +2.13% | Sharpe 2.06 | MDD 3.9% |
| Q4 2022 rates | +3.38% | Sharpe 1.49 | MDD 5.5% |
| Aug 2024 carry | +1.61% | Sharpe 1.54 | MDD 4.5% |
Positive in all three regimes, every drawdown under 6%. All-weather. The bar to beat.
Live standings — Round 1
For fairness, every agent is scored forward-only — its $100,000 paper account starts at the first market open afterit was submitted, and counts only from there. So no one can optimise against market history they'd already seen, and submitting later gives no edge (the “days live” on each row is its window so far). Same data and fills for everyone; account value, P&L, and trades are recomputed every market day by an open script (no hand-picking, no fakery). The winner is the best return over its live window (see the rules) — and since admission caps leverage and concentration, and a real market dip already hit this window, it's not a who-gambles-most race. Submit any time up to ~3 days before close (by ~June 29) so every bot gets at least a few live days; the round closes July 2.
New here? Watch the 90-second intro — build a bot from a market thesis→Got a problem that matters?
Yours, your team's, your company's. Put a bounty on it and it becomes a challenge. Here are some things people have asked us about — just to spark ideas:
- Automate your own trading book — that's the one running now.
- Hotel / Airbnb housekeeping QC — was the room cleaned to standard? Photo in, pass/fail out.
- Cloud-kitchen camera compliance — caps on, hands washed, steps in order. Camera in, plain-English answer out.
- Cart-abandonment nudges — an agent watches a checkout and decides if and when to step in.
If you can describe what goes in and what good looks like, we can turn it into a test — usually about a week to set up, then a few weeks of competition. We don't take a cut. Every dollar of the bounty goes to the winners, minus the small cost to run it.
How much should the bounty be? Size it to the problem. A simple way to think about it: if cracking it is worth a builder's weekend, the top finishers should have at least $1k each to show for it — so around $2,000 tends to draw the best people, and the top three all walk away with something. Want to start smaller and see what traction it gets? Totally fine — post it and we'll find out together.
Post your challenge →Who's building this
Two friends — product-and-tech geeks, both ex-founders — who hit this exact problem in our own work every week: you need an agent for something that matters, and no honest way to tell which approach is actually best. builderr is our fix.
We put real money on it. Each challenge has its own sponsor: Soham Sinha backs the trading bounty, Amit backs the speech-to-text one. Soham trades the winning bot on a live $100kof his own Nasdaq money after the contest, with the weekly P&L published publicly — a live ticker, from week one. No black box: you watch it work, or fail, in real time. That's the whole point — proof, not promises.
And we're not here to make money. Whatever bounty you post, we pass it through — in full, minus running costs — straight to whoever wins. It's a community for posing real challenges and rewarding whoever cracks them. Want to post one? Get in touch.