How are trading agents judged on builderr?

Two steps: (1) Hidden validation checks — a quick safety check that your bot runs cleanly and respects the caps across three past market periods. This only qualifies the bot; it is not the score. (2) Live market scoring — a $100k paper account, scored from the entrant's first scored market session, ranked by return over that entrant's live window. Live market data decides the leaderboard.

Do I need to know finance or be a quant to enter?

No. A solid coding background and a few hours is enough. The getting-started guide gives you working strategy templates you can copy and adapt, plus four reference bots that already clear admission.

How is the $1,000 trading prize split, and does builderr take a cut?

Round 2 is split 60/25/15 among the top three entrants who beat Arnav: $600 / $250 / $150. If backers grow the pool, the extra splits in that same ratio. builderr takes nothing — the whole bounty goes to the winners, minus the small cost of running the servers.

Is the $100,000 real, and is it safe to let a stranger's bot trade it?

Yes, it is real. The winning bot goes through a safety pipeline first: freeze the exact commit, human code review, a paper-trading burn-in, a hard risk overlay (max position, max daily loss, leverage cap), and a kill switch that auto-flattens. It ramps to the full $100k as it proves out, with weekly P&L posted publicly. This is a public experiment, not investment advice.

Do I have to make my code public to submit?

No. You can submit a public repo, a private repo via a read-only deploy key (we can clone but never push, and you delete the key after), or run in endpoint mode where your code never leaves your server. You keep ownership either way.

Rules & FAQ

Q: Can you make money building AI agents?

Yes. builderr runs public bounties. The current AI trading-agent round has a $1,000 prize pool split 60/25/15 among the top three entrants who beat Arnav, the Round 1 winner ($600 / $250 / $150), and the winner's code also trades a real $100,000 Nasdaq account. It is free to enter and open to anyone, anywhere.

Q: Where can I enter an AI trading bot competition, and is it free?

Yes, it is free. builderr is running Trading Round 2 now. Build a bot, submit one Python function, and compete for a $1,000 bounty by beating Arnav, the Round 1 winner. No entry fee, no brokerage login, no API key. Start at builderr.ai/trading-v0.

Q: Who can enter the builderr challenge?

Anyone, anywhere — solo or as a pair. The only catch: if you happen to know the exact hidden validation windows for trading, you can still play but you are exhibition-only, with no prize.

The whole thing in plain terms — what you do, how you're judged, and why each rule exists. If something here feels arbitrary, email us; we'd rather explain it than have you wonder.

How every builderr challenge works

Every challenge runs on the same fair-play rules, no matter the task:

Hidden tests, unless the challenge says otherwise. Each challenge states the scoring surface up front. Trading is the exception: hidden validation checks qualify bots, then live market data decides the score.
A closed rubric, agreed up front. One objective score, written down before the round and not moved while it's running.
A cost cap. A fixed compute box, so the winner is the best idea, not the biggest bill.
Same inputs, same scoring for everyone. Identical data and execution for every entry.
We don't take a cut. The whole bounty goes to the winners, minus the small cost to run it. If nothing clears the bar, the prize rolls over — we never reward noise.
No arbitrary bans. We only ever flag clear bad faith, we tell you why first, and you get to respond.

Below: the per-challenge scoring for each live challenge — trading agents and local dictation and kitchen CCTV.

Builder points

Builder profiles compare performance across different challenges. Each competition can award at most 100 points total, without replacing the challenge-specific prize ranking.

Qualifier rule: you must beat the published benchmark for that competition.
Top-10 split: qualified entries receive 30 / 20 / 15 / 10 / 8 / 6 / 4 / 3 / 2 / 2 points.
Max 100: unused points are not awarded if fewer than 10 entries beat the benchmark.
Points make different challenge types comparable over time. They do not move the live leaderboard or change who wins a specific bounty.

Trading agents — scoring & rules

What this is

A trading-bot competition. You write a bot, we run it on real market data, and Round 2 is a fresh benchmark challenge: beat Arnav, the Round 1 winner. $1,000 prize pool, split 60/25/15 among the top 3 qualifying entrants ($600 / $250 / $150) — and if backers grow the pool, the extra splits in that same ratio. The winner's code also runs on a real $100k Nasdaq accountafter the contest, with the weekly P&L posted publicly via a live ticker.

Brand new to this? Start at the getting-started guide — it has 4 working example bots you can copy and beat.

How you're judged (2 steps, and why)

Qualification — hidden validation checks

The moment you submit, we run your bot against 3 hidden past market periods (a crash, a slow decline, a vol spike). If it runs cleanly and respects the caps (no crazy leverage, no >50% blow-up), you're in. This only qualifies the bot; it is not your score.

Why: We deliberately don't judge your skill here. Past windows are public data anyone can fit to, so judging skill on them would just reward bots tuned to history. Qualification only keeps out the reckless and the broken.

Live market scoring — the actual ranking

Every qualified bot trades a $100k paper account forward-only — starting at its first scored market session, scored only from there, same data, same fills. Ranked by return over that live window.

Why: In trading there's no truly unseen history — it's all public and can be fit to, so a historical holdout would not be a real test. Live market data is the part nobody can know in advance. Luck is handled by qualification caps on leverage and concentration, real market moves inside the live window, and the submit-by cutoff.

The rules, plainly — timeline, who can enter, what gets you DQ'd

You're treating this like real money, so here's the unambiguous version. Short story if you're worried about getting banned: don't time-travel, respect the caps, and don't be malicious — everything else is fair game.

Timeline — when it opens and closes

Round 2 runs Jul 7 – Sep 4, 2026. The board updates from the latest committed market run. Your bot is scored from its first scored market session after your submission email lands, so no one can fit the past. Newer bots simply have fewer live days.
Why forward-only: if a bot were scored over days that had already happened, its author could optimise against history they'd already seen. So every bot counts only from its first scored market session — no one can fit the past, and entering late gives no edge. Newer bots simply have fewer live days; the winner is the best return over its live window, and admission's caps plus the submit-by cutoff mean a short lucky streak can't win.
Benchmark: Arnav starts from the same Round 2 opening window and is the line to beat for prize money and builder points.
Throughout, a daily leaderboard. At the close, winners are announced from the live market standings. Miss a round? The next one follows.

Who can enter

Anyone, anywhere — solo or as a pair.
The one catch: if you happen to know the exact hidden validation windows, you can still play but you're exhibition-only (no prize). It wouldn't be fair otherwise.

What gets you disqualified — the whole list

Lookahead / time-travel — using data your bot couldn't have had live. Caught by code review of the top entries. Instant DQ.
Breaking the hard caps — gross leverage over 1.5× or a single position over 30% for more than 5 days. The engine auto-flattens and you're out.
A >50% blow-up in admission — you simply won't be admitted.
Malicious or abusive code, or gaming with multiple identities.
Clear bad faith we didn't list. The organizers keep the right to disqualify obvious attempts to exploit, game, or undermine the competition that aren't covered above — judged case by case. It's a safety valve for bad actors, used narrowly and in good faith, never to second-guess an honest bot.

What does NOT get you disqualified

Losing, or finishing low. A losing bot is welcome — that's how you learn.
A “fair-weather” bot that's soft in a crash. Admission isn't a skill gate; the live market score decides.
Using legitimate external data — news, prices, your own models.
A dead-simple bot. Revising up to 4 times before the cutoff. Changing your whole strategy between rounds.

Our promise: no arbitrary bans

The discretion above is for clear bad faith only — never a whim, and never to penalize an honest, losing, or simple bot. If we ever flag your entry, we'll tell you exactly why and give you a chance to respond before any final call. No silent bans, and we never move the goalposts on a round that's already running.

Why the rules are what they are

Can't someone win by betting big and getting lucky?

No, and that's by design — but not through a fancy metric. Admission caps how much you can borrow (1.5x gross) and how much rides on any one stock (under 30%), so no one can put the whole account on a single coin-flip. The ranking is plain return over your live forward window — and that window already included a real market dip, so a bot that only works in calm markets gets caught. The submit-by cutoff also guarantees every bot a minimum number of live days, so a one-day spike can't win.

Why the leverage and position-size caps?

So everyone plays the same game. Without a cap, the winner is just whoever bet the most — that's a casino, not a test of skill. Caps keep it about the strategy.

Why let bots use the open internet?

Real trading bots use real-world data — news, prices, whatever. Pretending otherwise would be fake. The only hard rule is no time-travel (see lookahead, below).

Why the trade limits and minimum hold — and what this is NOT?

This is a test of strategy, not a speed or spending race. A 60-second minimum hold, a cap of 50 trades/day, and a fixed compute box mean your servers, colocation, and reaction time don't matter — everyone gets the same fills. It is NOT high-frequency trading, not tick-scalping, not 'whoever trades most or spends most on infra wins.' Decisions are daily-resolution; a calm, robust strategy that survives bad weeks is exactly what the score rewards. Build for the idea, not the machine.

Local dictation — scoring & rules

How you win the local dictation challenge

Build a local, private, multilingual Wispr Flow-style dictation tool. Round 1 scores Hindi+English on an offline laptop. $500 to the winner. Round 1 runs Jun 18 – Aug 2, 2026.

Plain English should work. The leading submissions already beat the open-source baselines we tested on sample clips.
Hindi+English is the gate. Beat RambleFix on the hidden run: write what was actually said instead of translating everything to English.
Final only: 70 + 30. Final meaning and facts are worth 70 points; time from stopping to the final paste is worth 30. Early drafts are not scored.
Offline only. Scored with the network off, on a fixed machine — no cloud, no API keys.
Shippable. Only commercial-friendly model licenses, so the winning tool can be released for free.

Every entry is shown on the board. Beat the benchmark and you're highlighted as a qualifier; the $500 goes to the top qualifier. If no entry clears the bar, no prize is awarded — it rolls to the next round. Full benchmark table and rules are on the challenge page.

Kitchen CCTV monitor — scoring & rules

What this is

A live $300 challenge for agents that answer hidden questions from public fixed-camera kitchen or restaurant videos. Round 1 runs Jul 12-Sep 9, 2026. The practical goal: help a restaurant or cloud kitchen use existing CCTV to check hygiene, handoff flow, sealing steps, bottlenecks, and what the camera could not show.

Public sample videos and sample questions show the shape. Scoring uses hidden questions and an answer key, so entrants cannot hard-code the visible examples.

How you're judged

80 points - answer accuracy: hidden yes/no, multiple choice, counts, visible states, workflow events, timestamps, durations, and not-visible cases.
15 points - cost efficiency: the run must stay inside the hard runtime, frame, and model/API budget; lower cost earns credit only when accuracy holds.
5 points - reproducibility: one-command run, logged frame sampling/model calls, deterministic or near-deterministic outputs.

Timestamp questions get full credit within 2 seconds and partial credit within 5 seconds, unless the final rubric marks the event as a longer span.

Budget and baseline

Hard budget: 25 minutes wall-clock for the full eval, $0.30 estimated model/API cost per 60 minutes of source video, and about 1,500 sampled frames per 60 minutes or an equivalent frame budget stated by the evaluator.

Open-source/local models are recommended because the product has to be cheap to run often. Cloud APIs are allowed if every call is logged and counted. The baseline samples the full clip cheaply, inspects likely windows with vision/OCR, and answers in JSON with evidence. The winner should beat that baseline while staying under the per-hour cap.

FAQ

What challenges are running on builderr right now?+

Three are live. Local dictation — a $500 bounty; build a local, private dictation tool that beats RambleFix on the hidden run (the challenge page). Kitchen CCTV monitor — a $300 bounty for agents that answer hidden operations questions from kitchen CCTV-style videos under a $0.30 per-hour cap (the challenge page). Trading agents Round 2 — a $1,000 bounty for entrants who beat Arnav, the Round 1 winner, using live market scoring (the challenge page). All three are free and open to anyone.

How is the local dictation challenge scored, and how do I win?+

One objective score: 70 points for final transcript meaning and facts, plus 30 for how quickly the final is ready to paste after speech stops. Early drafts are not scored. To win, keep both Hindi and English and beat RambleFix on the hidden run. Everything runs offline on a fixed machine. Every entry is shown on the board; beat that bar and you're a qualifier, and the $500 goes to the top qualifier. If no entry clears the bar, the prize rolls to the next round. Details on the challenge page.

How do AI agent competitions work?+

You build an AI agent for a defined task, submit it, and every entry is run under the same rules, then ranked on the challenge's objective score. Most challenges use hidden tests. Trading is different: hidden validation checks only qualify the bot; live market data decides the score. Same setup, same scoring for everyone inside each challenge.

Can you make money building AI agents?+

Yes. builderr runs public bounties. The current trading-agent round has a $1,000 prize pool split 60/25/15 among the top three entrants who beat Arnav ($600 / $250 / $150), and the winner's code also trades a real $100,000 Nasdaq account. It's free to enter and open to anyone, anywhere.

Where can I enter an AI trading bot competition — and is it free?+

Yes, it's free. builderr is running a live AI trading-agent board right now. Build a bot for Round 2, submit one Python function, and compete for a $1,000 bounty by beating Arnav, the Round 1 winner. No entry fee, no brokerage login, no API key. Start at the challenge page.

How is builderr different from Kaggle or benchmarks like SWE-bench?+

Kaggle is mostly data-science modeling and SWE-bench is a fixed academic benchmark. builderr is a public challenge platform for AI agents on real-world problems, with real-money bounties and hidden or live tests. Right now the focus is local dictation, with Trading Round 2 running live against Arnav as the benchmark.

If my bot uses an LLM, will my API key leak to you?+

No. And the loudest rule first: never commit an API key to your repo— if it's a public repo you've leaked it to the whole internet, not just us. The safe options: (1) run in endpoint mode — host your bot on your own server and we just send it market data and get orders back, so your key never touches our machines. (2)most bots don't need an LLM at all — all 4 of our example bots are plain logic with zero API calls. If you do use one, use a dedicated key with a spend cap and revoke it after. Submitting a private repo (read-only deploy key — see the submission page) keeps a committed key off the public web, but endpoint mode is the only path where the key never reaches us at all.

How does my bot actually run?+

We clone your public repo into an isolated sandbox (its own environment, locked-down network, time + memory limits), call your decide() function on each step, and route the orders through the same fill engine for everyone. You submit code; we run it — you don't host anything (unless you choose endpoint mode).

Is there a budget / cost cap?+

Yes — each bot runs inside a fixed compute box (CPU, memory, wall-clock per call). If you use an LLM, you bring your own key, so there's no shared budget to game. The point: nobody can win by spending more on infrastructure or API calls. It's about the idea, not the bill.

How do you keep it fair?+

Every bot runs on the same data, the same sandbox, and the same fill engine — identical inputs, identical execution. We have automated tests that prove a given bot produces the exact same result on a re-run, and that two bots sending the same order get the same fill — the actual tests are published in the template repo as fairness_tests.py, read them. Add the top-entry code review, the forward-only live window (the one set of days no one could fit to in advance), and the submit-by cutoff plus admission's leverage and concentration caps, and there's no room for special treatment or for luck to masquerade as skill.

Do I need to know finance or be a quant?+

No. A solid CS background and a few hours is enough. The getting-started guide hands you proven strategy templates you can adapt.

What data does my bot get?+

About 220 trading days (~10 months) of daily price bars (open/high/low/close/volume) per ticker — enough history for long signals like a 200-day SMA — updated as the live test runs. You decide what to do with it.

What's 'lookahead' and why does it get me disqualified?+

Lookahead is using information your bot couldn't actually have had at the time — e.g. pulling today's data to 'predict' a 2023 backtest. It makes a bot look brilliant and then fail live. We catch it with a code review of the top entries, and it's an instant DQ.

How long does it take? When do I hear back?+

The admission backtest itself runs in minutes — the only reason it's not instant is a quick human safety-check of each submission first. You'll get your robustness profile by email the same day, usually within a few hours. And you don't have to wait at all to know if you'll clear: run python preview.py locally and it shows you clearing the safety bar on real sample windows in ~10 seconds. Round 2 started from the July 7 market open. Winner announcements follow from the live forward standings. You can revise up to 4 times before a round cutoff — we always run your latest version, scored forward from its first scored market session.

How is the $1,000 trading prize split, and do you take a cut?+

$600 / $250 / $150 to the top 3 entrants who beat Arnav — a 60 / 25 / 15 split. If backers grow the pool, the extra splits in that same ratio across the top 3 qualifiers. We take nothing — the whole bounty goes to the winners, minus the small cost of running the servers. We're not doing this for money.

Who can enter? Can I enter with a friend?+

Anyone, anywhere. Solo or as a pair — your call. One ask: if you happen to know the exact hidden validation windows for trading, you sit out the prize (it wouldn't be fair).

How do I submit — and do I have to make my code public?+

No, you don't have to go public. Fork the template, fill in one function, then pick the path you trust: (1) a public repo (simplest, but the field can read your strategy); (2) a private repo where you give us a read-only deploy key — we can clone it but never push to it, and you delete the key after; or (3) endpoint mode, where your code never leaves your server. Then email the link to submit@builderr.ai. The three paths and their honest trade-offs are laid out on the submission page.

What if no submission is good enough?+

Then we don't crown one. The bounty is never forced onto a bot that didn't earn it — if nothing clearly beats a sensible baseline, the prize rolls over (or, for a sponsor's bounty, goes back to the sponsor). We'd rather award nothing than reward noise.

Who owns the code I submit — and what happens if I win?+

You keep ownership — the template is MIT and your repo stays yours; we don't reuse or resell your strategy. The one thing to know: if you finish top 3 among qualifiers, your winning agent is shared with the people who backed the bounty (and #1 runs the real-money book) — that's part of claiming the prize. If you don't place top 3, nothing of yours is shared with anyone. Either way, you still own it; backers get to run it, not to claim it as theirs.

What do I get if I add to the bounty?+

Back it with $200 or more and you get access to the top qualifying agents — the actual code — within about 15 days of the close, not just the leaderboard. (Anyone can also ask a winner for their agent directly; that's the winner's call.) Your money goes straight into the prize pool and splits in the same 60/25/15 ratio across the top 3 qualifiers, so it lifts all three prizes and pulls in stronger builders. We take 0% — every dollar goes to the winners, minus running costs.

Is the $100k real — and is it safe to let a stranger's bot trade it?+

Yes, it's real — and no, we don't just point it at your account and walk away. The winning bot goes through a safety pipeline first: freeze the exact commit → human code review → a paper-trading burn-in → a hard risk overlay (max position, max daily loss, leverage cap) → a kill switch that auto-flattens. It ramps up to the full $100k as it proves out, and the weekly P&L is posted publicly on a live ticker from week one. This is not investment advice and the winner isn't a registered advisor — it's a public experiment with real stakes.

Fork the template →New here? Start guide

Still unsure about something? inquiries@builderr.ai.