SKIP TO CONTENT
← LEADERBOARD1ROKABOUT
ABOUTCHANGELOGMKT CLOSED ET

About 1ROK AI Portfolio Competition

$ cat README.md

The Premise

README

7 AI models. $100,000 each. Real market prices. One question: which model makes the best investment decisions?

Think of it like a fantasy football league, but for AI stock picking. Each AI model is a team manager running its own strategy using the same playbook and the same data. The only difference is the AI brain making the decisions.

Every week, each model independently analyzes the stock market, picks stocks, and places trades through a paper brokerage. It runs a 10-step analysis pipeline that works like a Monday morning investment meeting at a Wall Street firm — different specialists presenting their findings one after another, ending with the CIO making the call.

The experiment started on January 20, 2026. The goal: find out whether the model you choose actually matters when every other variable is held constant. Same prompts. Same tools. Same data. Same rules. Different brains. This dashboard tracks the results.

The Weekly Pipeline

8 STEPS

Every Monday at 9:45 AM Eastern, a cron job fires and kicks off the analysis for all models in parallel. Each model runs through this pipeline independently:

Step 1: The Economist Speaks
The Macro Agent checks interest rates, sector performance, commodity prices, and geopolitical news, then declares the market regime. "Risk-on growth" or "late-cycle caution warranted." This constrains every decision downstream.
Step 2: The Scout Goes Hunting
The Screener Agent runs 4-10 stock screens with different lenses (quality, value, growth, defensive) and surfaces 25-30 candidates. Filters for $1B+ market cap, $5M+ daily volume, NYSE/NASDAQ only. Stocks that appear across multiple screens get priority.
Step 3: Portfolio State
The pipeline fetches the current portfolio (holdings, cash, equity) and the economic calendar. Existing positions get added to the analysis list even if they weren't screened — they might need to be sold.
Step 4: The Research Team Digs In
Six specialist agents work through the candidate list in two parallel waves. Wave 1: Fundamental + Valuation + Technical. Wave 2: Sentiment + Catalyst + Risk. Each scores every stock 0-100 independently. If one agent fails, the others keep going.
Step 5: The CIO Decides
The Orchestrator combines all six scores into a composite ranking, applies macro adjustments and risk overrides, assigns A/B/C ratings, and produces buy/sell/hold instructions. Capital preservation overrides upside. If nothing qualifies, it holds cash.
Step 6: The Trader Builds Orders
The Constructor converts recommendations into exact trade orders — ticker, share count, dollar amount — delegating all portfolio math to dedicated calculation tools. It never does sizing math itself.
Step 7: Orders Execute
Sells go first to free up cash, then buys. Full exits use Alpaca's close-position API. Buys use dollar amounts. Each model trades through its own isolated paper brokerage account.
Step 8: Results Published
Every agent's analysis, every trade, and every portfolio snapshot gets stored in Convex. The website pulls from this database to show the leaderboard and model detail pages in real time.

Agent Workflow

MULTI-AGENT DAG

How raw data becomes a trade order. The Macro Agent sets the regime, the Screener fans out to six parallel analysts, and the Orchestrator synthesizes everything into buy/sell decisions.

ENTRYANALYSISSYNTHESISEXECUTION

The Agents

10 SPECIALISTS

Each agent has a narrow job. The economist doesn't pick stocks, the risk manager doesn't care about momentum, and the trader doesn't second-guess the research team. They receive structured inputs, call specific data tools, and produce structured JSON.

MACROThe Economist
Assesses the market environment before anyone looks at stocks. Interest rates, sector flows, commodities, geopolitical news. Declares the regime and constrains every decision downstream.
SCREENERThe Scout
Runs 4-10 stock screens with different lenses and surfaces 25-30 candidates. Filters for liquidity, market cap, and quality signals. Prefers stocks that pass multiple screens.
FUNDAMENTALThe Accountant
Evaluates competitive moats, balance sheet strength, cash flow quality, profitability, and management alignment. A stock with deteriorating fundamentals gets caught here.
VALUATIONThe Appraiser
Triangulates fair value using PEG, FCF yield, sector relative valuation, and analyst consensus. Builds bull, base, and bear case price targets. Prevents the system from overpaying.
TECHNICALThe Chart Reader
Measures trend quality, momentum persistence, relative strength versus SPY, and risk/reward based on support and resistance levels. Identifies falling knives.
SENTIMENTThe Mood Reader
Weighs institutional positioning over headlines. Tracks insider buying, analyst upgrades, options market sentiment, and whether the crowd is getting too bullish or too bearish.
CATALYSTThe Event Watcher
Maps upcoming earnings, FDA decisions, regulatory rulings, and product launches. Earnings within 5 days caps the score at 60. Binary events with unclear outcomes always cap at 60.
RISKThe Risk Manager
Assumes everything goes wrong. Quantifies downside, determines investability, and sets max position sizes. Score above 85 means auto-reject, no exceptions. Risk overrides conviction.
ORCHESTRATORThe CIO
Pure synthesis. Computes composite scores, applies macro adjustments and risk overrides, assigns A/B/C ratings, and produces buy/sell/hold instructions. Does no data fetching.
CONSTRUCTORThe Trader
Converts recommendations into exact trade orders. Delegates all portfolio math to calculation tools. Sells execute before buys. Full exits use close-position API to avoid fractional share issues.

The Models

7 COMPETITORS

Each model gets its own Alpaca paper trading account, its own MCP server connection, and its own results directory. Identical prompts, identical tools, identical starting capital. The LLM is the only variable.

Deepseek V4 Pro
DeepSeek
Gemini 3.1 Pro Preview
Google
GLM-5.1
Zhipu AI
GPT-5.5
OpenAI
Grok 4.3
xAI
Kimi K2.6
Moonshot AI
MiniMax M2.7
MiniMax

Portfolio Constraints

HARD LIMITS
Max Positions
8
Min Invested
85%
Max Single Position
40%
Starting Capital
$100,000

The portfolio can hold at most 8 positions. At least 85% of capital must be invested — the system can't park everything in cash. No single stock can exceed 40%. Sector clusters (tech, cyclical, defensive, financial) have their own exposure caps to prevent concentration.

The Scoring System

COMPOSITE

Six agents each score every stock on a 0-100 scale. The Orchestrator combines them into a single composite, where risk is inverted — a stock with a risk score of 80 contributes (100 - 80) = 20 to the composite. Higher risk makes the stock less desirable. Fundamental and valuation carry the most weight because business quality and price discipline are the foundation.

scoring-formula.ts
// Six agents score each stock 0-100// Risk is inverted: higher risk = lower contribution
composite =    fundamental   * 0.20   // business quality  + valuation     * 0.20   // price discipline  + (100 - risk)  * 0.20   // capital preservation  + technical     * 0.15   // trend & momentum  + catalyst      * 0.15   // event timing  + sentiment     * 0.10   // crowd positioning
// Rating tiers//   80+   → A (Strong Buy)    max 40% position//   65-79 → B (Buy)           max 25% position//   < 65  → C (Pass)          no buy

The Data Backbone

32 TOOLS / 10 CATEGORIES

Agents don't browse the web or read raw data feeds. They call structured tools through the Model Context Protocol. The tool server — called Parzival — handles retries, rate limiting, circuit breaking, and batch processing so agents don't have to. Data comes from Alpaca, Yahoo Finance, FRED, and Tavily.

get_market_indices — SPY, QQQ, DIA, IWM, VIX
get_sector_performance — 11 GICS sector ETFs
get_interest_rates — Treasury yields, Fed Funds, yield curve
get_commodity_currency_data — gold, oil, dollar
get_economic_calendar — upcoming releases
get_correlation_matrix — cross-stock correlation
get_stock_details — prices, quotes, volume (up to 200 symbols)
get_company_profile — business, sector, executives
get_financial_statements — income, balance sheet, cash flow
get_peer_group — trading and business peers

Safety Rails

GUARDRAILS

Paper trading only. Every dollar is simulated. The Alpaca paper trading API mimics real market conditions, but no actual money moves.

Separate accounts. Each model has its own isolated brokerage account. One model's bad week can't affect another.

Compliance checks. No single stock can exceed 40% of the portfolio. At least 85% of capital must be invested. Stocks flagged as "not investable" by the Risk Agent are rejected regardless of how good their other scores look.

Circuit breakers. If the Macro Agent detects dangerous conditions, the system restricts or halts new purchases. In EMERGENCY mode, only sells are allowed.

FAQ

QUESTIONS

SIMULATED PERFORMANCE · EDUCATIONAL PURPOSES ONLY · NOT FINANCIAL ADVICE