Baz Cosmopoulos

Quantitative Trading • Derivatives Research • Market Microstructure

University of Michigan '27 | Financial Math & Data Science

bcosm@umich.edu

Hey, I'm Baz. I'm a junior at UMich who loves coding and trading, but also just a chill guy. Currently, I'm:

Experience Timeline

Competitions

IMC Prosperity Global Trading Challenge

2025

41st / 12,626 (Top 0.33%).

Michigan Quant Conference — Market Making Competition

2025

2nd place (IMC Trading + Old Mission-sponsored).

Summer 2026 (Incoming)

Quantitative Trading Intern

SCALP Trade

Incoming quantitative trading intern for Summer 2026.

Feb 2026 – Present

Quantitative Research Intern

Gnosis

Researching prediction market alpha.

Prague, Czechia · On-site

Mar 2025 – Present

Founder & Solo Full-Stack Engineer

DemandEngine

Built a data pipeline ingesting 5k+ Reddit/HN posts daily, extracting 32k pain-point signals to generate 1-2k unique SaaS ideas.
Deployed a DeepSeek-7B LLM with a twin-pass ranking system, achieving a 95% manual QA pass rate for generated ideas.
Engineered a scalable React + Django/FastAPI stack on GCP/Vercel, handling 100 concurrent users with p99 latency under 1s.

Michigan Finance and Mathematics Society

1 yr 1 mo

Co-President

Jan 2026 – Present · 2 mos

Ann Arbor, Michigan

VP External

May 2025 – Jan 2026 · 9 mos

Chicago Trek and sponsors

Quantitative Sports Betting Lead

Feb 2025 – Jul 2025 · 6 mos

Ann Arbor, MI · On-site

Led development of a multi-factor NCAA basketball point-spread strategy; backtested ROI +9.8%, Sharpe 2.96 over 15k+ bets.

Sep 2024 – Nov 2024

Structural Analysis Intern

Rocket Lab

Reduced full-satellite modal analysis compute time by over 95% (1.5 hours to <5 mins) by developing a reduced-order composite fuel tank model.
Closed a critical thermal analysis risk item for a preliminary design review by verifying satellite structural integrity under orbital temperature gradients.
Established a standardized test workflow for composite inserts and used chi-square analysis to determine B-basis load allowables per NASA standards.

May 2024 – Jul 2024

Building Physics Intern

Harris

Co-authored the first ASHRAE paper on modular-build emissions.
Designed a data center thermal-storage cooling system.

Jan 2024 – Apr 2024

Mechanical Engineering Co-op

Copeland

Simulated compressor part collisions in NX Nastran and automated lab-test analytics in Python to verify long-term component reliability.

Sep 2023 – Jan 2024

Mechanical Team Member

Michigan Robotic Submarine

Designed and CFD-optimized a torpedo launcher, bumping RoboSub scoring potential by 2k+ points while integrating cleanly into the hull.

May 2023 – Jul 2023

Software Engineer

Stealth Startup

Shipped bilingual FastPitch TTS and LoRA-tuned RAG LLMs, wiring the stack into a fully conversational voice customer-service platform.

Jun 2021 – Aug 2021

Engineering Research Intern

EMTECH

Co-authored an ESA CubeSat subsystem guide and delivered thermal/electrical analyses for the mission’s hardware simulator, with work being later used in ESA's Space Rider mission.

Featured Quant Projects

Featured Project

Deep Reinforcement Learning Hedging Agent

View on GitHub →

Options-based Deep RL hedging for a 10k-share SPY book, trained on rBergomi (GPU) and deployed in Backtrader. Beats classical delta-hedging on volatility, drawdowns, and cost.

23.8%

↓ Volatility

96.0%

↓ Trading costs

36.3%

↓ Max drawdown

24.1

× Efficiency

95%

Win rate

100k

rBergomi GPU

Backtests on SPY (2008-2023) using historical underlying + options; costs: $0.65/contract + 10 bps slippage. See repo for exact config.

Bar chart of RL hedger headline reductions vs delta-hedge (volatility, drawdown, trading costs) — Headline reductions vs delta-hedge.

Scatter plot of Pareto grid: mean cost vs mean absolute PnL risk proxy (log scale) — Pareto grid (risk proxy vs cost).

Pareto frontier plot from rBergomi-HedgeRL experiments — Pareto frontier from experiments.

▶ rBergomi Demo

Interactive Rough Volatility Simulation

Check out the rBergomi stochastic volatility model that powers the training environment for the Deep RL hedging agent below. This demo generates realistic option pricing paths with the same mathematical model used to train the agent that outperforms classical delta-hedging. Adjust parameters to see how market dynamics change in real-time.

Basic Parameters

# paths time steps

Advanced Parameters

ξ (xi) - roughness H - Hurst parameter η (eta) - vol of vol ρ (rho) - correlation S₀ - initial price r - risk-free rate

▶ More details

What this is

A Deep RL agent that hedges a 10,000-share SPY position with liquid long-dated ATM calls & puts. The policy is Recurrent PPO (LSTM) trained on 100k rBergomi rough-volatility paths (GPU), then validated in Backtrader on historical SPY underlying + options. Goal: minimize PnL volatility and drawdown with materially lower cost than classical delta-hedging.

How it works

Simulator: rBergomi paths + ATM option prices (CuPy/CUDA).

Policy: RecurrentPPO (LSTM).

State (13 dims): SPY, ATM call/put mids, current call/put holdings, portfolio Δ & Γ, time-to-go, vol & lagged changes.

Actions (continuous, 2): call_qty ∈ [−100,+100], put_qty ∈ [−100,+100] per step (clipped/rounded to contracts).

Reward: minimize |ΔPnL| (or MSE/CVaR) + λ × transaction_costs; position limits enforced.

Training: HPO with Optuna, then long final train; evaluation over multiple seeds with VecNormalize.

Backtesting setup

Engine: Backtrader (daily).
Data: data/spy_underlying.csv, data/spy_options.csv (real options, ATM selection with fallbacks).
Costs: $0.65/contract commission + 10 bps slippage on notional; contract multiplier 100; rolling near-expiry; daily rebalance.
Constraints: ±200 contracts/type outstanding; ±100 contracts/trade; 30-calendar-day tenor target.

Results (vs classical delta-hedge)

Volatility: −23.8% (0.35 vs 0.46).
Max drawdown: −36.3% (−0.85 vs −1.33).
Trading costs: −96.0% ($83,168 vs $2,087,810).
Hedging efficiency: 12.02× (vs 0.48).
Win rate across seeds: 95%.

(See repo tables for full metrics: downside deviation, ulcer/pain indices, equity stability.)

Why this matters (for trading)

Frames hedging as a cost-aware control problem (risk + frictions), not “hedge everything, every step.”
Produces an explicit risk/cost Pareto surface you can choose from based on liquidity and risk budget.
Same idea generalizes to inventory-aware execution / market making: state → action → costs, with constraints.

Failure modes & gotchas

Sim-to-real: rBergomi misspecification, discrete strikes, and real microstructure can break “optimal” actions.
λ sensitivity: mis-setting λ can under-hedge (cheap but risky) or over-hedge (stable but too expensive).
Liquidity shocks: wide markets and gap risk can push slippage beyond the backtest cost model.
Ops details: contract rounding, roll/assignment, missing chains, and stale quotes need robust fallbacks.

Reproduce quickly

pip install -r requirements.txt && pip install -e .
# Hyperparameter search (default 10 trials)
hedgerl train --hpo
# Final training with your chosen loss/weights
hedgerl train --loss_type abs --w 0.05 --lam 0.001
# Backtest (single or looped seeds)
hedgerl backtest
hedgerl backtest --loop 5
# Explore Pareto frontier (risk ↔ cost)
hedgerl generate-pareto

Repo map (just the bits recruiters care about)

src/sim/rbergomi_sim.py (GPU rough-vol paths & ATM pricing) • src/env/hedging_env.py (state/action/reward) • src/agents/train_ppo.py (RecurrentPPO + Optuna) • src/backtester/* (Backtrader, model wrapper, delta baseline) • model_files/* (exported weights + normalization)

Notes & limitations

Sim-to-real gaps (liquidity, microstructure) remain; ATM selection uses robust mids with BS fallback; commission & slippage are parameterized; single-asset scope (multi-asset on roadmap).

Hoops-Spread — NCAA Point-Spread Alpha

View on GitHub →

End-to-end XGBoost spread model with Boruta→SHAP feature selection, a 50+ subreddit cascading sentiment pipeline (VADER→Flair→DistilBERT+sarcasm), and walk-forward backtests with half-Kelly sizing.

+9.75%

ROI (½-Kelly)

63.2%

Hit rate

15,276

Bets

2.96

Sharpe

+0.213

Mean CLV

4.32u

Max DD

Window: Seasons 2008–2022 walk-forward (train on prior seasons, starting 2007). Half-Kelly staking. Strict chronological splits; sentiment windows lag tip-off by 24h. Out-of-sample only.

Hoops-Spread walk-forward out-of-sample equity curve comparing market-signal vs fundamental profiles — Equity curve (walk-forward, OOS).

Hoops-Spread drawdown over time comparing market-signal vs fundamental profiles — Drawdown over time.

Hoops-Spread closing-line value distribution comparing market-signal vs fundamental profiles — CLV distribution (market-signal vs fundamental).

▶ More details

What this is

A reproducible NCAA point-spread pipeline that learns a cover probability and sizes stakes with half-Kelly. Two profiles: Market-signal (includes opening total as a weak market prior) and Fundamental (excludes it).

How it works

Features: pace/efficiency/SOS, travel & altitude, rolling team stats, 50+ subreddit sentiment EMAs.
Sentiment: cascading VADER → Flair → DistilBERT (+ sarcasm); escalate only on uncertainty; dedup + EMAs.
Selection: Boruta-SHAP; Model: Optuna-tuned XGBoost; SHAP attribution for interpretability.
Backtest: walk-forward refits on 30-day horizons, half-Kelly, bankroll & VaR tracking; CLV and bootstrap CIs.

Results

Market-signal: ROI +9.75% (95% CI +9.61%–+9.89%), Hit 63.2%, Bets 15,276 (~56% of games), Sharpe 2.96, Max DD 4.32u, Mean CLV +0.213, CAGR 23.1%, Turnover 11.5.
Fundamental (baseline): ROI +1.62% (95% CI +1.50%–+1.75%), Hit 59.4%, Bets 20,284 (~75% of games), Sharpe 2.86, Max DD 4.89u, Mean CLV +0.150, CAGR 11.6%, Turnover 15.3.

Plots (walk-forward, OOS)

Hoops-Spread walk-forward out-of-sample equity curve — Equity curve.

Hoops-Spread drawdown over time — Drawdown.

Hoops-Spread year-by-year out-of-sample ROI — Year-by-year ROI.

Hoops-Spread rolling Sharpe over time — Rolling Sharpe (60 betting days).

Hoops-Spread closing-line value distribution — CLV distribution.

Reproduce quickly

pip install -e .
# Train only
hoops-spread modeling
# Backtest only
hoops-spread backtest
# Train + Backtest (expects features in data/features)
hoops-spread all

Early upstream data/sentiment orchestration lives in /wip; finalized modeling/backtesting are production-ready.

Repo map (fast tour)

hoops_spread/modeling/* • hoops_spread/backtesting/* • config/boruta_features_sentiment.txt • /wip/* (upstream data + sentiment).

Notes & limitations

Reality checks: limits, line moves, stale openers, and book rules can dominate realized ROI; CLV is the primary signal of edge. This backtest assumes execution near the modeled line with a fixed edge threshold. Upstream collection is being consolidated into a single DAG; injury/refs/travel feeds are on the roadmap.

Hybrid Monte Carlo Options Pricer

Revamp in progress View on GitHub →

Modular pricing engine that generates rBergomi-style paths and evaluates early exercise with four complementary methods, plus an optional Torch-based Bayesian meta-model for post-processing and uncertainty.

rBergomi Asymptotics LSM Dual/Martingale Branching OpenMP CUDA-optional Torch BNN

83K

Paths/sec

6.32×

Speedup @ 8 threads

0.29% / 0.12%

BS error (C / P)

American put Longstaff-Schwartz convergence versus number of Monte Carlo paths — LSM convergence (American put).

OpenMP scaling showing speedup versus threads for Monte Carlo pricing — OpenMP scaling.

Bar chart showing Monte Carlo Black-Scholes sanity check absolute percent error for call and put — Black-Scholes sanity check.

▶ Mini demo: Black-Scholes price + Greeks (client-side)

Spot (S) Strike (K) Time (T, years) Vol (σ, %) Rate (r, %) Dividend (q, %) Option type

—

Price

—

Delta

—

Gamma

—

Vega / 1%

—

Theta / day

—

Rho / 1%

Price curve vs spot (hold K, σ, r, q, T fixed).

Benchmarks from v1 (MSVC Release, OpenMP on an 8c/16t Windows box); revamp in progress. Mini demo uses analytic Black-Scholes (client-side), not the full engine.

▶ More details

What this is

A modular American-style options pricer built around Monte Carlo path generation under rough volatility. The engine compares and combines multiple early-exercise estimators and can feed their outputs into a Torch-based Bayesian meta-model to quantify prediction uncertainty, useful for research or downstream screening.

Methods implemented

Asymptotic analysis: boundary-style early-exercise approximations for fast screening.
Branching processes: upper/lower bounds via randomized tree exploration.
LSM (Longstaff–Schwartz): regression of continuation values across paths.
Martingale/duality optimization: variance-reduced bounds on the American price.

(Methods are implemented per standard literature, e.g., Keller Meeting on American Option Pricing, 2005.)

Rough-vol paths (rBergomi flavor)

Fractional Gaussian noise with FFT acceleration; automatic estimation of H, vol-of-vol, and correlation from recent returns, then forward-variance construction for path simulation.

Bayesian meta-model (optional)

Torch/LibTorch model (supports MC-Dropout) to post-process pricer outputs and produce a mean prediction with uncertainty bands. Runs on CPU by default; GPU/CUDA is optional.

Benchmarks (v1)

Throughput: ~83K paths/sec @ 50K paths, 252 steps (FFT fGn).
OpenMP scaling: 6.32× at 8 threads (50K paths, 252 steps).
Validation: European MC vs Black-Scholes error 0.29% call / 0.12% put (100K paths).

Numbers from MSVC Release on an 8c/16t Windows system; see repo reports/benchmarks.md for full tables.

Implementation notes

C++ core with OpenMP for parallel batches; LibTorch for the BNN; CLI pipeline to augment option datasets (no public demo).

Current status

Actively being revamped; v2 results pending. The benchmarks/plots shown above are from v1 for an honest baseline.

Published Research

Trader Behavior in 2024 Election Prediction Markets

An analysis of retail and institutional impact on Kalshi prediction markets. I segment traders with a Gaussian Mixture Model, then run price impact regressions, power-law fits, persistence tests, and mispricing event studies to understand who moves price and when flow becomes predictive.

Read paper (PDF) → View data & code on GitHub →

Headline numbers

1,697

Inst. median size

Retail median size

-0.154

KH retail slope

Flow → ΔP

Retail predictive (KH)

Numbers are from PRES-2024 Kalshi trade logs (retail vs institutional via GMM). Slopes are log-log fits; Granger results use differenced stationary series.

Key findings

→ Retail flow can be predictive: In KH, retail net flow Granger-causes subsequent price changes across multiple lags.

→ Institutional impact is often contemporaneous: In DJT, institutional flow has stronger same-window price impact vs retail.

→ Markets are resilient: Median price-impact ratios drop below 1 across horizons, consistent with mean-reversion / absorption.

→ Mispricing corrections are nuanced: “First institutional trade” is frequently not purely corrective; behavior depends on mispricing definition (MA% vs z-score).

Evidence plots

Density plot of log trade sizes for institutional versus retail labels — Trade size density (log scale).

Log-log scatter and fitted lines for flow versus absolute price change in the KH market by label — Flow vs |ΔP| power law (KH).

Log-log scatter and fitted lines for flow versus absolute price change in the DJT market by label — Flow vs |ΔP| power law (DJT).

Subplots of price persistence metrics across horizons in the KH market by label — Price persistence vs horizon (KH).

Subplots of price persistence metrics across horizons in the DJT market by label — Price persistence vs horizon (DJT).

Trading implications

Segmented flow features: “net flow” is not one thing; labeling matters. Retail vs institutional flow can behave differently and carry different predictive content.
Event definitions matter: the same market can look “corrective” or “momentum-driven” depending on mispricing definition; robust signals need multiple lenses.
Impact decays: persistence tests suggest impact mean-reverts across horizons, which informs holding periods and risk controls.

▶ Expand full details (methods, reproducibility, limitations)

Methods (high level)

Trader labeling: Gaussian Mixture Model on log trade size (plus robustness checks).
Impact: regressions of signed flow vs contemporaneous ΔP; power-law fits on log-log scale.
Persistence: post-trade impact ratios across horizons (median/CI).
Mispricing events: MA% and z-score definitions; flow decomposition into corrective vs counter-flow.
Predictability: bivariate Granger causality on stationary series; event-level tests as a robustness check.

Reproduce

Full code, data pulls, and notebooks live in the repo. The paper PDF is linked above for the narrative and figure references.

Limitations

No full order book / depth (trade prints only), so impact is measured on mid/last-trade proxies.
Prediction markets have unique microstructure (fees, contract increments, liquidity), so results transfer conceptually; not 1:1 to equities.
Some VAR residual autocorrelation issues; conclusions emphasize robust bivariate results and descriptive evidence.

Skills Visualization

Click + drag to pan, scroll to zoom, and click a node to pin it. Hover highlights connections.

Legend

Skill / domain

Project / experience

Selection

hover or click

Hover a node to preview its neighborhood. Click to pin. Press Esc to clear.

Tips

Double-click the canvas to recenter.
Use the search below to filter the full list.