Causal Inference
The best free intro to causal inference. DiD, IV, RDD, synthetic control — all with Python code you can run. No fluff.
DAGs that don't suck. Nick HK explains causality like he's at a whiteboard, not a lectern. Free, modern, and actually enjoyable to read.
Scott Cunningham's guide — reads like a conversation, not a textbook. Code in Stata, R, and Python.
Synthetic data tutorials that feel like a game. Generate fake users, apply treatment, challenge yourself to recover the effect using Random Forest.
Microsoft's library for ML-based causal inference. Tutorials on heterogeneous treatment effects, double ML, and more.
Microsoft's causal inference library. Great for understanding causal graphs and robustness checks.
Applied Causal Inference Powered by ML and AI. Double/debiased ML, heterogeneous effects, and modern methods from the pioneers.
Aleksander Molak's hands-on guide. Graph-based causal inference, discovery algorithms, and Python implementations. Perfect for engineers.
Bayesian causal inference done right. MCMC, probabilistic programming, and causal models from the PyMC team.
Synthetic controls that feel like scikit-learn. Built on PyMC, perfect for quasi-experiments when you can't randomize.
Stats, causal inference, experiment pitfalls. Frank, often funny, always insightful. The most-read stats blog for good reason.
Applied econometrics notes and code by Apoorva Lal covering modern causal inference methods.
Jonathan Roth et al.'s comprehensive guide to modern difference-in-differences. Staggered timing, pre-trends, and new estimators.
Stanford GSB's ML and Causal Inference short course. Videos, slides, and tutorials on causal forests, treatment effects, and policy evaluation.
Experimentation
Practical guides on experiment design, stats engines, CUPED, and common pitfalls. Written by practitioners.
Simulates a social graph in Python to prove why network effects ruin experiments, then shows cluster-based fixes.
Why treating one user affects another in marketplaces. Clear diagrams explaining interference as a living organism, not a static dataset.
A modern take on SUTVA violations. Why you randomize time, not users, in marketplaces. The 30-minute window trick explained.
When your p-value calculation has billions of rows. The engineering reality of running causal inference at Uber scale.
Documentation that reads like a tutorial. Exact logic to split time into windows and washout periods for network A/B tests in Python.
The closest author to Matheus Facure. Explains CUPED not as statistical theorem, but as simple linear regression adjustment in 3 lines of pandas.
Short, punchy script simulating 1,000 experiments to prove variance actually goes down. Very satisfying to run.
Deep dives on sequential testing, variance reduction, and experiment analysis. Free stats engine docs.
Sample size calculators, stats explainers, and practical A/B testing guides. Bookmark-worthy tools.
The article that taught the industry why peeking at p-values is a sin. Short, lethal, career-saving.
Metric hierarchies, North Star metrics, and building data-informed products. The definitive framework for product metrics.
Coding
High-quality computational economics in Python and Julia. Dynamic programming, time series, asset pricing — all with code.
Arthur Turrell's practical guide. Python basics through advanced workflows — built specifically for econ researchers.
Command line, Git, debugging, shell scripting. The CS skills they don't teach in econ PhD programs but you absolutely need.
Software design principles for ML applications. Go from messy notebooks to maintainable, modular code with OOP essentials and refactoring guides.
Kevin Sheppard's comprehensive intro for economists. NumPy, pandas, statsmodels, and econometric applications.
ML & Data Science
Train a model in 10 minutes. Understand why it works later. Jeremy Howard's famous 'code first, theory second' approach to deep learning.
The 'I actually finished this one' ML book. Free with runnable Python code. Math when you need it, intuition first.
The more rigorous sibling of ISLR. Hastie, Tibshirani, and Friedman's classic on statistical learning theory. Free PDF from Stanford.
Rob Hyndman's free book on time series forecasting. ARIMA, ETS, hierarchical forecasting — with R code.
Sutton & Barto's classic. The definitive RL textbook, free from the authors.
OpenAI's intro to deep RL. Clear explanations of policy gradients, Q-learning, and working code.
DeepMind's David Silver teaches RL from scratch. The lectures that trained a generation of RL researchers.
Andrew Ng's foundational ML course on Coursera. Covers supervised/unsupervised learning, neural networks, and best practices.
Andrew Ng's deep learning sequence. CNNs, RNNs, transformers, and practical AI project structuring.
Pricing & Demand
Talluri & van Ryzin's comprehensive textbook. Dynamic pricing, capacity allocation, overbooking — the bible of RM.
Exploration-exploitation trade-off using a restaurant menu example. Simulate greedy vs Thompson Sampling to see which makes more money.
Don't let 'Stanford' scare you. The accompanying Python code is the industry standard for bandit algorithms from scratch.
The single best piece of data journalism in tech. Interactive, animated tour of how they combine styles, logistics, and feedback loops.
Matteo Courthoud's code-first walkthrough of the BLP demand model. The 'Brave and True' for structural economics.
QuantEcon tutorial on estimating demand curves with Python. Turns PhD-level IO into a solvable Problem().solve().
Auctions & Market Design
Python simulations proving Revenue Equivalence. Code to solve BNE when closed-form solutions don't exist.
Gale-Shapley stable matching in Python. How to match riders to drivers, students to schools, organs to patients.
GSP auctions, quality scores, AdRank — how Google/Meta ad auctions actually work. Chapter 21.
Roughgarden's famous course. Why Bitcoin is an auction, how selfish routing breaks the internet, and Price of Anarchy — explained for coders.
Stanford GSB's market design expert. Course materials on auctions, matching markets, and platform economics. Advised Airbnb, LinkedIn, Google.
Industry Blogs
1000+ engineers, millions of users, zero guesswork. Real case studies on how Netflix actually runs A/B tests at scale.
Surge pricing, marketplace design, causal inference at scale. See how economists tackle real problems at Uber.
Experimentation platform design, interference in experiments, and applied causal inference from Airbnb's data science team.
How do you recommend songs to 500M users? Personalization, search, and ML at audio scale.
Marketplace economics, delivery optimization, and experimentation. Great posts on real-time pricing and logistics.
Rideshare economics, causal inference, and marketplace experiments. Practical posts from Lyft's economics and data science teams.
How Lyft combines causal inference with forecasting for pricing. The bridge between 'what happened' and 'what will happen'.
Efficiency isn't speed—it's an economic equilibrium. A masterclass in defining the objective function for marketplace optimization.
Demand forecasting, inventory optimization, and personalization. Unique blend of fashion retail + serious data science.
Large-scale experimentation, ML infrastructure, and data discovery at Facebook scale. Posts on causal inference and data tools.
41K concurrent A/B tests on 700M members. Network effects, inclusive product testing, and experimentation infrastructure.
Visual discovery engine, recommendation systems, and ML at scale. Feature engineering and personalization.
Marketplace balancing, delivery optimization, demand forecasting. Making on-demand grocery profitable.
Payment economics, fraud detection ML, financial data infrastructure. Building economic infrastructure for the internet.
Travel marketplace experimentation at massive scale. A/B testing, recommender systems, and pricing optimization.
Research from Amazon's scientists. Causal inference, supply chain optimization, pricing, and forecasting.
Economics & Strategy
Platform economics and digital strategy from a leading IO economist. Network effects, two-sided markets, and antitrust.
The gold standard for tech strategy analysis. Aggregation theory, platform dynamics, and business model breakdowns.
The firm as a mechanism design problem. How zero marginal costs and network effects rewrote the economics of tech companies.
Tech economics, AI, innovation, growth. Deep dives with data, accessible to non-specialists. The economist's tech newsletter.
Innovation economics research summaries. Academic findings on R&D, productivity, and science — in plain language.
Platform competition and network effects. Academics summarizing their own research on marketplaces and big tech.
The econ blog. Tech, innovation, markets — prolific, eclectic, and influential. A daily must-read.
Tech market trends and strategic analysis. What's happening in tech and why it matters.
Ex-Patreon/Reforge. Growth loops, product strategy, and how to structure product analytics.
SQL
Interactive SQL lessons from basic to advanced. Great for learning JOINs, window functions, and subqueries with a real database.
Learn SQL with interactive exercises. No setup required — run queries right in the browser. Perfect for beginners.
50 essential SQL problems to master for interviews. CTEs, window functions, and common patterns used at FAANG.
Modern in-process SQL database. Runs on your laptop, reads Parquet directly, and is perfect for analytics. The new pandas killer.
LeetCode
Curated LeetCode roadmap organized by pattern. Video explanations that actually make sense. The modern way to prep for coding interviews.
The 75 most important LeetCode problems. Arrays, strings, trees, graphs, DP — if you can solve these, you can handle any interview.
14 patterns to solve any coding interview question. Two pointers, sliding window, BFS/DFS, and more — with Python templates.
Automation
The best free Python book for non-programmers. Web scraping, Excel automation, file management — practical skills for data work.
The industrial-strength web scraping framework for Python. Build spiders, handle anti-bot measures, and scale to millions of pages.
Practical guide to scraping with BeautifulSoup and requests. Parse HTML, handle pagination, and extract structured data.
Statistics
Introduction to probability and statistics with Python. Exploratory data analysis, hypothesis testing, and Bayesian methods.
Richard McElreath's Bayesian approach to statistics. PyMC3 translations available. The book that changed how many think about inference.
Survey science, conjoint analysis, and quantitative UX research. Statistical rigor for product research.
Monographs
Stefan Wager's Stanford STATS 361 notes. Causal forests, HTE, and prescriptive policy learning — the exact tools used in tech companies.
Cosma Shalizi's 600-page CMU manuscript. Geometry of regression, DAGs, heavy-tailed distributions. Mercilessly rigorous yet chatty.
Hardt & Recht's living book grounding ML in decision theory. Why models fail in the real world and how to think about prediction as action.
Bronstein et al.'s proto-book unifying CNNs, GNNs, and Transformers under symmetry and group theory. The GNN bible.
Michael Betancourt's Stan case studies. HMC geometry, divergences, hierarchical modeling — the Bayesian technical standard.
Lattimore & Szepesvári's definitive reference. UCB, Thompson Sampling, Contextual Bandits — experimentation beyond A/B testing.
Agarwal, Jiang, Kakade's working draft. Statistical learning theory of RL — 'How many samples to learn this policy?'
Guido Imbens' Nobel lectures. Matrix completion for synthetic controls, the rigorous foundation for tech metrics.
Belloni & Chernozhukov on Post-Double Selection Lasso. Use ML as controls while getting valid p-values.
Bruce Hansen's modern reference. Asymptotics, cluster-robust SEs, panel data — the technical backbone.
Ariel Rubinstein's free lecture notes. Game theory, rational choice, bargaining — for mechanism design work.
Alvin Roth's Nobel lecture on matching markets. Stability, unraveling — essential for marketplace economics.
Technical leadership, glue work, and navigating org complexity. Essential for senior ICs in tech.
Martin Kleppmann's engineering bible. Stream vs batch, distributed systems, and data infrastructure.
Optimization
The bible of convex optimization — free online, universally cited. Covers LP, QP, SDP, and more.
Boyd's legendary lectures on convex optimization. The gold standard for learning optimization theory.
University of Melbourne's course on constraint programming, local search, and MIP. Covers MiniZinc modeling language.
Hands-on convex optimization in Python. Learn to model and solve real problems with CVXPY.
Agentic AI
Andrew Ng on the four agentic design patterns — reflection, tool use, planning, multi-agent. Start here.
Build agents from scratch with Harrison Chase. State machines, tool calling, and human-in-the-loop.
Design patterns from Claude's creators. The best practitioner guide on when and how to build agents.
Official docs for the industry standard agent framework. Graphs, state, persistence, and deployment.