Letter 104: Using Domain Knowledge for your Prediction Model

A prediction model is a tool anyone can create, domain knowledge is what makes it excel

Mar 17, 2026

∙ Paid

Last week I walked you through how to vibe code a prediction model from scratch.

The response was great and a bunch of people have started building their own models, which is awesome to see.

Some of the questions I’ve gotten this week are along the lines of “what should I try and predict?” and “are you meant to just follow the model blindly once it’s up and running?”

So I thought I’d write a bit more about the concept of domain knowledge since it answers both these questions + more.

Domain knowledge is a layer that sits (or should sit) at the foundation of as well as on top of any model you build. It’s the thing that separates someone who has a model from someone who has a good model, and someone who uses their model well.

This is the stuff you know about your area of expertise that no dataset fully captures. Context, nuance, edge cases, etc. Things that are hard to quantify but easy to recognize if you’ve spent thousands of hours in a space.

I think understanding how and when to apply your domain knowledge is one of the most important skills you develop as you work with prediction models. And it’s something I’ve been thinking about a lot as I continue to refine my Dota 2 model and track real bets.

My model, by the way, is continuing to prove to be quite the profitable little thing. Here are the latest results. Still early days, but my confidence is slowly but surely growing in it:

137 bets and profitable. I won’t feel too comfortable until we’re at 500 bets, and probably not really comfortable until we hit 1000+, but… we’re on our way.

Anyway. Back to domain knowledge. Here’s what we’ll cover today:

What domain knowledge actually is
Where domain knowledge helps you build a better model
When to trust yourself over the model
When to trust the model over yourself
This applies beyond betting on esports
Final thoughts

1. What domain knowledge actually is

Domain knowledge is everything you know about a subject that you’ve accumulated through experience, observation, and participation. It’s the stuff that lives in your head and is hard to put into a spreadsheet or json file or bit of python code.

For me and Dota 2, that’s knowledge which comes from 20+ years of playing the game and thousands of hours watching professional matches. Some examples of domain knowledge:

Knowing that the meta shifts considerably when new patches drop, and some teams (and players) perform better than others depending on the patch notes/changes
Knowing when a team has a standin player replacing one of their regular players due to visa issues (or other issues)
Knowing which games “don’t matter” in the sense that a team that has gone 0-4 in the group stage has a 0% chance of making it to the playoffs even if they win every match from now on, but they still have to play their matches, so they might not try as hard or they might try more experimental things than usual (also the flip side: when a team has secured their spot, they also might try to be more experimental)
Knowing when a team just played the second longest bo3 series in history and still has a bo5 to play and they’re running on slow sleep and fumes at the end of a 2.5 month trip away from home (this happened this past weekend)

None of that is in my model’s training data. You can sort of come up with ways to add versions of these into the model, but a) you still need to know to look for them in the first place (something I doubt most non-dota fans would be able to look for) and b) a lot of the time the information is very difficult/impossible to scrape and only applies to an extremely small % of matches that it harms the overall model to even try.

Models sees numbers. Win rates, hero matchups, recent form, historical performance. It does a good job with those numbers. But it doesn’t actually watch the games, it doesn’t watch pre match and post match interview, and it doesn’t understand.. for lack of a better word, the vibes.

And yes, I’m using the word vibes unironically here, because sometimes that’s what it comes down to. You watch a team play and something feels off so you might look into it and see: oh, yeah, their coach actually isn’t with them for this tournament cause of X, Y, Z reason. So they’re not drafting as good as they might be, that explains my vibe!

That’s domain knowledge.

The specifics will vary depending on the type of thing you’re trying to predict, but the principle is the same. Domain knowledge is either a) things you know that most others don’t and which you can feed into your model, and b) things you know that can’t reasonably be put into any model, but that might influence how much you want to rely on your model’s predictions.

Let’s look at these in a bit more detail now.

Letters from a Zeneca

Letter 104: Using Domain Knowledge for your Prediction Model

A prediction model is a tool anyone can create, domain knowledge is what makes it excel

1. What domain knowledge actually is

2. Where domain knowledge helps you build a better model

This post is for paid subscribers