This is an introductory article to a new series I’ll be trying out. I love asking ChatGPT conversations about stuff I don’t understand/want to understand more, because it’s gotten to the point where its pretty damn accurate for most topics, and I pretty much get to ask as many stupid questions as I want. But then I got to thinking — maybe other people can learn from this. So, I get to learn about new things I’m curious about, but also I get a free topic for an article to write for my readers? Sign me up!

That brings us to our pilot topic: Markov chains.

Markov chains are a mathematical system of describing the probabilities of one “state” going into another.

To put it in an example: Let’s say that, every day, I have three choices. In the afternoon, I could go get a coffee from a local coffee shop. Or, I could be more adventurous, and try a cafe downtown. Alternately, I can be lazy and stay in. Every single day, there is a…

30% chance I leave home to get a coffee
10% chance I go to a downtown cafe
60% chance I just stay home (we refer to this as “the state going into itself”)

The idea here is that what I did before in the day doesn’t really matter, just what I did immediately preceding it — the “state”, if you will. As you can predict, you can keep chaining these states up… for example, if I did go to the downtown cafe, you could say there’s a choice after it of me going back home (90% chance) or walk around the downtown (10% chance). When you see a Markov chain visualized, it becomes pretty obvious what they’re all about:

*From Jaro Education. This example uses states for the weather — in the example, if it is sunny today, there is a 10% chance it will be cloudy tomorrow.*

This is all well and good, as Markov chains are at a surface level very easy and intuitive to understand. But an immediate red flag should be up for you right now:

How the hell do you actually use this?

Sure, its easy to bullshit the idea of me having a 30% chance of getting a coffee, or a 10% chance of it being sunny, but if we’re going to apply this we need to know how we’re getting probabilities in the first place. In other words, we need to dive deeper…

The Basic Markov Model

Every example I’m going to use from here on out is real — real examples are preferable, don’t you think?

In particular, I’m going to cover the field of quantitative finance. I’m a quant guy, and my reason for wanting to learn Markov chains is based on this — plus, 90% of the practical use cases for Markov models (or models that utilize Markov chains) are in the quantitative finance field.

Let’s say we want to build a Markov model to gain a good idea of what the next market cycle will be. Generally speaking, their are four market cycles:

High Return, High Volatility (shortened to HRHV from here on out)
High Return, Low Volatility (shortened to HRLV)
Low Return, High Volatility (LRHV)
Low Return, Low Volatility (LRLV)

First, we need to gather the probabilities of each of these things happening. We can do that like so:

Gather a lot of historical return data of the S&P 500. Let’s say we’re trying to “predict” the next month for now — so all the data we’ll get is monthly
Define what is R and what is V. HR and LR is easy — its whether the S&P 500 was positive or negative that month. For HV and LV, we’ll say if the S&P 500 was at or below its average standard deviation (which we can calculate using our data).
Based on our definitions, we make a call: was the given historical month HRHV, HRLV, LRHV, or LRLV?
Do tallies. A lot of tallies. We want to count:
1. The amount of times HRHV goes into HRLV
2. The amount of times HRHV → LRHV
3. HRHV → LRLV…
4. …etc. etc. (though don’t forget to tally the amount of times HRHV falls into itself, AKA HRHV → HRHV!)
Do the probability counts on each of these. Such as (HRHV → HRLV) / Total Transitions

And there you have it! We’re done with our Basic Markov Model. In a way, all we did is just basic elementary school probability: we tallied up transition states, then divided them by a total to get a percent chance. You lay all this out and you can draw a very similar visual to our weather example. We did it!

…except, I don’t know any elementary schoolers who are rich quants. As it turns out, the Basic Markov Model is a little too basic, and doesn’t do all that great a job at predicting next month’s market. So, what are all those hedge fund guys doing different?

The Logistic Markov Model

This is where things get interesting.

Okay, technically a logistic regression equation isn’t the same thing as a Markov model, but if we gather a bunch of related logregs together, we could still build one!

In case you need a refresher, a logistic regression function (shortened to logreg — not a big fan of writing long words over and over) is a special type of function which always spits out a probability. In our case, a logreg might look like y = bx, where:

y is the given transition state
b is a coefficient for a variable
x is the variable

Now, we have a problem — so far we haven’t really gathered any variables. What would a variable look like in our example?

Well, if we’re trying to find the probability of market states, one thing we could use is common indicators for market states. Some examples include the beta of the S&P 500 (volatility), the unemployment rate (return), and the Copper-to-Gold ratio (a bit of both). So a sample logreg here might look like:

[HRHV→HRLV] = b1[S&P Beta] + b2[Unemployment] + b3[Copper-to-Gold]

For our [indicators], we simply put in what the indicator is at the given time. For our coefficients, we gather a lot of data on market transitions and what these indicator values were at the time. We build what are called one-vs-all logregs for all possible market transitions — we can do this by putting the data into a logreg optimizer that you can find in pretty much any spreadsheet software (don’t want to dwell too much on how these optimizations are done since we’re beginning to stray away from Markov Models). At the end of the day, these probabilities generate Markov chains similar to our previous two examples! These ones hold a bit more weight because they’re gathering data from numerous “informing” sources rather than just the beta and return, which can be caused by things we don’t know about…

Actually, the more we think about it, there’s a lot we don’t understand about how market transitions work. Is there a Markov model we can use when we just want to go in blind?

As it turns out, this is the Markov model the quants are actually using.

The Hidden Markov Model

Things are about to get a lot more complicated, so strap in.

Okay, let’s admit our faults and say we don’t really know when a market is doing good or not. We originally defined high returns as “anything over 0%” and high volatility as “anything above the S&P average”, but in practice this can be way off. What if a 1% return might be too weak to start a market rally? What if the average standard deviation increases — or falls — over time? At the end of the day, perhaps what we really want to do is make the least amount of assumptions as possible.

This is what the hidden Markov model (HMM) is for.

To be honest, explaining what an HMM is abstractly is just going to make it more confusing. I’m instead going to go straight into an example, and it will give you a good idea about whats new — and what’s the same — about an HMM.

First, we get a lot of historical data (noticing a trend?). This can still be months, and it can still be S&P 500 returns and standard deviations.
This time, we’re not going to assume what the buckets are — we’re just going to assume how many. Well, we know the return can go up or down, and we know the standard deviation can go up or down, so sticking with 4 seems like a good idea.
We feed our data — and our number of buckets — into an HMM algorithm. At this point my brain’s too fried and my article’s too long to explain how this works, though if you want an extra credit assignment you can read more about the most popular HMM algorithm here. Not gonna lie, I don’t know if this just one just exists in Microsoft Excel — you might need to go looking for some code. Fortunately for us the concept of what this algorithm does is pretty simple:
1. The algorithm says “You’ve given us four buckets, and a list of data. So we’re going to look at all this data and see if we notice four distributions in the data.”. In other words, it might notice that for certain months, the mean average return is a lot lower than the mean average return in other months. Vice versa for volatility. So it will start to split up the data based on that.
2. This algorithm will give us more precise definitions for our buckets. For example, while our basic HRHV was just >0% return and >avg% volatility, the real HRHV might be something like >3.24% return and >1.22% volatility, as shown by the data.
Fortunately, our transmission probabilities are pretty much the same. Once the “hidden states” (aka buckets) are discovered, it just calculates transitions the old-fashioned way: tallying up the number of transitions and dividing them by the total. We can also choose to intervene and do a logreg on these new states in case we want to be extra sure.

It’s worth knowing it does this calculation of distributions (called “emissions”) and transitions iteratively, like any good algorithm. Our logreg did this too. But this is a less important fact for us with we just plan on putting it through an algorithm and not making our own.

So, there you have it — yes, if you go on the quant finance ArXiv right now, they’ll all be using HMMs! This is the sort of final boss as it stands for Markov models right now. Had a lot of fun writing this, and cementing the (hopefully correct) knowledge that ChatGPT taught me. Look forward to more of these in the future!

One response

How To Make A (Super) Hard Drive From Scratch – Jacob Robinson

March 1, 2025 at 8:02 pm

[…] buzzword. In the long run I’ve been wanting to make a series on these buzzwords (technically the article on Markov Chains was the first) so topology was naturally my next […]

Loading…

Jacob Robinson

Markov Chains and You!

The Basic Markov Model

The Logistic Markov Model

The Hidden Markov Model

Like this:

One response

Leave a ReplyCancel reply

Markov Chains and You!

The Basic Markov Model

The Logistic Markov Model

The Hidden Markov Model

Share this:

Like this:

One response

Leave a ReplyCancel reply

Discover more from Jacob Robinson