In the previous post on this topic, we discussed the challenges and statistics involved in selecting a pair of stocks for statistical arbitrage. We understood how by using the co-integration tests we can say within a certain level of confidence interval that **the spread between the two stocks is a stationary signal**. In other words, this signal is mean reverting. The spread is defined as:

**Spread = log(a) – nlog(b)**, where ‘a’ and ‘b’ are prices of stocks A and B respectively. For each stock of A bought you have sold

*n*stocks of B.

*n*is calculated by regressing prices of stocks A and B.

Having already established that the equation above is mean reverting, we now need to identify the extreme points or threshold levels which when crossed by this signal, we trigger trading orders.

To be able to identify these threshold levels, a statistical construct called **z-score **is widely used in pair trading.

**What is z-score? **

Simply put, given a normal distribution of raw data points z-score is calculated so that the new distribution is a normal distribution with mean 0 and standard deviation of 1. Having such a distribution ~ N(0, 1) is very useful for creating threshold levels. For example, in pair trading we have a distribution of spread between the prices of stocks A and B. We can convert these raw scores of spread into z-scores as explained below. This new distribution will have mean 0 and standard deviation of 1. It is easily to create threshold levels for this distribution such as: 1.5 sigma, 2 sigma, 2.5 sigma, and so on.

#### How to calculate z-score?__ __

z = (x – mean) / standard deviation, where x is a raw data point and z is the z-score.

Mean and standard deviation can be rolling statistics for a period of ‘*t*’ days or minutes or time intervals.

**Moving average **

We divide the data into subsets of size‘t’, where‘t’ specifies a fixed time period for which average is to be calculated. For example, to calculate moving average of prices of stock A where ‘t’ is 10 days, we start by calculating average after first 10 days in the dataset. So we calculate moving average at 10^{th} day, 11^{th} day, 12^{th} day and so on. The average is moving or rolling. Moving average and standard deviation is calculated for ‘t’ as 10 days in the table below.

The moving average for 1-08-2001 or 11^{th} entry would not take in to account the first data point, that is, stock A prices on 18-07-2001.

Using these concepts of moving averages and z-score we create the entry points for pair trading.

**Defining Entry points **

- Spread = log(a) –
*n*log(b) = s Let us call it*s*. - Calculate z-score of ‘s’, using rolling mean and standard deviation for time period of ‘t’ intervals. Save this as z.
- Define threshold as anything 1.5-sigma, 2-sigma. This parameter will change as per the backtesting results without risking over fitting to data.
- When Z-score crosses upper threshold, go SHORT:
- Sell stock A
- Buy stock B

- When z-score crosses lower threshold, go LONG:
- Buy stock A
- Sell stock B

- Maintain the hedge ratio to calculate stock quantity

**Defining Exit points **

**STOP LOSS**

**STOP LOSS** is defined for scenarios when the expected do not happen. For example, if we chose entry signals at 2-sigma, we are expecting that the spread will revert back to mean from this threshold. However, it is possible that spread continues to blow up. Say it reaches 2.5-sigma and you incurred losses. To prevent further losses, you place Stop Loss at say 3-sigma.

In addition of placing a pre-defined stop loss criterion such as 3-sigma or extreme variation from the mean, you can check on the co-integration value. If the co-integration is broken during the pair is ON, the strategy warrants cutting the positions since the basic hypothesis is nullified.

**TAKE PROFIT**

**TAKE PROFIT** is defined as scenarios where you take profit before the prices move in other direction. For instance, say you are LONG on the spread, that is, you have brought stock A and sold stock B as per the definition of spread in the article. The expectation is that spread will revert back to mean or 0. In a profitable situation, the mean would be approaching to zero or very close to it. You can keep Take Profit scenario as when the mean crosses zero for the first time after reverting from threshold levels.

There can be many ways of defining take profits depending on your risk appetite and backtesting results.

**Next Step**

Implement these concepts, you can start with any two stocks say Microsoft and Apple and work on a simple pair trading strategy. If you are interested in implementing this trading strategy in Python, you can get started here.