## Pair Trading – Statistical Arbitrage On Cash Stocks

This article is the final project submitted by the author as a part of his coursework in Executive Programme in Algorithmic Trading (EPAT™) at QuantInsti™. Do check our Projects page and have a look at what our students are building.

### About the Author

Jonathan has a strong knowledge of mathematical programming and has worked as a process optimization engineer for 3 years. He started to get involved in trading as a hobby, especially in algorithmic trading due to his passion for math but eventually, it became his full-time job. Jonathan enrolled for Executive Programme in Algorithmic Trading (EPAT™) in November 2016 and found his space in the world on quantitative analysis in finance. Currently, he is taking several courses online in subjects related to Artificial Intelligence and its applications in finance and is about to start an online portal in Financial Engineering to share his experience as a Quant Trader.

### Project Objective

The objective of this project is to model a statistical arbitrage trading strategy and quantitatively analyze the modeling results. Motivation relies on diversifying investment throughout five sectors, aka Technology, Financial, Services, Consumer Goods and Industrial Goods. Furthermore, some stocks, generally in the same sector, move in tandem because prices are affected by the same market events. However, the noise might make them temporarily deviate from the usual pattern and a trader can take advantage of this apparent deviation with the expectation that the stocks will eventually return to their long-term relationship.

Within each sector, stocks were selected based on high liquidity, small bid/ask spread and ability to short the stock. However, it is possible to consider other stocks for further analysis. Once the stock universe is defined, pairs can be formed. Every day as we want to enter a position, all the pairs in the universe are evaluated and the top pairs are selected per some criteria.

### Trading Strategy Idea

As the universe of pairs is already defined, correlation analysis should be performed for all possible pairs to filter out pairs which have suitable properties for executing statistical arbitrage. With this correlation test, we are looking for a measurement of the relationship between two stock prices. The logic of the strategy is: for any pair that is correlated (from the universe established), if the pair ratio diverges from a certain threshold, then we short the stock that is expensive and buy the cheap stock. Once they converge to the mean, we close the position and profit from the reversal.

The strategy triggers new orders whenever the pair ratio of the prices of the stocks on the universe of filtered pairs diverges from the mean. To ensure the convenience of trading at this point, the pair must be cointegrated. If the pair ratio is cointegrated, the ratio is mean reverting and the greater the dispersion from its mean, the higher the probability of a reversal, which makes the trade more attractive. This analysis allows in determining the stability of the long-term relationship. Spread time series is tested for stationarity by the Augmented Dickey-Fuller (ADF) test. In other words, if pair stocks are cointegrated, it suggests that the mean and variance of this correlation remains constant over time. There is, however, a major issue which makes this simple strategy difficult to implement in practice: long term relationship can break down, and the spread can move from one equilibrium to another.

A training period of minimum 1-year data is chosen for out-of-sample test and the capital allocated to each sector is decided based on a minimum variance portfolio approach. Each sector is traded independently. Yahoo finance has been used for testing this strategy.  To perform the backtesting for each pair, data for the period 1-Jan-2009 to 31-Dec-2014 has been used.

### Strategy Details

You can read the complete project work of the author including the Python codes for Pairs Trading by downloading the Ebook provided below.

Highlights from the project include:

• Pair Trading – Statistical Arbitrage on Cash Stocks
• Strategy
• Code Details and In-Sample Backtesting
• Analyzing Model Output
• Monte Carlo Analysis and much more…

### Next Step

If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to build a promising career in algorithmic trading. Enroll now!

## Project Work: Segregating Human and HFT Algo Orders

### Introduction

Market price is determined by the participants. The trending and range bound markets are the moods of the participants. By analyzing the participants we can have a wide variety of conclusions. Consider the two extreme participants, HFT Algos and humans.

Now let us analyze human oriented orders. Humans cannot replace the orders hundreds of times per second, nor can they adjust the price in fractions so as to profit from them. On the other hand, the Algos can replace the orders even in a fraction of a second and they try to make profit from even the small edges.

The idea of the study is to segregate the human and machine orders based on the minimum time to replace the order before the execution of the order. If the minimum replace time is more than threshold time Ts, then the order is considered as a human order else HFT algo order. The effectiveness of the segregation is then determined with the sample data that has the human and HFT algo that has the human and HFT algo order details.

## Strategy using Trend-following Indicators: MACD, ST and ADX

This article is the final project submitted by the author as a part of his coursework in Executive Programme in Algorithmic Trading (EPAT™) at QuantInsti™. Do check our Projects page and have a look at what our students are building.

### About the Author

Gopal, a management professional, has over 2 decades of experience in IT industry with a strong Global Delivery background, with passion in Quantitative Finance and gadgets.  He is highly process driven with a focus on achieving quality using automation. He heads delivery at PreludeSys India Ltd. Gopal holds an MBA from Symbiosis Centre For Distance Learning. He has successfully completed the course work for the Executive Programme in Algorithmic Trading (EPAT™) in November 2016.

## Implementing Pairs Trading Using Kalman Filter

This article is the final project submitted by the author as a part of his coursework in Executive Programme in Algorithmic Trading (EPAT™) at QuantInsti. Do check our Projects page and have a look at what our students are building.

### Introduction

Some stocks move in tandem because the same market events affect their prices. However, idiosyncratic noise might make them temporarily deviate from the usual pattern and a trader could take advantage of this apparent deviation with the expectation that the stocks will eventually return to their long term relationship. Two stocks with such a relationship form a “pair”. We have talked about the statistics behind pairs trading in a previous article.

This article describes a trading strategy based on such stock pairs. The rest of the article is organized as follows. We will be talking about the basics of trading an individual pair, the overall strategy that chooses which pairs to trade and present some preliminary results. In the end, we will describe possible strategies for improving the results.

Pair trading

Let us consider two stocks, x and y, such that

y = \alpha + \beta x + e

\alpha and \beta are constants and e is white noise. The parameters {\alpha, \beta} could be obtained from a linear regression of prices of the two stocks with the resulting spread

e_{t} = y_{t} – (\alpha + \beta x_{t})

Let the standard deviation of this spread be \sigma_{t}. The z-score of this spread is

z_{t} = e_{t}/\sigma_{t}

### Trading Strategy

The trading strategy is that when the z-score is above a threshold, say 2, the spread can be shorted, i.e. sell 1 unit of y and buy \beta units of x. we expect that the relationship between x and y will hold in the future and eventually the z-score will come down to zero and even go negative and then the position could be closed. By selling the spread when it is high and closing out the position when it is low, the strategy hopes to be statistically profitable. Conversely, if the z-score is below a lower threshold say -2, the strategy will go long the spread, i.e. buy 1 unit of y and sell \beta units of x and when the z score rises to zero or above the position can be closed realizing a profit.

There are a couple of issues which make this simple strategy difficult to implement in practice:

1. The constants \alpha and \beta are not constants in practice and vary over time. They are not market observables and hence have to be estimated with some estimates being more profitable than others.
2. The long term relationship can break down, the spread can move from one equilibrium to another such that the changing {\alpha,\beta} gives an “open short” signal and the spread keeps rising to a new equilibrium such that when the “close long” signal come the spread is above the entry value resulting in a loss.

Both of these facts are unavoidable and the strategy has to account for them.

### Determining Parameters

The parameters {\alpha, \beta} can be estimated from the intercept and slope of a linear regression of the prices of y against the prices of x. Note that linear regression is not reversible, i.e. the parameters are not the inverse of regressing x against y. So the pairs (x,y) is not the same as (y,x). While most authors use ordinary least squares regression, some use total least squares since they assume that the prices have some intraday noise as well. However, the main issue with this approach is that we have to pick an arbitrary lookback window.

In this paper, we have used Kalman filter which is related to an exponential moving average. This is an adaptive filter which updates itself iteratively and produces \alpha, \beta, e and \sigma simultaneously. We use the python package pykalman which has the EM method that calibrates the covariance matrices over the training period.

Another question that comes up is whether to regress prices or returns. The latter strategy requires holding equal dollar amount in both long and short positions, i.e. the portfolio would have to be rebalanced every day increasing transaction cost, slippage, and bid/ask spread. Hence we have chosen to use prices which is justified in the next subsection.

### Stability of the Long Term Relationship

The stability of the long term relationship is determined by determining if the pairs are co-integrated. Note that even if the pairs are not co-integrated outright, they might be for the proper choice of the leverage ratio. Once the parameters have been estimated as above, the spread time series e_{t} is tested for stationarity by the augmented Dickey Fuller (ADF) test. In python, we obtain this from the adfuller function in the statsmodels module. The result gives the t-statistics for different confidence levels. We found that not many pairs were being chosen at the 1% confidence level, so we chose 10% as our threshold.

One drawback is that to perform the ADF test we have to choose a lookback period which reintroduces the parameter we avoided using the Kalman filter.

### Choosing Sectors and Stocks

The trading strategy deploys an initial amount of capital. To diversify the investment five sectors will be chosen: financials, biotechnology, automotive etc. A training period will be chosen and the capital allocated to each sector is decided based on a minimum variance portfolio approach. Apart from the initial investment, each sector is traded independently and hence the discussion below is limited to a single sector, namely financials.

Within the financial sector, we choose about n = 47 names based on large market capitalization. We are looking for stocks with high liquidity, small bid/ask spread, ability to short the stocks etc.  Once the stock universe is defined we can form n (n-1) pairs, since as mentioned above (x,y) is not the same as (y,x). In our financial portfolio, we would like to maintain up to five pairs at any given time. On any day that we want to enter into a position (for example the starting date) we run a screen on all the n (n-1) pairs and select the top pair(s) according to some criteria some of which are discussed next.

### Choosing Pairs

For each pair, the signal is obtained from the Kalman filter and we check if |e| > nz \sigma, where nz is the z-score threshold to be optimized. This ensures that this pair has an entry point. We perform this test first since this is inexpensive. If the pair has an entry point, then we choose a lookback period and perform the ADF test.

The main goal of this procedure is not only to determine the list of pairs which meets the standards but rank them according to some metrics which relates to the expected profitability of the pairs.

Once the ranking is done we enter into the positions corresponding to the top pairs until we have a total of five pairs in our portfolio.

### Results

In the following, we calibrated the Kalman filter over Cal11 and then used the calibrated parameters to trade in Cal12. In the following, we kept only one stock-pair in the portfolio.

In the tests shown we kept the maximum allowed drawdown per trade to 9%, but allowed a maximum loss of 6% in one strategy and only 1% in the other. As we see from above the performance improves with the tightening of the maximum allowed loss per trade. The Sharpe ratio (assuming zero index) was 0.64 and 0.81 respectively while the total P&L was 9.14% and 14%.

The thresholds were chosen based on the simulation in the training period.

### Future Work

1. Develop better screening criterion to identify the pairs with the best potentials. I already have several ideas and this will be ongoing research.
2. Optimize the lookback window and the buy/sell Z-score thresholds.
3. Gather more detailed statistics in the training period. At present, I am gathering statistics of only the top 5 (based on my selection criteria). However, in future, I should record statistics of all pairs that pass. This will indicate which trades are most profitable.
4. In the training period, I am measuring profitability by the total P&L of the trade, from entry till the exit signal is reached. However, I should also record max profit so that I could determine an earlier exit threshold.
5. Run the simulation for several years, i.e. calibrate one year and then test the next year. This will generate several year’s worths of out-of-sample tests. Another window to optimize is the length of the training period and how frequently the Kalman filter has to be recalibrated.
6. Expand the methodology to other sectors beyond financials.
7. Explore other filters instead of just Kalman filter.

### Next Steps

If you are a coder or a tech professional looking to start your own automated trading desk. Learn automated trading from live Interactive lectures by daily-practitioners. Executive Programme in Algorithmic Trading (EPAT™) covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. Enroll now!

Note:

The work presented in the article has been developed by the author, Mr. Dyutiman Das. The underlying codes which form the basis for this article are not being shared with the readers. For readers who are interested in further readings on implementing pairs trading using Kalman Filter, please find the article below.

Link: Statistical Arbitrage Using the Kalman Filter by Jonathan Kinlay

## Pairs Trading on ETF – EPAT Project Work

This article is the final project submitted by the author as part of his coursework in Executive Programme in Algorithmic Trading (EPAT™) at QuantInsti. You can check out our Projects page and have a look at what our students are building after reading this article.

### About the Author

Edmund Ho did his Bachelors in commerce from University of British Columbia, He completed his Masters in Investment Management from Hong Kong University of Science and Technology. Edmund was enrolled in the 27th Batch of EPAT™, and this report is part of his final project work.

### Project Summary

ETFs are very popular for pairs trading simply because it eliminates the firm-specific factors.   On top of that, most of the ETFs are short-able so we don’t have to worry about the short constraint.   In this project, we are trying to build a portfolio using 3 ETF pairs in oil (USO vs XLE), technology (XLK vs IYW), and financial sectors (XLF vs PSCF).

Over the long run, the overall performance of the miners is highly correlated with the commodities.  In short term, they may have divergences due to individual company’s performance or overall equity market performance, and hence the short term arbitrage opportunities may exist.  In technology sector, we attempt to seek mispricing on both large cap technology ETFs. Last, we attempt to see if arbitrage opportunity exists between the large and mid-cap financial ETFs.

### Pair 1 – Oil Sector USO vs XLE

#### Cointegration Test

The above charts were generated in R Studio.   The in sample data generated between Jan 1st, 2011 and Dec 31, 2014.

First, we plot the price for the pairs and it gives us an impression that both price series are quite similar.  Then we perform the regression analysis for USO vs XLE (return USO = Beta * Return XLE + Residual) and find the beta or hedge ratio to be 0.7493.  Next, we apply the hedge ratio to generate the spread returns.  We can see the spread returns deviate closely around 0, which shows the characteristic of cointegrating pattern.  At last, we apply the Augmented Dickey-Fuller test with a confidence level at 0.2 and check if the pairs pass the ADF test.  The results are as follow:

###### Augmented Dickey-Fuller Test

data:(spread) Dickey-Fuller = -3.0375, Lag order = 0, p-value = 0.1391

alternative hypothesis: stationary

[1] “The spread is likely Cointegrated with a pvalue of 0.139136738842547”

With the p-value of 0.1391, the pairs satisfied the cointegration test, and we will go ahead and back-test the pairs in next section.

#### Strategy Back-Testing

The above back-testing result were generated in R Studio.  The back-testing period was using the in-sample data similar to the cointegration test.  Our trading strategy is relatively simple as follow:

• If the spread is greater than +/- 1.5 standard deviations of its rolling lookback period of 120 days’ standard deviation, then we go short / long accordingly.
• At all time, only 1 open position
• Close the long/short position when the spread reverts to its mean/moving average.

The above back-testing result were generated in R-Studio using the PerformanceAnalystics package.  During the in sample back-testing period, the strategy achieved a cumulated return of 121.03% where the SPY (S&P500) had a cumulative return of 61.78%.  This translated into an annualized return of 22% and 12.82%, respectively.  In terms of risk analysis, the strategy had much lower annualized standard deviation of 11.63% vs. 15.35% in SPY.  The worst drawdown percentage for the strategy was 6.39% vs. 19.42% in SPY.  The annualized Sharpe ratio was superior in our strategy at 1.89 vs. 0.835 in SPY.  Please note that all of the above calculations did NOT factor into transaction cost.

#### Out of sample test

For the out of sample period between Jan 1st 2015 and Dec 31, 2015, the pairs did not pass the ADF test suggested by a high p-Value at 0.3845.  The phenomenon could be explained by the sharp decline in cruel oil price where the equity market was persisting in an uptrend.  If we look at the spread returns, they first seem to be cointegrating around 0 but with a much larger deviation suggested by the chart below.

The actual spread obviously did not suggest a cointegrating pattern as indicated by the high p-Value.  Next, we will go ahead and back-test the same strategy using the out of sample data despite the pair fails the cointegration test.   The hedge ratio was found to be 1.1841.  The key back-testing results were generated in R-Studio as follow:

USO and XLE Stat Arb          SPYAnnualized Return           0.09862893 -0.007623957

USO and XLE Stat Arb          SPYCumulative Return           0.09821892 -0.007593818

USO and XLE Stat Arb         SPYAnnualized Sharpe Ratio (Rf=0%)            0.5632756 -0.04884633

USO and XLE Stat Arb       SPYAnnualized Standard Deviation            0.1750989 0.1560804

USO and XLE Stat Arb       SPYWorst Drawdown            0.1706643 0.1228571

At a first glance, the strategy seems outperform the SPY in all aspects, but due to the lookback period which was set same as in-sample back-testing data (120 days for consistency), this strategy only had 1 trade during the out of sample period, which may not reflect the situation going forward.  However, this shows that it is not necessary to have a perfect cointegrating pairs in order to extract profit opportunities.  In reality, only a few perfect pairs would pass the test.

### Pair 2 – Large Cap Technology XLK vs. IYW

#### Cointegration Test

It should not be a surprise that XLK and IYW has a strong linear relationship as demonstrated in the regression analysis with a hedge ratio at 0.903.  The two large cap technology ETFs are very similar in nature except for its size, volume, expense ratio, etc.  However, if we take a closer look at the actual return spreads, it doesn’t seem to satisfy the cointegration test.  If we run the ADF test, the result shows they are not likely to be cointegrated with a p-Value at 0.5043.

Augmented Dickey-Fuller Test

data: (spread) Dickey-Fuller = -2.1748, Lag order = 0, p-value = 0.5043

alternative hypothesis: stationary

[1] “The spread is likely NOT Cointegrated with a pvalue of 0.504319921216107”

The purpose to run the strategy in this pair is to see if there is any mispricing (short term deviation) in the pair in order to profit from it.  In the USO and XLE example, we observed that profit opportunity may still exist despite the pair failing the cointegration test.  Here, we will go ahead to test the pair and see if any profit opportunity exists.

#### Back-testing Result

 XLK and IYW Stat Arb       SPYAnnualized Return         -0.003581305 0.1282006 XLK and IYW Stat Arb       SPYCumulative Return          -0.01420635 0.6177882 XLK and IYW Stat Arb       SPYAnnualized Sharpe Ratio (Rf=0%)           -0.2839318 0.8347157 XLK and IYW Stat Arb      SPYAnnualized Standard Deviation           0.01261326 0.153586 XLK and IYW Stat Arb       SPYWorst Drawdown           0.02235892 0.1942388

The back-testing results illustrated that the strategy performed very poorly during the back-testing period between Jan 1st 2011 and Dec 31, 2014.  This demonstrates that the two ETFs are highly correlated to each other and it’s very hard to extract profit opportunity from them.  In order for a statistical arbitrage strategy to work, we need a pair with some volatility in their spreads but they should eventually show a mean-reverting pattern.  In next section, we will perform the same analysis on the Financial ETFs.

### Pair 3 – Financial Sectors XLF vs. PSCF

Cointegration Test

In this pair, we attempt to seek a trading opportunity between the large financial cap ETF XLF and the small financial cap ETF PSCF.  From the price series they both show very similar pattern.  In terms of regression analysis, they obviously show a strong correlation with a hedge ratio of 0.9682.  The spread return also illustrates some cointegrating pattern with the spread deviating around 0.  The ADF test with test value set at 80% confidence shows the pair is likely to be cointegrated with a p-value at 0.1026.

Augmented Dickey-Fuller Test

data:(spread)

Dickey-Fuller = -3.1238, Lag order = 0, p-value = 0.1026

alternative hypothesis: stationary

[1] “The spread is likely Cointegrated with a pvalue of 0.102608136882834”

#### Back-testing Result

XLF and PSCF Stat Arb       SPYAnnualized Return            0.01212355 0.1282006

XLF and PSCF Stat Arb       SPYCumulative Return            0.04923268 0.6177882

XLF and PSCF Stat Arb       SPYAnnualized Sharpe Ratio (Rf=0%)             0.1942203 0.8347157

XLF and PSCF Stat Arb      SPYAnnualized Standard Deviation            0.06242163 0.153586

XLF and PSCF Stat Arb       SPYWorst Drawdown            0.07651392 0.1942388

Although the pair satisfied the cointegration test with a low p-value, the back-testing results demonstrated a below average performance when we compare to the index return.

### Conclusion

In this project, we chose 3 different pairs of ETFs to back test our simple mean-reverting strategy.  The back test results show superior performance on USO/XLE, but not on the other two pairs.  We can conclude that in order for the pair trading strategy to work, we do not need a pair that shows strong linear relationship, but a long term mean reverting pattern is essential to obtain a decent result.  In the pair XLK/IYW, we attempt to seek for mispricing between the two issuers, however, in an efficient ETF market in US, mispricing on such big ETFs is very rare, hence this strategy performs very poorly on this pair.  On the other hand, the correlation and cointegration test on the pair XLF/PSCF illustrate the pair is an ideal candidate to trade the statistical arbitrage strategy, however, the back-testing results show the other way around.  Any statistical arbitrage strategy we are essentially playing with the volatility, but if there is not enough volatility around the spreads to begin with, like the pair in XLK/IYW, the profit opportunity is trivial.   In the pair USO/XLE, the volatility around the spreads is ideal and the cointegration test shows the pair has a mean-reverting pattern, therefore it is not a surprise this pair prevails in the back-testing results.

### Next Step

Read about other strategies in this article on Algorithmic Trading Strategy Paradigms. If you also want to learn more about Algorithmic Trading, then click here.

• Project_Download.rar
• Project_Cointegration_Test.R
• Project_Backtest_Test.R

## Implementing Pairs Trading/Statistical Arbitrage Strategy In FX Markets : EPAT Project Work

This article is the final project submitted by the author as a part of his coursework in Executive Programme in Algorithmic Trading (EPAT™) at QuantInsti. Do check our Projects page and have a look at what our students are building.

### About the Author

Harish Maranani did his Bachelors in Technology in Electronics and Communications Engineering from Acharya Nagarjuna University, MBA Finance from Staffordshire University (UK), Certificate in Quantitative Finance (CQF), and Master of Science in Mathematical and Computational Finance from New Jersey Institute of Technology, Newark, USA. Harish was enrolled in the 27th Batch of EPAT™, and this report is part of his final project work.

Aim: To implement pairs trading/statistical arbitrage strategy in currencies.

Pairs Chosen: EURINR, USDINR, GBPINR, AUDINR, CADINR, JPYINR

Frequency: Daily

Time Period: 2011/4/21 to 2013/5/22

Implemented using: Python.

Pair Selection Criteria for FX Markets:

• The time series data for the above-chosen currency pairs is imported from quandl.
• Co-integration Test is carried out on all possible pair combinations viz. EURINR-USDINR, EURINR-GBPINR etc.
• Selecting Co-integrated pairs whose t-static value is less than 5% critical value of -2.8.
• Slicing the pairs which meet the co-integration condition for further analysis.
• To further test for confirmation of co-integration, CADF test is carried out on the sliced pairs from the pool.
• Z-score is calculated for each selected pair combination and the strategy is applied.
• Profit/loss, equity curve, maximum drawdown, are calculated/tabulated/plotted.
• Consider two currency pairs EUR/INR and USD/INR. Here the base currencies are EUR and USD respectively and the counter currency is INR.

Preliminary Test:

• In order to find the pairs of currencies that are co-integrated, a preliminary test through coint(x,y) from statsmodels.tsa.stattools is carried out and their respective pvalues, tstatic are plotted below.
• The t-static values that are displayed below are the ones that passed the co-integration test. i.e the t-static values smaller than the 5% critical value of -2.86.

Below is the list of pairs whose T-static values are less than the 5% critical value of -2.86:

• [‘EURINR/USDINR: -3.89372142826’,
• ‘EURINR/GBPINR: -3.04457063111’,
• ‘EURINR/CADINR: -3.16044058632’,
• ‘USDINR/AUDINR: -3.14784526027’,
• ‘USDINR/CADINR: -3.19434173492’,
• ‘GBPINR/CADINR: -3.86588509209’,
• ‘AUDINR/CADINR: -3.10827352646’]

Below is the plot of p-values of the co-integrated pairs:

Before rejecting null hypothesis to confirm the prices are mean-reverting, we shall conduct Co-Integrated Augmented Dickey-Fuller (CADF) test to confirm the same for the above sliced pairs out from the whole set of currencies. Below are the Results and plots.

We shall consider the 4 co-integrated pairs based on T-Static Values for CADF testing.

The following are the 4 Co-integrated pairs:

EURINR/USDINR:  -3.89372142826

GBPINR/CADINR:  -3.86588509209

USDINR/CADINR:  -3.19434173492

EURINR/CADINR:  -3.16044058632

#### EURINR/USDINR

TIME SERIES PLOTS OF EURINR/USDINR

From the above graph, it is visibly evident that the prices are co-integrated, however, to statistically confirm the same, the below set of tests/procedures are implemented.

Creating a scatter plot of the prices, to see the relationship is broadly linear.

Given the above residual plot, it is relatively stationary.

##### Co-integrated Augmented Dickey-Fuller Test Results

Co-integrated Augmented Dickey-Fuller (CADF) test determines the optimal hedge ratio by performing a linear regression against the two-time series and then tests for stationarity under the linear combination.

Implementing in python gives the following result:

(-3.0420602182962395,

0.03114885626164075,

1L,

652L,

{'1%': -3.440419374623044,

'10%': -2.5691361169972526,

'5%': -2.8659830798370352},

852.99818965061797)

Given the above results, the t-static to be -3.04 less than 5% critical value of -2.8, we can reject the null hypothesis and can confirm that the prices are mean-reverting.

#### GBPINR/CADINR

Below are the time series, scatter and residual plots of GBPINR/CADINR

CADF Test results:

(-3.3637522231183872,

0.012258395060108089,

2L,

651L,

{'1%': -3.440434903803665,

'10%': -2.569139761751388,

'5%': -2.865989920612213},

-179.04749802146216)

Given the above results, the t-static to be -3.36 smaller than 5% critical value of -2.8, we can reject the null hypothesis and can confirm that the prices are mean-reverting.

#### USDINR/CADINR

CADF Test results

Given the above results, the t-static to be -2.93 smaller than 5% critical value of -2.8, we can reject the null hypothesis and can confirm that the prices are mean-reverting.

(-2.9344605252608607,

0.041484961304201866,

1L,

652L,

{'1%': -3.440419374623044,

'10%': -2.5691361169972526,

'5%': -2.8659830798370352},

-99.577663481220952)

#### USDINR/AUDINR

Below are the results from CADF test:

(-3.2595055880757768,
0.016788501512565262,
4L,
649L,
{'1%': -3.440466106307706,
'10%': -2.5691470850496558,
'5%': -2.8660036655537744},
381.77145926378489)

With the t-static value of -3.25 smaller than the 5% critical value of -2.86, we can reject the null hypothesis and can confirm that the pair is co-integrating.

Now that we have found the co-integrated pairs in the form of following pairs with t-static values:

• EURINR/USDINR: -3.04
• GBPINR/CADINR: -3.363
• USDINR/CADINR: -2.934
• USDINR/AUDINR: -3.259

Next Step would be to calculate the Z-score of the price ratio for 30day moving average and 30day standard deviation:

• Calculating price ratios and creating a new column ratio in the data frames (df, df1, df2, df4) of the above currency pairs respectively.

Below is the snapshot of the data frames:

df:

Df1:

Calculation of Z-score of the price ratio for the 30-day window of moving average and standard deviation:

• Below are the plots of z-scores for the above co-integrated pairs with their respective price ratios:

From the above Z-Score plots of the selected pairs, Z-score is exhibiting mean reverting behavior within 2 standard deviations.

Building a Trading Strategy:

• When z-score touches +2 short the pair and close the position when it reverts back to +1
• When z-score touches -2 long the pair and close the position when it reverts back to -1.
• Only one position is held at a single instance of time.

Equity Curve:

Plotting the equity curve with the starting capital of 100 INR equally divided among 4 pairs.

With 100 INR initial Capital, equity ended at 114.05.

Cumulative profit to be 14% without any leverage. With 10 times leverage (ideal for FX trading), the profits can be seen at 140%. Below are the important performance metrics of the strategy.

 Profit percentage Without Leverage 14.0514144897 % Profit percentage with 10 times leverage 140.514144897 % Number of Positive Trades 59 Number of Negative Trades 23 Hit Ratio 71.9512195122 % Average Positive Trade 0.46886657456220338 Average Negative Trade -0.59181362649660851 Average Profit/Average Loss 0.792253766338 Maximum Drawdown -5.1832506579 %

The above graph shows the maximum drawdown points marked with red dots and the value is added in the above table.

### Instructions for Implementation

• Please run the IPython notebook named harish_stat_arb.ipynb for the confirmation of results and plots.
• Another option is to run the python script harish_quantinsti_final_project_code.py on any python IDE to confirm the results and graph.
• Use the below code for exporting the final dataframe to an excel file.
writer = pd.ExcelWriter('pairs_final.xlsx',engine = 'xlsxwriter')

pairs.to_excel(writer,'Sheet5')

writer.save()

### Conclusion

Though the strategy has generated 140% returns over the backtest period of 2 years, the following factors should be considered in order to evaluate a more accurate performance of the strategy.

• The model has ignored the slippage and commissions.
• The model ignored the bid-ask spread while placing buy or sell orders.

### Bibliography

• Statistical Arbitrage lecture Quantinsti, Nitesh Khandelwal.
• Pairs Trading, Ganapathy Vidyamurthy, Wiley Finance.
• Successful Algorithmic trading, Michael Halls-Moore.

### Next Step

Read about other strategies in this article on Algorithmic Trading Strategy Paradigms. If you also want to learn more about Algorithmic Trading, then click here.

## Shorting at High: Algo Trading Strategy in R

Milind began his career in Gridstone Research, building earnings models and writing earnings notes for NYSE listed companies, covering Technology and REITs sectors. Milind has also worked at CRISIL and Deutsche Bank, where he was involved in modeling of Structured Finance deals covering Asset Backed Securities (ABS), and Collateralized Debt Obligations (CDOs) for the US and EMEA region.

Milind holds a MBA in Finance from the University of Mumbai, and a Bachelor’s degree in Physics from St. Xavier’s College, Mumbai.

### Ideation

The Executive Programme in Algorithmic Trading (EPAT) exposed me to all the requisite subjects needed to learn algorithmic trading. As part of the EPAT project work I tried coding many strategies. Since I am a novice to algorithmic trading I wanted to code the simplest, and the most basic strategies. Although simple and basic, one should not underestimate the power of such strategies, as they can generate good returns.

“Shorting at High” was one of the strategies that I formulated for my project work. This post explains the strategy in brief and the coding part. I welcome the readers to give suggestions, improvise or to use the strategy.

### Strategy in brief

The strategy is to short the best stocks which cross the set percentage threshold on the upside (say 8%-9%) during intraday trading. The expectation for the shorted stocks is to fall by an amount as predicted in the metrics sheet which is generated upon executing the code. (more…)

## Pair Trading Strategy and Backtesting using Quantstrat

### A Recent Webinar Presentation by Marco Nicolas Dibo

This insightful webinar on pairs trading and sourcing data covers the basics of pair trading strategy followed by two examples. In the first example, Marco covers the pairs trading strategy for different stocks traded on the same exchange, and in the second example, Marco has illustrated the pairs strategy for different commodity futures traded on different exchanges. Marco also details the different data sources including Quandl which can be used for creating trading strategies.

This article is the final project submitted by the author as a part of his coursework in Executive Programme in Algorithmic Trading (EPAT) at QuantInsti. Do check our Projects page and have a look at what our students are building.

### Author

Marco has spent his career as a trader and portfolio manager, with a particular focus in equity and derivatives markets. He specializes in quantitative finance and algorithmic trading and currently serves as head of the Quantitative Trading Desk and Vice-president of Argentina Valores S.A. Marco is also Co-Founder and CEO of Quanticko Trading SA, a firm devoted to the development of high frequency trading strategies and trading software. Marco holds a BS in Economics and an MSc in Finance from the University of San Andrés.

### Introduction

One of my favorite classes during EPAT was the one on statistical arbitrage, so the pair trading strategy seemed a nice idea for me. My strategy triggers new orders when the pair ratio of the prices of the stocks diverge from the mean. But in order to work, we first have to test for the pair to be cointegrated. If the pair ratio is cointegrated, the ratio is mean-reverting and the greater the dispersion from its mean, the higher the probability of a reversal, which makes the trade more attractive. I chose the following pair of stocks:

• Bank of America (BAC)
• Citigroup (C)

The idea is the following: If we find two stocks that are correlated (they correspond to the same sector), and the pair ratio diverges from a certain threshold, we short the stock that is expensive and buy the one that is cheap. Once they converge to the mean, we close the positions and profit from the reversal.

## Authors

Maxime Fages
Maxime’s career spanned across the strategic aspects of value and risk, with a particular focus on trading behaviors and market microstructure over the past few years. He embraced a quantitative angle in M&A, fund management or currently corporate strategy and has always been an avid open-source software user. Maxime holds an MBA from Insead and an MSc, Engineering from Ecole Nationale Superieure D’Arts et Metiers; he is currently Strategy Director APAC at the CME Group.

Derek began his career on the floor of the CBOT then moved upstairs to focus on proprietary trading and strategy development. He manages global multi-strategy portfolios, focusing in the futures and options space. He is currently the Deputy Director of Systematic Trading at Foretrade Investment Co Ltd.

### Ideation

By the end of the Executive Programme in Algorithmic Trading (EPAT) lectures, Derek and I were spending a significant amount of time exchanging views over a variety of media. We discussed ideas for a project, and the same themes were getting us excited. First, we were interested in dealing with Futures rather than cash instruments. Second, we both had a solid experience using R for quantitative research and were interested in getting our hands dirty on the execution side of things, especially on the implementation of event-driven strategies in Python (which neither of us knew before the EPAT program). Third, we had spent hours discussing and assessing the performance of Machine Learning for trading applications and were pretty eager to try our ideas out. Finally, we were very interested in practical architecture design, particularly in what was the best way to manage the variable resource needs of any Machine Learning framework (training vs. evaluating).

The scope of our project, therefore, came about naturally: developing a fully cloud-based automated trading system that would leverage on simple, fast mean-reverting or trend-following execution algorithms and call on Machine learning technology to switch between these.

## Dispersion Strategy Based on Correlation of Stocks and Volatility of Index

### Introduction

This article examines profits from trading using the dispersion strategy based on the correlation of stocks, volatility of Index. Dispersion helps the trader take a view on volatility only (assuming that correlation mean reverts) and, therefore, it is made sure that delta risk is hedged by buying or selling futures. In this strategy, both long and short positions are built on volatility and with more strategies available nowadays it is better to use strategies which take advantage of relative values rather than absolutes. This limits the amount of money at risk in one direction. (more…)