Implementing Pairs Trading/Statistical Arbitrage Strategy In FX Markets : EPAT Project Work

pairs trading strategy quantinsti 1

This article is the final project submitted by the author as a part of his coursework in Executive Programme in Algorithmic Trading (EPAT™) at QuantInsti. Do check our Projects page and have a look at what our students are building.

About the Author


Harish Maranani did his Bachelors in Technology in Electronics and Communications Engineering from Acharya Nagarjuna University, MBA Finance from Staffordshire University (UK), Certificate in Quantitative Finance (CQF), and Master of Science in Mathematical and Computational Finance from New Jersey Institute of Technology, Newark, USA. Harish was enrolled in the 27th Batch of EPAT™, and this report is part of his final project work.

Aim: To implement pairs trading/statistical arbitrage strategy in currencies.


Frequency: Daily

Time Period: 2011/4/21 to 2013/5/22

Implemented using: Python.

Pair Selection Criteria for FX Markets:

  • The time series data for the above-chosen currency pairs is imported from quandl.
  • Co-integration Test is carried out on all possible pair combinations viz. EURINR-USDINR, EURINR-GBPINR etc.
  • Selecting Co-integrated pairs whose t-static value is less than 5% critical value of -2.8.
  • Slicing the pairs which meet the co-integration condition for further analysis.
  • To further test for confirmation of co-integration, CADF test is carried out on the sliced pairs from the pool.
  • Z-score is calculated for each selected pair combination and the strategy is applied.
  • Profit/loss, equity curve, maximum drawdown, are calculated/tabulated/plotted.
  • Consider two currency pairs EUR/INR and USD/INR. Here the base currencies are EUR and USD respectively and the counter currency is INR.

Preliminary Test:

  • In order to find the pairs of currencies that are co-integrated, a preliminary test through coint(x,y) from statsmodels.tsa.stattools is carried out and their respective pvalues, tstatic are plotted below.
  • The t-static values that are displayed below are the ones that passed the co-integration test. i.e the t-static values smaller than the 5% critical value of -2.86.

pairs trading fx trading

Below is the list of pairs whose T-static values are less than the 5% critical value of -2.86:

  • [‘EURINR/USDINR: -3.89372142826’,
  • ‘EURINR/GBPINR: -3.04457063111’,
  • ‘EURINR/CADINR: -3.16044058632’,
  • ‘USDINR/AUDINR: -3.14784526027’,
  • ‘USDINR/CADINR: -3.19434173492’,
  • ‘GBPINR/CADINR: -3.86588509209’,
  • ‘AUDINR/CADINR: -3.10827352646’]

Below is the plot of p-values of the co-integrated pairs:

Before rejecting null hypothesis to confirm the prices are mean-reverting, we shall conduct Co-Integrated Augmented Dickey-Fuller (CADF) test to confirm the same for the above sliced pairs out from the whole set of currencies. Below are the Results and plots.

We shall consider the 4 co-integrated pairs based on T-Static Values for CADF testing.

The following are the 4 Co-integrated pairs:

EURINR/USDINR:  -3.89372142826

GBPINR/CADINR:  -3.86588509209

USDINR/CADINR:  -3.19434173492

EURINR/CADINR:  -3.16044058632




From the above graph, it is visibly evident that the prices are co-integrated, however, to statistically confirm the same, the below set of tests/procedures are implemented.

Creating a Scatter plot of the prices, to see the relationship is broadly linear.

Creating a scatter plot of the prices, to see the relationship is broadly linear.

residual plot pairs trading fx market

Given the above residual plot, it is relatively stationary.

Co-integrated Augmented Dickey-Fuller Test Results

Co-integrated Augmented Dickey-Fuller (CADF) test determines the optimal hedge ratio by performing a linear regression against the two-time series and then tests for stationarity under the linear combination.

Implementing in python gives the following result:

Given the above results, the t-static to be -3.04 less than 5% critical value of -2.8, we can reject the null hypothesis and can confirm that the prices are mean-reverting.



Below are the time series, scatter and residual plots of GBPINR/CADINR




CADF Test results:

Given the above results, the t-static to be -3.36 smaller than 5% critical value of -2.8, we can reject the null hypothesis and can confirm that the prices are mean-reverting.


fx market pairs trading 1fx market pairs trading 1 fx market pairs trading 1

CADF Test results

Given the above results, the t-static to be -2.93 smaller than 5% critical value of -2.8, we can reject the null hypothesis and can confirm that the prices are mean-reverting.


fx-market-pairs-tradingfx-market-pairs-trading-3 fx-market-pairs-trading-2

Below are the results from CADF test:

With the t-static value of -3.25 smaller than the 5% critical value of -2.86, we can reject the null hypothesis and can confirm that the pair is co-integrating.

Now that we have found the co-integrated pairs in the form of following pairs with t-static values:

  • EURINR/USDINR: -3.04
  • GBPINR/CADINR: -3.363
  • USDINR/CADINR: -2.934
  • USDINR/AUDINR: -3.259

Next Step would be to calculate the Z-score of the price ratio for 30day moving average and 30day standard deviation:

  • Calculating price ratios and creating a new column ratio in the data frames (df, df1, df2, df4) of the above currency pairs respectively.

Below is the snapshot of the data frames:





Calculation of Z-score of the price ratio for the 30-day window of moving average and standard deviation:

  • Below are the plots of z-scores for the above co-integrated pairs with their respective price ratios:





From the above Z-Score plots of the selected pairs, Z-score is exhibiting mean reverting behavior within 2 standard deviations.

Building a Trading Strategy:

  • When z-score touches +2 short the pair and close the position when it reverts back to +1
  • When z-score touches -2 long the pair and close the position when it reverts back to -1.
  • Only one position is held at a single instance of time.

Equity Curve:

Plotting the equity curve with the starting capital of 100 INR equally divided among 4 pairs.


With 100 INR initial Capital, equity ended at 114.05.

Cumulative profit to be 14% without any leverage. With 10 times leverage (ideal for FX trading), the profits can be seen at 140%. Below are the important performance metrics of the strategy.


Profit percentage Without Leverage 14.0514144897 %
Profit percentage with 10 times leverage 140.514144897 %
Number of Positive Trades 59
Number of Negative Trades 23
Hit Ratio 71.9512195122 %
Average Positive Trade 0.46886657456220338
Average Negative Trade -0.59181362649660851
Average Profit/Average Loss 0.792253766338
Maximum Drawdown -5.1832506579 %



The above graph shows the maximum drawdown points marked with red dots and the value is added in the above table.

Instructions for Implementation

  • Please run the IPython notebook named harish_stat_arb.ipynb for the confirmation of results and plots.
  • Another option is to run the python script on any python IDE to confirm the results and graph.
  • Use the below code for exporting the final dataframe to an excel file.


Though the strategy has generated 140% returns over the backtest period of 2 years, the following factors should be considered in order to evaluate a more accurate performance of the strategy.

  • The model has ignored the slippage and commissions.
  • The model ignored the bid-ask spread while placing buy or sell orders.


  • Statistical Arbitrage lecture Quantinsti, Nitesh Khandelwal.
  • Pairs Trading, Ganapathy Vidyamurthy, Wiley Finance.
  • Successful Algorithmic trading, Michael Halls-Moore.

Next Step

Read about other strategies in this article on Algorithmic Trading Strategy Paradigms. If you also want to learn more about Algorithmic Trading, then click here.

Algorithmic trading course

Read more

Shorting at High: Algo Trading Strategy in R

Shorting at High - Algo Trading Strategy in R

By Milind Paradkar

Milind ParadkarMilind began his career in Gridstone Research, building earnings models and writing earnings notes for NYSE listed companies, covering Technology and REITs sectors. Milind has also worked at CRISIL and Deutsche Bank, where he was involved in modeling of Structured Finance deals covering Asset Backed Securities (ABS), and Collateralized Debt Obligations (CDOs) for the US and EMEA region.

Milind holds a MBA in Finance from the University of Mumbai, and a Bachelor’s degree in Physics from St. Xavier’s College, Mumbai.


The Executive Programme in Algorithmic Trading (EPAT) exposed me to all the requisite subjects needed to learn algorithmic trading. As part of the EPAT project work I tried coding many strategies. Since I am a novice to algorithmic trading I wanted to code the simplest, and the most basic strategies. Although simple and basic, one should not underestimate the power of such strategies, as they can generate good returns.

“Shorting at High” was one of the strategies that I formulated for my project work. This post explains the strategy in brief and the coding part. I welcome the readers to give suggestions, improvise or to use the strategy.

Strategy in brief

The strategy is to short the best stocks which cross the set percentage threshold on the upside (say 8%-9%) during intraday trading. The expectation for the shorted stocks is to fall by an amount as predicted in the metrics sheet which is generated upon executing the code. (more…)

Read more

Pair Trading Strategy and Backtesting using Quantstrat

A Recent Webinar Presentation by Marco Nicolas Dibo


This insightful webinar on pairs trading and sourcing data covers the basics of pair trading strategy followed by two examples. In the first example, Marco covers the pairs trading strategy for different stocks traded on the same exchange, and in the second example, Marco has illustrated the pairs strategy for different commodity futures traded on different exchanges. Marco also details the different data sources including Quandl which can be used for creating trading strategies.

By Marco Nicolas Dibo

This article is the final project submitted by the author as a part of his coursework in Executive Programme in Algorithmic Trading (EPAT) at QuantInsti. Do check our Projects page and have a look at what our students are building.


Marco cover picMarco has spent his career as a trader and portfolio manager, with a particular focus in equity and derivatives markets. He specializes in quantitative finance and algorithmic trading and currently serves as head of the Quantitative Trading Desk and Vice-president of Argentina Valores S.A. Marco is also Co-Founder and CEO of Quanticko Trading SA, a firm devoted to the development of high frequency trading strategies and trading software. Marco holds a BS in Economics and an MSc in Finance from the University of San Andrés.


One of my favorite classes during EPAT was the one on statistical arbitrage, so the pair trading strategy seemed a nice idea for me. My strategy triggers new orders when the pair ratio of the prices of the stocks diverge from the mean. But in order to work, we first have to test for the pair to be cointegrated. If the pair ratio is cointegrated, the ratio is mean-reverting and the greater the dispersion from its mean, the higher the probability of a reversal, which makes the trade more attractive. I chose the following pair of stocks:

  • Bank of America (BAC)
  • Citigroup (C)

The idea is the following: If we find two stocks that are correlated (they correspond to the same sector), and the pair ratio diverges from a certain threshold, we short the stock that is expensive and buy the one that is cheap. Once they converge to the mean, we close the positions and profit from the reversal.


Read more

Development of Cloud-Based Automated Trading System with Machine Learning

cloud-based automated trading system

This article is the final project submitted by the authors as a part of their coursework in Executive Programme in Algorithmic Trading (EPAT) at QuantInsti.


maxime fagesMaxime Fages
Maxime’s career spanned across the strategic aspects of value and risk, with a particular focus on trading behaviors and market microstructure over the past few years. He embraced a quantitative angle in M&A, fund management or currently corporate strategy and has always been an avid open-source software user. Maxime holds an MBA from Insead and an MSc, Engineering from Ecole Nationale Superieure D’Arts et Metiers; he is currently Strategy Director APAC at the CME Group.

Derek began his career on the floor of the CBOT then moved upstairs to focus on proprietary trading and strategy development. He manages global multi-strategy portfolios, focusing in the futures and options space. He is currently the Deputy Director of Systematic Trading at Foretrade Investment Co Ltd.


By the end of the Executive Programme in Algorithmic Trading (EPAT) lectures, Derek and I were spending a significant amount of time exchanging views over a variety of media. We discussed ideas for a project, and the same themes were getting us excited. First, we were interested in dealing with Futures rather than cash instruments. Second, we both had a solid experience using R for quantitative research and were interested in getting our hands dirty on the execution side of things, especially on the implementation of event-driven strategies in Python (which neither of us knew before the EPAT program). Third, we had spent hours discussing and assessing the performance of Machine Learning for trading applications and were pretty eager to try our ideas out. Finally, we were very interested in practical architecture design, particularly in what was the best way to manage the variable resource needs of any Machine Learning framework (training vs. evaluating).

The scope of our project, therefore, came about naturally: developing a fully cloud-based automated trading system that would leverage on simple, fast mean-reverting or trend-following execution algorithms and call on Machine learning technology to switch between these.


Read more

Dispersion Strategy Based on Correlation of Stocks and Volatility of Index

Dispersion Strategy Based On Correlation Of Stocks And Volatility Of Index

By Nitin Aggarwal and Jasbir Singh

This article is the final project submitted by the authors as a part of their coursework in Executive Programme in Algorithmic Trading (EPAT) at QuantInsti. Do check our Projects page and have a look at what our students are building.


This article examines profits from trading using the dispersion strategy based on the correlation of stocks, volatility of Index. Dispersion helps the trader take a view on volatility only (assuming that correlation mean reverts) and, therefore, it is made sure that delta risk is hedged by buying or selling futures. In this strategy, both long and short positions are built on volatility and with more strategies available nowadays it is better to use strategies which take advantage of relative values rather than absolutes. This limits the amount of money at risk in one direction. (more…)

Read more

EPAT Final Project by Jacques Joubert – Statistical Arbitrage Strategy in R

Statistical Arbitrage Strategy in R

Statistical Arbitrage Strategy in R – EPAT Project Work

By Jacques Joubert

This article is the final project submitted by the author as a part of his coursework in Executive Programme in Algorithmic Trading (EPAT) at QuantInsti. Do check our Projects page and have a look at what our students are building.


For those of you who have been following my blog posts for the last 6 months will know that I have taken part in the Executive Programme in Algorithmic Trading offered by QuantInsti.

It’s been a journey and this article serves as a report on my final project focusing on statistical arbitrage, coded in R. This article is a combination of my class notes and my source code.

I uploaded everything to GitHub in order to welcome readers to contribute, improve, use, or work on this project. It will also form part of my Open Source Hedge Fund project on my blog QuantsPortal

I would like to say a special thank you to the team at QuantInsti. Thank you for all the revisions of my final project, for going out of your way to help me learn, and the very high level of client services.

History of Statistical Arbitrage

First developed and used in the mid-1980s by Nunzio Tartaglia’s quantitative group at Morgan Stanly.

  • Pair Trading is a “contrarian strategy” designed to harness mean-reverting behavior of the pair ratio
  • David Shaw, founder of D.E Shaw & Co, left Morgan Stanley and started his own “Quant” trading firm in the late 1980s dealing mainly in pair trading

What is Pair Trading?

Statistical arbitrage trading or pairs trading as it is commonly known is defined as trading one financial instrument or a basket of financial instruments – in most cases to create a value neutral basket.

It is the idea that a co-integrated pair is mean reverting in nature. There is a spread between the instruments and the further it deviates from its mean, the greater the probability of a reversal.

Note however that statistical arbitrage is not a risk free strategy. Say for example that you have entered positions for a pair and then the spread picks up a trend rather than mean reverting.

The Concept

Step 1: Find 2 related securities

Find two securities that are in the same sector / industry, they should have similar market capitalization and average volume traded.

An example of this is Anglo Gold and Harmony Gold.

Step 2: Calculate the spread

In the code to follow I used the pair ratio to indicate the spread. It is simply the price of asset A / price asset B.

Step 3: Calculate the mean, standard deviation, and z-score of the pair ratio / spread.

Step 4: Test for co-integration

In the code to follow I use the Augmented Dicky Fuller Test (ADF Test) to test for co-integration. I set up three tests, each with a different number of observations (120, 90, 60), all three tests have to reject the null hypothesis that the pair is not co-integrated.

Step 5: Generate trading signals

Trading signals are based on the z-score, given they pass the test for co-integration. In my project, I used a z-score of 1 as I noticed that other algorithms that I was competing with were using very low parameters. (I would have preferred a z-score of 2, as it better matches the literature, however, it is less profitable)

Step 6: Process transactions based on signals

Step 7: Reporting

R markdown for my project

Import packages and set directory

The first step is always to import the packages needed.

This strategy will be run on shares listed on the Johannesburg Stock Exchange (JSE); because of this I won’t be using the quantmod package to pull data from yahoo finance, instead, I have already gotten and cleaned the data that I stored in a SQL database and moved to CSV files on the Desktop.

I added all the pairs used in the strategy to a folder which I now set to be the working directory.

Functions that will be called from within other functions (No user interaction)

Next: Create all the functions that will be needed. The functions below will be called from within other functions so you don’t need to worry about the arguments.


The AddColumns function is used to add columns to the data frame that will be needed to store variables.


The PrepareData function calculates the pair ratio and the log10 prices of the pair. It also calls the AddColumns function within it.


The GenerateRowValue function Calculates the mean, standard deviation and the z-score for a given row in the data frame.

Become an algotrader. learn EPAT for algorithmic trading

The GenerateSignal function creates a long, short, or close signal based on the z-score. You can manually change the z-score. I have set it to 1 and -1 for entry signals and any z-score between 0.5 and -0.5 will create a close/exit signal.


The GenerateTransactions function is responsible for setting the entry and exit prices for the respective long and short positions needed to create a pair.

Note: QuantInsti taught us a very specific way of backtesting a trading strategy. They used excel to teach strategies and when I coded this strategy I used a large part of the excel methodology.

Going forward, however, I would explore other ways of storing variables. One of the great things about this method is that you can pull the entire data frame and analyse why a trade was made and all the details pertaining to it.


GetReturnsDaily calculates the daily returns on each position and then calculates the total returns and adds slippage.


The next two arguments are used to generate reports. A report includes the following: Charting: 1. An Equity curve 2. Drawdown curve 3. Daily returns bar chart

Statistics: 1. Annual Returns 2. Annualized Sharpe Ratio 3. Maximum Drawdown

Table: 1. Top 5 drawdowns and their duration

Note: If you have some extra time then you can further break this function down into smaller functions inorder to reduce the lines of code and improve usability. Less code = Less Bugs

Functions that the user will pass parameters to

The next two functions are the only functions that the user should fiddle with.


BacktestPair is used when you want to run a backtest on a trading pair (the pair is passed in via the CSV file)

Functions arguments:

  • pairData = the CSV file date
  • mean = the number of observations used to calculate the mean of the spread.
  • slippage = the amount of basis points that act as brokerage as well as slippage
  • adfTest = a boolean value – if the backtest should test for co-integration
  • criticalValue = Critical Value used in the ADF Test to test for co-integration
  • generateReport = a boolean value – if a report must be generated

BacktestPortfolio accepts a vector of CSV files and then generates an equally weighted portfolio.

Functions arguments:

  • names = an attomic vector of CSV file names, example: c(‘DsyLib.csv’, ‘OldSanlam.csv’)
  • mean = the number of observations used to calculate the mean of the spread.
  • leverage = how much leverage you want to apply to the portfolio

Running Backtests

Now we can start testing strategies using our code.


Pure arbitrage on the JSE

When starting this project the main focus was on using statistical arbitrage to find pairs that were co-integrated and then to trade those, however, I very quickly realized that the same code could be used to trade shares that had both its primary listing as well as access to its secondary listing on the same exchange.

If both listings are found on the same exchange, it opens the door for a pure arbitrage strategy due to both listings referring to the same asset. Therefore you don’t need to test for co-integration.

There are two very obvious examples on the JSE.

First Example Investec:

Primary = Investec Ltd : Secondary = Investec PLC

Investec In-Sample Test (2005-01-01 – 2012-11-23)

Test the following parameters

  • The Investec ltd / plc pair
  • mean = 35
  • Set adfTest = F (Dont test for co-integration)
  • Leverage of x3

Statistical arbitrage qi

Investec Out-of-Sample Test (2012-11-23 – 2015-11-23)

Note: if you increase the slippage, you will very quickly kiss profits goodbye.


Statistical arbitrage qi

Second Example Mondi:

Primary = Mondi Ltd : Secondary = Mondi PLC

Mondi In-Sample Test (2008-01-01 – 2012-11-23)

Test the following parameters

  • The Mondi ltd / plc pair
  • mean = 35
  • Set adfTest = F (Dont test for co-integration)
  • Leverage of x3

data <- read.csv('mondi.csv') mondi <- BacktestPair(data, 35, generateReport = F, adfTest = F)

Statistical arbitrage qi

Mondi Out-of-Sample Test (2012-11-23 – 2015-11-23)

Note: In all of my testing I found that the further down the timeline my data was, the harder it was to make profits on the end of day data. I tested this same strategy on intraday data and it has a higher return profile.

Statistical arbitrage qi

Statistical Arbitrage on the JSE


Next, we will look at a pair trading strategy.

Typically a pair consists of 2 shares that:

  • Share a market sector
  • Have a similar market cap
  • Similar business model and clients
  • Are co-integrated

In all of the portfolios below I use 3x leverage

Construction Portfolio

In-sample test (2005-01-01 – 2012-11-01)
Statistical arbitrage qi
Out-of-sample test (2012-11-23 – 2015-11-23)
Statistical arbitrage qi

Become an algotrader. learn EPAT for algorithmic trading

Insurance Portfolio

In-sample test (2005-01-01 – 2012-11-01)
Statistical arbitrage qi
Out-of-sample test (2012-11-23 – 2015-11-23)
Statistical arbitrage qi

General Retail Portfolio

In-sample test (2005-01-01 – 2012-11-01)
Statistical arbitrage qi
Out-of-sample test (2012-11-23 – 2015-11-23)
stats arb


At the end of all my testing, and trust me – there is a lot more testing I did than what is in this report, I came to the conclusion that the Pure Arbitrage Strategy has great hope in being used as a strategy using real money, but the Pair Trading Strategy on portfolios of stocks in a given sector is strained and not likely to be used in production in its current form.

There are many things that I think could be added to improve the performance. Going forward I will investigate using Kalman filters.

More on the Pure Arbitrage Trading Strategy:

I have only found two shares that have duel listings on the same exchange; this means that we can’t allocate large sums of money to the strategy as it will have a high market impact, however, we could use multiple exchanges and increase the number of shares used.

More on the Pair Trading Strategy:
  1. The number of observations used in the ADF Tests is large to blame. The problem is that a test for co-integration has to be done in order to make a claim for statistical arbitrage, however by using 120, 90, and 60 as parameters to the three tests, it is very difficult to find pairs that match the criteria and that will continue in this form for the near future. (Kalman filtering may be useful here)
  2. I haven’t spent a lot of time changing the different parameters like the number of observations in the mean calculation. (This requires further exploration)
  3. From the above sector portfolios, we can see that the early years are very profitable but the further down the timeline we go, the lower returns get. I have spoken to a few people in the industry as well as my friends doing stat arb projects at the University of Cape Town, the local lore has it that in 2009 Goldman switched on their stat arb package, in regards to the JSE listed securities.
  4. The same is noticed with other portfolios that I didn’t include in this report but is in the R Code file.
  5. I believe that this is due to large institutions using the same bread and butter strategy. You will note (if you spend enough time testing all the strategies) that in 2009 there seems to be a sudden shift in the data to lower returns.
  6. I feel that the end of day data I am using is limiting me and if I were to test the strategy on intraday data then profits would be higher. (I ran one test on intraday data on Mondi and the results were much higher, but I am still to test it on sector portfolios)
  7. This is one of the simpler statistical arbitrage strategies and I believe that if we were to improve the way we calculate the spread and change some of the entry and exit rules, the strategy would become more profitable.

If you made it to the end of this article, I thank you and hope that it added some value. This is the first time that I am using Github, so I am looking forward to seeing if there are any new contributors to the project.

Github repository:

Next Step

Read about other strategies in this article on Algorithmic Trading Strategy Paradigms. If you want to learn Algorithmic Trading, then click here.

Read more