Architecture of IBrokers R Implementation in Interactive Brokers API

By Milind Paradkar

In the our last post on Using IBrokers package, we introduced our readers to some of the basic functions from the IBrokers package which are used to retrieve market data, view account info, and execute/modify orders via R. This post will cover the structure of the IBrokers package which will enable the R users to build their custom trading strategies and get them executed via Interactive Brokers Trader Workstation (TWS).

Overview of the Interactive Brokers API Architecture

Before we explain the underlying structure of the IBrokers package, let us take an overview of the Interactive Brokers API architecture. Interactive Brokers provides its API program which can be run on Windows, Linux, and MacOS. The API makes a connection to the IB TWS. The TWS, in turn, is connected to the IB data centers and thus, all the communication is routed via the TWS.

The IBrokers R package enables a user to write his strategy in R and helps it get executed via the IB TWS. Below is the flow structure diagram.
Overview of the Interactive Brokers API Architecture

Getting data from the TWS

To retrieve data from the IB TWS, the IBrokers R package includes five important functions.

  • reqContractDetails: retrieves detailed product information.
  • reqMktData: retrieves real-time market data.
  • reqMktDepth: retrieves real-time order book data.
  • reqRealTimeBars: retrieves real-time OHLC data.
  • reqHistoricalData: retrieves historical data.

In addition to these functions, there are helper functions which enable a user to create the above-mentioned data functions easily. These helper functions include:

  • twsContract: create a general Contract object.
  • twsEquity/twsSTK: wrapper to create equity Contract objects
  • twsOption/twsOPT: wrapper to create option Contract objects.
  • twsFuture/twsFUT: wrapper to create futures Contract objects.
  • twsFuture/twsFOP: wrapper to create futures options Contract objects.
  • twsCurrency/twsCASH: wrapper to create currency Contract objects.


Getting data from the TWS

 Real-time Data Model Structure

When a data function is used to access market data streams, the data streams received by the TWS API follow a certain path which enables to bucket these data streams into the relevant message type. Shown below is the list of arguments of the reqMktData function.

                                                     Example: Arguments of the reqMktData function

Source: Algorithmic Trading in R – Malcolm Sherrington

Real-time Data Model

In the following sections, we will see how this data model works and how the arguments of the real-time data functions (e.g. reqMktData) can be customized to create user-defined automated trading programs in R.

Using the CALLBACK Argument

The data functions like reqMktData, reqMktDepth, and reqRealTimeBars all have a special CALLBACK argument. By default, this argument calls the twsCALLBACK function from the IBrokers package.

The general logic of the twsCALLBACK function is to receive the header to each incoming message from the TWS. This is then passed to the processMsg function, along with the eWrapper object. The eWrapper object can maintain state data (prices) and has functions for managing all incoming message types from the TWS. Once the processMsg call returns, another cycle of the infinite loop occurs.

In the example of the incoming messages shown below, we have circled a single message in green (1 6 1 4 140.76 1 0). The first digit (i.e. 1) is the header and the remaining numbers (i.e. 6 1 4 140.76 1 0) constitute the body of the message.

Using the CALLBACK ArgumentIncoming messages from the reqMktData function call

Each message received will invoke the appropriately named eWrapper callback, depending on the message type. By default when nothing is specified, the code will call the default method for printing the results to the screen via cat.

Example with Default Method:

Default Method



Setting CALLBACK = NULL will send raw message level data to cat, which in turn will use the file argument to that function to either return the data to the standard output, or redirected via an open connection, a file, or a pipe.

Example with CALLBACK argument set to NULL:

CALLBACK argument set to NULL


Callbacks, via CALLBACK and eventWrapper, are designed to allow for R level processing of the real-time data stream. Callback helps to customize the output (i.e. incoming results) which can be used to create automated trading programs in R based on the user-defined criteria.

Internal code of the twsCALLBACK function

Inside of the CALLBACK (i.e. twsCALLBACK function) is a loop that fetches the incoming message type and calls the processMsg function at each new message.

Internal code of the twsCALLBACK functionInternal code snippet of the twsCALLBACK function

The ProcessMsg Function

The processMsg function internally is a series of if-else statements that branch according to a known incoming message type. A snippet of the internal code structure of the processMsg function is shown below.

The ProcessMsg FunctionInternal code snippet of the processMsg function

The eWrapper Closure

The eWrapper ClosureCreating eWrapper closure in twsCALLBACK using the eWrapper function

The eWrapper function creates an eWrapper closure to allow for the custom incoming message management. The eWrapper closure contains a list of functions to manage all incoming message type. Each message has a corresponding function in the eWrapper designed to handle the particular details of each incoming message type.

List of functions contained in the eWrapper ClosureList of functions contained in the eWrapper Closure

The data environment is .Data, with accessor methods get.Data, assign.Data, and remove.Data. These methods can be called from the closure object eWrapper$get.Data, eWrapper$assign.Data, etc. By creating an instance of eWrapper, accomplished by calling it as a function call, one can then modify any or all the particular methods embedded in the object.

Summarizing the Internal Structure of the IBrokers Package

We have seen above how the internal structure of the IBrokers package works. To summarize the entire mechanism, it can be depicted as shown below:

Request to TWS for data -> twsCALLBACK -> processMsg -> eWrapper

Real-Time Data Model

We will use the snapShotTest code example published by Jeff Ryan. The code below modifies the twsCALLBACK function. This modified callback is used as an argument to the reqMktData function. The output using the modified callback is more convenient to read than the normal output when we use the reqMktData function.
Real-Time Data Model

 Another change in the snapShotTest code is to record any error messages from IB API to a separate file. (Under the default method the eWrapper prints such error messages to the console).  To do this, we create a different wrapper using eWrapper(debug=NULL). Once we construct this, we can assign its errorMessage() function to the eWrapper we should use.

simple trade logic

We then apply a simple trade logic which generates a buy signal if the last bid price is greater than a pre-specified threshold value. One can similarly tweak the logic of the twsCALLBACK to create custom callbacks based on one’s requirement of trading strategy.

custom callbacks

 Order getting filled in the IB Trader Workstation (TWS)

IB Trader Workstation


To conclude, the post gave a detailed overview of the architecture of the IBrokers package which is the R implementation of Interactive Brokers API. Interactive Brokers in collaboration with QuantInstiTM hosted a webinar, “Trading using R on Interactive Brokers” which was held on 2nd March 2017 and conducted by Anil Yadav, Director at QuantInstiTM. You can click on the link provided above to learn more about the IBrokers package.

Next Step

If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to be a successful trader. Enroll now!

Read more

Setting-Up an Algo Trading Desk

Algo Trading

By Apoorva Singh

You need domain knowledge, skilled resources, technology & infrastructure in the form of hardware and software for setting up any business or start-up. The requirements, especially in terms of regulations, infrastructure and cost estimates can vary depending on the country you plan to set up your desk in but overall, things will fall under this umbrella. This blog will give you an overview of the requirements for setting up an algorithmic trading desk or firm.

Requirements for setting up an Algorithmic Trading desk

  1. Registering your company: The first step is to register your firm. You can register your trading firm (for proprietary trading) as a Company, Partnership, LLP or even as an Individual. If, however you want to set up a Hedge Fund with investors, other approvals from regulators (For e.g. SEBI in India and MAS in Singapore) are also required and the compliance rules and regulations are generally much stricter.
  1. Capital required for Trading and for Operations: Broadly speaking, trading capital required for High-Frequency Trading is usually relatively less than that required for Low-Frequency Trading. LFT is scalable and can absorb much more trading capital. But the capital required for trading operations is typically far higher in case of HFT as compared to LFT given the infrastructure and technology requirements in HFT.
  1. Trading Paradigm: You need to decide on the trading philosophy you’ll adopt. The most common trading philosophies include execution based strategies where the focus is to get the best price for execution rather than focusing on Alpha. Then there are High-frequency strategies which are extremely latency sensitive and mainly include market making, scalping, and arbitrage. Then there are market sentiment based, machine learning based and news based trading algorithms which can be relatively less sensitive to latency as compared to HFT.
  1. Access to Market: There are different kinds of memberships which exchanges offer- clearing members, trading members, trading cum clearing members, professional clearing members, etc. If you don’t want to go for direct membership with the exchange, you can also go through a broker. This involves lesser compliance rules and regulatory requirements. However, the flip-side is that you have to pay brokerage and most HFT strategies are highly sensitive to transaction cost.
  1. Infrastructure Requirements: Main focus areas under this head are Colocation, Hardware and Network Equipment and Network Lines.

a) Colocation: Colocation means that your server is in the same premises and on the same local area network as that of the exchange. Most exchanges provide colocation facility now. In some cases when exchanges do not provide colocation facility, there are vendors who provide co-location or proximity hosting facility. A significant percentage of orders received by exchanges are now generated by algorithms with most of such orders being generated by co-located space.

b) Hardware: Many leading companies produce servers required for Algorithmic Trading setup. Customizable hardware for high-frequency trading is also available which can be modified as per the requirement to improve performance. Given fast changes in technology, the present scenario requires servers to be changed and updated almost every year or at most in two years.

c) Network Equipment: This mainly includes Routers/Modems, Switches and Network Interface Controller (NICs) and FPGAs. For routers and modems, you need to check version compatibility with exchanges. NICs are basically Ethernet cards which help your computer to get connected to a network. FPGA stands for Field-Programmable Gate Array. It is basically an integrated circuit containing an array of programmable logic blocks and that be configured to perform complex operations.

d) Network Lines: Network lines can be broadly categorised into the below four categories-

 i.Trading Lease Line– Used for sending out orders to the exchange. Different lines provide different bandwidth for messages to be sent and are priced accordingly.

ii. Market Data Lease Line– This line used to receive market data from the exchanges or your data provider. There are two main formats ways in which exchanges send market data- Tick By Tick or Snapshot Data (example for NSE).

-> Tick By Tick (TBT)- Tick data is a collection of sequential “ticks” which is the latest quote, trade, price, and volume information. You can also subscribe to bucket feed which filters data for specific instruments requested.

 -> Snapshot Data– Snapshot Data feed contains data pertaining to Stock Exchange trade quotations and other related information pertaining to the trading of different instruments generated at regular intervals of time.

iii. Lines between Exchanges: These are point to point lines between exchanges which can assist with SOR. Smart Order Routing (SOR) lets you shoot orders to different exchanges, in effect helping you to pick liquidity available on different exchanges at the most effective price.

iv. Between Premises and Exchange: In India, you cannot have the internet in colocation area, so there is a dedicated line between colocation premises and your facility. The cost of this line depends on the distance.

v. Test Connectivity: Exchanges provide test markets where you can test your trading algorithms. For instance, in India, NSE provides two test markets; Normal test market and Dedicated test market. Some Global exchanges like CME also provide internet VPNs for test connectivity.

Become an algotrader. learn EPAT for algorithmic trading

  1. Algorithmic Trading Platform: An algorithmic trading platform has three main parts-

a) Market Data Adapter– MDA is used to receive data from the exchange and convert it to the format which our trading system understands.

b) Complex Events Processing Engine– CEP is the brain of the system and the main strategy logic lies here.

c) Order Routing System– CEP sends instructions to ORS which converts the order to exchange understandable format. FIX is the most widely used format in most exchanges, some exchanges might have their own native formats as well. When an exchange uses both, a native and FIX format, sometimes native may be preferred due to faster connectivity as the FIX converter might be applied in the next layer but using the exchange’s native format might also involve dedicated efforts in terms of maintenance.

      The latency of various platforms varies from system to system and so does the price.

  1. Backtesting: Backtesting is a historical simulation of an algorithmic trading strategy to see its performance on the past data. Most ATPs come with backtesting platforms which can be used to obtain simulated results in terms of profit & loss, risk and performance statistics over the duration of the backtested data which help to quantify the strategy’s return on risk. Next, we test the strategy in the “Test markets” which we’ve already discussed in the previous section briefly. Market tests ensure that there are no technical glitches which might occur while connecting to the market through the strategy.
  1. Risk Management: Risk management generally involves more focus on Market Risk monitoring. But in the case of High-Frequency trading, Operational Risk is much more important. Failure of technology, network, data streams can be disastrous. You need to have multiple level checks for data, starting from the socket level to capture any anomalies and stop the strategy instantly if something is wrong. A matter of seconds can lead to huge losses, which makes it important to react very fast and disconnect within a few milliseconds or lesser time duration if things go wrong.
  1. Conformance and Empanelment: In India, you need exchange’s approval before you take a strategy live. The process involves participating in a mock to give a demo of your strategy to the exchange. If all required conditions are satisfied then the strategy can be taken live. Some exchanges like CME don’t require each strategy to be tested separately; they just test Trading Systems and grant access.
  1. Audit & Compliance: All HFT firms in India have to undergo a half yearly audit. Auditing can only be done by certified auditors listed on the exchange’s website. For the audit, you are required to maintain order logs, trade logs, control parameters etc. for past few years. Other global exchanges like CME require similar data to be saved for the past few years for audit purposes.
  1. Team: And last but not the least, you need a team of professionals to come together to run your desk. Broadly speaking Traders/Strategists, IT professionals, Network managers, Risk Managers, HR and Legal teams need to work together. But to start with IT professionals and Traders/Strategists should be sufficient. A small team of 3-5 Traders and IT professionals, along with Support Staff, i.e. a total of about 7-10 people can constitute an algorithmic trading firm. In the case of start-ups, a single person can don multiple hats taking responsibility for several tasks and a team of 4-5 members can start.

The trading philosophy and frequency of trading you choose will alter the infrastructure and skill requirements significantly. Selecting the trading philosophy can be a crucial decision for your set-up and will require enough research about different paradigms, markets and the regulations around them.

Next Step

Learn by doing and know more about trading strategy paradigms, different programming languages that can be used for trading and the advantages of algorithmic trading over traditional trading techniques by checking out the self-paced certification courses on Quantra!

If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to build a promising career in algorithmic trading. Enroll now!



Read more

Mean Reversion in Time Series

By Devang Singh

Time series data is simply a collection of observations generated over time. For example, the speed of a race car at each second, daily temperature, weekly sales figures, stock returns per minute, etc. In the financial markets, a time series tracks the movement of specific data points, such as a security’s price over a specified period of time, with data points recorded at regular intervals. A time series can be generated for any variable that is changing over time. Time series analysis comprises of techniques for analyzing time series data in an attempt to extract useful statistics and identify characteristics of the data. Time series forecasting is the use of a mathematical model to predict future values based on previously observed values in the time series data.

The graph shown below represents the daily closing price of Aluminium futures over a period of 93 trading days, which is a time series.

Mean Reversion

Mean reversion is the theory which suggests that prices, returns, or various economic indicators tend to move to the historical average or mean over time. This theory has led to many trading strategies which involve the purchase or sale of a financial instrument whose recent performance has greatly differed from their historical average without any apparent reason. For example, let the price of gold increase on average by INR 10 every day and one day the price of gold increases by INR 40 without any significant news or factor behind this rise, then by the mean reversion principle we can expect the price of gold to fall in the coming days such that the average change in price of gold remains the same. In such a case, the mean reversionist would sell gold, speculating the price to fall in the coming days. Thus, making profits by buying the same amount of gold he had sold earlier, now at a lower price.

A mean-reverting time series has been plotted below, the horizontal black line represents the mean and the blue curve is the time series which tends to revert back to the mean.

A collection of random variables is defined to be a stochastic or random process. A stochastic process is said to be stationary if its mean and variance are time invariant (constant over time). A stationary time series will be mean reverting in nature, i.e. it will tend to return to its mean and fluctuations around the mean will have roughly equal amplitudes. A stationary time series will not drift too far away from its mean because of its finite constant variance. A non-stationary time series, on the contrary, will have a time varying variance or a time varying mean or both, and will not tend to revert back to its mean. In the financial industry, traders take advantage of stationary time series by placing orders when the price of a security deviates considerably from its historical mean, speculating the price to revert back to its mean. They start by testing for stationarity in a time series. Financial data points, such as prices, are often non-stationary, i.e. they have means and variances that change over time. Non-stationary data tends to be unpredictable and cannot be modeled or forecasted. A non-stationary time series can be converted into a stationary time series by either differencing or detrending the data. A random walk (the movements of an object or changes in a variable that follow no discernible pattern or trend) can be transformed into a stationary series by differencing (computing the difference between Yt and Yt -1). The disadvantage of this process is that it results in losing one observation each time the difference is computed. A non-stationary time series with a deterministic trend can be converted into a stationary time series by detrending (removing the trend). Detrending does not result in loss of observations. A linear combination of two non-stationary time series can also result in a stationary, mean-reverting time series. The time series (integrated of at least order 1), which can be linearly combined to result in a stationary time series are said to be cointegrated.

Shown below is a plot of a non-stationary time series with a deterministic trend (Yt = α + βt + εt) represented by the blue curve and its detrended stationary time series (Yt – βt = α + εt) represented by the red curve.

Become an algotrader. learn EPAT for algorithmic trading

Trading Strategies based on Mean Reversion

One of the simplest mean reversion related trading strategies is to find the average price over a specified period, followed by determining a high-low range around the average value from where the price tends to revert back to the mean. The trading signals will be generated when these ranges are crossed – placing a sell order when the range is crossed on the upper side and a buy order when the range is crossed on the lower side. The trader takes contrarian positions, i.e. goes against the movement of prices (or trend), expecting the price to revert back to the mean. This strategy looks too good to be true and it is, it faces severe obstacles. The lookback period of the moving average might contain a few abnormal prices which are not characteristic to the dataset, this will cause the moving average to misrepresent the security’s trend or the reversal of a trend. Secondly, it might be evident that the security is overpriced as per the trader’s statistical analysis, yet he cannot be sure that other traders have made the exact same analysis. Because other traders don’t see the security to be overpriced, they would continue buying the security which would push the prices even higher. This strategy would result in losses if such a situation arises.

Pairs Trading is another strategy that relies on the principle of mean reversion. Two co-integrated securities are identified, the spread between the price of these securities would be stationary and hence mean reverting in nature. An extended version of Pairs Trading is called Statistical Arbitrage, where many co-integrated pairs are identified and split into buy and sell baskets based on the spreads of each pair. The first step in a Pairs Trading or Stat Arb model is to identify a pair of co-integrated securities. One of the commonly used tests for checking co-integration between a pair of securities is the Augmented Dickey-Fuller Test (ADF Test). It tests the null hypothesis of a unit root being present in a time series sample. A time series which has a unit root, i.e. 1 is a root of the series’ characteristic equation, is non-stationary. The augmented Dickey-Fuller statistic, also known as t-statistic, is a negative number. The more negative it is, the stronger the rejection of the null hypothesis that there is a unit root at some level of confidence, which would imply that the time series is stationary. The t-statistic is compared with a critical value parameter, if the t-statistic is less than the critical value parameter then the test is positive and the null hypothesis is rejected.

Co-integration check – ADF Test

Consider the Python code shown below for checking co-integration:

We start by importing relevant libraries, followed by fetching financial data for two securities using the quandl.get() function. Quandl provides financial and economic data directly in Python by importing the Quandl library. In this example, we have fetched data for Aluminium and Lead futures from MCX. We then print the first five rows of the fetched data using the head() function, in order to view the data being pulled by the code. Using the statsmodels.api library, we compute the Ordinary Least Squares regression on the closing price of the commodity pair and store the result of the regression in the variable named ‘result’. Next, using the statsmodels.tsa.stattools library, we run the adfuller test by passing the residual of the regression as the input and store the result of this computation the array “c_t”. This array contains values like the t-statistic, p-value, and critical value parameters. Here, we consider a significance level of 0.1 (90% confidence level). “c_t[0]” carries the t-statistic, “c_t[1]” contains the p-value and “c_t[4]” stores a dictionary containing critical value parameters for different confidence levels. For co-integration we consider two conditions, firstly we check whether the t-stat is lesser than the critical value parameter (c_t[0] <= c_t[4][‘10%’]) and secondly whether the p-value is lesser than the significance level (c_t[1] <= 0.1). If both these conditions are true, we print that the “Pair of securities is co-integrated”, else print that the “Pair of securities is not cointegrated”.


To know more about Pairs Trading, Statistical Arbitrage and the ADF test you can check out the self-paced online certification course on “Statistical Arbitrage Trading“ offered jointly by QuantInsti and MCX to learn how to trade Statistical Arbitrage strategies using Python and Excel.

Other Links

Statistics behind pairs trading –

ADF test using excel –

Next Step

If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to build a promising career in algorithmic trading. Enroll now!

Read more

Strategy using Trend-following Indicators: MACD, ST and ADX

By Gopal Ananthanarayanan

This article is the final project submitted by the author as a part of his coursework in Executive Programme in Algorithmic Trading (EPAT™) at QuantInsti™. Do check our Projects page and have a look at what our students are building.

‘Trend is a friend’ goes a famous saying.  According to the Pundits, the success rate in the stock market is 5%. Of course, everyone wants to be in that 5%! To be in that 5% bracket is difficult if not impossible. All we need is an appropriate strategy(ies). Even with a strategy in place, it is important to understand whether the market condition will help the strategy. The decisive factor to being a successful trader is being current in the market.

My strategy is based on training that I had a couple of years ago from a respectable and well known technical analyst.  I have used that strategy and enhanced it based on the feedback that I received from my mentor Mr. Abhishek Kulkarni.

Motivation for using this strategy

I have been following charts for a few years now. As part of my learning, I have used several technical indicators. I am fascinated by how technical analysts use these indicators in response to the market.

As my strength is programming, I have used a strategy, which is not too heavy on concepts. During assignments stage, I have used R to implement the pairs trading strategy.

Details about the strategy

The strategy that I have arrived at works on trending markets – both bear and bull. It uses a couple of technical indicators to identify the momentum and trades both on the long and the short side of the market.

The technical indicators that I have used are Moving Average Convergence Divergence (MACD) indicator, a trend-following momentum indicator and Super Trend (ST) indicator, a trend following indicator.

Become an algotrader. learn EPAT for algorithmic trading

A brief note on the use of MACD and ST indicators

Moving Average Convergence Divergence (MACD)

MACD is calculated using two exponential moving averages (EMA) – short term and long term. An exponential moving average of MACD is used as a signal line to indicate the upward or downward momentum.

There are two entry points to be considered while using MACD.  One, when the MACD line crosses the signal line.  Second, when MACD is in the positive territory – which implies that, the smaller moving average is above the chosen larger moving average.

SuperTrend (ST)

The Super Trend indicator, which is a trending indicator has been used to determine whether the price is in an upward or downward trend. If the price is above the indicator line then the price point acts as a point of support. If the price is below the indicator line, it acts as a point of resistance.

I have optimized the strategy by managing the position size through a weightage of 1 unit for MACD and 2 units for SuperTrend. I expect that the MACD will provide quick entry and exit positions.  When used with SuperTrend, it will provide more clarity to run the trend.

Choice of stocks

There is no stock-specific criterion for using this strategy.  However, for any strategy to work efficiently, liquid stocks are preferred. Hence, the focus of this strategy is on Nifty 50 stocks.

Data usage

I have backtested this strategy on a daily time frame working off the daily data downloaded from Yahoo.

I have tested my strategy from the year 2012 onwards.  Obviously, some of the stocks I have used did not have data from 2012.  That is a known caveat, with which I backtested my strategy.

I have used Python for implementing my strategy along with packages like Numpy, Panda, Matplotlib, TA-Lib.  TA-Lib helps in calculating MACD with the necessary parameters.

As there is no readily available method to calculate SuperTrend price points, I have coded the method and used the same in the program.

When I presented this topic to my mentor Abhishek, he suggested I improvise the strategy by using the Average Directional Index (ADX) indicator.  ADX is another readily available method in TA-LIB package to determine trending of the price.

The strategy to BUY/SELL based on MACD and ST will be done only when the stock is trending. Trending of stocks will be decided based on ADX. Any BUY/SELL decision will be made only when the ADX data is above the pre-determined threshold. The moment ADX moves below the threshold, all the open positions will be closed when the market opens the next day.

Implementation of the strategy

The Python packages that I have used in this strategy include:

I created a method to calculate the SuperTrend indicator.  One can optimize this further by passing the data in an array.  However, for simplicity purpose, I am passing each record to identify the upper band, lower band, and the supertrend.

If the period decided for backtesting is very much in the past, there is a chance that some stocks might not have been traded in the market during that time frame. This is a known caveat.

The following Python codes get the latest Nifty50 list from and convert the output into a dataframe.

To set the dates for data retrieval and retrieve the historical data from yahoo finance:

After going through the data and analyzing the same, I discovered that the ‘Adjusted Close’ data provided by yahoo is skewed when there is a corporate action on the stock.  For this, I have made a small adjustment programmatically. If there is a huge variance in the daily return – say the variance is less than 0.75 or above 1.50, then I update the ‘Adjusted Close’ data with ‘Close’ data.

The following Python codes get the technical indicators data into a dataframe for further processing.

Though Average True Range (ATR) indicator is not used directly in the strategy, it is needed to calculate the SuperTrend. I have used the following code to get that into the dataframe.

Now calculate SuperTrend and add that to the dataframe.

To identify the crossover, I have prepared the dataframe with previous periods data for each day’s trading data – 2 periods data needed for MACD and ST to avoid back-testing bias. For ADX, previous day’s data is enough.

To generate MACD and ST trading signals and add the signal to the dataframe. (I have posted detailed inline comments, followed by the for loop code.)

ADX indicator provides an important data point for the strategy to determine whether the stock is trending. BUY or SELL will happen only when ADX is above the threshold. The moment ADX drops below the threshold all open positions will be closed.

Become an algotrader. learn EPAT for algorithmic trading

When there is an MACD crossover or ST crossover, ADX is used to decide on trading the stock. Two indicators have been used in the strategy. One is a signal (macdsig/supersig) and other one is a strategy (macdstr/superstr).

When there is a positive crossover of MACD / ST and ADX is trending, then the signal and strategy are set to ‘1’. When there is a negative crossover of MACD / ST and ADX is trending, then the signal and strategy are set to ‘-1’.

When there is no crossover, the signal is set to ‘0’, but strategy variable is used to decide whether to continue the trade or close the trade (no matter buy or sell). This strategy variable is set to continue the same value from the previous day when the ADX is above the threshold (trending). If not trending, the strategy value is set to 0, thus giving a complete signal to close all the open trades.

To calculate the daily return for MACD crossover and ST crossover separately – Based on the analysis and to get near-to-accurate returns, I have used ‘Adjusted Close’ and ‘Close’ data alternatively based on the daily activity.

After all the data preparatory work is completed, the actual processing of the strategy and signal will be implemented.

I have included a column in the dataframe for output with the commission, to help find the profit post-commission.  I have considered 1% for the first trade and 2% when there is a counter trade.

Here the processing of the signals and strategy is implemented. MACD and ST are separately processed, but the same logic is used. While processing, separate columns in the dataframe are used to calculate the accurate profit/loss.

The logic used is

  • When the signal and the strategy are same and positive, BUY.
  • If the running trade is SOLD, then cover and buy.
  • If not, just buy.


  • When the signal and strategy are same and negative, SELL.
  • If the running trade is BOUGHT, then sell and go short.
  • If not, just go short.

When the signal and strategy are same (0) and it is NO SIGNAL, Close the open positions.  If the running trade is BOUGHT, then sell and if it is SOLD, then cover.

All the other calculations are done at each row level and stored in additional columns in the dataframe – Cumulative returns, Annualized returns, Annualized standard deviation, Annualized Sharpe ratio. The above data is used to calculate for MACD and ST separately.

The combined calculation:

Print the same on the console or file depending on where it is routed.

Additionally, the trade success details are also calculated for further analysis of the strategy – CAGR, Success ratio of trades, and Average profit to loss.

To print out the data:

To plot the chart using Matplotlib:

It took a lot of time to complete the strategy as
(1) I was super busy at work
(2) Had to undergo a cornea transplant surgery.

I was frequently in touch with my mentor Abhishek, sharing the work-in-progress code. He has been kind enough to give feedback regularly. The Quantinsti™ team was encouraging and provided guidance with various sample projects.

I do not think this experience as a blog will be complete without showing the output.

Going through this project and coding this strategy was a great learning experience. Stackoverflow, Pandas documentation were the favorite websites that I was visiting for technical knowhow. I have tried to keep the code simple and straight. I have optimized the code and have improvised by reducing the looping of the dataframe to the minimum. I have kept the code dynamic, so that any code written in between to include new columns in the dataframe, will not affect the other parts of the program.

I am sure the Python Pundits can polish the code further. I appreciate your time reading this strategy. If any of you find a simpler way to do what I have done, please feel free to write to me.

Next Step

If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to build a promising career in algorithmic trading. Enroll now!



Python file

Read more

Forecasting Stock Returns using ARIMA model

By Milind Paradkar

“Prediction is very difficult, especially about the future”. Many of you must have come across this famous quote by Neils Bohr, a Danish physicist. Prediction is the theme of this blog post. In this post, we will cover the popular ARIMA forecasting model to predict returns on a stock and demonstrate a step-by-step process of ARIMA modeling using R programming.

What is a forecasting model in Time Series?

Forecasting involves predicting values for a variable using its historical data points or it can also involve predicting the change in one variable given the change in the value of another variable. Forecasting approaches are primarily categorized into qualitative forecasting and quantitative forecasting. Time series forecasting falls under the category of quantitative forecasting wherein statistical principals and concepts are applied to a given historical data of a variable to forecast the future values of the same variable. Some time series forecasting techniques used include:

  • Autoregressive Models (AR)
  • Moving Average Models (MA)
  • Seasonal Regression Models
  • Distributed Lags Models

Become an algotrader. learn EPAT for algorithmic trading

What is Autoregressive Integrated Moving Average (ARIMA)?

ARIMA stands for Autoregressive Integrated Moving Average. ARIMA is also known as Box-Jenkins approach. Box and Jenkins claimed that non-stationary data can be made stationary by differencing the series, Yt. The general model for Yt is written as,

Yt1Yt1 2Yt2…ϕpYtpt + θ1ϵt1+ θ2ϵt2 +…θqϵtq

Where, Yt is the differenced time series value, ϕ and θ are unknown parameters and ϵ are independent identically distributed error terms with zero mean. Here, Yt is expressed in terms of its past values and the current and past values of error terms.

The ARIMA model combines three basic methods:

  • AutoRegression (AR) – In auto-regression the values of a given time series data are regressed on their own lagged values, which is indicated by the “p” value in the model.
  • Differencing (I-for Integrated) – This involves differencing the time series data to remove the trend and convert a non-stationary time series to a stationary one. This is indicated by the “d” value in the model. If d = 1, it looks at the difference between two time series entries, if d = 2 it looks at the differences of the differences obtained at d =1, and so forth.
  • Moving Average (MA) – The moving average nature of the model is represented by the “q” value which is the number of lagged values of the error term.

This model is called Autoregressive Integrated Moving Average or ARIMA(p,d,q) of Yt.  We will follow the steps enumerated below to build our model.

Step 1: Testing and Ensuring Stationarity

To model a time series with the Box-Jenkins approach, the series has to be stationary. A stationary time series means a time series without trend, one having a constant mean and variance over time, which makes it easy for predicting values.

Testing for stationarity – We test for stationarity using the Augmented Dickey-Fuller unit root test. The p-value resulting from the ADF test has to be less than 0.05 or 5% for a time series to be stationary. If the p-value is greater than 0.05 or 5%, you conclude that the time series has a unit root which means that it is a non-stationary process.

Differencing – To convert a non-stationary process to a stationary process, we apply the differencing method. Differencing a time series means finding the differences between consecutive values of a time series data. The differenced values form a new time series dataset which can be tested to uncover new correlations or other interesting statistical properties.

We can apply the differencing method consecutively more than once, giving rise to the “first differences”, “second order differences”, etc.

We apply the appropriate differencing order (d) to make a time series stationary before we can proceed to the next step.

Step 2: Identification of p and q

In this step, we identify the appropriate order of Autoregressive (AR) and Moving average (MA) processes by using the Autocorrelation function (ACF) and Partial Autocorrelation function (PACF).  Please refer to our blog, “Starting out with Time Series” for an explanation of ACF and PACF functions.

Identifying the p order of AR model

For AR models, the ACF will dampen exponentially and the PACF will be used to identify the order (p) of the AR model. If we have one significant spike at lag 1 on the PACF, then we have an AR model of the order 1, i.e. AR(1). If we have significant spikes at lag 1, 2, and 3 on the PACF, then we have an AR model of the order 3, i.e. AR(3).

Identifying the q order of MA model

For MA models, the PACF will dampen exponentially and the ACF plot will be used to identify the order of the MA process. If we have one significant spike at lag 1 on the ACF, then we have an MA model of the order 1, i.e. MA(1). If we have significant spikes at lag 1, 2, and 3 on the ACF, then we have an MA model of the order 3, i.e. MA(3).

Step 3: Estimation and Forecasting

Once we have determined the parameters (p,d,q) we estimate the accuracy of the ARIMA model on a training data set and then use the fitted model to forecast the values of the test data set using a forecasting function. In the end, we cross check whether our forecasted values are in line with the actual values.

Building ARIMA model using R programming

Now, let us follow the steps explained to build an ARIMA model in R. There are a number of packages available for time series analysis and forecasting. We load the relevant R package for time series analysis and pull the stock data from yahoo finance.

In the next step, we compute the logarithmic returns of the stock as we want the ARIMA model to forecast the log returns and not the stock price. We also plot the log return series using the plot function.

Next, we call the ADF test on the returns series data to check for stationarity. The p-value of 0.01 from the ADF test tells us that the series is stationary. If the series were to be non-stationary, we would have first differenced the returns series to make it stationary.

In the next step, we fixed a breakpoint which will be used to split the returns dataset in two parts further down the code.

We truncate the original returns series till the breakpoint, and call the ACF and PACF functions on this truncated series.

We can observe these plots and arrive at the Autoregressive (AR) order and Moving Average (MA) order.

We know that for AR models, the ACF will dampen exponentially and the PACF plot will be used to identify the order (p) of the AR model. For MA models, the PACF will dampen exponentially and the ACF plot will be used to identify the order (q) of the MA model. From these plots let us select AR order = 2 and MA order = 2. Thus, our ARIMA parameters will be (2,0,2).

Our objective is to forecast the entire returns series from breakpoint onwards. We will make use of the For Loop statement in R and within this loop we will forecast returns for each data point from the test dataset.

In the code given below, we first initialize a series which will store the actual returns and another series to store the forecasted returns.  In the For Loop, we first form the training dataset and the test dataset based on the dynamic breakpoint.

We call the arima function on the training dataset for which the order specified is (2, 0, 2). We use this fitted model to forecast the next data point by using the forecast.Arima function. The function is set at 99% confidence level. One can use the confidence level argument to enhance the model. We will be using the forecasted point estimate from the model. The “h” argument in the forecast function indicates the number of values that we want to forecast, in this case, the next day returns.

We can use the summary function to confirm the results of the ARIMA model are within acceptable limits. In the last part, we append every forecasted return and the actual return to the forecasted returns series and the actual returns series respectively.

Before we move to the last part of the code, let us check the results of the ARIMA model for a sample data point from the test dataset.

From the coefficients obtained, the return equation can be written as:

Yt = 0.6072*Y(t-1)  – 0.8818*Y(t-2) – 0.5447ε(t-1) + 0.8972ε(t-2)

The standard error is given for the coefficients, and this needs to be within the acceptable limits. The Akaike information criterion (AIC) score is a good indicator of the ARIMA model accuracy. Lower the AIC score better the model. We can also view the ACF plot of the residuals; a good ARIMA model will have its autocorrelations below the threshold limit. The forecasted point return is -0.001326978, which is given in the last row of the output.

Become an algotrader. learn EPAT for algorithmic trading

Let us check the accuracy of the ARIMA model by comparing the forecasted returns versus the actual returns. The last part of the code computes this accuracy information.

If the sign of the forecasted return equals the sign of the actual returns we have assigned it a positive accuracy score. The accuracy percentage of the model comes to around 55% which looks like a decent number. One can try running the model for other possible combinations of (p,d,q) or instead use the auto.arima function which selects the best optimal parameters to run the model.


To conclude, in this post we covered the ARIMA model and applied it for forecasting stock price returns using R programming language. We also crossed checked our forecasted results with the actual returns. In our upcoming posts, we will cover other time series forecasting techniques and try them in Python/R programming languages.

Next Step

If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to build a promising career in algorithmic trading. Enroll now!

Read more

Using IBrokers package to implement R in Interactive Brokers API

By Milind Paradkar

In our previous article, we covered IBridgePy written by Dr. Hui Liu. IBridgePy is a wrapper for Interactive Brokers’ C++ API that allows one to trade in Interactive Brokers (IB) using Python. The article covered the key topics from the webinar, “Trading with Interactive Brokers using Python”, which was conducted by Dr. Hui Liu and hosted by QuantInstiTM. You can watch the recording of Dr. Hui Liu’s webinar here.

Another popular programming language that is widely used by algorithmic developers for data analysis, coding strategies, and backtesting is R programming. Interactive Brokers in collaboration with QuantInstiTM hosted a webinar, “Trading using R on Interactive Brokers” which was held on 2nd March 2017 at 10:00 AM ET.  The webinar was conducted by Anil Yadav, Director at QuantInstiTM. Anil is also an Algo strategy advisor at iRageCapital, one of the leading HFT firms in India and has managed a portfolio of equity futures using R and Interactive Brokers. The webinar was very well received by the audience and it covered all the relevant topics to get one trading in live markets using R!

The webinar covered the following topics:

  • Installing R-studio IDE
  • Reference sheet for the IBrokers Package
  • TWS configuration
  • Understanding the structure – CallBack, eWrapper, ProcessMessage
  • Viewing account information details in R
  • Downloading historical data in R
  • Printing real-time data on R console
  • Sending predefined order using R script
  • Sending event based order using R script

The webinar includes sample strategy and also mentions the warnings and pitfalls that a developer needs to be aware of when trading on Interactive Brokers using R.

This post gives a brief on the IBrokers package for all the R coders/enthusiasts, Interactive Brokers’ clients, and wannabe traders.

About the Interactive Brokers’ APIs

Before we cover the IBrokers package a short brief on the application program interfaces (APIs) offered by Interactive Brokers (IB) for trading programmatically. IB is an international brokerage firm which specializes in electronic execution in products ranging from equities to bonds, options to futures, Forex, all from a single account. Trader Workstation (TWS) is Interactive Brokers widely used desktop trading platform. Interactive Brokers provides several API programming languages (Java, .Net, C++, ActiveX, DDE.) which can be used to link to one’s system and trade on your IB account. The API allows you to connect through either the TWS or the IB Gateway.

  1. Connecting through the TWS requires that you have the application running, but also allows you to test and confirm that your API orders are working correctly.
  2. Connecting through the IB Gateway allows you to use the API without a large graphical user interface (GUI) application running, but does not provide an interface for you to test and confirm API activity.

Some of the key uses of the API include:

  1. To retrieve real-time data from the TWS
  2. To programmatically execute the orders (view, modify, and submit orders to be executed)
  3. Access to account information, connection status.
  4. Access to contract details and news bulletins from the TWS.

Become an algotrader. learn EPAT for algorithmic trading

Exploring the IBrokers package

The IBrokers package authored by Jeffery Ryan is a pure R implementation of the TWS API and lets you trade in Interactive Brokers using R. We will cover some of the important functions from the package in this post.

To install the IBrokers package, follow the standard installation function in R, i.e. install.packages().


Some important functions from IBrokers package:

twsConnect Function

This function is used to establish, check or terminate a connection to TWS. The function returns a twsConnection object for use in subsequent TWS API calls.


twsConnectionTime Function

This function provides the time when the connection to the TWS was made.


reqAccountUpdates Function

This function is used to request and view account details from Interactive Brokers.



twsContract Function

This function creates, test or coerces a twsContract for use in API calls. The value returned by the function is a twsContract object.


Similarly, there is a twsCurrency function which is a wrapper to twsContract to make ‘currency/FX’ contracts easier to specify.

reqMktData Function

This function allows for streaming market data to be handled in R.







Become an algotrader. learn EPAT for algorithmic trading

reqHistoricalData Function

This function is used to request historical data from TWS.





placeOrder Function

This function is used to place or cancel an order to the TWS.


These were some of the key functions from the IBrokers package. One can formulate a strategy using other statistical packages in R and then use the functions from the IBrokers package to retrieve market data, view account info, and execute/modify orders via R.

Next Step

If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to be a successful trader. Enroll now!




Read more

Starting Out with Time Series

Time series analysis and forecasting find wide usage in the financial markets across assets like stocks, F&O, Forex, and Commodities. As such, it becomes pertinent for aspiring quants to have sound knowledge in time series forecasting. In this post, we will introduce the basic concepts of time series and illustrate how to create time series plots and analysis in R programming language.

Time series defined

A time series is a sequence of observations over time, which are usually spaced at regular intervals of time. For example:

  • Daily stock prices for the last 5 years
  • 1-minute stock price data for the last 90 days
  • Quarterly revenues of a company over the last 10 years
  • Monthly car sales of an automaker for the last 3 years
  • Annual unemployment rate of a state in the last 50 years

Univariate time series and Multivariate time series

A univariate time series refers to the set of observations over time of a single variable. Correspondingly, a multivariate time series refers to the set of observations over time of several variables.

Time Series Analysis and Forecasting

In time series analysis, the objective is to apply/develop models which are able to describe the given time series with a fair amount of accuracy. On the other hand, time series forecasting involves forecasting the future values of a given time series using the past observed values. There are various models that are used for forecasting and the viability of a particular model used for forecasting is determined by its performance at predicting the future values.

Some examples of time series forecasting:

  • Forecasting the closing price of a stock every day
  • Forecasting the quarterly revenues of a company
  • Forecasting the monthly number of cars sold.

Plotting a time series

A plot of a time series data gives a clear picture of the spread over the given time period. It becomes easy for a human eye to detect any seasonality or abnormality in a given time series.

Become an algotrader. learn EPAT for algorithmic trading

Plotting a time series in R

To plot a time series in R, we first need to read the data in R. If the data is available in a CSV file or in an Excel file, we can read the data in R using the function or the read.xlsx() function respectively. Once the data has been read, we can create a time series plot by using the plot.ts() function. See the example given below.

We will use the time series data set from the Time Series Data Library (TSDL) created by Rob Hyndman. We will plot the monthly closings of the Dow-Jones industrial index, Aug. 1968 – Oct. 1992. Save the dataset in your current R working directory with the name monthly-closings-of-the-dowjones.csv


Decomposing time series

A time series generally comprises of a trend component, irregular (noise) component, and can also have a seasonal component, in the case of a seasonal time series. Decomposing time series means separating the original time series into these components.

Trend – The increasing or decreasing values in a given time series.

Seasonal – The repeating cycle over a specific period (day, week, month, etc.) in a given time series.

Irregular (Noise) – The random (irregularity) of values in a given time series

Why do we need to decompose a time series?

As mentioned in the above paragraph, a time series might include a seasonal component or an irregular component. In such a case, we would not get a true picture of the trending property of the time series. Hence, we need to separate out the seasonality effect and/or the noise which will give us a clear picture, and help in further analysis.

How do we decompose a time series?

There are two structures which can be used for decomposing a given time series.

  1. Additive decomposition – If the seasonal variation is relatively constant over time, we can use the additive structure for decomposing a given time series. The additive structure is given as –

Xt = Trend + Random + Seasonal

  1. Multiplicative decomposition – If the seasonal variation is increasing over time, we can use the multiplicative structure for decomposing a time series. The multiplicative structure is given as –

Xt = Trend * Random * Seasonal

Decomposing a time series in R

To decompose a non-seasonal time series in R, we can use a smoothing method for calculating the moving average of a given time series. We can use the SMA() function from the TTR package to smooth out the time series.

To decompose a seasonal time series in R, we can use the decompose() function. This function estimates the trend, seasonal, and irregular (noise) components of a given time series. The decompose function is given as –

decompose(x, type = c(“additive”, “multiplicative”), filter = NULL)

x – A time series
type – The type of seasonal component. Can be abbreviated
filter – A vector of filter coefficients in reverse time order (as for AR or MA coefficients), used for filtering out the seasonal component. If NULL, a moving average with the symmetric window is performed.

When we use the decompose function, we need to specify the trend type (multiplicative, additive) and seasonality type (multiplicative, additive) in the arguments.

Become an algotrader. learn EPAT for algorithmic trading

Stationary and non-stationary time series

A stationary time series is one where the mean and the variance are both constant over time or is one whose properties do not depend on the time at which the series is observed. Thus, the time series is a flat series without trend, constant variance over time, a constant mean, a constant autocorrelation and no seasonality. This makes a stationary time series is easy to predict. On the other hand, a non-stationary time series is one where either mean or variance or both are not constant over time.

There are different tests that can use to check whether a given time series is stationary. These include the Autocorrelation function (ACF), Partial autocorrelation function (PACF), Ljung-Box test, Augmented Dickey–Fuller (ADF) t-statistic test, and the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test.

Let us test our sample time series with the Autocorrelation function (ACF), Partial autocorrelation function (PACF) to check if it is stationary.

Autocorrelation function (ACF) – The autocorrelation function checks for correlation between two different data points of a time series separated by a lag “h”. For example, the ACF will check for correlation between points #1 and #2, #2 and #3 etc. Similarly, for lag 3, the ACF function will check between points #1 and #4, #2 and #5, #3 and #6 etc.

R code for ACF –


Partial autocorrelation function (PACF) – In some cases, the effect of autocorrelation at smaller lags will have an influence on the estimate of autocorrelation at longer lags. For example, a strong lag one autocorrelation can cause an autocorrelation with lag three. The Partial Autocorrelation Function (PACF) removes the effect of shorter lag autocorrelation from the correlation estimate at longer lags.

R code for PACF


The values of ACF and PACF each vary between plus and minus one. When the values are closer to plus or minus one it indicates a strong correlation. If the time series is stationary, the ACF will drop to zero relatively quickly, while the ACF of non-stationary time series will decrease slowly. From the ACF graph, we can conclude that the given time series in non-stationary.


In this post, we gave an overview of time series, plotting time series data, and decomposition of a time series into its constituent components using R programming language. We also got introduced to the concept of stationary and non-stationary time series and the tests which can be carried out to check if the given time series is stationary. In our upcoming post, we will continue with the concept stationary time series and see how to convert a non-stationary time series into a stationary time series. For further reference, you might like go through the following:

Next Step

If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to build a promising career in algorithmic trading. Enroll now!


Read more

[WEBINAR] How to Use Financial Market Data for Fundamental and Quantitative Analysis

QuantInsti will be hosting one-of-a-kind webinar with three leading experts from across the globe. Register for the webinar to learn to trade fundamentals profitably, understand the challenges surrounding High-frequency data analysis, discover the opportunities and gotchas in Futures trading, and view a live demonstration of a step-by-step tutorial on one of the most popular trading strategies, the Pairs trading strategy!

Don’t miss out on this opportunity to learn from the market practitioners themselves


Read more

Decoding the Black Box running Trading Systems


What is a Trading System?

A “trading system”, more commonly referred as a “trading strategy” is nothing but a set of rules, which when applied to the given input data generate entry and exit signals (buy/sell).

Although formulating a trading strategy seems like an easy task, in reality, it is not! Creating a profitable trading strategy requires exhaustive quantitative research, and the brains behind a quantitative trading strategy are known as “Quants” in the algorithmic trading world. We can define a quant as a professional employed by a quantitative trading firm who applies advanced mathematical and statistical models with the sole objective to create an alpha-seeking strategy.

By an alpha-seeking strategy, we mean a profitable trading strategy that can consistently generate returns that are independent of the direction of the overall market.

For those outside the algorithmic trading world, the work of quants and the quantitative trading strategies appear opaque and complex, hence the term “Black Box” Trading. In this post, we will attempt to unravel the black box, and try to decipher the mechanics of black box trading.

How do Trading Systems operate?

Any trading system, conceptually, is nothing more than a computational block that interacts with the exchange on two different streams.

  1. Receives market data
  2. Sends order requests and receives replies from the exchange.

The market data that is received typically informs the system of the latest order book. It might contain some additional information like the volume traded so far, the last traded price and quantity for a scrip. However, to make a decision on the data, the trader might need to look at old values or derive certain parameters from history. To cater to that, a conventional system would have a historical database to store the market data and tools to use that database. The analysis would also involve a study of the past trades by the trader. Hence another database for storing the trading decisions as well. Last, but not the least, a GUI interface for the trader to view all this information on the screen.

The entire trading system can now be broken down into

  • The exchange(s) – the external world
  • The server
    • Market Data receiver
    • Store market data
    • Store orders generated by the user
  • Application
    • Take inputs from the user including the trading decisions
    • Interface for viewing the information including the data and orders
    • An order manager sending orders to the exchange

What you call a Trading System is actually a CEP System

A CEP System stands for Complex Event Processing System. This lengthy term may sound very convoluted, but once you learn complex events and the components that make a CEP system, you will appreciate this clear-box system.

A complex event is nothing but a set of incoming events. These include stock trends, market movements, news etc. Complex event processing is performing computational operations on complex events in short time. The operations can include detecting complex patterns, building correlations and relationships such as causality and timing between many incoming events.

CEP systems process events in real time and this is a key feature of a CEP system. The faster the processing of events, the better a CEP system is. For example, if a trading system is designed to detect a profit-making opportunity for the next 1 second, but the time taken by the CEP system exceeds this threshold, then the trading system won’t be able to make any profits.

The CEP system comprises of four parts: a CEP engine, CEP rules, CEP WS and CEP result interface. The two primary components of any CEP system are the CEP engine and the set of CEP rules. The CEP engine processes incoming events based on CEP rules. These rules and the events that go as an input to the CEP engine are determined by the trading system (trading strategy) applied.

For a quant, the majority of his work is concentrated in this CEP system block.  A quant will spend most of his time in formulating trading strategies; performing rigorous backtesting, optimization, and position-sizing among other things. This is done to ensure the viability of the trading strategy in real markets. No single strategy can guarantee everlasting profits. Hence, quants are required to come up with new strategies on a regular basis to maintain an edge in the markets.

There are a number of popular trading systems that are widely used in current markets. These range from Momentum strategies, Statistical arbitrage, Market making etc. See our very insightful blog on Algorithmic Trading Strategies, Paradigms and Modelling Ideas to know more about these trading systems.

Order Management in Automated Trading Systems

The signals generated by an algorithmic system can be either executed manually or in an automated way. When the signals are executed in an automated manner, we can call this entire system as an “Automated trading system”.  Automation of the orders is done by the “Order Manager” module. The order manager module comprises of different execution strategies which execute the buy/sell orders based on a pre-defined logic. Some of the popular execution strategies include VWAP, TWAP etc. There are different processes like order routing, order encoding, transmission etc. that form part of this module. See our blog on Order Management System (OMS) to know more about these processes.

Risk management in Automated Trading Systems

Since automated trading systems work without any human intervention, it becomes pertinent to have thorough risk checks to ensure that the trading systems perform as designed. The absence of risk checks or a faulty risk management can lead to enormous irrecoverable losses for a quantitative firm as seen in the past. Thus, a risk management system (RMS) forms a very critical component of any automated trading system. There are 2 places where Risk Management is handled in algo trading systems:

Within the application – We need to ensure those wrong parameters are not set by the trader. It should not allow a trader to set grossly incorrect values nor any fat-finger errors.

Before generating an order in OMS – Before the order flows out of the system we need to make sure it goes through some risk management system. This is where the most critical risk management check happens. See our blog on “Changing trends in trading risk management” to know more about risk management aspects and risk handling in an automated trading system.

High-Frequency Trading Systems

Building an automated trading system involves high costs and resources. Building such a system in-house may not be feasible for some quant firms. Such firms can opt for institutional automated trading platforms which allow for high-frequency trading, execution and order management across equities, foreign exchange, options, and futures. These platforms allow their clients to completely control and customize their proprietary algorithms while maintaining the confidentiality of their trading strategies.

Popular Automated Trading Systems

Building an entire automated trading system can be beyond the scope of an individual retail trader. For traders who want to explore the algorithmic way of trading can opt for automated trading systems that are available in the markets on a subscription basis. A trader can subscribe to these automated systems and use the algorithmic trading strategies that are made available to the users on these systems. We have highlighted some of the popular automated trading systems in our blog, “Top Algo Trading Platforms in India”. Traders who know programming can formulate and backtest their strategies in programming platforms like Python and R.

Build Your Own Algorithmic Trading Systems

By now, you must have realized that “Black Box” trading is not as complex as it sounds.  Wannabe traders can learn to build their own algorithmic trading strategies and trade profitably in the markets. The following steps can serve as a rough guideline for building an algorithmic trading strategy:

  1. Ideation or strategy hypothesis – come up with a trading idea which you believe would be profitable in live markets. The idea can be based on your market observations or can be borrowed from trading books, research papers, trading blogs, trading forums or any other source.
  2. Get the required data – To test your idea you would require historical data. You can get this data from sites like Google finance, Yahoo finance or from a paid data vendor
  3. Strategy writing – Once you have the data, you can start coding your strategy for which you can use tools like Excel, Python or R programming.
  4. Backtesting your strategy – Once coded, you need to test whether your trading idea gives good returns on the historical data. Backtesting would involve optimization of inputs, setting profit targets and stop loss, position-sizing etc.
  5. Paper trading your strategy – After the backtesting step, you need to paper trade your strategy first. This would mean testing your strategy on a simulator which simulates market conditions. There are brokers which provide platforms for paper trading your strategy.
  6. Taking your strategy live – if the strategy is profitable after paper trading you can take it live. You can open an account with a suitable broker that provides the algorithmic trading facility.

The number of exchanges that allow algorithmic trading for professional, as well as retail traders, has been growing with each passing year, and more and more traders are turning to algorithmic trading. We hope that this article was insightful for our readers and would encourage them to upgrade their way of trading. So what are you waiting for? Go Algo!!

Next Step

If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to build a promising career in algorithmic trading. If you are interested in exploring self-paced trading courses you can also visit Quantra where we have listed short courses like “Getting started with Algorithmic Trading” and “Python for Trading”.


Read more

[WEBINAR] Trading in Live Markets using R

Trading using R on Interactive Brokers

The session would be covering the following aspects:

  1. Installing R-studio IDE
  2. Reference sheet for the IBroker Package –
  3. TWS configuration
  4. Viewing account information details in R
  5. Downloading historical data in R
  6. Printing real-time data on R console
  7. Sending predefined order using R script
  8. Sending event based order using R script

We’d be covering the points 6-8 through a trading strategy.


Read more