Architecture of IBrokers R Implementation in Interactive Brokers API

By Milind Paradkar

In the our last post on Using IBrokers package, we introduced our readers to some of the basic functions from the IBrokers package which are used to retrieve market data, view account info, and execute/modify orders via R. This post will cover the structure of the IBrokers package which will enable the R users to build their custom trading strategies and get them executed via Interactive Brokers Trader Workstation (TWS).

Overview of the Interactive Brokers API Architecture

Before we explain the underlying structure of the IBrokers package, let us take an overview of the Interactive Brokers API architecture. Interactive Brokers provides its API program which can be run on Windows, Linux, and MacOS. The API makes a connection to the IB TWS. The TWS, in turn, is connected to the IB data centers and thus, all the communication is routed via the TWS.

The IBrokers R package enables a user to write his strategy in R and helps it get executed via the IB TWS. Below is the flow structure diagram.
Overview of the Interactive Brokers API Architecture

Getting data from the TWS

To retrieve data from the IB TWS, the IBrokers R package includes five important functions.

  • reqContractDetails: retrieves detailed product information.
  • reqMktData: retrieves real-time market data.
  • reqMktDepth: retrieves real-time order book data.
  • reqRealTimeBars: retrieves real-time OHLC data.
  • reqHistoricalData: retrieves historical data.

In addition to these functions, there are helper functions which enable a user to create the above-mentioned data functions easily. These helper functions include:

  • twsContract: create a general Contract object.
  • twsEquity/twsSTK: wrapper to create equity Contract objects
  • twsOption/twsOPT: wrapper to create option Contract objects.
  • twsFuture/twsFUT: wrapper to create futures Contract objects.
  • twsFuture/twsFOP: wrapper to create futures options Contract objects.
  • twsCurrency/twsCASH: wrapper to create currency Contract objects.


Getting data from the TWS

 Real-time Data Model Structure

When a data function is used to access market data streams, the data streams received by the TWS API follow a certain path which enables to bucket these data streams into the relevant message type. Shown below is the list of arguments of the reqMktData function.

                                                     Example: Arguments of the reqMktData function

Source: Algorithmic Trading in R – Malcolm Sherrington

Real-time Data Model

In the following sections, we will see how this data model works and how the arguments of the real-time data functions (e.g. reqMktData) can be customized to create user-defined automated trading programs in R.

Using the CALLBACK Argument

The data functions like reqMktData, reqMktDepth, and reqRealTimeBars all have a special CALLBACK argument. By default, this argument calls the twsCALLBACK function from the IBrokers package.

The general logic of the twsCALLBACK function is to receive the header to each incoming message from the TWS. This is then passed to the processMsg function, along with the eWrapper object. The eWrapper object can maintain state data (prices) and has functions for managing all incoming message types from the TWS. Once the processMsg call returns, another cycle of the infinite loop occurs.

In the example of the incoming messages shown below, we have circled a single message in green (1 6 1 4 140.76 1 0). The first digit (i.e. 1) is the header and the remaining numbers (i.e. 6 1 4 140.76 1 0) constitute the body of the message.

Using the CALLBACK ArgumentIncoming messages from the reqMktData function call

Each message received will invoke the appropriately named eWrapper callback, depending on the message type. By default when nothing is specified, the code will call the default method for printing the results to the screen via cat.

Example with Default Method:

Default Method



Setting CALLBACK = NULL will send raw message level data to cat, which in turn will use the file argument to that function to either return the data to the standard output, or redirected via an open connection, a file, or a pipe.

Example with CALLBACK argument set to NULL:

CALLBACK argument set to NULL


Callbacks, via CALLBACK and eventWrapper, are designed to allow for R level processing of the real-time data stream. Callback helps to customize the output (i.e. incoming results) which can be used to create automated trading programs in R based on the user-defined criteria.

Internal code of the twsCALLBACK function

Inside of the CALLBACK (i.e. twsCALLBACK function) is a loop that fetches the incoming message type and calls the processMsg function at each new message.

Internal code of the twsCALLBACK functionInternal code snippet of the twsCALLBACK function

The ProcessMsg Function

The processMsg function internally is a series of if-else statements that branch according to a known incoming message type. A snippet of the internal code structure of the processMsg function is shown below.

The ProcessMsg FunctionInternal code snippet of the processMsg function

The eWrapper Closure

The eWrapper ClosureCreating eWrapper closure in twsCALLBACK using the eWrapper function

The eWrapper function creates an eWrapper closure to allow for the custom incoming message management. The eWrapper closure contains a list of functions to manage all incoming message type. Each message has a corresponding function in the eWrapper designed to handle the particular details of each incoming message type.

List of functions contained in the eWrapper ClosureList of functions contained in the eWrapper Closure

The data environment is .Data, with accessor methods get.Data, assign.Data, and remove.Data. These methods can be called from the closure object eWrapper$get.Data, eWrapper$assign.Data, etc. By creating an instance of eWrapper, accomplished by calling it as a function call, one can then modify any or all the particular methods embedded in the object.

Summarizing the Internal Structure of the IBrokers Package

We have seen above how the internal structure of the IBrokers package works. To summarize the entire mechanism, it can be depicted as shown below:

Request to TWS for data -> twsCALLBACK -> processMsg -> eWrapper

Real-Time Data Model

We will use the snapShotTest code example published by Jeff Ryan. The code below modifies the twsCALLBACK function. This modified callback is used as an argument to the reqMktData function. The output using the modified callback is more convenient to read than the normal output when we use the reqMktData function.
Real-Time Data Model

 Another change in the snapShotTest code is to record any error messages from IB API to a separate file. (Under the default method the eWrapper prints such error messages to the console).  To do this, we create a different wrapper using eWrapper(debug=NULL). Once we construct this, we can assign its errorMessage() function to the eWrapper we should use.

simple trade logic

We then apply a simple trade logic which generates a buy signal if the last bid price is greater than a pre-specified threshold value. One can similarly tweak the logic of the twsCALLBACK to create custom callbacks based on one’s requirement of trading strategy.

custom callbacks

 Order getting filled in the IB Trader Workstation (TWS)

IB Trader Workstation


To conclude, the post gave a detailed overview of the architecture of the IBrokers package which is the R implementation of Interactive Brokers API. Interactive Brokers in collaboration with QuantInstiTM hosted a webinar, “Trading using R on Interactive Brokers” which was held on 2nd March 2017 and conducted by Anil Yadav, Director at QuantInstiTM. You can click on the link provided above to learn more about the IBrokers package.

Next Step

If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to be a successful trader. Enroll now!

Read more

Starting Out with Time Series

Time series analysis and forecasting find wide usage in the financial markets across assets like stocks, F&O, Forex, and Commodities. As such, it becomes pertinent for aspiring quants to have sound knowledge in time series forecasting. In this post, we will introduce the basic concepts of time series and illustrate how to create time series plots and analysis in R programming language.

Time series defined

A time series is a sequence of observations over time, which are usually spaced at regular intervals of time. For example:

  • Daily stock prices for the last 5 years
  • 1-minute stock price data for the last 90 days
  • Quarterly revenues of a company over the last 10 years
  • Monthly car sales of an automaker for the last 3 years
  • Annual unemployment rate of a state in the last 50 years

Univariate time series and Multivariate time series

A univariate time series refers to the set of observations over time of a single variable. Correspondingly, a multivariate time series refers to the set of observations over time of several variables.

Time Series Analysis and Forecasting

In time series analysis, the objective is to apply/develop models which are able to describe the given time series with a fair amount of accuracy. On the other hand, time series forecasting involves forecasting the future values of a given time series using the past observed values. There are various models that are used for forecasting and the viability of a particular model used for forecasting is determined by its performance at predicting the future values.

Some examples of time series forecasting:

  • Forecasting the closing price of a stock every day
  • Forecasting the quarterly revenues of a company
  • Forecasting the monthly number of cars sold.

Plotting a time series

A plot of a time series data gives a clear picture of the spread over the given time period. It becomes easy for a human eye to detect any seasonality or abnormality in a given time series.

Become an algotrader. learn EPAT for algorithmic trading

Plotting a time series in R

To plot a time series in R, we first need to read the data in R. If the data is available in a CSV file or in an Excel file, we can read the data in R using the function or the read.xlsx() function respectively. Once the data has been read, we can create a time series plot by using the plot.ts() function. See the example given below.

We will use the time series data set from the Time Series Data Library (TSDL) created by Rob Hyndman. We will plot the monthly closings of the Dow-Jones industrial index, Aug. 1968 – Oct. 1992. Save the dataset in your current R working directory with the name monthly-closings-of-the-dowjones.csv


Decomposing time series

A time series generally comprises of a trend component, irregular (noise) component, and can also have a seasonal component, in the case of a seasonal time series. Decomposing time series means separating the original time series into these components.

Trend – The increasing or decreasing values in a given time series.

Seasonal – The repeating cycle over a specific period (day, week, month, etc.) in a given time series.

Irregular (Noise) – The random (irregularity) of values in a given time series

Why do we need to decompose a time series?

As mentioned in the above paragraph, a time series might include a seasonal component or an irregular component. In such a case, we would not get a true picture of the trending property of the time series. Hence, we need to separate out the seasonality effect and/or the noise which will give us a clear picture, and help in further analysis.

How do we decompose a time series?

There are two structures which can be used for decomposing a given time series.

  1. Additive decomposition – If the seasonal variation is relatively constant over time, we can use the additive structure for decomposing a given time series. The additive structure is given as –

Xt = Trend + Random + Seasonal

  1. Multiplicative decomposition – If the seasonal variation is increasing over time, we can use the multiplicative structure for decomposing a time series. The multiplicative structure is given as –

Xt = Trend * Random * Seasonal

Decomposing a time series in R

To decompose a non-seasonal time series in R, we can use a smoothing method for calculating the moving average of a given time series. We can use the SMA() function from the TTR package to smooth out the time series.

To decompose a seasonal time series in R, we can use the decompose() function. This function estimates the trend, seasonal, and irregular (noise) components of a given time series. The decompose function is given as –

decompose(x, type = c(“additive”, “multiplicative”), filter = NULL)

x – A time series
type – The type of seasonal component. Can be abbreviated
filter – A vector of filter coefficients in reverse time order (as for AR or MA coefficients), used for filtering out the seasonal component. If NULL, a moving average with the symmetric window is performed.

When we use the decompose function, we need to specify the trend type (multiplicative, additive) and seasonality type (multiplicative, additive) in the arguments.

Become an algotrader. learn EPAT for algorithmic trading

Stationary and non-stationary time series

A stationary time series is one where the mean and the variance are both constant over time or is one whose properties do not depend on the time at which the series is observed. Thus, the time series is a flat series without trend, constant variance over time, a constant mean, a constant autocorrelation and no seasonality. This makes a stationary time series is easy to predict. On the other hand, a non-stationary time series is one where either mean or variance or both are not constant over time.

There are different tests that can use to check whether a given time series is stationary. These include the Autocorrelation function (ACF), Partial autocorrelation function (PACF), Ljung-Box test, Augmented Dickey–Fuller (ADF) t-statistic test, and the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test.

Let us test our sample time series with the Autocorrelation function (ACF), Partial autocorrelation function (PACF) to check if it is stationary.

Autocorrelation function (ACF) – The autocorrelation function checks for correlation between two different data points of a time series separated by a lag “h”. For example, the ACF will check for correlation between points #1 and #2, #2 and #3 etc. Similarly, for lag 3, the ACF function will check between points #1 and #4, #2 and #5, #3 and #6 etc.

R code for ACF –


Partial autocorrelation function (PACF) – In some cases, the effect of autocorrelation at smaller lags will have an influence on the estimate of autocorrelation at longer lags. For example, a strong lag one autocorrelation can cause an autocorrelation with lag three. The Partial Autocorrelation Function (PACF) removes the effect of shorter lag autocorrelation from the correlation estimate at longer lags.

R code for PACF


The values of ACF and PACF each vary between plus and minus one. When the values are closer to plus or minus one it indicates a strong correlation. If the time series is stationary, the ACF will drop to zero relatively quickly, while the ACF of non-stationary time series will decrease slowly. From the ACF graph, we can conclude that the given time series in non-stationary.


In this post, we gave an overview of time series, plotting time series data, and decomposition of a time series into its constituent components using R programming language. We also got introduced to the concept of stationary and non-stationary time series and the tests which can be carried out to check if the given time series is stationary. In our upcoming post, we will continue with the concept stationary time series and see how to convert a non-stationary time series into a stationary time series. For further reference, you might like go through the following:

Next Step

If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to build a promising career in algorithmic trading. Enroll now!


Read more

Sentiment Analysis on News Articles using Python

Know how to perform sentiment analysis on news articles using Python Programming Language

by Milind Paradkar

In our previous post on sentiment analysis we briefly explained sentiment analysis within the context of trading, and also provided a model code in R. The R model was applied on an earnings call conference transcript of an NSE listed company, and the output of the model was compared with the quarterly earnings numbers, and by charting the one-month stock price movement post the earnings call date. QuantInsti also conducted a webinar on “Quantitative Trading Using Sentiment Analysis” where Rajib Ranjan Borah, Director & Co-founder, iRageCapital and QuantInsti, covered important aspects of the topic in detail, and is a must watch for all enthusiast wanting to learn & apply quantitative trading strategies using sentiment analysis.

Taking these initiatives on sentiment analysis forward, in this blog post we attempt to build a Python model to perform sentiment analysis on news articles that are published on a financial markets portal. We will build a basic model to extract the polarity (positive or negative) of the news articles.

In Rajib’s Webinar, one of the slides details the sensitivity of different sectors to company and sectorial news. In the slide, the Pharma sector ranks at the top as the most sensitive sector, and in this blog we will apply our sentiment analysis model on specific news articles pertaining to select Indian Pharma companies. We will determine the polarity, and then check how the market reacted to these news. For our sample model, we have taken ten Indian Pharma companies that make the NIFTY Pharma index.

Building the Model

Now, let us dive straight in and build our model. We use the following Python libraries to build the model:

  • Requests
  • Beautiful Soup
  • Pattern

Step 1: Create a list of the news section URL of the component companies

We identify the component companies of the NIFTY Pharma index, and create a dictionary in python which contains the company names as the keys, while the dictionary values comprise the respective company abbreviation used by the financial portal site to form the news section URL. Using this dictionary we create a python list of the news section URLs for the all components companies.

Step 2: Extract the relevant news articles web-links from the company’s news section page

Using the Python list of the news section URLs, we run a Python For loop which pings the portal with every URL in our Python list. We use the requests.get function from the Python requests library (which is a simple HTTP library). The requests module allows you to send HTTP/1.1 requests. One can add headers, form data, multipart files, and parameters with simple Python dictionaries, and also access the response data in the same way.

The text of the response object is then applied to create a Beautiful Soup object. Beautiful Soup is a Python library for pulling data out of HTML and XML files. It works with a given parser to provide for ways of navigating, searching, and modifying the parse tree.

HTML parsing basically means taking in the HTML code and extracting relevant information like the title of the page, paragraphs in the page, headings, links, bold text etc.

The news section webpage on the financial portal site contains 20 news articles per page. We target only the first page of the news section, and our objective is to extract the links for all the news articles that appear on the first page using the parsed HTML. We inspect the HTML, and use the find_all method in the code to search for a tag that has the CSS class name as “arial11_summ”. This enables us to extract all the 20 web-links.

Fortunes of the R&D intensive Indian Pharma sector are driven by sales in the US market and by approvals/rejections of new drugs by US Food and Drug Administration (USFDA). Hence, we will select only those news articles pertaining to the US Food and Drug Administration (USFDA) and the US market. Using keywords like “US”, “USA”, and “USFDA” in a If statement which is nested within the Python For Loop, we get us our final list of all the relevant news articles.

Step 3: Remove the duplicate news articles based on news title

It may happen that the financial portal publishes important news articles pertaining to the overall pharma sector on every pharma company’s news section webpage. Hence, it becomes necessary to weed out the duplicate news articles that appear in our Python list before we run our sentiment analysis model. We call the set function on our Python list which we generated in Step 2 to give us a list with no duplicate news articles.

Step 4: Extract the main text from the selected news articles

In this step we run a Python For Loop and for every news article URL, we call the requests.get() on the URL, and then convert the text of response object into a Beautiful Soup object. Finally, we extract the main text using the find and get_text methods from the  Beautiful Soup module.

Step 5: Pre-processing the extracted text

We will use the n-grams function from the Pattern module to pre-process our extracted text. The ngrams() function returns a list of n-grams (i.e., tuples of n successive words) from the given string. Since we are building a simple model, we use a value of one for the n argument in the n-grams function. The Pattern module contains other useful functions for pre-processing like parse, tokenize, tag etc. which can be explored to conduct an in-depth analysis.

Step 6: Compute the Sentiment analysis score using a simple dictionary approach

To compute the overall polarity of a news article we use the dictionary method. In this approach a list of positive/negative words help determine the polarity of a given text. This dictionary is created using the words that are specific to the Pharma sector. The code checks for positive/negative matching words from the dictionary with the processed text from the news article.

Step 7: Create a Python list of model output

 The final output from the model is populated in a Python list. The list contains the URL, positive score and the negative score for each of the selected news articles on which we conducted sentiment analysis.

Final Output

sentiment trading using python

Step 8: Plot NIFTY vs NIFTY Pharma returns

Shown below is a plot of NIFTY vs NIFTY Pharma for the months of October-November 2016. In our NIFTY Pharma plot we have drawn arrows highlighting some of the press releases on which we ran our sentiment analysis model. The impact of the uncertainty regarding the US Presidential election results, and the negative news for the Indian Pharma sector emanating from the US is clearly visible on NIFTY Pharma as it fell substantially from the highs made in late October’2016. Thus, our attempt to gauge the direction of the Pharma Index using the Sentiment analysis model in Python programming language is giving us accurate results (more or less).

sentiment trading using python


Next Step:

One can build more robust sentiment models using other approaches and trade profitably. As a next step we would recommend watching QuantInsti’s webinar on “Quantitative Trading Using Sentiment Analysis” by Rajib Ranjan Borah. Watch it by clicking on the video below:


Also, catch our other exciting Python trading blogs and if you are interested in knowing more about our EPAT course feel free to contact our QuantInsti team by clicking here.

Algorithmic trading course

  • Download.rar
    • Sentiment Analysis of News Article – Python Code
    • dict(1).csv
    • Nifty and Nifty Pharma(1).csv
    • Pharma vs


Read more

Popular Python Trading Platform for Algorithmic Trading

python trading platform

python trading platform

by Apoorva Singh

In one of our recent articles, we’ve talked about most popular backtesting platforms for quantitative trading. Here we are sharing most widely used python trading platform and libraries for quantitative trading.

Python is a free open-source and cross-platform language which has a rich library for almost every task imaginable and specialized research environment. Python is an excellent choice for automated trading when the trading frequency is low/medium, i.e. for trades which do not last less than a few seconds. It has multiple APIs/Libraries that can be linked to make it optimal, cheaper and allow greater exploratory development of multiple trade ideas. (more…)

Read more

RExcel Tutorial – Leveraging the Power of R in Excel

RExcel Tutorial

By Milind Paradkar

How many times has MS Excel given you a hard time while building complex models or importing that extra-large data set into the spreadsheet? As a trader, I would love to see crisp formulas in my worksheets and more importantly, I would want that my models are less prone to errors when I am trading in live market.

What if I tell you that there is a microlith that can tear through these shortcomings and leverage the power of R in Excel in a hassle-free and non-tedious manner?


Friends, let me introduce you to RExcel- an add-in which allows you to use R functionalities on MS Excel. In a nutshell, we can do the following with this R Excel plugin:

  • Use R functions via cell formula/macros
  • Run R scripts through excel
  • Transfer data between R and Excel.

Developed by Erich Neuwirth, RExcel works on Microsoft Windows with Excel 2003, 2007, 2010 and 2013. It uses the statconnDCOM server and the rcom package to access R from Excel.

Before you start using RExcel, you will need the following:

  • A suitable version of R
  • A matching version of rscproxy
  • statconnDCOM or rcom with statconnDCOM

You can find the link to install these in your system at the end of this article along with the download link.

Let’s come back to our tutorial now. There are three ways of using RExcel –

  • Worksheet functions
  • Macro mode
  • Scratchpad mode.

I will illustrate each of these modes with examples.

Using RExcel worksheet functions

 As the name suggests, these functions call R functions in Excel worksheet cells. The list of functions include:

  • RApply
  • RCall
  • REval
  • RExec
  • Other argument modifier functions.

You can refer to the help and documentation link in RExcel help tab to see the complete list of worksheet functions.

How do we use this function?

Let me demonstrate with some examples.

1) Calculating the mean:

Let’s say that I wish to calculate the mean of the OHLC prices for the two stocks using the RApply function.

How do I do that?

I will use RApply which will allow to call any R function as an Excel worksheet function. We call the mean function and apply it over the OHLC prices.

Pretty simple, isn’t it?

RExcel tutorial

2) Defining functions:

Now, if I want to define my custom function and apply it using a given set of arguments, then I can do that as well. In this example, I round off the mean price to the nearest 0.05 paisa using the function shown below. The minimum tick for NSE stocks is Rs. 0.05.

Rexcel macros

3) Applying functions over a range for quick execution:

If we use multiple RApply calls then it slows down the computation considerably. To overcome this we can use Excel array formulas instead of multiple RApply’s, and speed up the computation. The example below illustrates the rounding using the array function. Check the link to learn how array formulas are written in Excel.

Rexcel macros

Method II: Connecting R via Macros

RExcel has provided VBA procedures and functions for us to connect R via Macros. However, prior to starting in Macros you need to set the reference to RExcelVBAlib from the References tab (In the VBA window, see in Tools -> References).

Let me take a couple of examples to illustrate the macro method.

Example: Running an R script and generating the output in excel.

I am going to run an R script called “Top Gainers of the day.R” from the “RunRScript” macro (code shown below). When I execute this, the R script generates a list of top 5 NSE stock gainers of that day.

How does it do this?

It does this by sorting the percentage price change for all the given stocks in a descending order, and stores the top 5 in the “TopGainers_df” dataframe. We will run the macro and print the dataframe in our excel worksheet.

 Rexcel macros

The commands mentioned in the above macro have the following meaning:

  1. The RInterface.StartRServer starts the R server.
  2. The RInterface.RRun executes the command string that follows.
  3. The RInterface.RunRFile executes the R script mentioned in the quotes.
  4. The RInterface.GetDataframe command is used to retrieve the output in Excel. This command takes two arguments, the name of the dataframe variable, and the location in Excel where we want to print the output.
  5. Finally, the StopRServer command stops the R server.

The output printed in Excel upon running the macro is shown below.

RExcel Output from the R script printed in the Excel sheet


How do I call R functions in Macro?

There is another way of using R functions in macros. I have given macro code below where I have stored a list of stocks with their percentage price change on Sheet3 of the workbook.

I have used the RInterface.PutDataframe command to assign this range as a dataframe to R. Then I called the arrange function from the dplyr package, and got the top 5 NSE stock gainers of the day.

Finally, I use the RInterface.GetDataframe to print this dataframe onto sheet2 of the workbook.

Thus, upon running this “Arrange” macro I was able to produce the same result as obtained in the first example.

RExcel Output from the R script printed in the Excel sheet

These macros can be attached to menu items or toolbar items for easy execution. Once again, I will advise you to refer to the help and documentation link in RExcel help tab to see the complete list of procedures and functions available.

Method III: Using the Scratchpad method

In this method I will write the R expressions on an Excel sheet, and execute it using the buttons in the RExcel Menu. One needs to initiate R connection by selecting the “Start R” link from the RExcel Menu.
RExcel Output from the R script printed in the Excel sheet


  1. We select the range (I3:I5) shown below
  2. Then click “Run R” from the RExcel menu
  3. Next, we select an empty cell (M3)
  4. Select “Get R Value” and when prompted, indicate the cell (I5 in this case) containing the final expression.
  5. The output from the expression gets printed in this empty cell.

I just used R’s cbind function and generated the output in Excel.

RExcel Output from the R script printed in the Excel sheet

The scratchpad method can be applied to scalars, vectors, data frames or to a matrix. There are other additional operations that can be done using this method, and you know very well where you can learn these.

To Conclude

Combining the power of R with Excel can surely simplify things for traders using R in Excel, and as a result provide them with more firepower to backtest their strategies and execute them on MS Excel.

Next Step

If you’re a trader interested in learning various aspects of Algorithmic trading, check out the Executive Programme in Algorithmic Trading (EPAT). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. Most of all, the course will surely equip you with the required skillsets to be a successful algo trader.




Sources & References:

For Installation of R, R(D)COM server and RExcel go to:

Read more

IBPy Tutorial to implement Python in Interactive Brokers API

How to implement Python in Interactive Brokers API using IBPy

I hope you had a great time attending our webinar on Trading with Interactive Brokers using Python, I thought it would be a very good idea to give you a brief insight on Interactive Brokers API and using IBPy to implement Python in IB’s TWS. As we proceed, you will need an Interactive Brokers demo account and IBPy. Towards the end of this article, you will be running a simple order routing program using Interactive Brokers API.

For those of you who are already aware of Interactive Brokers (IB) and its interface, you can very well understand why I prefer IB over other available online brokerages. However, for those who have not used IB, this would be the first question that comes to mind:

Why Interactive Brokers?

When I have online brokerages like Fidelity, Capital One Investing, & Firstrade, then why should one use Interactive Brokers?


Become an algotrader. learn EPAT for algorithmic trading

Interactive Brokers is my first choice because of 5 simple reasons:

  1. International Investing in more than 100 markets
  2. Commission rates that are highly competitive
  3. Low margin rates
  4. Very friendly user interface
  5. Vast selection of order types

Among the five points mentioned above, the most important and impressive one for any beginner is point no. 2 and point no. 4, isn’t it?. The Interactive Brokers API can be used in a professional context even for those who are completely alien to it. Interactive Broker API’s connectivity with Java, C++ and Python is very impressive as well.

Enough said it is time to move to the next step. I can understand that most of you must already be eager to test their hand at the Interactive Brokers API panel. After all, nobody could say no to something very friendly that is lucrative as well. You can easily set up your account on Interactive Brokers by going to their website. There is an option wherein you can opt for a free trial package.

Algorithmic traders prefer Interactive Brokers due to its relatively straightforward API. In this article, I will be telling you how to automate trades by implementing Python in the Interactive Brokers API using a bridge called IBPy.

Interactive Brokers IBPy

As Interactive Brokers offers a platform to an incredibly wide spectrum of traders, therefore, its GUI consists of a myriad of features. This standalone application is called Trader Workstation or TWS on Interactive Brokers. Apart from the Trader Workstation, Interactive Brokers also has an IB Gateway. This particular application allows IB servers to access it using a Command Line Interface. Algo traders usually prefer using this over GUI.

What is IbPy?

IbPy is a third-party implementation of the API used for accessing the Interactive Brokers on-line trading system. IbPy implements functionality that the Python programmer can use to connect to IB, request stock ticker data, submit orders for stocks and futures, and more.

The purpose of IbPy is to conceive the native API, that is written in Java, in such a way that it can be called from Python. Two of the most significant libraries in IBPy are ib.ext and ib.opt. ib.opt derives from the functionality of ib.ext. Through IBPy, the API executes orders and fetches real-time market data feeds. The architecture essentially utilizes a client-server model.

Implementation of IB in Python

First of all, you must have an Interactive Brokers account and a Python workspace to install IBPy, and thereafter, you can use it for your coding purposes.

Installing IBPy

As I had mentioned earlier, IBPy is a Python emulator written for the Java-based Interactive Brokers API. IBPy helps in turning the development of algo trading systems in Python into a less cumbersome process. For this reason, I will be using it as a base for all kinds of interaction with the Interactive Brokers TWS. Here I am presuming that you have Python 2.7 installed on your system, else you may download it from here:

Installing On Ubuntu


IBPy can be acquired from GitHub repository.

The following code will be needed on an Ubuntu system:

Creation of subdirectory

Download IBPy

Great! You have installed Python on your Ubuntu system.

Installing IBPy on Windows

Go to the github repository and download the file from:

Unzip the downloaded file. Move this folder to the directory where you have installed Python so that it can recognize this package:


Now, open the setup with windows command prompt and type the following command: 

After this, you will have to get your Trader Workstation (TWS) in operation.

Installing Trader Workstation

Interactive Brokers Trader Workstation or TWS is the GUI that lets all registered users of Interactive Brokers to trade on their systems. Don’t worry, even if you do not have prior knowledge of programming or coding, TWS will let you do the trading work.You can download the TWS installer from interactive Brokers’ website and run it on your system.

You can download the TWS Demo from here:

Important Note

In the older versions of TWS, the user would get to choose two different programs. The first one was the TWS of Interactive Brokers and the second was the IB Gateway, about which I have already talked earlier. Although they are different applications, however, they can only be installed together. 

The IB Gateway runs on lower processing power since it does not have an evolved graphical user interface as the Trade Workstation. However, the results and other data are displayed in the form of primitive codes on the IB Gateway, making it less friendly for certain set of users who do not possess enough knowledge in coding.

You may use either of the two interfaces for your work on interactive Brokers. The functionalities of both remain the same, i.e. to relay info between your system and the Interactive Brokers server. Needless to say, the Python app will get the exact same messages from the server end of Interactive brokers.

Installation Walk-through

Once you download the application, you will find the executable file at the bottom of your browser. Click on Run when prompted with a security warning.

  • Now, click on Next.
  • Click on finish to complete your installation.
  • Click on the desktop icon and start the TWS application.

IBPy implementation in TWS

Since I am going to use a demo account, therefore, click on No User name?

  • Enter your email address and click on Login:

IBPy implementation in TWS

Configuration of Interactive Brokers Panel

The journey so far has been pretty easy, hasn’t it? It is great if you agreed with me on that one. After installing the TWS and/or IB Gateway, we have to make some changes in the configurations before implementing our strategies on Interactive Brokers’ servers. The software will connect to the server properly only once these settings are changed.


  • Go to API settings in TWS

Setting preferences for IBPY on Interactive Brokers TWS

Setting preferences for IBPY on Interactive Brokers TWS

  • Check the Enable ActiveX and Socket Clients
  • Set Socket port to unused port.
  • Set the Master API client ID to 100
  • Create a Trusted IP Address and set to

Global preferences for Interactive Brokers API using IBPy

Running the first program

So, all done with the configuration?

Great! We are now ready to run our first program.

Before you start typing in those codes, make sure that you have started TWS (or IB Gateway). Many times, I get questions as to why we get an error message when the code is run. Like I had mentioned in the previous section, your system is connected to the Interactive Brokers’ server through the TWS or IB Gateway. So, if you haven’t turned it on, then you are bound to get an exception message, no matter how smartly you have developed your code.

Let’s start working on the coding step-by-step.

Open Spyder (Start – All Programs – Anaconda2 – Spyder)

On the Spyder console, I will be entering my codes.

Spyder interface for IBPy

1) We start by importing necessary modules for our code:
Connection is a module that connects the API with IB while message performs the task of a courier between the server and the system, it basically retrieves messages from the Interactive Brokers server.
Just like every transaction in the real-world involves some kind of a contract or agreement, we have Contract here as well. All orders on Interactive Brokers are made using contract.
2) Making the contract function
The contract function has the following parameters:

  • Symbol
  • Security Type
  • Exchange
  • Primary Exchange
  • Currency

The values of these parameters must be set accordingly.

3) Setting the Order Function

Order function allows us to make orders of different types. The order function has the following parameters:

  • Order Type
  • Total Quantity
  • Market Action (Buy or sell)

Considering that our order does have a set price, we code it in the following way:

The conditional statement will now set up the order as a simple market order without any set price. 

The client id & port should be the same as you had set in the Global preferences

4) Initiating Connection to API

Establish the connection to TWS.


Assign error handling function.


Assign server messages handling function.


Create AAPL contract and send order

In the above line, AAPL is Apple Inc. and STK is the name of the security type. The exchange and primary exchange has been set to SMART. When we set these two parameters to SMART, then we are actually using Interactive Brokers smart routing system which enables the algo to find the best route to carry out the trade. And of course, the currency has been set to USD.

We wish to sell 100 shares of AAPL

Our order is to sell 1 stocks and our price is $100.

We have placed an order on IB TWS with the following parameters:

  • Order id
  • Contract
  • offer

“Always remember that the order id should be unique.”

5) Disconnecting

And finally you need to disconnect:      

Yes, you are done with your first order on Interactive brokers’ API using basic Python coding. Keep in mind that the demo account that you are using might not give you all the privileges of a paid account.

Running the Code

Click on the Green colored ‘Play’ button or simply press F5 in Spyder. On your TWS Demo system, you will get a popup regarding your order. Click on OK.

IBPy confirmation


You can see the final output on the bottom right side of Interactive Brokers TWS panel.

Output of algo using IBPy

Just in case you want to have a look at the complete code at one go, here it is:

Next Step

I am sure that you have all run your code and made your first transaction using Interactive Brokers API and IBPy. We can see the output on the TWS where you will be selling 100 shares of APPLE. This is a very generic and simple type automated execution using Interactive Brokers API.

You can watch QuantInsti’s webinar on Trading with Interactive Brokers using Python, where Dr. Hui Liu has explained how to use another wrapper called IBridgePy. Dr Hui Liu is one of the pioneers in the field. So, if you wish to know how you can implement Algo strategies in the live market using Python on Interactive Broker’s API, then you should definitely check out the videos from our recently concluded webinar. To know more about algorithmic trading, enrol for EPAT.


Become an algotrader. learn EPAT for algorithmic trading

Sources & References:,,

Read more

How to Check Data Quality Using R


How to check data quality

By Milind Paradkar

Do You Use Clean Data?

Always go for clean data! Why is it that experienced traders/authors stress this point in their trading articles/books so often? As a novice trader, you might be using the freely available data from sources like Google or Yahoo finance. Do such sources provide accurate, quality data?

We decided to do a quick check and took a sample of 143 stocks listed on the National Stock Exchange of India Ltd (NSE). For these stocks, we downloaded the 1-minute intraday data for the period 1/08/2016 – 19/08/2016. The aim was to check whether Google finance captured every 1-minute bar during this period for each of the 143 stocks.

NSE’s trading session starts at 9:15 am and ends at 15:30 pm IST, thus comprising of 375 minutes. For 14 trading sessions, we should have 5250 data points for each of these stocks. We wrote a simple code in R to perform the check.

Here is our finding. Out of the 143 stocks scanned, 89 stocks had data points less than 5250, that’s more than 60% of our sample set!! The table shown below lists downs 10 such stocks from those 89 stocks.


Let’s take the case of PAGEIND. Google finance has captured only 4348 1-minute data points for the stock, thus missing 902 points!!

Example – Missing the 1306 minute bar on 20160801:

Missing the 1306 minute bar on 20160801

Example – Missing the 1032 minute bar on 20160802:

Missing the 1032 minute bar on 20160802

If a trader is running an intraday strategy which generates buy/sell signals based on 1-minute bars, the strategy is bound to give some false signals.

As can be seen from the quick check above, data quality from free sources or from cheap data vendors is not always guaranteed. Many of the cheap data vendors source the data from Yahoo finance and provide it to their clients. Poor data feed is a big issue faced by many traders and you will find many traders complaining about the same on various trading forums.

Backtesting a trading strategy using such data will give false results. If are using the data in live trading and in case there is a server problem with Google or Yahoo finance, it will lead to a delay in the data feed. As a trader, you don’t want to be in a position where you have an open trade, and the data feed stops or is delayed. When trading with real money, one is always advised to use quality data from reliable data vendors. After all, Data is Everything!

Next Step

If you’re a retail trader interested in learning various aspects of Algorithmic trading, check out the Executive Programme in Algorithmic Trading (EPAT). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. The course equips you with the required skillsets to be a successful trader.

Download Data Files

  • Do You Use Clean Data.rar
    • 15 Day Intraday Historical
    • F&O Stock List.csv
    • R code – Google_Data_Quality_Check.txt
    • R code – Stock price data.txt


Read more

Vectorised Backtesting in Excel

Backtesting in Excel

By Jacques Joubert

Now for those of you who know me as a blogger might find this post a little unorthodox to my traditional style of writing, however in the spirit of evolution, inspired by a friend of mine Stuart Reid (, I will be following some of the tips suggested in the following blog post.

Being a student in the EPAT program I was excited to learn the methodology that others make use of when it comes to backtesting. As usual, we start off in Excel and then migrate to R.

Having previously written a blog series on backtesting on Excel and then moving to R, I was very interested to see a slightly different method used by the QuantInsti team. (more…)

Read more

Importing CSV Data in Zipline for Backtesting

Importing CSV Data in Zipline for Backtesting

By Priyanka Sah

In our previous article on introduction to Zipline package in Python, we created an algorithm for moving crossover strategy. Recall, Zipline is a Python library for trading applications and is used to create an event-driven system that can support both backtesting and live trading.

In the previous article, we also learned how to implement Moving Average Crossover strategy on Zipline. The strategy code in Zipline reads data from Yahoo directly, performs the backtest and plots the results. We recommend that you brush up a few essential concepts, covered in the previous post, before going further:

  1. Installation (how to install Zipline on local)
  2. Structure (format to write a code in Zipline)

In this article, we will take a step further and learn to backtest on Zipline using data from different sources. We will learn to:

  • Import and backtest on OHLC data in CSV format
  • Import and use data from Google Finance for research/analysis
  • Calculate and print backtesting results such as PnL, number of trades, etc

Become an algotrader. learn EPAT for algorithmic trading

The post serves as a guide for serious quants and DIY Algo traders who want to make use of Python or Zipline packages independently for backtesting and hypothesis testing of their trading ideas. In this post, we will assume that the data is from the US markets. It is possible to use other markets’ data sets for analysis with some edits and additions in the code. We will share the same in a later post.

The Parts of the code on Zipline – what we have learned already

Part 1 - Code screenshot

The problem with the existing method?

Zipline provides an inbuilt function “loads_bars_from_yahoo()” that fetches data from Yahoo in given range and uses that data for all the calculations. Though very easy to use, this function only works with Yahoo data. Using this function, we cannot backtest on different data sets such as

  1. Commodities data – yahoo does not provide
  2. Simulated data sets created and saved in csv format

We have been using this inbuilt function so far to load stock data in Python IDE and work further with it. To be able to read csv or any other data type in Zipline, we need to understand how Zipline works and why usual methods to import data do not work here!

Zipline accepts the data in panel form. To understand how Zipline treats and understands data, we must learn a little bit about data structures in Python.

Data Structures in Panda

Pandas structures data in three forms essentially: Series (1D), Data Frame (2D), Panel (3D)

  1. Series:

It is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.). The axis labels are collectively referred to as the index.

The basic method to create a Series is to call:

s = pd.Series(data, index=index)

A series accepts different kinds of objects such as a Python dictionary, ndarray, a scalar value (like 5).

  1. Data Frame:

It is a two-dimensional labeled data structure with rows and columns. Columns can be of different types or same.

It is one of the most commonly used pandas objects and accepts different types of inputs such as Dict of 1D ndarrays, lists, dicts, or Series; 2-D numpy.ndarray; Structured or record ndarray; a Series.

  1. Panel:

A Panel is a lesser used data structure but can be efficiently used for three-dimensional data.

The three axes are named as below:

  1. items: axis 0, each item corresponds to a DataFrame contained inside
  2. major_axis: axis 1, it is the index (rows) of each of the DataFrames
  3. minor_axis: axis 2, it is the columns of each of the DataFrame

Zipline only understands data structure in the Panel format.

While it is easy to import .csv data in Panda as a dataframe, it is not possible to do the same in Zipline directly. However, we have found a roundabout to this problem:


This is a powerful technique which will help you in importing data from different sources such as:

  • Import OHLC data in a CSV format in zipline (we will show how)
  • Read data from online sources other than Yahoo which connect with Panda (we will show how)
  • Read data from Quandl in Zipline (this is left as an exercise for you!)

Let us get started with the three steps!

  1. Import the data in python

We can use any method to import the data as a Dataframe or just import the data and convert it into a Dataframe. Here, we will use two methods to fetch data: DataReader & read_csv function.

Use DataReader to read data from Google

Pandas provide a function Datareader which allows the user to specify the date range and the source.  You can use Yahoo, Google or any other data source.

This is how a DataFrame looks like when you print the first 6 rows:


Use read_csv function to import a CSV file

Pandas provide another function read_csv that fetches the csv file from a specified location. Please note that the CSV should be in a proper format so that it runs in a correct fashion when called by a strategy algorithm in Zipline.

Format of CSV file:

The First column is the “Date” column, the second column is “Open”, the third column is “High”, the fourth column is “Low”, the fifth column is “Close”, the sixth column is “Volume” and the seventh column is “Adj Close”. None of the columns should be blank or with missing values.

Reading CSV file:

Note in the code above:

Name of the stock is “SPY”
We are already in the directory where the CSV file “SPY.csv” is saved, else you need to specify the path as well.

  1. Convert DataFrame to Panel

The data imported in Python IDE by aforementioned methods is saved as a Dataframe. Now we need to convert it into Panel format and modify major and minor axis.

Zipline accepts [‘Open’, ‘High’, ‘Low’, ‘Close’, ‘Volume’, ‘Price’] data as minor axis and ‘Date’ as major axis in UTC time format. Since if we did not have the Date in UTC format, we convert it by using “tz_localize(pytz.utc)”.

Now ‘panel’ is the dataset ‘data’ saved in the panel format. This is how a Panel format looks like:

Panel format

  1. Use this new data structure Panel to run your strategy

We use this new data structure ‘Panel’ to run our strategy with no changes in the “initialize” or “handle_data” sections. The strategy logic and code remains the same. We just plug the new data structure while running the strategy.

That’s it! Now you can easily run the previously explained Moving Crossover strategy on a CSV data file! Go on, give it a try!

You can fetch the Quandl(US data) data, and try generating signals on the same.

Backtesting on Zipline

In the previous post, we backtested a simple Moving Crossover strategy and plotted cash and PnL for each trading day. Now, we will calculate PnL and the total number of trades for the entire trading period.

Recall that the results are automatically saved in ‘perf_manual’. Using the same, we can calculate any performance ratios or numbers that we need.

Looks like this strategy lost more than 50% of initial capital

Looks like this strategy lost more than 50% of initial capital!

To change the initial capital and other parameters to optimize your backtesting results, you need to initialize the TradingAlgorithm() accordingly. ‘capital_base’ is used to define the initial cash, ‘data_frequency’ is used to define the data frequency. For example:

(By default the capital is 100000.0.)

Go through the official documentation of TradingAlgorithm() function to try and learn more!

Next Step

If you are serious about writing advanced trading strategies and executing them through Python, read more about our Executive Programme in Algorithmic Trading. Over 250 hours of intensive training, with customized learning solutions, interactions with industry experts, traders, quants and two months of practical project work under Algo & HFT traders is what you get at throw-away prices! The new batch is starting from 27th August! Enroll now!

Download Data File

  • mac_excel_ zipline.txt


Read more

Introduction to Zipline in Python

Zipline in Python

By Priyanka Sah


Python has emerged as one of the most popular languages for programmers in financial trading, due to its ease of availability, user-friendliness, and the presence of sufficient scientific libraries like Pandas, NumPy, PyAlgoTrade, Pybacktest and more.

Python serves as an excellent choice for automated trading when the trading frequency is low/medium, i.e. for trades which do not last less than a few seconds. It has multiple APIs/Libraries that can be linked to make it optimal, cheaper and allow greater exploratory development of multiple trade ideas.

Become an algotrader. learn EPAT for algorithmic trading

It is due to these reasons that Python has a very interactive online community of users, who share, reshare, and critically review each other’s work or codes. The two current popular web-based backtesting systems are Quantopian and QuantConnect.

Quantopian makes use of Python (and Zipline) while QuantConnect utilises C#. Both provide a wealth of historical data. Quantopian currently supports live trading with Interactive Brokers, while QuantConnect is working towards live trading.

Zipline is a Python library for trading applications that powers the Quantopian service mentioned above. It is an event-driven system that supports both backtesting and live trading.

In this article, we will learn how to install Zipline and then how to implement Moving Average Crossover strategy and calculate P&L, Portfolio value etc.

This article is divided into the following four sections:

  • Benefits of Zipline
  • Installation (how to install Zipline on local)
  • Structure (format to write code in Zipline),
  • Coding Moving average crossover strategy with Zipline

Benefits of Zipline

  • Ease of use
  • Zipline comes “batteries included” as many common statistics like moving average and linear regression can be readily accessed from within a user-written algorithm.
  • Input of historical data and output of performance statistics are based on Pandas DataFrames to integrate nicely into the existing PyData ecosystem
  • Statistic and machine learning libraries like matplotlib, scipy, statsmodels, and sklearn support development, analysis, and visualization of state-of-the-art trading systems


Using pip

Assuming you have all required non-Python dependencies, you can install Zipline with pip via:

Using conda

Another way to install Zipline is via the conda package manager, which comes as part of Anaconda or can be installed via pip install conda.

Once setup, you can install Zipline from our Quantopian channel:


Basic structure

Zipline provide a particular structure to code which includes defining few functions that run the algorithms over a dataset as mentioned below.

So, first we have to import some functions we would need in the code. Every Zipline algorithm consists of two functions you have to define:

* initialize(context) and * handle_data(context, data)

Before the start of the algorithm, Zipline calls the initialize() function and passes in a context variable. Context is a global variable that allows you to store variables you need to access from one algorithm iteration to the next.

After the algorithm has been initialized, Zipline calls the handle_data() function once for each event. At every call, it passes the same context variable and an event frame called data containing the current trading bar with open, high, low, and close (OHLC) prices as well as volume for each stock.

All functions commonly used in the algorithm can be found in Zipline.api module. Here we are using order(arg1, arg2) that takes two arguments: a security object, and a number specifying how many stocks you would like to order (if negative, order() will sell/short stocks). In this case, we want to order 10 shares of Apple at each iteration.

Now, the second method record() allows you to save the value of a variable at each iteration. You provide it with a name for the variable together with the variable itself. After the algorithm finished running you can all the variables you recorded, we will learn how to do that.

To run the algorithm, you would need to call TradingAlgorithm() that uses two arguments: initialize function and handle_data.  Then, call run method using data as argument on which algorithm will run (data is panda data frame that stores the stocks prices)

run() first calls the initialize() function, and then streams the historical stock price day-by-day through handle_data(). After each call to handle_data() we instruct Zipline to order 10 stocks of AAPL.

How to code Moving average crossover strategy with Zipline

Moving Averages

It is the simple average of a security over a defined number of time periods.


Moving average crossovers are a common way traders can use Moving Averages. A crossover occurs when a faster Moving Average (i.e. a shorter period Moving Average) crosses either above a slower Moving Average (i.e. a longer period Moving Average) which is considered a bullish crossover or below which is considered a bearish crossover.

Now we will learn how to implement this strategy using Zipline. To import libraries and initialize variables that will be used in the algorithm.

The code is divided into 5 parts

  • Initialization
  • Initialize method
  • handle_data method
  • Strategy logic
  • Run Algo



load_bars_from_yahoo() is the function that takes stock and time period for which you want to fetch the data. Here I am using SPY stocks between 2011 to 2012, you can change this according to you.

Initialize method

Now we would define initialize function, represents the stock that we are dealing with, in our case its SPY.

handle_data method

handle_data() contains all the operation we want to do, the main code for our algorithm. we need to calculate moving averages for different windows, Zipline gives an inbuilt function mavg() that takes an integer to define the window size.

Also, Zipline automatically calculates current_price, portfolio_value etc. we can just call the variables, in this algorithm, I have calculated current_positions, price, cash, portfolio_value, and the PnL.

Strategy logic

Now the logic that will place the order for buy or sell depending upon the condition that compares moving averages.

  1. If short moving average is greater than longer one and your current_positions is 0 then you need to calculate the no of shares and place an order
  2. If the short moving average is smaller than the longer one and your current_positions is not 0 then you need to sell all the shares that you have currently.
  3. the third condition is if nothing satisfies then do nothing just record the variables you need to save.


For running this algorithm, you need the following code:


You can plot the graph also using method plot()

Graph for the strategy

graph of moving crossover strategy using zipline

Snapshot of the screen using Zipline

Snapshot of screen in Zipline


We hope that you found this introduction to zipline and implementing a strategy using the same useful. In our next article, we will show you how to import and backtest data in CSV format using Zipline. For building technical indicators using python, here are few examples.

If you are a coder or a tech professional looking to start your own automated trading desk. Learn automated trading from live Interactive lectures by daily-practitioners. Executive Programme in Algorithmic Trading covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. Enroll now!

Read more