Basic Statistics for Trading Strategies (Part 1) – Historical Data Analysis


Basic Statistics for Trading Strategies - Part 1

by Anupriya Gupta

This series of posts is to get our readers to start using statistics and data analysis while trading. This post tries to answer the basic question: “how do you analyse stock’s historical data and use it for strategy building?” In this post, we will focus on using Excel to do the analysis. In the subsequent post(s), We will go through the basic codes in R which you can use to try out the similar analysis in R.

Summary Statistics

Dataset: MARUTI SUZUKI India Limited- Daily data from Jan 01, 2013 to Dec 31, 2013. Public link to download the corresponding dataset:

This is a time series data set with daily closing prices and volumes for Maruti. We’ll base our analysis on the closing prices for this stock. We’ll just touch upon the very basic statistical properties for the daily stock prices in this post, which would be followed by calculating beta and a simple moving average for this stock later.

Mean: Mean is the average of a data set. The average of the daily closing price for Maruti in 2013 was 1522.47. However, for the last day of the year, 31-Dec-2013, the closing price was 1763.9. In Excel, you can use the function to calculate the mean.

Mode: Mode gives us the most frequently occurring value in the data set. This measure doesn’t really make much sense for closing prices or other continuous data. A mode is especially useful when you want to plot histograms and visualize the frequency distribution. In excel, you can calculate the mode for a given dataset using a function.

Median: Median gives the middle value of the increasingly sorted data set. It is especially useful in circumstances when the dataset has a few values at the extreme ends. For example, the salary in a company might vary between Rs. 2000 to Rs. 1,00,00,000. The mean would be influenced by a few extreme situations and might get skewed negatively or positively. To avoid this, a median is used which gives the middle point of the distribution. In excel, you can calculate median for a given dataset using a function.

Standard Deviation: Volatility of the stock can be calculated using standard deviation. The volatility of Maruti stock in given period is 141.26. The standard deviation can be for the population as well as for the sample. In quantitative trading strategy modelling, you’ll be using sample standard deviation most of the times. Excel formula for calculating the sample standard deviation is.

Range: Range simply gives the difference between the min and max values of the data set. Minimum and maximum values for Maruti closing prices in 2013 are 1236.35 and 1809.65. However, it makes more sense to consider the intraday high and low as well to get the high and low for the stock in a year which gives us the range as 1126 to 1830.

The graph below shows all the measures in a graph so that you get a sense of all the values.
Trading Range













You can notice how the interval (mean -2*SD, mean + 2*SD) captures most of the data points. Statistically speaking 95.46% data is within that interval assuming a normal distribution.

Moving average and standard deviations are used to create Bollinger bands too, which is one of the many ways of predicting the range for the stock price technical analysis and you can also form various trading strategies based on Bollinger bands.

In part 2 of this series, we will try to understand distributions. It will also try to answer the basic question: “why is statistics necessary for strategy building?”

In part 3 of this series, we will try to understand the relationship between a stock and a market index. The terms we will understand are regression, correlation and co-integration.

Leave a Reply

Your email address will not be published. Required fields are marked *