By Mandeep Kaur
In our previous blog on time series “Time Series Analysis: An Introduction In Python”, we saw how we can get time series data from online sources and perform major analysis on the time series including plotting, calculating moving averages and even forecasting. In this blog, we will discuss some important tools that are really helpful to analyse time series data. These tools are extremely helpful for traders in designing and backtesting trading strategies.
Traders deal with loads of historical data and need to play around and perform analysis on such time series. These tools are used to prepare the data before doing the required analysis. We will majorly focus on how to deal with dates and frequency of the time series. We will also discuss indexing, slicing and slicing operations on time series. For this blog, we will extensively use ‘datetime’ library.
Time Series Analysis: Date-Time Data In PythonClick To Tweet
Let us begin with importing this library in our program.
#Importing the required modules from datetime import datetime from datetime import timedelta
Basic Tools for Date and Time
To begin with, let us save current date and time in a variable ‘current_time’. The below code will execute the same.
#Printing the current date and time current_time = datetime.now() current_time Output: datetime.datetime(2018, 2, 14, 9, 52, 20, 625404)
We can compute the difference between two dates using datetime.
#Calculating the difference between two dates (14/02/2018 and 01/01/2018 09:15AM) delta = datetime(2018,2,14)-datetime(2018,1,1,9,15) delta Output: datetime.timedelta(43, 53100)
We can convert the output in terms of days or seconds using:
#Converting the output to days delta.days Output: 43 #Converting the output to seconds delta.seconds Output: 53100
If we want to shift a date, we can use timedelta module which we have already imported.
#Shift a date using timedelta my_date = datetime(2018,2,10) #Shift the date by 10 days my_date + timedelta(10) Output: datetime.datetime(2018, 2, 20, 0, 0)
We can also use multiples of timedelta function.
#Using multiples of timedelta function my_date - 2*timedelta(10) Output: datetime.datetime(2018, 1, 21, 0, 0)
We have seen ‘datetime’ and ‘timedelta’ data types of datetime module. Let us give a brief note of major data types which are of great use while analysing time series.
Data Type | Description |
date | Stores calendar dates (year, month and day) using the Gregorian calendar |
time | Stores time as hours, minutes, seconds and microseconds |
datetime | Stores both date and time (as already discussed in examples) |
timedelta | Stores the difference between two datetime values (as already discussed in examples) |
Conversion Between Strings And datetime
We can convert a datetime format to a string and save it under a string variable. Similarly, the reverse can also be done and a string which represents a date can be converted to datetime data type.
#Converting datetime to string my_date1 = datetime(2018,2,14) str(my_date1) Output: '2018-02-14 00:00:00'
We can use ‘strptime’ function to convert the string to datetime.
#Converting a string to datetime datestr = '2018-02-14' datetime.strptime(datestr, '%Y-%m-%d') Output: datetime.datetime(2018, 2, 14, 0, 0)
We can also use pandas to handle dates. Let us first import pandas.
#Importing pandas import pandas as pd
‘to_datetime’ method in pandas is used to convert date strings to dates.
#Using pandas to parse dates datestrs = ['1/14/2018', '2/14/2018'] pd.to_datetime(datestrs) Output: DatetimeIndex(['2018-01-14', '2018-02-14'], dtype='datetime64[ns]', freq=None)
In pandas, a missing time or NA values in time are represented as NaT (Not a Time).
Indexing And Slicing Of A Time Series
To understand various operations on a time series, let us create a time series using random numbers.
#Creating a time series with random numbers import numpy as np from random import random dates = [datetime(2011, 1, 2), datetime(2011, 1, 5), datetime(2011, 1, 7), datetime(2011, 1, 8), datetime(2011, 1, 10), datetime(2011, 1, 12)] ts = pd.Series(np.random.randn(6), index=dates) ts Output: 2011-01-02 0.888329 2011-01-05 -0.152267 2011-01-07 0.854689 2011-01-08 0.680432 2011-01-10 0.123229 2011-01-12 -1.503613 dtype: float64
The elements of this time series can be called as any other pandas series using the index as shown.
ts[’01/02/2011′] or ts[‘20110102’] will give the same output 0.888329
The slicing is also similar to what we have for other pandas series.
#Slicing the time series ts[datetime(2011,1,7):] Output: 2011-01-07 0.854689 2011-01-08 0.680432 2011-01-10 0.123229 2011-01-12 -1.503613 dtype: float64
Duplicate Indices in Time Series
Sometimes your time series may contain duplicated indices. Consider the below time series.
#Creating a time series with duplicated indices datesdup = [datetime(2018, 1, 1), datetime(2018, 1, 2), datetime(2018, 1, 2), datetime(2018, 1, 2), datetime(2018, 1, 3)] dup_ts = pd.Series(np.random.randn(5), index=datesdup) dup_ts Output: 2018-01-01 -0.471411 2018-01-02 0.667770 2018-01-02 -0.010174 2018-01-02 -0.699517 2018-01-03 -0.611886 dtype: float64
In the above time series, we can see that ‘2018-01-02’ is repeated thrice. We can check this using ‘is_unique’ property of ‘index’ function.
dup_ts.index.is_unique Output: False
We can aggregate the records with the same index using ‘groupby’ functionality.
grouped=dup_ts.groupby(level=0)
We can now use mean, count or the sum of those records as per our requirement.
grouped.mean() Output: 2018-01-01 -0.471411 2018-01-02 -0.013973 2018-01-03 -0.611886 dtype: float64 grouped.count() Output: 2018-01-01 1 2018-01-02 3 2018-01-03 1 dtype: int64 grouped.sum() Output: 2018-01-01 -0.471411 2018-01-02 -0.041920 2018-01-03 -0.611886 dtype: float64
Data Shifting
We can shift the index of the time series using the ‘shift’ function.
#Shifting the time series ts.shift(2) Output: 2011-01-02 NaN 2011-01-05 NaN 2011-01-07 0.888329 2011-01-08 -0.152267 2011-01-10 0.854689 2011-01-12 0.680432 dtype: float64
Summary
In this blog, we have seen some basic functionalities which are of great use in analyzing a time series. We have seen how we can play around the dates, converting them from one format to another. We have also covered slicing, indexing and shifting operations.
Next Step
If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to be a successful trader. Enroll now!
Disclaimer: All investments and trading in the stock market involve risk. Any decisions to place trades in the financial markets, including trading in stock or options or other financial instruments is a personal decision that should only be made after thorough research, including a personal risk and financial assessment and the engagement of professional assistance to the extent you believe necessary. The trading strategies or related information mentioned in this article is for informational purposes only.