Time Series Analysis: Working With Date-Time Data In Python

Time Series Analysis - Working With Date-Time Data In Python

By Mandeep Kaur

In our previous blog on time series “Time Series Analysis: An Introduction In Python”, we saw how we can get a time series data from online sources and perform major analysis on the time series including plotting, calculating moving averages and even forecasting. In this blog, we will discuss some important tools that are really helpful to analyse time series data. These tools are extremely helpful for traders in designing and backtesting trading strategies.

Traders deal with loads of historical data and need to play around and perform analysis on such time series. These tools are used to prepare the data before doing the required analysis. We will majorly focus on how to deal with dates and frequency of the time series. We will also discuss indexing, slicing and slicing operations on time series. For this blog, we will extensively use ‘datetime’ library.

Time Series Analysis: Date-Time Data In PythonClick To Tweet

Let us begin with importing this library in our program.

#Importing the required modules

from datetime import datetime
from datetime import timedelta

Basic Tools for Date and Time

To begin with, let us save current date and time in a variable ‘current_time’. The below code will execute the same.

#Printing the current date and time

current_time = datetime.now()
current_time
Output: datetime.datetime(2018, 2, 14, 9, 52, 20, 625404)

We can compute the difference between two dates using datetime.

#Calculating the difference between two dates (14/02/2018 and 01/01/2018 09:15AM)

delta = datetime(2018,2,14)-datetime(2018,1,1,9,15)
delta
Output: datetime.timedelta(43, 53100)

We can convert the output in terms of days or seconds using:

#Converting the output to days

delta.days
Output: 43

#Converting the output to seconds

delta.seconds
Output: 53100

If we want to shift a date, we can use timedelta module which we have already imported.

#Shift a date using timedelta

my_date = datetime(2018,2,10)

#Shift the date by 10 days

my_date + timedelta(10)
Output: datetime.datetime(2018, 2, 20, 0, 0)

We can also use multiples of timedelta function.

#Using multiples of timedelta function

my_date - 2*timedelta(10)
Output: datetime.datetime(2018, 1, 21, 0, 0)

We have seen ‘datetime’ and ‘timedelta’ data types of datetime module. Let us give a brief note of major data types which are of great use while analysing time series.

Data Type Description
date Stores calendar dates (year, month and day) using the Gregorian calendar
time Stores time as hours, minutes, seconds and microseconds
datetime Stores both date and time (as already discussed in examples)
timedelta Stores the difference between two datetime values (as already discussed in examples)

Learn Algorithmic trading from Experienced Market Practitioners




  • This field is for validation purposes and should be left unchanged.

Conversion Between Strings And datetime

We can convert a datetime format to a string and save it under a string variable. Similarly, the reverse can also be done and a string which represents a date can be converted to datetime data type.

#Converting datetime to string

my_date1 = datetime(2018,2,14)
str(my_date1)
Output: '2018-02-14 00:00:00'

We can use ‘strptime’ function to convert the string to datetime.

#Converting a string to datetime

datestr = '2018-02-14'
datetime.strptime(datestr, '%Y-%m-%d')
Output: datetime.datetime(2018, 2, 14, 0, 0)

We can also use pandas to handle dates. Let us first import pandas.

#Importing pandas

import pandas as pd

‘to_datetime’ method in pandas is used to convert date strings to dates.

#Using pandas to parse dates

datestrs = ['1/14/2018', '2/14/2018']
pd.to_datetime(datestrs)
Output: DatetimeIndex(['2018-01-14', '2018-02-14'], dtype='datetime64[ns]', freq=None)

In pandas, a missing time or NA values in time are represented as NaT (Not a Time).

Indexing And Slicing Of A Time Series

To understand various operations on a time series, let us create a time series using random numbers.

#Creating a time series with random numbers

import numpy as np
from random import random
dates = [datetime(2011, 1, 2), datetime(2011, 1, 5), datetime(2011, 1, 7), datetime(2011, 1, 8), datetime(2011, 1, 10), datetime(2011, 1, 12)]
ts = pd.Series(np.random.randn(6), index=dates)
ts

Output:
2011-01-02   0.888329
2011-01-05  -0.152267
2011-01-07   0.854689
2011-01-08   0.680432
2011-01-10   0.123229
2011-01-12  -1.503613
dtype: float64

The elements of this time series can be called as any other pandas series using the index as shown.

ts[’01/02/2011′] or ts[‘20110102’] will give the same output 0.888329

The slicing is also similar to what we have for other pandas series.

#Slicing the time series

ts[datetime(2011,1,7):]
Output:
2011-01-07 0.854689
2011-01-08 0.680432
2011-01-10 0.123229
2011-01-12 -1.503613
dtype: float64

Duplicate Indices in Time Series

Sometimes your time series may contain duplicated indices. Consider the below time series.

#Creating a time series with duplicated indices

datesdup = [datetime(2018, 1, 1), datetime(2018, 1, 2), datetime(2018, 1, 2), datetime(2018, 1, 2), datetime(2018, 1, 3)]
dup_ts = pd.Series(np.random.randn(5), index=datesdup)
dup_ts
Output:
2018-01-01 -0.471411
2018-01-02 0.667770
2018-01-02 -0.010174
2018-01-02 -0.699517
2018-01-03 -0.611886
dtype: float64

In the above time series, we can see that ‘2018-01-02’ is repeated thrice. We can check this using ‘is_unique’ property of ‘index’ function.

dup_ts.index.is_unique
Output: False

We can aggregate the records with the same index using ‘groupby’ functionality.

grouped=dup_ts.groupby(level=0)

We can now use mean, count or the sum of those records as per our requirement.

grouped.mean()
Output:
2018-01-01 -0.471411
2018-01-02 -0.013973
2018-01-03 -0.611886
dtype: float64

grouped.count()
Output:
2018-01-01 1
2018-01-02 3
2018-01-03 1
dtype: int64

grouped.sum()
Output:
2018-01-01 -0.471411
2018-01-02 -0.041920
2018-01-03 -0.611886
dtype: float64

Learn Algorithmic trading from Experienced Market Practitioners




  • This field is for validation purposes and should be left unchanged.

Data Shifting

We can shift the index of the time series using the ‘shift’ function.

#Shifting the time series
ts.shift(2)
Output:
2011-01-02 NaN
2011-01-05 NaN
2011-01-07 0.888329
2011-01-08 -0.152267
2011-01-10 0.854689
2011-01-12 0.680432
dtype: float64

Summary

In this blog, we have seen some basic functionalities which are of great use in analyzing a time series. We have seen how we can play around the dates, converting them from one format to another. We have also covered slicing, indexing and shifting operations.

Next Step

If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to be a successful trader. Enroll now!

Disclaimer: All investments and trading in the stock market involve risk. Any decisions to place trades in the financial markets, including trading in stock or options or other financial instruments is a personal decision that should only be made after thorough research, including a personal risk and financial assessment and the engagement of professional assistance to the extent you believe necessary. The trading strategies or related information mentioned in this article is for informational purposes only.