Visualizing and Comparing Time Series Data - Market Analysis with Python
Introduction
Let's talk about charting multiple financial time series. A lot of people will simply overlay them using different axes. But that can be misleading. Let's look at a much better way using some cool Pandas functions such as the Percentage Change, Rolling, and CumSum.
Code
from IPython.display import Image
Image(filename='viralml-logos.png')
ViralML.com - Comparing Financial Time Series Visually with Pandas Percentage Change - Hands-On Market Analysis with Python¶
Let's play around with some cool Pandas functions to better visualize and compare different data series. We'll look at the
Percentage Change pandas.DataFrame.pct_change¶
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pct_change.html
Rolling Mean (pandas.core.window.Rolling.mean¶
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.window.Rolling.mean.html
Cumulative Sum (pandas.DataFrame.cumsum)¶
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.cumsum.html
"Today's scary (Bloomberg) chart! #equities v Manufacturing #PMI."¶
We will use as a case study, a recent Tweet from Jeroen Blokland (@jsblokland) titled "Today's scary (Bloomberg) chart! #equities v Manufacturing #PMI." I encourage anyone interested in market news to follow people that actively and cunningly post on Twitter relevant news. They are your eyes in the market, its entirely free, and if they are good, can be very interesting.
https://twitter.com/jsblokland/status/1197131574413799425
The two data points he mentions in the tweet are available for free on Quandl.com - let's go get them!
Data Needed¶
MSCI World Index - The Morgan Stanley Capital International World Index is a broad global equity index that represents large and mid-cap equity performance across all 23 developed markets countries. It covers approximately 85% of the free float-adjusted market capitalization in each country.
PMI Composite Index - Manufacturing Purchasing Managers' Index
https://www.quandl.com/data/ISM/MAN_PMI-PMI-Composite-Index
HANG SENG INDEX (^HSI)
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
import io, base64, os, json, re, math
import pandas as pd
import numpy as np
import datetime
import warnings
warnings.filterwarnings('ignore')
path_to_market_data = '/Users/manuel/Documents/financial-research/market-data/2019-11-21/'
# MSCI World Index Futures
msci_world_index_df = pd.read_csv(path_to_market_data + 'CHRIS-EUREX_FMWO1.csv')
msci_world_index_df['Date'] = pd.to_datetime(msci_world_index_df['Date'])
print(np.min(msci_world_index_df['Date'] ),np.max(msci_world_index_df['Date'] ))
msci_world_index_df = msci_world_index_df.sort_values('Date', ascending=True)
msci_world_index_df.tail()
# PMI Composite Index
ism_pmi_df = pd.read_csv(path_to_market_data + 'ISM-MAN_PMI.csv')
ism_pmi_df['Date'] = pd.to_datetime(ism_pmi_df['Date'])
print(np.min(ism_pmi_df['Date'] ),np.max(ism_pmi_df['Date'] ))
ism_pmi_df = ism_pmi_df.sort_values('Date', ascending=True)
ism_pmi_df.tail()
hsi_df = pd.read_csv(path_to_market_data + '^HSI.csv')
hsi_df['Date'] = pd.to_datetime(hsi_df['Date'])
hsi_df = hsi_df.sort_values('Date', ascending=True)
print(min(hsi_df['Date']), max(hsi_df['Date']))
print(hsi_df.shape)
hsi_df.head()
Raw look¶
cut_off_date = "2015-08-01"
msci_world_index_df_tmp = msci_world_index_df.copy()
ism_pmi_df_tmp = ism_pmi_df.copy()
msci_world_index_df_tmp = msci_world_index_df_tmp[msci_world_index_df_tmp['Date'] >= cut_off_date]
ism_pmi_df_tmp = ism_pmi_df_tmp[ism_pmi_df_tmp['Date'] >= cut_off_date]
fig, ax = plt.subplots(figsize=(16, 8))
ax.set_facecolor('xkcd:black')
plt.plot(msci_world_index_df_tmp['Date'],
msci_world_index_df_tmp['Settle'],
color='gray',
linewidth=2,
label='MSCI World Index')
plt.grid()
plt.legend(loc='upper left')
fig.suptitle('MSCI World Index VS ISM Manufacturing PMI', fontsize=25, fontweight='bold')
ax.twinx()
plt.plot(ism_pmi_df_tmp['Date'], ism_pmi_df_tmp['PMI'] ,
color='#345d8f',
linewidth=2,
label='ISM/PMI')
plt.legend(loc='lower left')
cut_off_date = "2015-01-01"
msci_world_index_df_tmp = msci_world_index_df.copy()
ism_pmi_df_tmp = ism_pmi_df.copy()
msci_world_index_df_tmp = msci_world_index_df_tmp[msci_world_index_df_tmp['Date'] >= cut_off_date]
ism_pmi_df_tmp = ism_pmi_df_tmp[ism_pmi_df_tmp['Date'] >= cut_off_date]
fig, ax = plt.subplots(figsize=(16, 8))
ax.set_facecolor('xkcd:black')
plt.plot(msci_world_index_df_tmp['Date'],
msci_world_index_df_tmp['Settle'].rolling(window=20, min_periods=1).mean(),
color='lightgrey',
linewidth=2,
label='MSCI World Index')
plt.grid()
plt.legend(loc='upper left')
fig.suptitle('MSCI World Index VS ISM Manufacturing PMI', fontsize=25, fontweight='bold')
ax.twinx()
plt.plot(ism_pmi_df_tmp['Date'], ism_pmi_df_tmp['PMI'] ,
color='#345d8f',
linewidth=2,
label='ISM/PMI')
plt.legend(loc='lower left')
# But b
Join both datasets together¶
cut_off_date = "2015-08-01"
msci_world_index_df_tmp = msci_world_index_df[['Date', 'Settle']].copy()
msci_world_index_df_tmp.columns = ['Date', 'MSCI_Settle']
ism_pmi_df_tmp = ism_pmi_df.copy()
msci_world_index_df_tmp = msci_world_index_df_tmp[msci_world_index_df_tmp['Date'] >= cut_off_date]
ism_pmi_df_tmp = ism_pmi_df_tmp[ism_pmi_df_tmp['Date'] >= cut_off_date]
# join both datasets together
together_df = pd.merge(msci_world_index_df_tmp,
ism_pmi_df_tmp,
on= ['Date'], how='left') # left join because PMI is a montly number
# last valid observation forward
together_df = together_df.fillna(method='ffill')
# drop NAs
together_df = together_df.dropna(axis=0)
together_df = together_df.sort_values('Date', ascending=True)
together_df.head()
Quick Peek Together¶
fig, ax = plt.subplots(figsize=(16, 8))
ax.set_facecolor('xkcd:black')
plt.plot(together_df['Date'],
together_df['MSCI_Settle'],
color='gray',
linewidth=2,
label='MSCI World Index')
plt.grid()
plt.legend(loc='upper left')
fig.suptitle('MSCI World Index VS ISM Manufacturing PMI', fontsize=25, fontweight='bold')
plt.plot(together_df['Date'], together_df['PMI'] ,
color='#345d8f',
linewidth=2,
label='PMI')
plt.legend(loc='lower left')
together_df['MSCI_Settle_PctChng'] = together_df['MSCI_Settle'].pct_change()
together_df['PMI_PctChng'] = together_df['PMI'].pct_change()
fig, ax = plt.subplots(figsize=(16, 8))
ax.set_facecolor('xkcd:black')
plt.plot(together_df['Date'],
together_df['MSCI_Settle_PctChng'],
color='gray',
linewidth=2,
label='MSCI World Index')
plt.grid()
plt.legend(loc='upper left')
fig.suptitle('MSCI World Index VS ISM Manufacturing PMI', fontsize=25, fontweight='bold')
plt.plot(together_df['Date'], together_df['PMI_PctChng'] ,
color='#345d8f',
linewidth=2,
label='PMI')
plt.legend(loc='lower left')
fig, ax = plt.subplots(figsize=(16, 8))
ax.set_facecolor('xkcd:black')
plt.plot(together_df['Date'],
together_df['MSCI_Settle_PctChng'].cumsum(),
color='gray',
linewidth=2,
label='MSCI World Index')
plt.grid()
plt.legend(loc='upper left')
fig.suptitle('MSCI World Index VS ISM Manufacturing PMI', fontsize=25, fontweight='bold')
plt.plot(together_df['Date'], together_df['PMI_PctChng'].cumsum(),
color='#345d8f',
linewidth=2,
label='PMI')
plt.legend(loc='lower left')
Google Correlate¶
cut_off_date = "2014-06-01"
msci_world_index_df_tmp = msci_world_index_df.copy()
ism_pmi_df_tmp = ism_pmi_df.copy()
hsi_df_tmp = hsi_df.copy()
msci_world_index_df_tmp = msci_world_index_df_tmp[msci_world_index_df_tmp['Date'] >= cut_off_date]
ism_pmi_df_tmp = ism_pmi_df_tmp[ism_pmi_df_tmp['Date'] >= cut_off_date]
ghc_df_tmp = ghc_df_tmp[ghc_df_tmp['Date'] >= cut_off_date]
hsi_df_tmp = hsi_df_tmp[hsi_df_tmp['Date'] >= cut_off_date]
fig, ax = plt.subplots(figsize=(16, 8))
ax.set_facecolor('xkcd:black')
plt.plot(msci_world_index_df_tmp['Date'],
msci_world_index_df_tmp['Settle'],
color='gray',
linewidth=2,
label='MSCI World Index')
plt.grid()
plt.legend(loc='upper left')
fig.suptitle('MSCI World Index VS ISM Manufacturing PMI', fontsize=25, fontweight='bold')
ax.twinx()
plt.plot(ism_pmi_df_tmp['Date'], ism_pmi_df_tmp['PMI'] ,
color='#345d8f',
linewidth=2,
label='PMI')
plt.legend(loc='lower left')
ax.twinx()
plt.plot(hsi_df_tmp['Date'], hsi_df_tmp['Adj Close'] ,
color='yellow',
linewidth=2,
label='HANG SENG INDEX')
plt.legend(loc='lower left')
Show Notes
(pardon typos and formatting -these are the notes I use to make the videos)
Let's talk about charting multiple financial time series. A lot of people will simply overlay them using different axes. But that can be misleading. Let's look at a much better way using some cool Pandas functions such as the Percentage Change, Rolling, and CumSum.