Get the "Applied Data Science Edge"!

The ViralML School

Fundamental Market Analysis with Python - Find Your Own Answers On What Is Going on in the Financial Markets

Web Work

Python Web Work - Prototyping Guide for Maker

Use HTML5 Templates, Serve Dynamic Content, Build Machine Learning Web Apps, Grow Audiences & Conquer the World!

Hot off the Press!

The Little Book of Fundamental Market Indicators

My New Book: "The Little Book of Fundamental Analysis: Hands-On Market Analysis with Python" is Out!

Visualizing and Comparing Time Series Data - Market Analysis with Python

Introduction

Let's talk about charting multiple financial time series. A lot of people will simply overlay them using different axes. But that can be misleading. Let's look at a much better way using some cool Pandas functions such as the Percentage Change, Rolling, and CumSum.



If you liked it, please share it:

Code

ViralML-Visualizing-and-Comparing-Time-Series-Data-Market-Analysis-With-Python
In [48]:
from IPython.display import Image
Image(filename='viralml-logos.png')
Out[48]:

ViralML.com - Comparing Financial Time Series Visually with Pandas Percentage Change - Hands-On Market Analysis with Python

Let's play around with some cool Pandas functions to better visualize and compare different data series. We'll look at the

Percentage Change pandas.DataFrame.pct_change

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pct_change.html

Rolling Mean (pandas.core.window.Rolling.mean

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.window.Rolling.mean.html

Cumulative Sum (pandas.DataFrame.cumsum)

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.cumsum.html

"Today's scary (Bloomberg) chart! #equities v Manufacturing #PMI."

We will use as a case study, a recent Tweet from Jeroen Blokland (@jsblokland) titled "Today's scary (Bloomberg) chart! #equities v Manufacturing #PMI." I encourage anyone interested in market news to follow people that actively and cunningly post on Twitter relevant news. They are your eyes in the market, its entirely free, and if they are good, can be very interesting.

https://twitter.com/jsblokland/status/1197131574413799425

The two data points he mentions in the tweet are available for free on Quandl.com - let's go get them!

Data Needed

MSCI World Index - The Morgan Stanley Capital International World Index is a broad global equity index that represents large and mid-cap equity performance across all 23 developed markets countries. It covers approximately 85% of the free float-adjusted market capitalization in each country.

https://www.quandl.com/data/CHRIS/EUREX_FMWO1-MSCI-World-Index-Futures-Continuous-Contract-1-FMWO1-Front-Month

PMI Composite Index - Manufacturing Purchasing Managers' Index

https://www.quandl.com/data/ISM/MAN_PMI-PMI-Composite-Index

HANG SENG INDEX (^HSI)

https://finance.yahoo.com/quote/%5EHSI

In [52]:
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
import io, base64, os, json, re, math
import pandas as pd
import numpy as np
import datetime
import warnings
warnings.filterwarnings('ignore')

path_to_market_data = '/Users/manuel/Documents/financial-research/market-data/2019-11-21/'
In [5]:
# MSCI World Index Futures 
msci_world_index_df = pd.read_csv(path_to_market_data + 'CHRIS-EUREX_FMWO1.csv')
msci_world_index_df['Date'] = pd.to_datetime(msci_world_index_df['Date']) 
print(np.min(msci_world_index_df['Date'] ),np.max(msci_world_index_df['Date'] ))
msci_world_index_df = msci_world_index_df.sort_values('Date', ascending=True) 
msci_world_index_df.tail()
2013-09-09 00:00:00 2019-11-20 00:00:00
Out[5]:
Date Open High Low Settle Volume Prev. Day Open Interest
4 2019-11-14 6632.0 6646.0 6616.0 6635.0 2531.0 119231.0
3 2019-11-15 6671.0 6689.0 6651.0 6681.0 1639.0 119328.0
2 2019-11-18 6711.0 6715.0 6676.0 6692.0 50318.0 165649.0
1 2019-11-19 6702.0 6722.0 6679.0 6689.0 3242.0 166669.0
0 2019-11-20 6675.0 6682.0 6633.0 6679.0 3389.0 166976.0
In [50]:
# PMI Composite Index
ism_pmi_df = pd.read_csv(path_to_market_data + 'ISM-MAN_PMI.csv')
ism_pmi_df['Date'] = pd.to_datetime(ism_pmi_df['Date']) 
print(np.min(ism_pmi_df['Date'] ),np.max(ism_pmi_df['Date'] ))
ism_pmi_df = ism_pmi_df.sort_values('Date', ascending=True) 
ism_pmi_df.tail()
 
1948-01-01 00:00:00 2019-10-01 00:00:00
Out[50]:
Date PMI
4 2019-06-01 51.7
3 2019-07-01 51.2
2 2019-08-01 49.1
1 2019-09-01 47.8
0 2019-10-01 48.3
In [53]:
hsi_df = pd.read_csv(path_to_market_data + '^HSI.csv')
hsi_df['Date'] = pd.to_datetime(hsi_df['Date'])
hsi_df = hsi_df.sort_values('Date', ascending=True) 
print(min(hsi_df['Date']), max(hsi_df['Date']))
print(hsi_df.shape)
hsi_df.head()
1986-12-31 00:00:00 2019-11-20 00:00:00
(8375, 7)
Out[53]:
Date Open High Low Close Adj Close Volume
0 1986-12-31 2568.300049 2568.300049 2568.300049 2568.300049 2568.300049 0.0
1 1987-01-01 NaN NaN NaN NaN NaN NaN
2 1987-01-02 2540.100098 2540.100098 2540.100098 2540.100098 2540.100098 0.0
3 1987-01-05 2552.399902 2552.399902 2552.399902 2552.399902 2552.399902 0.0
4 1987-01-06 2583.899902 2583.899902 2583.899902 2583.899902 2583.899902 0.0

Raw look

In [54]:
cut_off_date = "2015-08-01"
msci_world_index_df_tmp = msci_world_index_df.copy()
ism_pmi_df_tmp = ism_pmi_df.copy()

msci_world_index_df_tmp = msci_world_index_df_tmp[msci_world_index_df_tmp['Date'] >= cut_off_date]
ism_pmi_df_tmp = ism_pmi_df_tmp[ism_pmi_df_tmp['Date'] >= cut_off_date]


fig, ax = plt.subplots(figsize=(16, 8))
ax.set_facecolor('xkcd:black')
plt.plot(msci_world_index_df_tmp['Date'], 
         msci_world_index_df_tmp['Settle'],
         color='gray', 
         linewidth=2,
         label='MSCI World Index')
plt.grid()
plt.legend(loc='upper left')
fig.suptitle('MSCI World Index VS ISM Manufacturing PMI', fontsize=25, fontweight='bold')

ax.twinx()
plt.plot(ism_pmi_df_tmp['Date'],  ism_pmi_df_tmp['PMI'] , 
         color='#345d8f', 
         linewidth=2,
         label='ISM/PMI')
plt.legend(loc='lower left') 
Out[54]:
<matplotlib.legend.Legend at 0x129e10d30>
In [56]:
cut_off_date = "2015-01-01"
msci_world_index_df_tmp = msci_world_index_df.copy()
ism_pmi_df_tmp = ism_pmi_df.copy()

msci_world_index_df_tmp = msci_world_index_df_tmp[msci_world_index_df_tmp['Date'] >= cut_off_date]
ism_pmi_df_tmp = ism_pmi_df_tmp[ism_pmi_df_tmp['Date'] >= cut_off_date]

fig, ax = plt.subplots(figsize=(16, 8))
ax.set_facecolor('xkcd:black')
plt.plot(msci_world_index_df_tmp['Date'], 
         msci_world_index_df_tmp['Settle'].rolling(window=20, min_periods=1).mean(), 
         color='lightgrey', 
         linewidth=2,
         label='MSCI World Index')
plt.grid()
plt.legend(loc='upper left')
fig.suptitle('MSCI World Index VS ISM Manufacturing PMI', fontsize=25, fontweight='bold')

ax.twinx()
plt.plot(ism_pmi_df_tmp['Date'],  ism_pmi_df_tmp['PMI'] , 
         color='#345d8f', 
         linewidth=2,
         label='ISM/PMI')
plt.legend(loc='lower left') 
Out[56]:
<matplotlib.legend.Legend at 0x11dff67b8>
In [ ]:
# But b

Join both datasets together

In [73]:
cut_off_date = "2015-08-01"

msci_world_index_df_tmp = msci_world_index_df[['Date', 'Settle']].copy()
msci_world_index_df_tmp.columns = ['Date', 'MSCI_Settle']
ism_pmi_df_tmp = ism_pmi_df.copy()
  
msci_world_index_df_tmp = msci_world_index_df_tmp[msci_world_index_df_tmp['Date'] >= cut_off_date]
ism_pmi_df_tmp = ism_pmi_df_tmp[ism_pmi_df_tmp['Date'] >= cut_off_date]

# join both datasets together
together_df = pd.merge(msci_world_index_df_tmp, 
                ism_pmi_df_tmp,
                on= ['Date'], how='left')  # left join because PMI is a montly number

# last valid observation forward
together_df = together_df.fillna(method='ffill')

# drop NAs
together_df = together_df.dropna(axis=0)


together_df = together_df.sort_values('Date', ascending=True)
 
together_df.head()
Out[73]:
Date MSCI_Settle PMI
21 2015-09-01 4326.0 50.2
22 2015-09-02 4329.0 50.2
23 2015-09-03 4403.0 50.2
24 2015-09-04 4297.0 50.2
25 2015-09-07 4328.0 50.2

Quick Peek Together

In [74]:
fig, ax = plt.subplots(figsize=(16, 8))
ax.set_facecolor('xkcd:black')
plt.plot(together_df['Date'], 
         together_df['MSCI_Settle'], 
         color='gray', 
         linewidth=2,
         label='MSCI World Index')
plt.grid()
plt.legend(loc='upper left')
fig.suptitle('MSCI World Index VS ISM Manufacturing PMI', fontsize=25, fontweight='bold')

 
plt.plot(together_df['Date'],  together_df['PMI'] , 
         color='#345d8f', 
         linewidth=2,
         label='PMI')
plt.legend(loc='lower left') 
 
Out[74]:
<matplotlib.legend.Legend at 0x12b12f7f0>
In [75]:
together_df['MSCI_Settle_PctChng'] = together_df['MSCI_Settle'].pct_change()
together_df['PMI_PctChng'] = together_df['PMI'].pct_change()
In [76]:
fig, ax = plt.subplots(figsize=(16, 8))
ax.set_facecolor('xkcd:black')
plt.plot(together_df['Date'], 
         together_df['MSCI_Settle_PctChng'], 
         color='gray', 
         linewidth=2,
         label='MSCI World Index')
plt.grid()
plt.legend(loc='upper left')
fig.suptitle('MSCI World Index VS ISM Manufacturing PMI', fontsize=25, fontweight='bold')

 
plt.plot(together_df['Date'],  together_df['PMI_PctChng'] , 
         color='#345d8f', 
         linewidth=2,
         label='PMI')
plt.legend(loc='lower left') 
Out[76]:
<matplotlib.legend.Legend at 0x12b1a98d0>
In [77]:
fig, ax = plt.subplots(figsize=(16, 8))
ax.set_facecolor('xkcd:black')
plt.plot(together_df['Date'], 
         together_df['MSCI_Settle_PctChng'].cumsum(), 
         color='gray', 
         linewidth=2,
         label='MSCI World Index')
plt.grid()
plt.legend(loc='upper left')
fig.suptitle('MSCI World Index VS ISM Manufacturing PMI', fontsize=25, fontweight='bold')

 
plt.plot(together_df['Date'],  together_df['PMI_PctChng'].cumsum(), 
         color='#345d8f', 
         linewidth=2,
         label='PMI')
plt.legend(loc='lower left')
Out[77]:
<matplotlib.legend.Legend at 0x12b46a5c0>

Google Correlate

In [78]:
cut_off_date = "2014-06-01"
msci_world_index_df_tmp = msci_world_index_df.copy()
ism_pmi_df_tmp = ism_pmi_df.copy() 
hsi_df_tmp = hsi_df.copy()

msci_world_index_df_tmp = msci_world_index_df_tmp[msci_world_index_df_tmp['Date'] >= cut_off_date]
ism_pmi_df_tmp = ism_pmi_df_tmp[ism_pmi_df_tmp['Date'] >= cut_off_date]
ghc_df_tmp = ghc_df_tmp[ghc_df_tmp['Date'] >= cut_off_date]
hsi_df_tmp = hsi_df_tmp[hsi_df_tmp['Date'] >= cut_off_date]


fig, ax = plt.subplots(figsize=(16, 8))
ax.set_facecolor('xkcd:black')
plt.plot(msci_world_index_df_tmp['Date'], 
         msci_world_index_df_tmp['Settle'], 
         color='gray', 
         linewidth=2,
         label='MSCI World Index')
plt.grid()
plt.legend(loc='upper left')
fig.suptitle('MSCI World Index VS ISM Manufacturing PMI', fontsize=25, fontweight='bold')

ax.twinx()
plt.plot(ism_pmi_df_tmp['Date'],  ism_pmi_df_tmp['PMI'] , 
         color='#345d8f', 
         linewidth=2,
         label='PMI')
plt.legend(loc='lower left') 



ax.twinx()
plt.plot(hsi_df_tmp['Date'],  hsi_df_tmp['Adj Close'] , 
         color='yellow', 
         linewidth=2,
         label='HANG SENG INDEX')
plt.legend(loc='lower left')
Out[78]:
<matplotlib.legend.Legend at 0x12b948978>

Show Notes

(pardon typos and formatting -
these are the notes I use to make the videos)

Let's talk about charting multiple financial time series. A lot of people will simply overlay them using different axes. But that can be misleading. Let's look at a much better way using some cool Pandas functions such as the Percentage Change, Rolling, and CumSum.