Eigen-Vectors & Eigen-Values of Stock Returns Correlations
Back in undergrad, John Rundle taught a graduate "Econo-Physics" class where in one lesson we learned about portfolio managment using the eigen-vectors & eigen-values correlation matrix of stock returns. Here we create some fake data and make that caluation using numpy and pandas while visualizing using matplotlib and seaborn.
Tip: Jake Vanderplas has a great tutorial on using pandas!
#collapse-hide
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from itertools import permutations
%matplotlib inline
sns.set()
Note: Make random Data
np.random.seed(42)
def brownian_motion(mean,std,npts):
return np.cumsum(np.random.normal(scale=std, size=npts)) + mean
num_stocks = 10
num_timesteps = 1000
letters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
tickers = [''.join(x) for x in np.random.choice(list(letters), size=(num_stocks,3))]
dates = pd.date_range('2020-11-21', periods=num_timesteps, freq='D')
data = np.vstack([brownian_motion(mean, std, num_timesteps) for mean, std in zip(np.random.randint(50,200,num_stocks), np.random.randint(2,5, num_stocks))]).T
df = pd.DataFrame(data, columns=tickers, index=dates)
df = df[df > 0].dropna(axis=1)
df.head()
Tip: Use this to load csv’s from your own google drive
from google.colab import drive
drive.mount('/content/gdrive')
!ls "gdrive/My Drive" # this line will look in the folder
df = pd.read_csv('gdrive/My Drive/data.csv') # put the full path to the file in google drive here if you have one
fig, ax = plt.subplots(1, figsize=(20,8))
df.plot(ax=ax)
plt.show()
df.diff().corr()
e_val, e_vect = np.linalg.eig(df.diff().corr())
evect_df = pd.DataFrame(e_vect[np.argsort(e_val)[::-1]], columns=df.columns, index=df.columns)
evect_df
fig, ax = plt.subplots(1, figsize=(12,10))
ax.set_title('Eigenvalues of Correlation of Running Difference', fontsize=16)
sns.heatmap(evect_df, ax=ax, annot=True, fmt=".2f", linewidths=.5)
fig.savefig('../images/eigen_correlation_heatmap.png')
plt.show()