Abstract¶
A Markov process is a framework used to model random processes that exhibit memoryless behavior, meaning that the future state of the process is independent of its past states. The assumption that stock prices follow a Markov process is commonly made in financial modeling. This simplification may not accurately capture the complexity of real-world market dynamics. Despite this, we will employ a simulation approach to forecast the future price of bitcoin by assuming that its movement is truly random and follows a specific distribution, specifically a t-distribution, for the purpose of this analysis.
0. Import libraries¶
%matplotlib inline
import random
import numpy as np
import pandas as pd
import scipy.stats as stats
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from tqdm import tqdm
import warnings
warnings.filterwarnings("ignore")
1. Data preview¶
In this analysis, we will use Yahoo Finance public API to retrieve up-to-date market information. To proceed, we will compute the percentage change in the price of bitcoin for each date in question and store the values in a list. Subsequently, we will create a histogram to visualize the distribution of these changes. As we calculate the mean of the percentage change, we observe that it is positive, indicating that the price of bitcoin has tended to increase rather than decrease over time. This suggests that there has been more gain than losses in the cryptocurrency market.
# Price chart preview
df = pd.read_csv("https://query1.finance.yahoo.com/v7/finance/download/BTC-USD?period1=1400000000&period2=2000000000&interval=1d&events=history&includeAdjustedClose=true")
plt.figure(figsize=(15, 5))
plt.plot(df['Date'].to_list(), df['Close'].to_list())
plt.grid(color='gray', alpha=0.5)
plt.gca().xaxis.set_major_locator(mdates.YearLocator())
plt.title('Bitcoin Price since September 17th 2014')
plt.xlabel('Date')
plt.ylabel('Price (USD)')
plt.show()
# Histogram for observations
observations = (df['Close'].pct_change() * 100).tolist()
observations = [x for x in observations if not np.isnan(x)]
print("Mean:", np.mean(observations))
print("Standard Deviation:", np.std(observations))
plt.hist(observations, bins=60, range=(-30, 30), edgecolor='black')
plt.grid(color='gray', alpha=0.5)
plt.title('Histogram of Observed Daily Returns')
plt.xlabel('Daily Return')
plt.ylabel('Frequency')
plt.show()
Mean: 0.2137750647552026 Standard Deviation: 3.6827182471820006
2. Fit distribution¶
We will consider various probability distributions commonly used in statistical analysis, including the normal distribution, Cauchy distribution, t-distribution, F-distribution, alpha-distribution, beta-distribution, gamma-distribution, chi-distribution, and chi-squared distribution. For each of these distributions, we will calculate the Kolmogorov-Smirnov statistics, which measure the similarity between the observed distribution and the fitted distribution. Based on these calculations, we will determine the best-fitted distribution for the observations in question.
The Kolmogorov-Smirnov statistic can take on different values depending on the similarity between the observed and hypothesized distributions. If the two distributions are identical, the Kolmogorov-Smirnov statistic will be 0. If the observed distribution is more dispersed than the hypothesized distribution, the Kolmogorov-Smirnov statistic will be greater than 0. Conversely, if the observed distribution is less dispersed than the hypothesized distribution, the Kolmogorov-Smirnov statistic will be less than 0.
Our analysis reveals that the t-distribution provides the best fit for the current observations. The estimated parameters are as follows:
- Mean (μ): 0.18
- Scale (σ): 2.03
- Shape (ν): 1.96
These results indicate that the distribution of the bitcoin price is similar to a t-distribution, which is a generalization of the normal distribution that takes into account the heteroscedasticity (non-constancy) of the variance across observations.
# Fit distribution for observations
def fit_dist(rv_list):
distributions = [stats.norm, stats.cauchy, stats.t, stats.f,
stats.alpha, stats.beta, stats.gamma,
stats.chi, stats.chi2]
best_fit = None
best_params = None
best_ks_stat = np.inf
for distribution in distributions:
params = distribution.fit(rv_list)
ks_stat, _ = stats.kstest(rv_list, distribution.cdf, args=params)
# Perform the Kolmogorov-Smirnov test
if ks_stat < best_ks_stat:
best_fit = distribution
best_params = params
best_ks_stat = ks_stat
print("Best fit distribution:", best_fit.name)
print("Best fit parameters:", best_params)
print("Kolmogorov-Smirnov statistic:", best_ks_stat)
return best_fit, best_params
dist, params = fit_dist(observations)
Best fit distribution: t Best fit parameters: (2.1431054512763126, 0.17668432407380463, 1.9270381670653132) Kolmogorov-Smirnov statistic: 0.02734883204485744
# Histogram for fitted distribution
plt.hist(dist.rvs(*params, size=10000), bins=60, range=(-30, 30), edgecolor='black')
plt.grid(color='gray', alpha=0.5)
plt.title('Histogram of Simulated Daily Returns')
plt.xlabel('Daily Return')
plt.ylabel('Frequency')
plt.show()
3. Demonstration¶
To simulate market movement, we will randomly generate a new price change using t-distribution with the given parameters. To prevent cases where the market price become negative, we will take into account volatility by re-rolling the simulation if the next day's price change falls outside a certain range (+/- 30%), and reset the current state of the market to the new random price change until the price change is within the desired range.
In this demonstration, we will use historical market data from the last four years as a basis to project potential trends over the next two years.
# Function for simulation
def simulate_market(dist, params, starting_price, depth):
price = starting_price
simulations = [starting_price]
for i in range(depth):
change = -100
while change < -30 or change > 30:
change = dist.rvs(*params)
price = price * (1 + change / 100)
simulations.append(price)
return simulations
# Run the simulation and plot the result
simulation = df['Close'].to_list()[-365*4:-1] + simulate_market(dist, params, df['Close'].to_list()[-1], 365*2)
plt.figure(figsize=(15, 5))
plt.plot(simulation)
plt.grid(color='gray', alpha=0.5)
plt.title('Simulated Bitcoin Price from 2020 to 2026')
plt.xlabel('Days Past')
plt.ylabel('Price (USD)')
plt.show()
4. Simulation¶
To further analyze the results of the financial market simulation using a Markov process, we will run the simulation 100 times and plot all the resulting prices in a single graph. This will allow us to better compare the different potential outcomes.
In this simulation, we will start from the current market price as a basis to project potential trends over the next 365 days.
# Simulate 100 times
simulations = []
plt.figure(figsize=(15, 5))
for i in tqdm(range(100)):
simulations.append(simulate_market(dist, params, df['Close'].to_list()[-1], 365))
for series in simulations:
plt.plot(series, color='grey', alpha=0.1)
plt.title('Multiple Series Plot of Bitcoin Price Simulation')
plt.xlabel('Days Past')
plt.ylabel('Price (USD)')
plt.ylim(0, 600000)
plt.show()
100%|███████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 150.48it/s]