Backtest A Simple Trading Strategy Based On Historical Monthly Performance Of Stocks

K for What?
4 min readApr 3, 2021
It’s a backtest looks like

Hi everyone, in my first blog, I presented how to find best stocks in a certain month in years, so naturally we will ask ourselves how we could make profit based on this idea.

A simple strategy is at the start of each month we will buy the best performing stocks in that month, then hold until the end of month. An alternative strategy is we buy the x top performing stocks at the beginning of month, then sell them at the end of month.

Let’s see how it works.

To evaluate the performance of this strategy, I used following assumptions:

  • buy at the beginning of month, sell at the end of month
  • the buy/sell price is at close price
  • only consider stocks that have at least 10-year data history
  • If we buy different stocks, we will buy them with equal weight.
  • Only consider stocks available at the moment

The data I used is the same data as in the first blog, I created following new features to be utilized later in the code:

  • Month: month extracted from column Date in current row
  • Year: year extracted from column Date in current row
  • Lagged_time_month: Previous month
  • Closing Price of month: The close price at last trading date in current month.
  • Return_1M: return of stock in current month
  • Winner: whether stock price increases/decreases in current month, takes 0–1 value
  • No. winning years: number of years that stock won in this month
  • Odds of Winning: empirical odds that stock won in this month
  • Average Return: average return of stock in this month

The backtest code is as below:

df = pd.read_csv('/Users/trantrungkien/Kien 2021/Stock Data/HSX22032021.csv',)
df = df.rename(columns = {'<Ticker>':'Ticker', '<DTYYYYMMDD>':'Date', '<Open>':'O', '<High>':'H', '<Low>':'L', '<Close>':'C', '<Volume>':'Vol'})
df['Date'] = pd.to_datetime(df['Date'], format = '%Y%m%d')
df = df.sort_values(by = ['Ticker', 'Date'], ascending = [True, True]).reset_index(drop = True)
df['Month'] = df['Date'].dt.month
df['Year'] = df['Date'].dt.year
df['Lagged_time_month'] = df['Month'].shift(1)
def winner_in_month(df, month):
df_agg = pd.DataFrame()
ticker_list = list(set(df['Ticker'][df['Year'] == 2021]))
for ticker in ticker_list:
df_ticker = df[df['Ticker'] == ticker].dropna().reset_index(drop = True)
condition = df_ticker['Month'] != df_ticker['Lagged_time_month']
df_month = df_ticker.copy().dropna()[condition].reset_index(drop = True)
df_month['Closing Price of month'] = df_month['C'].shift(-1)
df_month['Return_1M'] = df_month['Closing Price of month']/df_month['C']-1
df_month = df_month[df_month['Month'] == month].dropna().reset_index(drop = True)
if len(df_month) >= 10:
df_month['Winner'] = np.where(df_month['Return_1M'] > 0, 1, 0)
df_month['No. winning years'] = np.maximum(df_month['Winner'].cumsum()-1, 0)
df_month['Odds of Winning'] = df_month['No. winning years']/df_month['No. winning years'].expanding().count()
df_month['Average Return'] = df_month['Return_1M'].expanding().mean()
df_month = df_month.iloc[-5:, :].reset_index(drop = True)
df_agg = pd.concat([df_agg, df_month])
years_review = range(2016, 2021)
df_agg1 = pd.DataFrame()
df_agg2 = pd.DataFrame()
for year in years_review:
df_year = df_agg[df_agg['Year'] == year].reset_index(drop = True)
no_years = max(df_year['No. winning years'])
no_odds = max(df_year['Odds of Winning'])
df_agg1 = pd.concat([df_agg1, df_year[df_year['No. winning years'] == no_years].reset_index(drop = True)[['Ticker', 'Year', 'Return_1M', 'No. winning years', 'Odds of Winning', 'Average Return']]])
df_agg2 = pd.concat([df_agg2, df_year[df_year['Odds of Winning'] == no_odds].reset_index(drop = True)[['Ticker', 'Year', 'Return_1M', 'No. winning years', 'Odds of Winning', 'Average Return']]])
print(df_agg1)
print(f'Mean return: {df_agg1["Return_1M"].mean()}')
print('------------------------')
print(df_agg2)
print(f'Mean return: {df_agg2["Return_1M"].mean()}')

Since it’s April now, so I want to see performance of strategy in April. Here it is:

Based on this backtest result of the first strategy, we should follow the highest odds of winning stocks to get better return.

Let’s go to the second strategy. The code is as below:

def winner_in_month_v2(df, month, x):
df_agg = pd.DataFrame()
ticker_list = list(set(df['Ticker'][df['Year'] == 2021]))
for ticker in ticker_list:
df_ticker = df[df['Ticker'] == ticker].dropna().reset_index(drop = True)
condition = df_ticker['Month'] != df_ticker['Lagged_time_month']
df_month = df_ticker.copy().dropna()[condition].reset_index(drop = True)
df_month['Closing Price of month'] = df_month['C'].shift(-1)
df_month['Return_1M'] = df_month['Closing Price of month']/df_month['C']-1
df_month = df_month[df_month['Month'] == month].dropna().reset_index(drop = True)
if len(df_month) >= 10:
df_month['Winner'] = np.where(df_month['Return_1M'] > 0, 1, 0)
df_month['No. winning years'] = np.maximum(df_month['Winner'].cumsum()-1, 0)
df_month['Odds of Winning'] = df_month['No. winning years']/df_month['No. winning years'].expanding().count()
df_month['Average Return'] = df_month['Return_1M'].expanding().mean()
df_month = df_month.iloc[-5:, :].reset_index(drop = True)
df_agg = pd.concat([df_agg, df_month])
years_review = range(2016, 2021)
df_agg1 = pd.DataFrame()
df_agg2 = pd.DataFrame()
for year in years_review:
df_year = df_agg[df_agg['Year'] == year].reset_index(drop = True)
df_year1 = df_year.copy().sort_values(by = ['No. winning years'], ascending = False).reset_index(drop = True)[:x]
df_agg1 = pd.concat([df_agg1, df_year1[['Ticker', 'Year', 'Return_1M']]], ignore_index = True)
df_year2 = df_year.copy().sort_values(by = ['Odds of Winning'], ascending = False).reset_index(drop = True)[:x]
df_agg2 = pd.concat([df_agg2, df_year2[['Ticker', 'Year', 'Return_1M']]], ignore_index = True)
print(df_agg1)
print(f'Mean return: {df_agg1["Return_1M"].mean()}')
print('------------------------')
print(df_agg2)
print(f'Mean return: {df_agg2["Return_1M"].mean()}')

The following is the backtest result for April:

Again we can see that the result is better when we buy/sell stocks based on the odds of winning. From my perspective, you should follow the second strategy because it not only has a higher average return but also diversifies your portfolio better.

Til next time all!

--

--

K for What?

Quant Researcher, Data Scientist, Food Hater (so I eat them a lot).