Can I dynamically change the hyper-parameters of a model?




More Details

1) Explanations on My Model
As my model is a stock trading model, I will explain to you how I trade stocks. Please bear with me for explaining how I trade stocks. It is not a long explanation.

  • I am using Bollinger bands to trade stocks. (All the stock data in this example is daily).
  • In short, I calculate simple moving average (SMA) and standard deviation of N days long stock prices. (Yes, I assume that the stock prices follow the Gaussian distribution.) The upper band is SMA + k*stdev, while the lower band is SMA - k*stdev.
  • I buy the stock when the stock price is above the upper band ('Too_High_Buy') or below the lower band ('Too_Low_Buy'). For the 'Too_High_Buy' case, I sell the stock when the stock price goes below the SMA. For the 'Too_Low_Buy' case, I sell the stock when the stock price goes above the SMA.

enter image description here

  • The parameters of my model is SMA and Stdev, while the hyper-parameters are N and k.
    • N: It decides how smooth SMA (the yellow line) will be.
    • k: It decides how far the upper and lower bands will be located from the SMA.

enter image description here

  • As different values of N and K show different characteristics, we should search which values of N and K are good for stock price data.

2) How to decide the hyper-parameters(N & k)

  • I use 'sliding steps' to decide the appropriate number for two hyper-parameters, N and K.
  • Sliding steps use the fixed amount of training data to decide the hyper parameter and check the performance of these hyper-parameters on the validation dataset, which immediately follows the training dataset.

enter image description here

  • I thought sliding steps is a good cross-validation tool to apply on stock data, because statistical properties of stocks can change while time goes by. For example, 30 years ago, McDonald and Coca-cola shows the similar price movements because they were sold together. However, nowadays, Coca-cola focuses on healthy drinks while McDonald stays as unhealthy food brands, they can show different price movements.

  • The hyper-parameter here can be several things, but for the sake of simplicity, let's say the hyper-parameters that we should decide is the duration of training dataset (N). The duration of dropped and forecasting is set to 1 day.

  • Using grid-search of different values of N and K, I calculate which values of N and K shows the best performance during the validation period in the training dataset.

enter image description here

3) My question

  • Can I use different k for different training sets? In other words, can I dynamically change which value of K to use based on the performance of sub-training dataset?

  • While performing the grid-search in the sliding step window method, we use the same value of K in all the training datasets to trade a stock during the validation period.

  • However, we can use different values of K based on the performance of sub-training data.

  • For example, let's say N is fixed to be 30 days. Then from the 1st Jan to 30th Jan, k=0.6 shows the best performance and we use this k=0.6 for the 31st Jan. Then from the 2nd Jan to 31st Jan, k=1.5 shows the best performance, then we use this K value for the 1st Feb, and so on.

  • Why should we use the shared hyper-parameter K all across the model? For parameters, it makes sense because it allows reduction of the parameters that the model has to learn. (source: Recurrent NNs: what's the point of parameter sharing? Doesn't padding do the trick anyway?)

  • But for using shared hyper-parameters, the amount of hyper-parameters is the same whether I use the shared hyper-parameters or not. It is only 1 hyper-parameter, which is K. The amount of computations needed is the same as well.


Posted 2021-02-11T21:57:08.737

Reputation: 153

No answers