The existing prediction system for realized volatility is limited and cannot effectively de-scribe the stocks’ highly complex and nonlinear characters. In this study, we built a hybrid model by combining Feedforward Neural Network (FFNN) with Light Gradient Boosting Machine (LightGBM). Then we extract three important categories of features based on high frequency stock trading and quotation data, feed them into the hybrid model for predicting volatility, and test it on the real-market data in the next three months. We also compared our hybrid model with other models in the experiment process. Compared with traditional machine learning models like Naïve Bayes and SVM, or the single Lightgbm model, our hybrid model has the lowest RMSPE result of 0.192. And in the following three-month realmarket data test, our hybrid model’s RMSPE result remained in range [0.199, 0.219]. This test result further demonstrates the accuracy and robustness of our model’s out-of-sample performance.
Supplementary notes can be added here, including code, math, and images.