Building Fusion Model for Flu Trend Prediction by using Social Web Data
關鍵詞：流感監測(Flu Monitoring)、模型融合(Model Fusion)、線性迴歸(Linear Regression)、社群網路(Social Web)
According to statistics of WHO, flu averagely causes 3 million serious cases of illness and 250 thousand deaths per year. Obviously, it is a constant and big threaten for global people. However, if we can discover the epidemic trend of flu in advance, it is still possible to reduce the illness rate effectively. To monitor the epidemic situation of flu, the CDCs of countries usually collect weekly influenza-like illness (ILI) rate by gathering clinical reports from hospitals. However, it could cause about 1-2 weeks delay and therefore might miss the information about the peak period of flu epidemic. To remedy the above situation, it is necessary to develop new methods to discover the epidemic of flu in time. Because social webs have become part of our lives, it is a promise way to build prediction methods by mining the flu information from the webs. In addition, in view of the drastic change of flu epidemic trend, it is necessary to combine several prediction methods to provide a more accurate prediction. To this end, this paper tries to develop effective methods for flu trend prediction by model fusion and mining data from the social web. First, we collect the web data from different sources. Next, various prediction models are built by considering the delay of epidemic. Finally, those generated models are merged by model fusion to increase the accuracy and stability of prediction.
To demonstrate the effectiveness of the proposed method, we collected over 1.6 million posts from Twitter in England and the flu-related keywords search statistics from Google Trends for experiments. Compared with the six single prediction models, the proposed method has the highest predictive relevance that shows the effectiveness of the method. In order to understand the stability of various models, the data will be divided into "dramatic up-down" area and "slow up-down" area. The results show that the method has the highest correlation with the second highest in the two regions, and the other single models show inconsistent prediction effects, indicating that the method can produce more stable prediction results. Based on the above results, the proposed method of this study does contribute to the early warning of influenza surveillance and establish more anti-epidemic defense lines.
Keywords: Flu Monitoring、Model Fusion、Linear Regression、Social Web