Understanding the Effective Management of COVID-19 in Taiwan
As of April 19, 2020, Taiwan has one of the lowest number of confirmed COVID-19 cases around the world at 419 cases1, of which 189 cases have recovered. The major infection clusters in March 2020 are imported from two major regions such as the United States and United Kingdom. These imported clusters are unlikely to cause local transmissions, since all confirmed cases need to stay in hospital until multiple test results are negative.
As early as February 2020, Taiwan banned mainland China visitors from entering the country2. Given the fact that it is geographically close to mainland China, the infection epicenter, this was an important step by the Taiwanese government to control the spread of COVID-19 cases. With the increase in cases globally, the Taiwanese government placed additional controls starting from March 14, 2020, limiting the immigration entry to Taiwanese only and encouraging the use of face masks. In addition, they were required to practice a 14-day home quarantine after entering Taiwan3.
With tight Taiwanese government policies on COVID-19 and strong health awareness among the Taiwanese population, we would like to understand their positive impact on the daily confirmed number of COVID-19 cases. This is challenging because:
- Only 64 out of 100 days analyzed have confirmed cases. With the limited data availability, this poses a challenge for machine learning problems
- To add on to the challenge, the Taiwanese government cannot publish patient details, such as age, home address, and when the patient has recovered
The published data includes the patients’ registered residential regions, travel history, and the infection cluster or group a patient belongs to. Therefore, additional public data had to be extracted for predictive modelling and insights generation.
Approaches and Results
The additional public data and extracted features include:
- Taiwan COVID-19 government policies
- Taiwan Centers for Disease Control (CDC) published data: Two numeric and two text features extracted—daily confirmed cases, infection cluster information, related Mandarin news on the policies, and the total number of people tested
- Ministry of the Interior’s National Immigration Agency4 immigration data: One numeric feature extracted – number of daily inflow passengers
- Health awareness among Taiwanese
- Google search trends5 in Taiwan for COVID-19 related keywords in traditional Mandarin: 45 numeric features extracted – 新冠肺炎 口罩 (COVID-19 face mask), 新冠肺炎 居家隔離 (COVID-19 home isolation) and 武漢肺炎 症狀 (Coronavirus symptoms)
Additional feature engineering steps were taken to create potentially useful features. When a new cluster is found, or when the frequency of news being published increases, it could have a correlation with the number of confirmed cases.
- New cluster indicator: A binary flag indicating if a new cluster is found on that day
- Number of new infection clusters found: A numeric feature indicating the number of new infection clusters found on that day
- Total number of news items from CDC: A numeric feature indicating the total number of news items published by Taiwan CDC
A total of 54 features and 100 rows were used in the predictive modeling and insights generation. Each row represents a day, along with the corresponding characteristics.
The average number of cases per day is four, with 36 days out of 100 days having zero confirmed cases. Notably, from Chart 1 below, there are two spikes in the number of confirmed cases:
- March 18-31, 2020: (total 245 cases, 58% of the total cases), caused by the surge in the number of residents coming back from overseas6 (imported cases)
- April 18, 2020: (total 22 cases, 5% of total cases), caused by a navy ship cluster7
We tried two different approaches to predict the daily new confirmed cases: Automated Machine Learning and Automated Time Series. Both approaches were run using AutoPilot capabilities with DataRobot’s default settings in the first iteration. Subsequent iterations were run based on different feature lists in order to find the optimal set of features. For the Automated Machine Learning approach, stratified sampling is applied to set aside 20% of the holdout set, and remaining 80% on five-folds validation. Out-of-time approach is not used in this analysis as we do not have sufficient number of days to build a model with more than one backtest.
Based on our Automated Machine Learning approach, the most impactful feature is the Google search frequency for “home isolation (居家隔離)” 1 day before. Having all else equal, higher search frequency for “home isolation (居家隔離)” represents a higher number of confirmed cases. However, as the search frequency increases past 5% for “home isolation (居家隔離),” the number of confirmed cases remains the same. It is also found that in places where a larger number of people frequently gather for a long period of time such as “workplace” and “school,” has a higher number of confirmed cases compared to “family” and “tour.”
To improve model accuracy, additional features such as the daily sales volume of face masks and hand sanitizers can be used. Since 87.8% of the confirmed cases are imported, a more granular number of returning residents, (e.g., breakdown by country), can also be included.
Even though Taiwan COVID-19 government policies were not top features in our models, it is widely reported that the Taiwanese leadership (such as their Vice President8 and Health Minister9) has received international recognition for its effectiveness and seen as a model for other countries to emulate.
This article was reviewed by Weizhong Toh, Jean Tsai, and Javier Lombana, benefited from edits by Kathleen McMorrow.
1 Taiwan Centers for Disease Control. Accessed on May 19, 2020
2 Chinese Residents to be Prohibited from Entering Taiwan. Accessed on May 19, 2020
3 Inbound Travelers will be Subject to a 14-day Period of Home Quarantine after Entering Taiwan. Accessed on May 19, 2020
4 Taiwan Immigration Data. Accessed on May 19, 2020
5 Google Coronavirus Search Trends. Accessed on May 19, 2020
6 Taiwan’s total number of imported cases and local transmission. Accessed on May 19, 2020
7 Taiwan President Apologizes After 28 Navy Sailors Infected in COVID-19 Cluster. Accessed on May 19, 2020
8 Taiwan’s Weapon Against Coronavirus: An Epidemiologist as Vice President. Accessed on May 19, 20209Taiwan Rewards Health Minister Chen Shih-chung’s Coronavirus Success Story. Accessed on May 19, 2020
Clifton is a Customer Facing Data Scientist (CFDS) at DataRobot working in Singapore and leads the Asia Pacific (APAC)’s CFDS team. His vertical domain expertise is in banking, insurance, government; and his horizontal domain expertise is in cybersecurity, fraud detection, and public safety. Clifton’s PhD and Bachelor’s degrees are from Clayton School of Information Technology, Monash University, Australia. In his free time, Clifton volunteers professional services to events, conferences, and journals. Was also part of teams which won some analytics competitions.
Jessie is a Customer-Facing Data Scientist (CFDS) at DataRobot working in Taipei, Taiwan.
Hwee Theng Yeo is a Customer-Facing Data Scientist (CFDS) at DataRobot working in Singapore.
We will contact you shortly
We’re almost there! These are the next steps:
- Look out for an email from DataRobot with a subject line: Your Subscription Confirmation.
- Click the confirmation link to approve your consent.
- Done! You have now opted to receive communications about DataRobot’s products and services.
Didn’t receive the email? Please make sure to check your spam or junk folders.
Optimizing Large Language Model Performance with ONNX on DataRobot MLOpsJune 1, 2023· 11 min read
Belong @ DataRobot: AAPI Heritage Month with the ACTnow! CommunityMay 25, 2023· 3 min read
Deep Learning for Decision-Making Under UncertaintyMay 18, 2023· 5 min read
Our Objective We want to predict the days-to-recovery (or days-to-discharge) of a COVID-19 patient at time of diagnosis, especially if it is a mild case. While COVID-19 is a global pandemic, thousands have recovered from it and are no longer infectious. For many who may not be feeling sick despite being COVID-19 positive, one of the key questions on their…
Updated: April 5, 2020 This follow up study was conducted 2 weeks after the first study (see below) and some key statistics are: 161% increase in cases: 1189 known COVID-19 cases in entire Singapore Higher proportion of cases being hospitalized, despite limiting non-essential gatherings: 74.5% are in hospital, 0.5% (6 patients) have died, and 25% have recovered More local transmissions…
As a majority of counties have already detected COVID-19 cases, today (4/1/2020) is our last update. Our data science team is switching to other projects related to COVID-19. With the fight against COVID-19 spreading across the U.S. and the world, DataRobot understands it is essential that federal government entities convey accurate information to citizens, local governments, and healthcare providers. Towards…