DataRobot Taiwan BG feature image v1.0 01

Understanding the Effective Management of COVID-19 in Taiwan

June 5, 2020
· 4 min read


As of April 19, 2020, Taiwan has one of the lowest number of confirmed COVID-19 cases around the world at 419 cases1, of which 189 cases have recovered. The major infection clusters in March 2020 are imported from two major regions such as the United States and United Kingdom. These imported clusters are unlikely to cause local transmissions, since all confirmed cases need to stay in hospital until multiple test results are negative.

As early as February 2020, Taiwan banned mainland China visitors from entering the country2. Given the fact that it is geographically close to mainland China, the infection epicenter, this was an important step by the Taiwanese government to control the spread of COVID-19 cases. With the increase in cases globally, the Taiwanese government placed additional controls starting from March 14, 2020, limiting the immigration entry to Taiwanese only and encouraging the use of face masks. In addition, they were required to practice a 14-day home quarantine after entering Taiwan3.

The Problem

With tight Taiwanese government policies on COVID-19 and strong health awareness among the Taiwanese population, we would like to understand their positive impact on the daily confirmed number of COVID-19 cases. This is challenging because:

  1. Only 64 out of 100 days analyzed have confirmed cases. With the limited data availability, this poses a challenge for machine learning problems
  2. To add on to the challenge, the Taiwanese government cannot publish patient details, such as age, home address, and when the patient has recovered

The published data includes the patients’ registered residential regions, travel history, and the infection cluster or group a patient belongs to. Therefore, additional public data had to be extracted for predictive modelling and insights generation.

Approaches and Results

The additional public data and extracted features include:

  • Taiwan COVID-19 government policies
    1. Taiwan Centers for Disease Control (CDC) published data: Two numeric and two text features extracted—daily confirmed cases, infection cluster information, related Mandarin news on the policies, and the total number of people tested
    2. Ministry of the Interior’s National Immigration Agency4 immigration data: One numeric feature extracted – number of daily inflow passengers
  • Health awareness among Taiwanese
    1. Google search trends5 in Taiwan for COVID-19 related keywords in traditional Mandarin: 45 numeric features extracted – 新冠肺炎 口罩 (COVID-19 face mask), 新冠肺炎 居家隔離 (COVID-19 home isolation) and 武漢肺炎 症狀 (Coronavirus symptoms)

Additional feature engineering steps were taken to create potentially useful features. When a new cluster is found, or when the frequency of news being published increases, it could have a correlation with the number of confirmed cases.

  • New cluster indicator: A binary flag indicating if a new cluster is found on that day
  • Number of new infection clusters found: A numeric feature indicating the number of new infection clusters found on that day
  • Total number of news items from CDC: A numeric feature indicating the total number of news items published by Taiwan CDC    

A total of 54 features and 100 rows were used in the predictive modeling and insights generation. Each row represents a day, along with the corresponding characteristics.

Table 1 Snapshot of the training data
Table 1: Snapshot of the training data

The average number of cases per day is four, with 36 days out of 100 days having zero confirmed cases. Notably, from Chart 1 below, there are two spikes in the number of confirmed cases:

  1. March 18-31, 2020: (total 245 cases, 58% of the total cases), caused by the surge in the number of residents coming back from overseas6 (imported cases)
  2. April 18, 2020: (total 22 cases, 5% of total cases), caused by a navy ship cluster7
Chart 1 Number of daily confirmed cases over time
Chart 1: Number of daily confirmed cases over time

We tried two different approaches to predict the daily new confirmed cases: Automated Machine Learning and Automated Time Series. Both approaches were run using AutoPilot capabilities with DataRobot’s default settings in the first iteration. Subsequent iterations were run based on different feature lists in order to find the optimal set of features. For the Automated Machine Learning approach, stratified sampling is applied to set aside 20% of the holdout set, and remaining 80% on five-folds validation. Out-of-time approach is not used in this analysis as we do not have sufficient number of days to build a model with more than one backtest.

Approaches to COVID research
Approaches to COVID research
Chart 2 Feature Impact chart with redundant features highlighted in yellow
Chart 2: Feature Impact chart with redundant features highlighted in yellow

Based on our Automated Machine Learning approach, the most impactful feature is the Google search frequency for “home isolation (居家隔離)” 1 day before. Having all else equal, higher search frequency for “home isolation (居家隔離)” represents a higher number of confirmed cases. However, as the search frequency increases past 5% for “home isolation (居家隔離),” the number of confirmed cases remains the same. It is also found that in places where a larger number of people frequently gather for a long period of time such as “workplace” and “school,” has a higher number of confirmed cases compared to “family” and “tour.”


To improve model accuracy, additional features such as the daily sales volume of face masks and hand sanitizers can be used. Since 87.8% of the confirmed cases are imported, a more granular number of returning residents, (e.g., breakdown by country), can also be included.

Even though Taiwan COVID-19 government policies were not top features in our models, it is widely reported that the Taiwanese leadership (such as their Vice President8 and Health Minister9) has received international recognition for its effectiveness and seen as a model for other countries to emulate.

COVID-19 Response: DataRobot is offering services pro bono
Learn more


This article was reviewed by Weizhong Toh, Jean Tsai, and Javier Lombana, benefited from edits by Kathleen McMorrow.


1 Taiwan Centers for Disease Control. Accessed on May 19, 2020

2 Chinese Residents to be Prohibited from Entering Taiwan. Accessed on May 19, 2020

3 Inbound Travelers will be Subject to a 14-day Period of Home Quarantine after Entering Taiwan. Accessed on May 19, 2020

4 Taiwan Immigration Data. Accessed on May 19, 2020

5 Google Coronavirus Search Trends. Accessed on May 19, 2020

6 Taiwan’s total number of imported cases and local transmission. Accessed on May 19, 2020

7 Taiwan President Apologizes After 28 Navy Sailors Infected in COVID-19 Cluster. Accessed on May 19, 2020

8 Taiwan’s Weapon Against Coronavirus: An Epidemiologist as Vice President. Accessed on May 19, 20209Taiwan Rewards Health Minister Chen Shih-chung’s Coronavirus Success Story. Accessed on May 19, 2020

About the author
Clifton Phua
Clifton Phua

Customer Facing Data Scientist, DataRobot

Clifton is a Customer Facing Data Scientist (CFDS) at DataRobot working in Singapore and leads the Asia Pacific (APAC)’s CFDS team. His vertical domain expertise is in banking, insurance, government; and his horizontal domain expertise is in cybersecurity, fraud detection, and public safety. Clifton’s PhD and Bachelor’s degrees are from Clayton School of Information Technology, Monash University, Australia. In his free time, Clifton volunteers professional services to events, conferences, and journals. Was also part of teams which won some analytics competitions.

Meet Clifton Phua

Jessie Lan
Jessie Lan

Customer Facing Data Scientist, DataRobot

Jessie is a Customer-Facing Data Scientist (CFDS) at DataRobot working in Taipei, Taiwan.

Meet Jessie Lan

Hwee Theng Yeo
Hwee Theng Yeo

Customer Facing Data Scientist, DataRobot

Hwee Theng Yeo is a Customer-Facing Data Scientist (CFDS) at DataRobot working in Singapore.

Meet Hwee Theng Yeo
  • Listen to the blog
  • Share this post
    Subscribe to DataRobot Blog
    Newsletter Subscription
    Subscribe to our Blog