Predicting Severity of COVID-19 Patients

April 1, 2020
· 5 min read

Updated: April 5, 2020

This follow up study was conducted 2 weeks after the first study (see below) and some key statistics are:

  • 161% increase in cases: 1189 known COVID-19 cases in entire Singapore
  • Higher proportion of cases being hospitalized, despite limiting non-essential gatherings: 74.5% are in hospital, 0.5% (6 patients) have died, and 25% have recovered
  • More local transmissions than imported cases, as a result of banning all short-term visitors from entering Singapore: 54% local transmissions and 46% imported cases

In this 676-patient sample, there are significantly more serious cases now: 547 (81%) are or had been in serious condition, out of which 373 are still in hospital. We left out patients who were admitted after 25th March 2020. The new clusters of cases mostly come from communities, such worker dormitories and nursing homes.

Other than Symptomatic to Confirmation and Details, we also used features such as Displayed Symptoms, Age, Country of Origin, and Nationality. Using this 676-patient sample and these features, we achieved around 71% accuracy which is much lower than the model built 2 weeks ago. This can be directly attributed to fewer details being published online about the new cases, so most of the Symptomatic to Confirmation values are missing, and only the hospital where the patient was admitted into and prior travelled countries are made known in Details.

On 3rd April 2020 (Friday), 2 days before this follow up study, it was announced that most workplaces and all schools will be closed soon to reduce the local transmissions [1].

In addition to triaging, we have learnt that predicting severity for COVID-19 has more downstream uses, such as understanding the upcoming demand for staff, beds, and critical medical equipment. We can also extend this to other medical admission types, such as flu and pneumonia.

Original Study: March 22, 2020

Our objective is to predict the severity of a COVID-19 patient at time of hospital admission, to provide a second opinion to the triaging officer, so that more resources can be accurately allocated to a serious case [2]. We define a serious case as a patient having at least 11 days of hospitalisation which is equal to or more than the average stay of a serious COVID-19 case [3], or died in hospital.

We used Singapore as a specific example because it has more proactive testing and containment efforts than most other countries, with the intention of extending COVID-19 severity prediction to other countries. The main data source is Singapore government’s Ministry of Health (compiled from public sources [4] and [5]). This study was conducted on March 22, 2020 (Sunday) and some key statistics are:

  • 455 known COVID-19 cases in entire Singapore, with the first case confirmed on January 23, 2020
  • >75% between ages of 21 to 60
  • >70% are Singaporeans and the rest are made up of at least 25 other nationalities
  • 68% are in hospital, 31.5% have recovered, and 0.5% (2 patients) have died
  • 54% imported cases (14% from UK, 5% from US, 5% from China) and 46% local transmission

In this 190-patient sample, 119 (63%) are or had been in serious condition, out of which 44 are still in hospital which can potentially introduce some bias. We left out patients who were admitted after March 11, 2020 because we do not know if they will remain in hospital for at least 11 days. Using this sample, we achieved around 85% accuracy (AUC) on a holdout set. The top-2 features which predict severity are:

  1. Details contain textual description of each patient which is usually collected at time of hospital admission. It is about their profile, recent travel history, who and where (s)he visited, where (s)he is hospitalized, which known COVID-19 cases (s)he is connected to, when was the onset of symptoms and where (s)he has initially sought treatment etc.
    Feature Details WordCloud 2From the WordCloud above, size of word indicates frequency and red is related to higher chance of being severe. One interesting insight is that patients in the National Centre for Infectious Diseases (NCID) (330-bed purpose-built facility for infectious diseases) tend to have lower severity, compared to other public and private hospitals. NCID could receive patients of a lower severity or have better treatment, but we are not able to discern this from the current sample. Another insight is that the initial cases from China and/or Wuhan seemed to be less severe cases who were well enough to travel to Singapore.
  2. Symptomatic to Confirmation refers to the days elapsed between onset of symptoms [6] and confirmation of COVID-19.
    Feature Symptomatic to Confirmation HistogramThe Histogram above shows that if the patient shows symptoms and (s)he gets tested and confirmed much later after the symptoms, it is most likely a mild or moderate case. We have 2 explanations for this. First, for milder cases, it usually takes longer for symptoms to develop sufficiently to warrant testing and hospitalization. Second, many of these early COVID-19 patients (circa early February) visited family clinics during onset of symptoms, but were only referred to hospitals much later as COVID-19 was not as widespread (at that time mostly contained in China, Japan, Hong Kong, and Singapore).

On March 23, 2020 (Monday), a day after this study was conducted, there was a spike of 54 new cases (48 are imported from overseas). At the end of this day, Singapore has temporarily banned all short-term visitors from entering or transiting via the country [7]. On March 24, 2020 (Tuesday), Singapore has limited gatherings outside work and school to 10 or less, and closed all bars, entertainment venues, tuition and enrichment centres, and religious services [8]. For future work, we will be keen to:

  • understand the effects of these containment measures on COVID-19 severity prediction in a few weeks’ time
  • extend COVID-19 severity prediction globally to other countries, such as Taiwan, Indonesia, South Korea, China, and/or France where patient data is already available
  • may present studies on other COVID-19 patient use cases, such as hospital length-of-stay or days-to-recovery prediction to help with bed utilization management at the wards


This study has benefited from review and advice from Sergey Yurgenson.

COVID Response Effort


[1] Coronavirus: Most workplaces to close, schools will move to full home-based learning from next week, says PM Lee. Published March 3, 2020

[2] WHO: Operational considerations for case management of COVID-19 in health facility and community. Published March 19, 2020

[3] NPR: How The Novel Coronavirus And The Flu Are Alike … And Different. Published March 20, 2020

[4] Singapore COVID-19 Dashboard. The Taiwan dashboard is now available, and others such as Indonesia, South Korea, China, and France will be available. Accessed on March 22, 2020

[5] Singapore Ministry of Health – Updates on COVID-19 local situation. Accessed on March 22, 2020

[6] CDC: Symptoms of COVID-19. Published March 20, 2020

[7] Coronavirus: All short-term visitors barred from entering or transiting in Singapore from Monday, 11.59pm. Published March 22, 2020

[8] Coronavirus: All entertainment venues in Singapore to close, gatherings outside work and school limited to 10 people. Published March 24, 2020

About the author
Clifton Phua
Clifton Phua

Customer Facing Data Scientist, DataRobot

Clifton is a Customer Facing Data Scientist (CFDS) at DataRobot working in Singapore and leads the Asia Pacific (APAC)’s CFDS team. His vertical domain expertise is in banking, insurance, government; and his horizontal domain expertise is in cybersecurity, fraud detection, and public safety. Clifton’s PhD and Bachelor’s degrees are from Clayton School of Information Technology, Monash University, Australia. In his free time, Clifton volunteers professional services to events, conferences, and journals. Was also part of teams which won some analytics competitions.

Meet Clifton Phua

Matt Carrigan
Matt Carrigan

AI Success Director at DataRobot

Matt is an AI Success Director for Australia and New Zealand at DataRobot.

Meet Matt Carrigan

Sampad Desai
Sampad Desai

Customer-Facing Data Scientist at DataRobot

Sampad is a Customer-Facing Data Scientist at DataRobot working in India and servicing clients across APAC region.

Meet Sampad Desai
  • Listen to the blog
  • Share this post
    Subscribe to DataRobot Blog
    Newsletter Subscription
    Subscribe to our Blog