Forecast Patient Volume to Improve Staffing

Healthcare Operations Patient Experience Decrease Costs Augmentation Demand Forecasting End to End Time Series
Census or patient admission forecasting helps healthcare providers optimize their staffing and resource needs.
Request a Demo


Business Problem

Hospitals and healthcare facilities schedule clinical staff according to the volume of patients and admissions that come into the facility. The true volume is only available in real-time (as it happens) so schedules created for future needs run the risk of being inaccurate as admissions increase or decrease unexpectedly.

Additionally, because information about future admissions and patient volumes isn’t available, administrators cannot order and stock the correct amount of resources and equipment; this can leave the hospital with insufficient supplies to handle an unexpected uptick in patient admissions.

To address this, administrators currently rely on simple models based on moving averages and historical data from similar time periods.

Intelligent Solution

Knowing the expected daily admission or census rate – weeks or months ahead of time – helps providers staff appropriately according to some sort of ratio that is suitable for the facility’s operations. With AI, a hospital can build time series models to predict daily patient volume 14-30 days in advance, giving enough lead time for them to appropriately staff and resource a given facility/department so that efficiencies are maximized and not underserved. Additionally, providers could also use the patient volume forecast to appropriately order and stock resources and equipment.

While the patient forecast volume can effectively be produced by an AI model, the appropriate ratio for staffing or resources is typically done by looking at historical ratios and using subject matter expertise.

Value Estimation

What has ROI looked like for this use case?

Just a 1% reduction in registered nurses hours paid per patient day netted $2 million in savings per year, for just eight of the 38 hospitals in Steward’s network.

How would I measure ROI for my use case? 

ROI is usually calculated by taking some percentage reduction of the number of nursing (or other clinical staff) hours for a given provider. Conservatively, you can expect to see savings that are in the range of a 3-5% reduction in staffing. ROI will vary depending on the organization, the type of staff they utilize, and their use of contracted clinicians.

Technical Implementation

About the Data 

For illustrative purposes, we have created a synthetic dataset to show how DataRobot can help providers use time series models to forecast patient volume.  

Problem Framing

The most important thing is to make sure that the data is in a format so that the dataset includes records for date and facility/department, allowing us to leverage the dataset using time series models. 

The target variable we are forecasting in this example is daily patient volume for each facility, which is typical when forecasting the number of admissions of patients for immediate care facilities and emergency departments. In other scenarios, the problem can also be framed to forecast the daily beds utilized.  

A good model can be built with just the first three features. Although adding a calendar file or exogenous and known in advance features will help model performance, they are not typically needed to build a viable model.

Sample Feature List
Feature NameData TypeDescriptionData SourceExample
DayDateDay of census or admissionEDW2013-1-1
Facility/DepartmentTextFacility or department of patient volumeEDWGotham Central Hospital
VolumeFloatNumber of patientsEDW230
Capacity/Beds*FloatNumber of beds or rooms available (should be Known in Advance)EDW275
Data Preparation

In the real world, the appropriate amount of data to build a good patient forecast tends to be 3-4 years worth. However,  you also need to make some adjustments to your data. For example, if your data includes the age of facilities/departments and there are newer facilities/departments, you may need to model them separately or remove them completely from the model. (DataRobot’s Series Accuracy tool can help you determine this.) Also, if capacity was added to a facility (number of beds or rooms) this will have to be captured as a “known in advance” feature.

Model Training

DataRobot automates many parts of the modeling pipeline, so for the sake of this tutorial we will be more focused on the specific use case rather than the generic parts of the modeling process.

Interpreting the Results

Feature Impact illustrates how much the calendar file matters. Here we can see that facility and variations of dates have strong impact on patient volume. Most of the time much of the signal will come also from the target variable. 

Feature Effects helps by explaining the shape of those relationships. 

Evaluate Accuracy

These models are often going to be optimized on RMSE or Gamma Deviance, but to make them easier to interpret and share with business users,  MAE is often a nice metric to evaluate. To determine if you need to break up your use case into separate projects, try looking at accuracy across the series and across the forecast horizons.

Evaluating with MAE

In the interest of evaluating these graphics to see if additional projects and models need to be built, you can see that “Gotham Central Hospital” performs worse than the other locations. In this case it may make sense to create a new dataset with just the “Gotham Central Hospital” data and rerun a single-series project to see if you can get more accuracy.

In this graph we can see that the model is less accurate when predicting further forecast distances. This is a normal curve, however if you notice dramatic shifts in accuracy at certain forecast distances it may make sense to take the same data and create a separate project where the Forecast Distance starts at the point where we see the shift.

It’s also worth looking at the Accuracy Over Time plots to see if there are any periods of time where the model is performing poorly. If you see certain points of time where the model is performing poorly, you may want to go back and reposition your back tests to cover this time.


For post processing the biggest consideration is often whether or not to include a prediction interval and if so, at what size. With models like this it’s expected that, at times, some predictions may need to be manually adjusted by clinical and operational experts.

Business Implementation

Decision Environment

After you are able to find the best model, DataRobot makes it easy to deploy the model into your desired decision environment. Decision environments are the ways in which the predictions generated by the model will be consumed by the appropriate stakeholders in your organization, and how these stakeholders will make decisions using the predictions to impact the overall process. 

Decision Maturity 

Automation | Augmentation | Blend

In this use case, we use DataRobot’s Automated TimeSeries (AutoTS) to forecast demand for each facility or department. While these forecasts can be automatically generated, the output will be consumed by your clinicians and they will be the ones to make the final decision for how many clinicians need to be staffed. 

Model Deployment

The forecasted patient volumes are often materialized through some sort of dashboard or reporting tool. The best option here is to deploy the model via the API and access the model through your data prep or ETL workflow, and then write the predictions down to a reporting warehouse or file storage to be accessed by the reporting or dashboard tool. The forecasts should be intuitively displayed for clinicians so they can easily consume them to make decisions.

Decision Stakeholders

Decision Executors

Clinical and operations leaders

Decision Managers

Clinical and operations leaders

Decision Authors

Data Scientists, Clinical liaisons, Operational experts and Engineers

Decision Process

Operational and clinical experts will use these forecasts to adjust staffing and resources with ROI coming from an increase in efficiencies and better utilization of resources. The forecasted patient volume will serve as a benchmark for how many clinicians need to be staffed. The provider can set its own ratio for how many clinicians should be on duty for a given amount of patients. Clinicians can view these results at a regular cadence on a dashboard tool to constantly monitor staffing requirements for the next several weeks or months. 

Model Monitoring 

Models should be tracked by evaluating drift, but with these models there is often a human-in-the-loop in terms of understanding if there are any systemic changes in the environment that require significant changes to the model or more manual intervention (e.g,. COVID-19).

Implementation Risks

The models will need to be updated if there are any systemic changes as it may lead to data drift; this may involve some manual interventions, such as manual forecasts for a temporary model or switching to a simplistic naive model. As with the changes brought forth by the COVID-19 pandemic, providers will need to retrain their models for them to reflect the current environment. 

Operations and clinical folks who are not familiar with advanced techniques in forecasting may be hesitant about moving to a machine learning model. It’s important that everyone understands the strengths and reliability of AI.

End users must have appropriate access to the reporting or visualization software, and the model must be integrated in the appropriate place within business operations. For ad-hoc materialization, the Excel add-in may be a good option (and you can find more information in the DataRobot Platform Documentation).

banner purple waves bg

Experience the DataRobot AI Platform

Less Friction, More AI. Get Started Today With a Free 30-Day Trial.

Sign Up for Free
Explore More Healthcare Use Cases
Healthcare companies are using machine learning and AI to increase top and bottom line through gaining competitive advantages, reducing expenses, and improving efficiencies. They are optimizing all areas of their business from readmission risk and occupancy rates to marketing, in order to make data-driven decisions that lead to increased profitability.