Score Incoming Job Applicants

Industry Agnostic Human Resource Improve Company Culture Binary Classification Blend End to End Other
Identify the most-qualified candidates from a broader pool of job applicants.
Build with Free Trial

Overview

Business Problem

An organization’s personnel are key to its success, but the right people can be hard to find. SHRM (the Society for Human Resource Management) estimated in 2017 that between applicants, referrals, and agencies, there are 100 applicants to every hire.

Recruiters dealing with high numbers of applicants are forced to process them extremely quickly—the notorious “six seconds” per resume rule—rather than spending time going deeper with the best candidates and using their time to craft a compelling value proposition for the candidate. Automated screenings can also give applicants results more quickly and dramatically speed up the hiring process.

Intelligent Solution

With AI, organizations can identify candidates who have the right background and credentials to be successful in the role. The explainable insights from AI models (e.g., the relative importance of education vs job experience for new entry-level hires) can provide valuable guidance for recruiters and hiring managers. Prediction Explanations (e.g., individual callouts highlighting what makes someone a particularly strong candidate) could also be used to inform a new hire’s onboarding process to proactively address any relative weaknesses.

Finally and most importantly, models are explainable, consistent, and can be documented to ensure compliance with regulatory guidelines and ensure fairness to applicants.

IMPORTANT: Many countries have laws in place to protect employees from discrimination in regards to hiring or employment decisions. Besides that, fairness is the right thing to do. It is incredibly important that you work closely and proactively with your organization’s HR and Legal and Compliance teams to ensure that the models you build will pass legal and ethical scrutiny before models are put into production (View the business implementation tab to learn more about this use case and Trusted AI).

Value Estimation

How would I measure ROI for my use case? 

Most HR departments track average cost per hire. This metric combines both internal and external factors (recruiter time, agency fees, recruiting bonuses, employee travel, etc.) and represents the total amount of money and time spent to bring a new hire into the organization. 

SHRM’s 2016 Human Capital Report benchmarked average cost per hire at $4,179. (This will vary by industry and job role, e.g., entry level roles will be lower.) If a machine learning algorithm can reduce the total candidate pool by 30% at the beginning of the hiring process, that can save recruiters time and dramatically reduce cost per hire. A 10% reduction in total cost per hire by reducing the demands on recruiters’ time would equate to over $400 saved per hire brought into the organization.

Technical Implementation

Problem Framing

A typical target variable for this use case is to predict whether an applicant will pass a recruiter screen, which is a binary classification problem. This prediction is usually a preliminary review done by the recruiter before passing an applicant to the hiring manager for consideration. 

However, defining a target can become complex and will need to be adapted to your process as it depends on the data your organization may or may not have. While many organizations ultimately want to predict a hire decision or even on-the-job performance, there may be data limitations based on how many people were actually hired into that role.

The target, i.e., the “end result” you are trying to predict, will define what features are included in the model; if the goal is to predict which new applicants will get passed by a recruiter to a hiring manager, then we cannot use hiring manager feedback in a model because, in practice, that feedback won’t be available yet. If the target is instead a hiring decision, then the model will do best when hiring manager feedback is included. The model should be trained on the available data at the time the decision is made.

These are some of the recommended features needed to build the model; you can add or remove features based on the nature of the data available and the requirements of the model you are trying to build. 

  • Numeric and categorical features from a structured application (e.g., previous job history, employers, education credentials)
  • Resume data, if available
  • Source of application
  • If required, external tools or resume parsers can be used to do pre-processing and provide additional structure to raw applicant data.

These datasets usually come from Greenhouse or a similar ATS (Applicant Tracking System). For jobs in which candidates don’t generally provide a resume, any kind of completed job application can be used provided it’s in machine-readable format (e.g., avoid scanned PDFs).

Sample Feature Set
Feature NameData TypeDescriptionData SourceExample
Pass_ScreenBinary (Target)Whether the applicant passes the hiring manager screen for a given roleATSTrue
Application SourceCategoricalSource of the application ATSEmployee referral
Highest degree attainedCategoricalHighest educational credentialATS2-year college degree
Previous employersTextList of previous employersATSBilly Jo’s Pizza
Educational studiesText or CategoricalDropdown or user-entered text describing educational studyATSBusiness Management
ResumeTextRaw resume text (if available)ATS (may need to be converted from PDF)
Questions asked on a job pageNumeric or Categorical“How many years experience do you have working directly with customers?”ATS
Job descriptionTextDescription of the position being hired forJob postings
Data Preparation 

To prepare the data, applicant data from an ATS is converted to machine readable as needed (e.g., text fields are extracted from a PDF document). Each row of the training data represents an application rather than an applicant, as applicants may apply to different positions or to the same position multiple times. Any external data sources are considered and added in as new features.

For an applicant scoring model to be accurate, it should be specific. Similar roles can be grouped together, but fundamentally different roles should be trained with different models. This is where automation and iteration are helpful. For instance, a model trained on hires within a specific geography might reveal more concrete insights (e.g., a certain university is a good feeding program for new Analysts) than a national model. 

We should also be careful to exclude people from our training data who “failed” the recruiter screen but were actually qualified. Recruiters may decide not to interview applicants for a variety of reasons unrelated to their qualifications, including because the candidates themselves expressed that they weren’t interested. This data can usually be found in an Applicant Tracking System the (ATS). 

Model Training

DataRobot Automated Machine Learning automates many parts of the modeling pipeline. Instead of hand-coding and manually testing dozens of models to find the one that best fits your needs, DataRobot automatically runs dozens of models and finds the most accurate one for you, all in a matter of minutes. In addition to training the models, DataRobot automates other steps in the modeling process such as processing and partitioning the dataset.

While we will jump straight to deploying the model, you can take a look here to see how DataRobot works from start to finish and to understand the data science methodologies embedded in its automation. 

A few key modeling decisions for this use case:

  • Partitioning: Hiring practices change over time in response to both the macroeconomic environment and organizational initiatives / hiring practices. An OTV (out of time validation) partitioning scheme will evaluate model performance on the most recent data and give a more accurate benchmark to how well the model will perform when deployed.
  • Setting a threshold:  If the model is to be used as a pass/fail screen, explore the false positive and negative rates at different thresholds. Ultimately, organizational demand will also help determine the threshold. For example, if the hiring pipeline is sparse, the needs of the organization might necessitate a lower threshold (e.g., more candidates passed) than the optimal one determined in training. 
  • Accuracy metrics: If the model is being used to stack-rank applicants, consider using AUC in addition to LogLoss as a measure of performance for binary classification. 

Business Implementation

Decision Environment 

After you finalize a model, DataRobot makes it easy to deploy the model into your desired decision environment. Decision environments are the methods by which predictions will ultimately be used for decision making.

Decision Maturity 

Automation | Augmentation | Blend 

There are many ways to implement a hiring model in practice. Some organizations use a hiring model as a pass/fail screening tool to cut down on the number of applications that recruiters are required to read and review. This has the advantage of giving candidates an answer more quickly.

Other organizations use the score as a way to stack-rank applicants, allowing recruiters to focus on the most promising candidates first. The most sophisticated is a blended approach: set a relatively low pass/fail barrier so that the “automatic no” values are removed from the pipeline up front. From there, provide recruiters the scores and the Prediction Explanations to help them make better decisions faster. 

Model Deployment

All new applicants to a role should be scored on a batch basis (e.g., one batch request per hour). Predictions and Prediction Explanations should be returned and saved in the database underlying the Applicant Tracking System. 

Decision Stakeholders
  • Decision Executors: Recruiters are the most direct consumers and will use the predictions on a daily or weekly basis.
  • Decision Managers: Hiring managers and ultimately Chief Human Resources Officers and their teams are responsible for making sure that decisions are being made correctly.
  • Decision Authors: Data scientists, HR analysts, and industrial/organizational psychologists are all well-positioned to build the models in DataRobot. IT Support or vendors can be brought in if there are particular data processing challenges (e.g., PDFs).
Decision Process

Here are some examples of decisions you can take using the predictions generated from the model.

  • Remove candidates who don’t meet minimum qualifications from the hiring pipeline.
  • Review the most promising candidates first.
  • Accelerate the review process for recruiters and hiring managers by leveraging explainable insights to highlight each candidate’s comparative strengths and weaknesses.
Model Monitoring 

Models should be retrained when data drift tracking shows significant deviations between the scoring and training data. In addition, if there are significant changes to the role (e.g., a change in the requirements of a position), the model will need to be refreshed. In that case, teams may have to manually rescore historical applicants against the updated requirements before model retraining can occur. 

Finally, think carefully about how to evaluate model accuracy. If a model imposes a pass/fail requirement but failing applicants are never evaluated by the recruiters, then we will track False Positives (applicants predicted to pass who did not) but not False Negatives (applicants rejected who would have passed). In a blended scenario (stack-ranking + scores), the model is directly influencing the recruiters’ decision making, which would tend to make the model seem more accurate than it is. 

The best way to evaluate accuracy is to have recruiters score a certain number of applicants independently and evaluate the model accuracy based on those cases.

Trusted AI 

In addition to traditional risk analysis, AI Trust is essential for this use case. 

Bias & Fairness: HR decision makers need to be aware of the risks that come with automating decision making within HR. Specifically, models trained on historically biased hiring practices can learn and reflect those same biases. It is incredibly important to make sure that your organization involves the right decision makers and content experts when building models to ensure that they remain fair.

However, there is also opportunity here. Upturn (a think tank focused on justice in technology) published guidelines for ethical AI in hiring. They note both the risks and opportunities of using AI in this space, suggesting that “with more deliberation, transparency, and oversight, some new hiring technologies might be poised to help improve on [our current ethical] baseline.”

The key, they argue, is explainability. Machine learning in this space must be transparent, documented, and explainable. Using the suite of explainability tools in DataRobot, non-data scientist HR teams can understand:

  • what features are important in a model
  • model performance for key segments or demographic groups
  • how the models are actually working in practice
  • individual-specific explanations of *why* a model is returning the score it does 

This is particularly important for free text fields, where it is essential to understand and actively control what words and phrases the model is allowed to learn from. Importantly, these models are not “black boxes;” rather, they are fully transparent and controlled by the organization.

In addition to explainability, bias testing should be part of model evaluation. One bias test that may be appropriate is statistical parity. With statistical parity, your goal is to measure if different demographic groups have an equal probability of achieving a favorable outcome. In this case, that would mean testing whether protected groups (e.g., race, gender, ethnicity) pass the recruiting screen at equivalent rates. In US law, the four-fifths rule is generally used to determine if a personnel process demonstrates adverse impact. A selection rate for a group less than four-fifths (80%) of the rate for another comparable group is evidence of adverse impact.

Note: Leaders interested in ethics and bias should also consider attending DataRobot’s course on Ethical AI, which teaches executives how to identify ethical issues within machine learning and develop an ethics policy for AI. 

banner purple waves bg

Experience the DataRobot AI Platform

Less Friction, More AI. Get Started Today With a Free 30-Day Trial.

Sign Up for Free
build models
Explore More Industry Agnostic Use Cases
AI can help organizations across the board, no matter their industry, with a variety of internal and external challenger - from driving operational efficiency and optimizing expenditures to transforming marketing activities and improving forecasting.

Explore More Use Cases