How to Build Trust in AI
Just as trust needs to be established in our personal and business relationships, it also needs to be established between an AI user and the system. Transformative technologies such as autonomous vehicles will be possible only when there are clear methods and benchmarks to establish trust in AI systems. At DataRobot, we define the benchmark of AI maturity as AI you can trust.
Dimensions of Trust
We organize the concept of trust in an AI system into three main categories. The first is trust in the performance of your AI/machine learning model. The second is trust in the operations of your AI system. The third is trust in the ethics of your workflow, both to design the AI system and how it is used to inform your business process.
In each of these three categories, we identify a set of dimensions that help define them more tangibly. Combining each dimension together holistically constitutes a system that can earn your trust.
When it comes to evaluating the trustworthiness of AI systems, we look at multiple facets of performance. They all serve to answer the question, “How well can my model make predictions based on data?” In performance, the trust dimensions are the following:
- Data quality — the performance of any machine learning model is intimately tied to the data it was trained on and validated against. So, we ask, what recommendations and assessments can you use to verify the origin and quality of the data used? How can identifying gaps or discrepancies in the training data help you build a more trustworthy model?
- Accuracy — this refers to a subset of model performance indicators that measure a model’s aggregated errors in different ways. It’s multidimensional, so to understand accuracy holistically, you need to evaluate it through multiple tools and visualizations.
- Speed — for model performance, speed refers to the time it takes to use a model to score a prediction. The speed of model scoring directly impacts how you can use it in a business process. How large is the data set? How often is the process run, monthly or daily? How quickly is a prediction required? All of these variables play a role in determining the prioritization of speed and accuracy.
- Robustness and stability — how do you ensure that your model will behave in consistent and predictable ways when confronted with changes or messiness in your data? Testing your model to assess its reproducibility, stability, and robustness forms an essential part of its overall evaluation.
Best practices around the operation of a system (the software and people that interact with a model) are as pivotal to its trustworthiness as the design of the model itself. In operations, these are the dimensions of trust:
- Compliance — there are generally three domains in which model risk management and regulatory compliance must be established: model development, implementation, and use. Robust documentation throughout the end- to-end modeling workflow is one of the strongest enablers of compliance.
- Security — large amounts of sensitive data are analyzed or transmitted with AI systems. Independent and international standards, such as ISO 27001, exist to verify an information security management system’s operation.
- Humility — an AI prediction is fundamentally probabilistic. Therefore, not all model predictions are made with the same level of confidence. Recognizing and admitting uncertainty is a major step in establishing trust.
- Governance and monitoring — governance in AI is the formal infrastructure of managing human-machine interaction. To earn trust, it is critical that a clear system of monitoring, accountability, and redundancy be in place, including the joint oversight and collaboration of your information technology specialists, data scientists, and business users.
- Business rules — knowing when and how a business should use an AI model and outputting information around model confidence can also contribute to trustworthiness.
AI systems and the data they use can have an impact all over the world. It’s important that they reflect the values of multiple stakeholders with different perspectives. The dimensions of trust in ethics are:
- Privacy — individual privacy is a fundamental right, but it is also complicated by the use and exchange of data. The first step is understanding what kind of data may be defined as personally identifiable information (PII). Best practices in information security must be embraced and incorporated into any system.
- Bias and fairness — it starts with understanding what it means for an AI model to be biased. Next is understanding where that bias came from. The largest source of bias in an AI system is the data it was trained on. Machine learning learns from data, but that data comes from us, our decisions and systems. Then understanding how to measure the bias becomes important and, ultimately, enables opportunities to mitigate issues of bias uncovered.
- Explainability and transparency — how can these two linked properties facilitate the creation of a shared understanding between machine and human decision-makers? Explainability is one of the most intuitively powerful ways to build trust between a user and a model. Being able to interpret how the model works and makes decisions is a major asset to your final evaluation.
- Impact — when you are evaluating the real value that machine learning adds to a use case, an impact assessment is a powerful tool for your organization to use. It can reveal the true impact that a model has on your organization and on the individuals affected by it.
There is no universally agreed-upon ethical standard that can anticipate and head off every issue the development and use of an AI system may entail. But, with forethought and an understanding of the dimensions of trust (accuracy, robustness and stability, security, privacy, governance, humility, bias and fairness, explainability, and more), AI systems that reflect our values and deserve our trust are possible.