-
Step 1:
Data Identification
Data is the fuel that drives high-scale innovation with AI. Ironically, many organizations struggle to use their data effectively because of the overwhelming number of data sources that are available, and no clear way to identify the most trusted sources of data. The AI Catalog inside DataRobot serves as a centralized source of truth for data engineers, data stewards, data scientists, and analysts to gain self-service access to AI assets they can trust.
Introducing DataRobot AI Catalog
-
Step 2:
Data Preparation
머신러닝 알고리즘은 저마다 다르게 작동하며 데이터 요건도 서로 다릅니다. 예를 들어, 어떤 알고리즘은 정규화에 수치형 특징이 필요하지만 일부 알고리즘은 그렇지 않습니다. DataRobot은 원시 데이터를 각 알고리즘이 최적의 성능을 위해 필요로 하는 특정 형식으로 변환한 다음 데이터 분할 모범 사례를 따릅니다.
모델 블루프린트는 DataRobot에 어떻게 가치를 더합니까?
-
Step 3:
피처 엔지니어링
Feature engineering is the process of modifying data to help machine learning algorithms work better and is often time-consuming and expensive. DataRobot engineers new features from existing numeric, categorical, and text features. It knows which algorithms benefit from extra feature engineering and which don’t and only generates features that make sense given the data characteristics.
DataRobot Automated Feature Engineering
-
Step 4:
Algorithm Diversity
Every dataset contains unique information that reflects the individual characteristics of a business. Due to the variety of situations and conditions, one algorithm cannot successfully solve every possible business problem or dataset. Some machine learning automation platforms only give users access to a few types of algorithms, but with DataRobot you get immediate access to hundreds of diverse algorithms, and the appropriate pre-processing, to test against your data in order to find the best one for your particular AI challenge.
AIs are Individuals, Just Like People
-
Step 5:
알고리즘 선택
Having hundreds of algorithms at your fingertips is great, but in many cases users don’t have time to try each and every algorithm on their data. Some algorithms aren’t suited to the data, some are not suited to the data sizes, and some are extremely unlikely to work well on the data. DataRobot will only run the algorithms that make sense for your data.
Can An AI Recommend the Best Algorithm for Me?
-
Step 6:
훈련과 조정
It’s standard for automated machine learning software to train the model on your data. DataRobot takes this a step further by using smart hyperparameter tuning, not just brute force, to tune the most important hyperparameters for each algorithm. The platform can also create ensemble models (also known as “blenders”) that combine the strengths of several algorithms and balance out the weaknesses of others. Ensemble models typically outperform individual algorithms because of their diversity. DataRobot finds the optimal algorithms to blend together and tunes the weighting of the algorithms within each blender model.
Data Science Fails: There’s No Such Thing As A Free Lunch
-
Step 7:
일대일 모델 경쟁
You won’t know in advance which algorithm will perform the best, so you need to compare the accuracy and speed of different algorithms on your data regardless of which programming language or machine learning library they came from. You can think of it as a competition amongst the models where the best model wins. DataRobot builds and trains dozens of models for AI machine learning automation, comparing the results, and ranking the models by accuracy, speed, and the most efficient combination of the two.
Competition in AI Blog
-
Step 8:
인간 친화적 통찰력
Over the past few years, some automated machine learning tools and AI have made massive strides in predictive power, but at the price of complexity. It is not enough for a model to score well on accuracy and speed – you also have to trust the answers it is giving. And in regulated industries, you must justify the model to a regulator. DataRobot explains model decisions in a human-interpretable manner, showing which features have the greatest impact on the accuracy of each model and the patterns fitted for each feature. DataRobot can also provide prediction explanations to illustrate the key reasons why a specific prediction was made.
Give me one good reason to trust artificial intelligence
-
Step 9:
쉬운 배포
Harvard Business Review once described a team of analysts that built an impressive predictive model, but the business lacked the infrastructure needed to directly implement the trained model in a production setting, which was a waste of time and resources. All DataRobot models are production-ready, and can be deployed in several ways on standard system hardware.
Machine Learning Model Deployment
-
Step 10:
모델 모니터링 및 관리
In a constantly changing world, your AI applications will start to decay over time as the data that’s being used to make predictions is different than what the model was trained on. Unfortunately, figuring out when to replace an outdated model is difficult because traditional IT tools for managing software applications don’t effectively work for machine learning models. DataRobot MLOps provides a common framework for model deployment, monitoring, and governance no matter what data science language or software tool was used to create the model.