教師なし機械学習

What does Unsupervised Machine Learning Mean?

Unsupervised machine learning algorithms infer patterns from a dataset without reference to known, or labeled, outcomes. Unlike supervised machine learning, unsupervised machine learning methods cannot be directly applied to a regression or a classification problem because you have no idea what the values for the output data might be, making it impossible for you to train the algorithm the way you normally would. Unsupervised learning can instead be used to discover the underlying structure of the data.

Why is Unsupervised Machine Learning Important?

Unsupervised machine learning purports to uncover previously unknown patterns in data, but most of the time these patterns are poor approximations of what supervised machine learning can achieve. Additionally, since you do not know what the outcomes should be, there is no way to determine how accurate they are, making supervised machine learning more applicable to real-world problems.

The best time to use unsupervised machine learning is when you do not have data on desired outcomes, such as determining a target market for an entirely new product that your business has never sold before. However, if you are trying to get a better understanding of your existing consumer base, supervised learning is the optimal technique.

Some applications of unsupervised machine learning techniques include:

  1. クラスタリングでは、類似性に基づいてデータセットを自動的にグループに分割できます。ただし、多くの場合、クラスター分析では、グループ間の類似性が過大評価され、データポイントが個別に扱われません。このため、クラスター分析は、顧客のセグメント化やターゲット化といった応用には適していません。
  2. 異常検知は、データセット内の異常なデータポイントを自動的に検出できます。これは、詐欺取引の特定、ハードウェアの故障部品検出、またはデータ入力中の人的エラーによる異常値の識別に役立ちます。
  3. アソシエーションマイニングは、データセット内で頻繁に同時発生するアイテムセットを識別します。小売業者がバスケット分析によく使用します。アナリストが、同時に購入されることが多い商品を見つけ出し、それに従って、より効果的なマーケティング戦略やマーチャンダイジング戦略を開発できるためです。
  4. 潜在的変数モデルは、データセット内の特徴量の数を減らす(次元削減)、データセットを複数のコンポーネントに分解するなど、データ処理で一般的に使用されます。

The patterns you uncover with unsupervised machine learning methods may also come in handy when implementing supervised machine learning methods later on. For example, you might use an unsupervised technique to perform cluster analysis on the data, then use the cluster to which each row belongs as an extra feature in the supervised learning model (see semi-supervised machine learning). Another example is a fraud detection model that uses anomaly detection scores as an extra feature.

教師なし機械学習 + DataRobot

The DataRobot automated machine learning platform requires a “target” column — that is, it needs to know the output variable in order to uncover patterns in your data. However, many of its model blueprints utilize unsupervised learning to automate complicated feature engineering techniques, which are difficult and time-consuming to implement without automation.
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”What is the difference between supervised and unsupervised machine learning?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Unsupervised ML is used when the right answer for each data point is either unknown or doesn’t exist for historical data. Supervised ML is used when the right answer is known for historical data.”}},{“@type”:”Question”,”name”:”What is unsupervised machine learning?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Unsupervised machine learning algorithms infer patterns from a dataset without known, or labeled, outcomes.”}}]}