Javascript is required
/machine-learning/bootcamp/deep-learning/22-multiclass-classification.md

Deep Learning - Multiclass Classification

There are 2 types of multiclass classification

  • non-exclusive classes: a single datapoint can belong to multiple classes (ex. a photo can have multiple tags beach, family, vacation )
  • mutually exclusive classes: a single datapoint can belong to only one class. (ex. categorizing a grayscale photo to white or black )

In order to organize multiple classes is necessary to map 1 output node for each class.
Each output node could output a continuos regression value or a binary classification value. The output node with the highest value will be the predicted class.

One-hot encoding

One-hot encoding is a way to represent categorical data in a binary format. Each category is mapped to a vector of all zeros except for the index of the category, which is marked with a 1.
For mutually exlusive classes the output vector will have only one 1, for non-exclusive classes the output vector will have multiple 1s.

In the case of mutually exclusive classes a suitable activation function is the sigmoid function, which outputs a value between 0 and 1. The output vector will be a vector of probabilities, where each value represents the probability of the datapoint belonging to that class.
In case of non-exclusive classes the sigmoid function would not be suitable, since the sum of the probabilities would not be 1. The sigmoid function would only be able to give a probabilty of the belonging to a class.

Softmax

The softmax function is a generalization of the sigmoid function for multiple classes. It outputs a vector of probabilities, where the sum of the probabilities is 1. The softmax function is defined as:

<img src="https://storage.rottigni.tech/fs/github/images/ML/DL-softmax-function.png" alt="softmax function equation" width=400">

K stands for the number of categories, z is the input vector and zi is the i-th element of the input vector.
The softmax function will calculate the probabilities of each target class over all possible target classes.

Ex. for a multiclass classification with labels [Red, Green, Blue] the softmax function will provide 3 probabilities, one for each class. the sum of the probabilities will be 1. ([0.1,0.6,0.3])
We can say that ther's the 100% of probability that the datapoint belongs to one of the 3 classes.

Git

GO

GitGOmachine-learningdeep-learninggithubcodingmdstorage