What is dimensionality reduction?
Dimensionality reduction is an unsupervised machine
learning technique this will helps us to reduce large number of columns into
small number of columns.
For example, we have the dataset which consists of 500 columns (C1,C2,C3,…..C500) and a class variable.
1) If a dataset consists of 500,1000 and a larger number of columns i.e. huge amount of data, to fit this kind of data through any machine learning algorithm takes more amount of time for processing. So, there is a way to shrink this data into lesser number of columns i.e. 4 or whatever number of columns which is Dimensionality Reduction.
2) This reduced data is a new data which consists of lesser number of dimensions (columns).
3) On this reduced data, we can fit machine learning algorithms.
4) The process of reducing high dimensional data into low dimensional data varies from one technique to another technique.
Disadvantages of High dimensionality
- More computation needed for higher dimensions.
- Training of a model is very slow.
- The model becomes a complex model.
- Slow prediction times.
When to use Dimensionality Reduction in machine learning?
- It can be used under supervised ML
- Unsupervised ML
- Data Visualization
Most popular Dimensionality Reduction techniques
- Principal component analysis Linear discriminant analysis
- T-distributed stochastic neighbor
embedding
- Uniform manifold approximation and
projection
- Feature extraction
- Feature selection
j
d
·
.png)
Comments
Post a Comment