Machine Learning
Overview
Machine learning is a field that seeks to create algorithms that can learn from data. Typically this involves
finding patterns in data such that the algorithm can effectively make new predictions from unseen data.
While some machine learning algorithms overlap with those of statistics and statistical inference, machine
learning does not necessarily depend on using a probabilistic lense to design or understand its
algorithms, even if the alogorithms performance is usually understood statistically.
Classification
Classification algorithms are algorithms that try to assign one of a finite set of labels to each data point.
- Bayes Classifier
- Linear Discriminant Analysis
- Network Algorithms
encompasses a broad spectrum of algorithms including the
perceptron
and deep learning
- Nearest Neighbor :
The nearest neighbor algorithm is an algorithm that looks looks for the points in a given dataset that are closest in some sense the
the point you are trying to classify or predict, and then using that set to find the best guess. It is one of several non-parametric
methods of machine learning. The nearest neighbor method can be used in a regression context as well as classification.
- Tree :
The tree algorithm is an algorithm that partitions the data space into rectangular regions, where each region is assigned a given classification.
The tree algorithm can also be used in a regression setting, by applying regression techniques within each region.
- Clustering :
Clustering is a set of algorithms that are designed to identify a set of clusters in the data. Typically, these are unsupervised algorithms.
-
Minimum Distance
-
Logistic Regression
- a standard statistical tool, which is generally used to model probabilities of binary outcomes, can be used as a classifier.
Regression
Regression :
Regression models are used in continuum problems where the forecasted element does not fall into a series of discrete cases.
Genetic Algorithms
Genetic Algorithms
are algorithms inspired by the concept of natural selection used in
optimization and search algorithms.
Algorithm Enhancements
- Dimensionality reduction :
- Regularization : is a technique used to prevent overfitting and
sometimes also used to create sparse models.
-
Kernel Methods are a set of methods that can convert an algorithm that
recognizes linear patterns to one that can recognize non-linear patterns.
Example algorithms that can be extended using kernel methods:
- perceptron
- linear regression
- principal component analysis
- clustering
Ensemble Learning
Ensemble learning refers to a machine learning alogrithm that is obtained by treaining several separate machine
learning algorithms, and then using an algorithm to create an answer from the model ensemble.