Machine Learning

Overview


Machine learning is a field that seeks to create algorithms that can learn from data. Typically this involves finding patterns in data such that the algorithm can effectively make new predictions from unseen data. While some machine learning algorithms overlap with those of statistics and statistical inference, machine learning does not necessarily depend on using a probabilistic lense to design or understand its algorithms, even if the alogorithms performance is usually understood statistically.

Classification


Classification algorithms are algorithms that try to assign one of a finite set of labels to each data point.

  • Bayes Classifier
  • Linear Discriminant Analysis
  • Network Algorithms encompasses a broad spectrum of algorithms including the perceptron and deep learning
  • Nearest Neighbor : The nearest neighbor algorithm is an algorithm that looks looks for the points in a given dataset that are closest in some sense the the point you are trying to classify or predict, and then using that set to find the best guess. It is one of several non-parametric methods of machine learning. The nearest neighbor method can be used in a regression context as well as classification.
  • Tree : The tree algorithm is an algorithm that partitions the data space into rectangular regions, where each region is assigned a given classification. The tree algorithm can also be used in a regression setting, by applying regression techniques within each region.
  • Clustering : Clustering is a set of algorithms that are designed to identify a set of clusters in the data. Typically, these are unsupervised algorithms.
  • Minimum Distance
  • Logistic Regression - a standard statistical tool, which is generally used to model probabilities of binary outcomes, can be used as a classifier.

Regression


Regression : Regression models are used in continuum problems where the forecasted element does not fall into a series of discrete cases.

Genetic Algorithms


Genetic Algorithms are algorithms inspired by the concept of natural selection used in optimization and search algorithms.

Algorithm Enhancements


  • Dimensionality reduction :
  • Regularization : is a technique used to prevent overfitting and sometimes also used to create sparse models.
  • Kernel Methods are a set of methods that can convert an algorithm that recognizes linear patterns to one that can recognize non-linear patterns. Example algorithms that can be extended using kernel methods:

    • perceptron
    • linear regression
    • principal component analysis
    • clustering

Ensemble Learning


Ensemble learning refers to a machine learning alogrithm that is obtained by treaining several separate machine learning algorithms, and then using an algorithm to create an answer from the model ensemble.

Contents