What precision and recall are? After the predictive model has been finished, the most important question is: How good is it? Does it predict well? Evaluating the model is one of the most important tasks in the data science project, it indicates how good predictions are. Very often for classification problems we look at metrics …

# Category: Glossary

Data science glossary, a dictionary for machine learning, deep learning, neural networks.

## What is Resampling?

Resampling is any technique of generating a new sample from an existing dataset. There is a variety of methods for estimating the precision of sample statistics (medians, variances, percentiles) by using subsets of available data (jackknifing) or drawing randomly with replacement from a set of data points (bootstrapping). Exchanging labels on data points when performing …

## What is Regularization?

What is Regularization? In Machine Learning, very often the task is to fit a model to a set of training data and use the fitted model to make predictions or classify new (out of sample) data points. Sometimes model fits the training data very well but does not well in predicting out of sample data …

## What is Regression?

Regression is a statistical measure used that attempts to determine the strength of the relationship between one dependent variable and a series of other changing (independent) variables. The two basic types of regression are linear regression and multiple linear regression, although there are non-linear regression methods for more complicated data and analysis. Linear regression uses …

## What is Random Sampling?

Random sampling. In this technique, each member of the population has an equal chance of being selected as the subject. The entire process of sampling is done in a single step with each subject selected independently of the other members of the population. There are many methods to proceed with simple random sampling. A sample …

## What is Random Forest?

Random Forest or Random Decision Forest are an ensemble learning method for classification, regression, and other tasks, that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees. Random decision forests correct for decision …

## What is Radial Basis Function(RBF) network?

Radial Basis Function(RBF) network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the inputs and neuron parameters. Radial basis function networks have many uses, including function approximation, time series prediction, classification, and system control. They were …

## What is QQ plot?

QQ plots – Quantile-Quantile plots are a graphical technique for determining if two data sets come from populations with a common distribution. A q-q plot is a plot of the quantiles of the first data set against the quantiles of the second data set. By a quantile, we mean the fraction (or percent) of points …

## What is Q-learning?

Q-learning is a model-free reinforcement learning technique. Specifically, Q-learning can be used to find an optimal action selection policy for any given (finite) Markov decision process (MDP). It works by learning an action-value function that ultimately gives the expected utility of taking a given action in a given state and following the optimal policy thereafter. …

## What is Pruning?

Pruning is a technique in machine learning that reduces the size of decision trees by removing sections of the tree that provide a little power to classify instances. Pruning reduces the complexity of the final classifier, and hence improves predictive accuracy by the reduction of overfitting. One of the questions that arise in a decision …