Gradient Boosting in python using scikit-learn
Gradient boosting has become a big part of Kaggle competition winners’ toolkits. It was initially searched in earnest by Jerome Friedman in the paper Greedy Function Approximation: A Gradient Boosting Machine. In this post, we’ll see gradient boosting and its use in python with the scikit-learn library.
Gradient boosting is a boosting ensemble method.
Ensemble machine learning methods are things in which several predictors are aggregated to produce a final prediction, which has lower bias and variance than any specific predictors.
Ensemble machine learning methods come in 2 different flavors — bagging and boosting.
Bagging is a method in which several predictors are trained independently of one another, and then they are aggregated later using an average (majority vote or mode, mean, weighted mean). Random forests are an example of a bagging algorithm.
Boosting is a technique in which the predictors do train sequentially (the error of one stage is passed as input into the next step).
Gradient boosting produces an ensemble of decision trees that are weak decision models on their own. Let’s take a look at how this model works.
We’ll start with some imports.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import ensemble
from sklearn import linear_model