In this post we dicuss different decision trees.

Two types of decision tree exists.

  • Classification Tree.
    • The simplest to exist. See Bootstrap and Bagging for details.
  • Regression Tree
    • Make use of variance as a measure for splitting.
  • Adaboost with decision tree
    • same as random forest except there the weights of each classifier and sample are re-calculated during each iteration
  • GBDT (Gradient Boost Decision Tree) (Mart) (Multiple Additive Regression Tree)
    • think of it as residual net
    • two kinds of implementation
      • new tree trained gradient based
      • new tree trained difference based
    • loop procedure: initial setup $y_{label} = y_{actual}$
      • $y_{n} = y_{label} - y_{1}^{n-1}$, where $y_1^{n-1}$ represent an ensemble prediction of all previous trees. And $y_n$ is the new tree learned from either the differene or gradient
      • $y_1^n = y_1^{n-1} + y_n$, the new tree is then added to the ensemble
      • $y_{label} = \text{difference or negative gradient}$
    • shrinkage version
      • replace step 2 in the loop with $y_1^n = y_1^{n-1} + \text{step} \cdot y_n$, everything else is the same.
      • require more computational resource
      • prevent overfitting. No theory to prove it though.
    • other variant: stochastic sampling of features and bootstrap sampling.
    • Reference
  • XGBOOST
    • make use of 2nd Taylor series
    • 2nd Taylor series reduces the iteration process therefore hasten the training process
    • added 2 regulations to prevent the tree from overfitting
    • kltaUH.png the loss function
    • kltOIJ.png the regulation function
    • where $\gamma$ and $\lambda$ is two hyperparameters, the $\gamma$ parameter is a threshold value. And $\lambda$ is a smoothing factor.
    • Reference - Usuage
    • Reference - Induction
    • Features I don’t understand
      • how parallel process is implemented in XGboost
      • Its relationship to GBDT
      • How it handles missing features
      • If 2nd order derivative is used, why not 3rd order?