CatBoost vs XGBoost - A CatBoost Introduction¶

Link to Udemy Class

from IPython.display import Image
Image(filename='CatBoost vs Xgboost.png' )

Let's pit CatBoost against XGBoost in a friendly classificiation battle! Cat Fight Time

I got some of my best scores on Kaggle using it! At one point, I was ranked 185th and I thank XGBoost ( https://www.kaggle.com/amunategui ), lot's of others thanked XGBoost too. We still thank it today - it's integrated all over the place - scikit-learn, cloud providers, I use it everyday for customers in GCP as it is now compatible with Cloud ML, so you can model terrabytes of data using it.

GCP Built-in XGBoost algorithm https://cloud.google.com/ml-engine/docs/algorithms/xgboost-start

Scikit-Learn API Scikit-Learn Wrapper interface for XGBoost https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.sklearn

# installing catboost and xgboost

# !pip3 install catboost --user
# !pip3 install xgboost --user

# Let's compare XGBoost to CatBoost

%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
from sklearn.model_selection import train_test_split

import catboost
print('catboost version:', catboost.__version__)
import xgboost
print('xgboost version:', xgboost.__version__)

catboost version: 0.18
xgboost version: 0.72.1

Let's get an independent Titanic data set from the Vanderbilt University¶

titanic_df = pd.read_csv(
    'http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3.csv')
titanic_df.head()

# simple feature engineering

# strip first letter from cabin number if there
titanic_df['cabin'] = titanic_df['cabin'].replace(np.NaN, 'U') 
titanic_df['cabin'] = [ln[0] for ln in titanic_df['cabin'].values]
titanic_df['cabin'] = titanic_df['cabin'].replace('U', 'Unknown') 
titanic_df['cabin'].head()

0    B
1    C
2    C
3    C
4    C
Name: cabin, dtype: object

# create isfemale field and use numerical values
titanic_df['isfemale'] = np.where(titanic_df['sex'] == 'female', 1, 0)

# drop features not needed for model 
titanic_df = titanic_df[[f for f in list(titanic_df) if f not in ['sex', 'name', 'boat','body', 'ticket', 'home.dest']]]

# make pclass actual categorical column
titanic_df['pclass'] = np.where(titanic_df['pclass'] == 1, 'First', 
                                np.where(titanic_df['pclass'] == 2, 'Second', 'Third'))


titanic_df['embarked'] = titanic_df['embarked'].replace(np.NaN, 'Unknown') 


titanic_df.head()

# how many nulls do we have?
titanic_df.isna().sum()

pclass        0
survived      0
age         263
sibsp         0
parch         0
fare          1
cabin         0
embarked      0
isfemale      0
dtype: int64

# impute age to mean
titanic_df['age'] = titanic_df['age'].fillna(titanic_df['age'].mean())
titanic_df['age']

0       29.000000
1        0.920000
2        2.000000
3       30.000000
4       25.000000
5       48.000000
6       63.000000
7       39.000000
8       53.000000
9       71.000000
10      47.000000
11      18.000000
12      24.000000
13      26.000000
14      80.000000
15      29.881138
16      24.000000
17      50.000000
18      32.000000
19      36.000000
20      37.000000
21      47.000000
22      26.000000
23      42.000000
24      29.000000
25      25.000000
26      25.000000
27      19.000000
28      35.000000
29      28.000000
          ...    
1279    14.000000
1280    22.000000
1281    22.000000
1282    29.881138
1283    29.881138
1284    29.881138
1285    32.500000
1286    38.000000
1287    51.000000
1288    18.000000
1289    21.000000
1290    47.000000
1291    29.881138
1292    29.881138
1293    29.881138
1294    28.500000
1295    21.000000
1296    27.000000
1297    29.881138
1298    36.000000
1299    27.000000
1300    15.000000
1301    45.500000
1302    29.881138
1303    29.881138
1304    14.500000
1305    29.881138
1306    26.500000
1307    27.000000
1308    29.000000
Name: age, Length: 1309, dtype: float64

# SEED - play around with this variable as it will change winners
SEED = 1234 # try 0

CatBoost's Turn!¶

titanic_df.head()

# map categorical features
titanic_catboost_ready_df = titanic_df.dropna() 

features = [feat for feat in list(titanic_catboost_ready_df) if feat != 'survived']
print(features)
titanic_categories = np.where(titanic_catboost_ready_df[features].dtypes != np.float)[0]
titanic_categories

['pclass', 'age', 'sibsp', 'parch', 'fare', 'cabin', 'embarked', 'isfemale']

array([0, 2, 3, 5, 6, 7])

from catboost import CatBoostClassifier 

X_train, X_test, y_train, y_test = train_test_split(titanic_df[features], 
                                                    titanic_df[['survived']], 
                                                    test_size=0.3, 
                                                     random_state=SEED)
 

params = {'iterations':5000,
        'learning_rate':0.01,
        'cat_features':titanic_categories,
        'depth':3,
        'eval_metric':'AUC',
        'verbose':200,
        'od_type':"Iter", # overfit detector
        'od_wait':500, # most recent best iteration to wait before stopping
        'random_seed': SEED
          }

cat_model = CatBoostClassifier(**params)
cat_model.fit(X_train, y_train,   
          eval_set=(X_test, y_test), 
          use_best_model=True, # True if we don't want to save trees created after iteration with the best validation score
          plot=True  
         );

0:	test: 0.8269924	best: 0.8269924 (0)	total: 67ms	remaining: 5m 35s
200:	test: 0.8437691	best: 0.8450862 (179)	total: 2.34s	remaining: 56s
400:	test: 0.8479008	best: 0.8484554 (369)	total: 4.44s	remaining: 50.9s
600:	test: 0.8509234	best: 0.8509511 (592)	total: 6.35s	remaining: 46.5s
800:	test: 0.8510205	best: 0.8514364 (786)	total: 8.43s	remaining: 44.2s
1000:	test: 0.8497865	best: 0.8520742 (851)	total: 10.3s	remaining: 41s
1200:	test: 0.8528645	best: 0.8530864 (1191)	total: 12.2s	remaining: 38.5s
1400:	test: 0.8538628	best: 0.8540014 (1359)	total: 15s	remaining: 38.6s
1600:	test: 0.8547363	best: 0.8552077 (1557)	total: 17.2s	remaining: 36.4s
1800:	test: 0.8551106	best: 0.8553602 (1787)	total: 19.3s	remaining: 34.2s
2000:	test: 0.8557762	best: 0.8557762 (2000)	total: 21.4s	remaining: 32s
2200:	test: 0.8568992	best: 0.8568992 (2197)	total: 23.7s	remaining: 30.2s
2400:	test: 0.8584244	best: 0.8586462 (2357)	total: 25.4s	remaining: 27.5s
2600:	test: 0.8599773	best: 0.8600882 (2591)	total: 27.7s	remaining: 25.5s
2800:	test: 0.8612528	best: 0.8612806 (2789)	total: 29.7s	remaining: 23.3s
3000:	test: 0.8619461	best: 0.8620570 (2876)	total: 32.1s	remaining: 21.4s
3200:	test: 0.8622789	best: 0.8626948 (3177)	total: 34.3s	remaining: 19.3s
3400:	test: 0.8620570	best: 0.8627225 (3345)	total: 36.7s	remaining: 17.3s
3600:	test: 0.8625007	best: 0.8628335 (3585)	total: 39.8s	remaining: 15.5s
3800:	test: 0.8633049	best: 0.8635267 (3778)	total: 43s	remaining: 13.6s
4000:	test: 0.8634158	best: 0.8639427 (3950)	total: 45.8s	remaining: 11.4s
4200:	test: 0.8631939	best: 0.8639427 (3950)	total: 48.7s	remaining: 9.26s
4400:	test: 0.8636099	best: 0.8639427 (3950)	total: 51.3s	remaining: 6.99s
Stopped by overfitting detector  (500 iterations wait)

bestTest = 0.8639426543
bestIteration = 3950

Shrink model to first 3951 iterations.

# Confusion matrix
dval_predictions = cat_model.predict(X_test)
dval_predictions

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, [1 if p > 0.5 else 0 for p in dval_predictions])

plt.figure(figsize = (6,4))
plt.ticklabel_format(style='plain', axis='y', useOffset=False)
sns.set(font_scale=1.4)
sns.heatmap(cm, annot=True, annot_kws={"size": 16}) 
plt.show()

cat_model.get_feature_importance()

array([13.47392213, 14.06654777,  4.57348393,  4.04448054, 17.05781177,
        7.42358554,  7.38780932, 31.972359  ])

feat_import = [t for t in zip(features, cat_model.get_feature_importance())]
feat_import_df = pd.DataFrame(feat_import, columns=['Feature', 'VarImp'])
feat_import_df = feat_import_df.sort_values('VarImp', ascending=False)
feat_import_df[feat_import_df['VarImp'] > 0]

XGBoost's Turn¶

Dummy/one-hot only for XGBoost¶

titanic_df.isnull().any()

pclass      False
survived    False
age         False
sibsp       False
parch       False
fare         True
cabin       False
embarked    False
isfemale    False
dtype: bool

def prepare_data_for_model(raw_dataframe, target_columns, drop_first = True, make_na_col = False):
    # dummy all categorical fields 
    dataframe_dummy = pd.get_dummies(raw_dataframe, columns=target_columns, 
                                     drop_first=drop_first, 
                                     dummy_na=make_na_col)
    return (dataframe_dummy)

# create dummy features 
titanic_xgboost_ready_df = prepare_data_for_model(titanic_df, target_columns=['pclass', 'cabin', 'embarked'])
titanic_xgboost_ready_df = titanic_xgboost_ready_df.dropna() 

list(titanic_xgboost_ready_df)

['survived',
 'age',
 'sibsp',
 'parch',
 'fare',
 'isfemale',
 'pclass_Second',
 'pclass_Third',
 'cabin_B',
 'cabin_C',
 'cabin_D',
 'cabin_E',
 'cabin_F',
 'cabin_G',
 'cabin_T',
 'cabin_Unknown',
 'embarked_Q',
 'embarked_S',
 'embarked_Unknown']

# split data into train and test portions and model
features = [feat for feat in list(titanic_xgboost_ready_df) if feat != 'survived']
X_train, X_test, y_train, y_test = train_test_split(titanic_xgboost_ready_df[features], 
                                                 titanic_xgboost_ready_df[['survived']], 
                                                test_size=0.3, 
                                                 random_state=SEED)
 

import xgboost  as xgb
xgb_params = {
    'max_depth':3, 
    'eta':0.01, 
    'silent':0, 
    'eval_metric':'auc',
    'subsample': 0.8,
    'colsample_bytree': 0.8,
    'objective':'binary:logistic',
    'seed' : SEED
}

dtrain = xgb.DMatrix(X_train, y_train, feature_names=X_train.columns.values)
dtest = xgb.DMatrix(X_test, y_test, feature_names=X_test.columns.values)

evals = [(dtrain,'train'),(dtest,'eval')]
xgb_model = xgb.train ( params = xgb_params,
              dtrain = dtrain,
              num_boost_round = 5000,
              verbose_eval=200, 
              early_stopping_rounds = 500,
              evals=evals,
              maximize = True)

[0]	train-auc:0.712492	eval-auc:0.693187
Multiple eval metrics have been passed: 'eval-auc' will be used for early stopping.

Will train until eval-auc hasn't improved in 500 rounds.
[200]	train-auc:0.882827	eval-auc:0.862319
[400]	train-auc:0.895702	eval-auc:0.863279
[600]	train-auc:0.908131	eval-auc:0.870968
[800]	train-auc:0.918307	eval-auc:0.87386
[1000]	train-auc:0.925438	eval-auc:0.875056
[1200]	train-auc:0.931388	eval-auc:0.87589
[1400]	train-auc:0.93583	eval-auc:0.875779
[1600]	train-auc:0.939955	eval-auc:0.874611
Stopping. Best iteration:
[1161]	train-auc:0.930458	eval-auc:0.876808

# get dataframe version of important feature for model 
xgb_fea_imp=pd.DataFrame(list(xgb_model.get_fscore().items()),
columns=['feature','importance']).sort_values('importance', ascending=False)
xgb_fea_imp.head(10)

# Confusion matrix
dval_predictions = xgb_model.predict(dtest)
dval_predictions

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, [1 if p > 0.5 else 0 for p in dval_predictions])

plt.figure(figsize = (6,4))
plt.ticklabel_format(style='plain', axis='y', useOffset=False)
sns.set(font_scale=1.4)
sns.heatmap(cm, annot=True, annot_kws={"size": 16}) 
plt.show()

Final Winners¶

xgb_model.best_score

0.876808

cat_model.best_score_

{'learn': {'Logloss': 0.30515428206120315},
 'validation': {'Logloss': 0.4201347125001406, 'AUC': 0.8639426543175642}}

Let's dive deep into CatBoost!¶

Initial release date: July 18, 2017 by Yandex researchers and is open sourced

https://catboost.ai/

pip install catboost

Quick start https://catboost.ai/docs/concepts/python-quickstart.html

CatBoostClassifier¶

from catboost.datasets import titanic
titanic_train, titanic_test = titanic()

print(titanic_train.head(3))

titanic_train.shape

   PassengerId  Survived  Pclass  \
0            1         0       3   
1            2         1       1   
2            3         1       3   

                                                Name     Sex   Age  SibSp  \
0                            Braund, Mr. Owen Harris    male  22.0      1   
1  Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   
2                             Heikkinen, Miss. Laina  female  26.0      0   

   Parch            Ticket     Fare Cabin Embarked  
0      0         A/5 21171   7.2500   NaN        S  
1      0          PC 17599  71.2833   C85        C  
2      0  STON/O2. 3101282   7.9250   NaN        S

(891, 12)

titanic_test.shape

(418, 11)

# pip install pandas-profiling
import pandas_profiling as pp
pp.ProfileReport(titanic_train)

# clean up NaNs
titanic_train.isnull().sum(axis=0)

PassengerId      0
Survived         0
Pclass           0
Name             0
Sex              0
Age            177
SibSp            0
Parch            0
Ticket           0
Fare             0
Cabin          687
Embarked         2
dtype: int64

# impute age to mean
titanic_train['Age'] = titanic_train['Age'].fillna(titanic_train['Age'].mean())
titanic_train['Embarked'] = titanic_train['Embarked'].replace(np.nan, 'Unknown', regex=True)

from catboost import CatBoostClassifier

# data split
outome_name = 'Survived'
features_for_model =['Pclass', 'Sex', 'Age', 'Embarked']


# data split
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(titanic_train[features_for_model], 
                                                 titanic_train[outome_name], 
                                                 test_size=0.3, 
                                                 random_state=1)

# tell catboost which are categorical columns
titanic_categories = np.where(X_train[features_for_model].dtypes != np.float)[0]
print('titanic_categories:', titanic_categories)

titanic_categories: [0 1 3]

params = {'iterations':1000,
        'learning_rate':0.01,
        'cat_features':titanic_categories,
        'depth':3,
        'eval_metric':'AUC',
        'verbose':200,
        'od_type':"Iter", # overfit detector
        'od_wait':500,  
         }

 

model_classifier = CatBoostClassifier(**params)
                       
model_classifier.fit(X_train, y_train, 
                     eval_set=(X_test, y_test),  
                     use_best_model=True, 
                     plot= True  
                    );

0:	test: 0.7554419	best: 0.7554419 (0)	total: 31.3ms	remaining: 31.3s
200:	test: 0.8107133	best: 0.8157431 (28)	total: 2.2s	remaining: 8.76s
400:	test: 0.8169082	best: 0.8169082 (343)	total: 4.01s	remaining: 6s
600:	test: 0.8160557	best: 0.8183007 (403)	total: 5.79s	remaining: 3.85s
800:	test: 0.8196363	best: 0.8213413 (738)	total: 7.89s	remaining: 1.96s
999:	test: 0.8202330	best: 0.8213413 (738)	total: 9.89s	remaining: 0us

bestTest = 0.8213412901
bestIteration = 738

Shrink model to first 739 iterations.

# feature importance 
feat_import = [t for t in zip(features_for_model, model_classifier.get_feature_importance())]
feat_import_df = pd.DataFrame(feat_import, columns=['Feature', 'VarImp'])
feat_import_df = feat_import_df.sort_values('VarImp', ascending=False)
feat_import_df.head(20)

CatBoostRegressor¶

Boston house prices dataset¶

The Boston Housing Dataset consists of price of houses in various places in Boston. Alongside with price, the dataset also provide information such as Crime (CRIM), areas of non-retail business in the town (INDUS), the age of people who own the house (AGE).

from sklearn.datasets import load_boston
boston_dataset = load_boston()
boston_dataset.keys()

dict_keys(['data', 'target', 'feature_names', 'DESCR', 'filename'])

for ln in boston_dataset.DESCR.split('\n'):
    print(ln)

.. _boston_dataset:

Boston house prices dataset
---------------------------

**Data Set Characteristics:**  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
        - B        1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
        - LSTAT    % lower status of the population
        - MEDV     Median value of owner-occupied homes in $1000's

    :Missing Attribute Values: None

    :Creator: Harrison, D. and Rubinfeld, D.L.

This is a copy of UCI ML housing dataset.
https://archive.ics.uci.edu/ml/machine-learning-databases/housing/


This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University.

The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic
prices and the demand for clean air', J. Environ. Economics & Management,
vol.5, 81-102, 1978.   Used in Belsley, Kuh & Welsch, 'Regression diagnostics
...', Wiley, 1980.   N.B. Various transformations are used in the table on
pages 244-261 of the latter.

The Boston house-price data has been used in many machine learning papers that address regression
problems.   
     
.. topic:: References

   - Belsley, Kuh & Welsch, 'Regression diagnostics: Identifying Influential Data and Sources of Collinearity', Wiley, 1980. 244-261.
   - Quinlan,R. (1993). Combining Instance-Based and Model-Based Learning. In Proceedings on the Tenth International Conference of Machine Learning, 236-243, University of Massachusetts, Amherst. Morgan Kaufmann.

boston = pd.DataFrame(boston_dataset.data, columns=boston_dataset.feature_names)
boston.head(10)

# Our target variable - Median value of owner-occupied homes in $1000s
boston['MEDV'] = boston_dataset.target
boston.head()

pp.ProfileReport(boston)

# clean up NaNs
boston.isnull().sum(axis=0)

CRIM       0
ZN         0
INDUS      0
CHAS       0
NOX        0
RM         0
AGE        0
DIS        0
RAD        0
TAX        0
PTRATIO    0
B          0
LSTAT      0
MEDV       0
dtype: int64

from catboost import CatBoostRegressor

# data split
outome_name = 'MEDV'
features_for_model = [f for f in list(boston) if f not in [outome_name, 'TAX']]

# get categories and cast to string
boston_categories = np.where([boston[f].apply(float.is_integer).all() for f in features_for_model])[0]
print('boston_categories:', boston_categories)

# convert to values to string
for feature in [list(boston[features_for_model])[f] for f in list(boston_categories)]:
    print(feature)
    boston[feature] = boston[feature].to_string()


# data split
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(boston[features_for_model], 
                                                 boston[outome_name], 
                                                 test_size=0.3, 
                                                 random_state=1)




params = {'iterations':5000,
        'learning_rate':0.001,
        'depth':3,
        'loss_function':'RMSE',
        'eval_metric':'RMSE',
        'random_seed':55,
        'cat_features':boston_categories,
        'metric_period':200,  
        'od_type':"Iter",  
        'od_wait':20,  
        'verbose':True,
        'use_best_model':True}


model_regressor = CatBoostRegressor(**params)

model_regressor.fit(X_train, y_train, 
          eval_set=(X_test, y_test),  
          use_best_model=True,  
          plot= True   
         );

boston_categories: [3 8]
CHAS
RAD

Warning: Overfitting detector is active, thus evaluation metric is calculated on every iteration. 'metric_period' is ignored for evaluation metric.

0:	learn: 9.0062199	test: 9.5913153	best: 9.5913153 (0)	total: 3.52ms	remaining: 17.6s
200:	learn: 8.1097535	test: 8.6739924	best: 8.6739924 (200)	total: 398ms	remaining: 9.51s
400:	learn: 7.3671975	test: 7.9052743	best: 7.9052743 (400)	total: 733ms	remaining: 8.41s
600:	learn: 6.7332418	test: 7.2399556	best: 7.2399556 (600)	total: 1.12s	remaining: 8.17s
800:	learn: 6.1944857	test: 6.6637468	best: 6.6637468 (800)	total: 1.51s	remaining: 7.92s
1000:	learn: 5.7395102	test: 6.1776956	best: 6.1776956 (1000)	total: 1.81s	remaining: 7.24s
1200:	learn: 5.3595767	test: 5.7679251	best: 5.7679251 (1200)	total: 2.07s	remaining: 6.55s
1400:	learn: 5.0351836	test: 5.4117721	best: 5.4117721 (1400)	total: 2.33s	remaining: 5.99s
1600:	learn: 4.7588279	test: 5.1042392	best: 5.1042392 (1600)	total: 2.7s	remaining: 5.74s
1800:	learn: 4.5228107	test: 4.8422938	best: 4.8422938 (1800)	total: 3.11s	remaining: 5.53s
2000:	learn: 4.3186420	test: 4.6256193	best: 4.6256193 (2000)	total: 3.43s	remaining: 5.14s
2200:	learn: 4.1376143	test: 4.4359809	best: 4.4359809 (2200)	total: 3.66s	remaining: 4.66s
2400:	learn: 3.9849409	test: 4.2810937	best: 4.2810937 (2400)	total: 3.92s	remaining: 4.24s
2600:	learn: 3.8455645	test: 4.1391164	best: 4.1391164 (2600)	total: 4.18s	remaining: 3.86s
2800:	learn: 3.7227215	test: 4.0203071	best: 4.0203071 (2800)	total: 4.46s	remaining: 3.5s
3000:	learn: 3.6157683	test: 3.9192963	best: 3.9192963 (3000)	total: 4.72s	remaining: 3.14s
3200:	learn: 3.5211869	test: 3.8313423	best: 3.8313423 (3200)	total: 4.97s	remaining: 2.8s
3400:	learn: 3.4370885	test: 3.7550608	best: 3.7550608 (3400)	total: 5.26s	remaining: 2.47s
3600:	learn: 3.3596317	test: 3.6827074	best: 3.6827074 (3600)	total: 5.54s	remaining: 2.15s
3800:	learn: 3.2897361	test: 3.6224526	best: 3.6224526 (3800)	total: 5.86s	remaining: 1.85s
4000:	learn: 3.2286628	test: 3.5723766	best: 3.5723766 (4000)	total: 6.12s	remaining: 1.53s
4200:	learn: 3.1709888	test: 3.5216147	best: 3.5216147 (4200)	total: 6.37s	remaining: 1.21s
4400:	learn: 3.1177826	test: 3.4758259	best: 3.4758259 (4400)	total: 6.61s	remaining: 900ms
4600:	learn: 3.0655620	test: 3.4298636	best: 3.4298636 (4600)	total: 6.83s	remaining: 593ms
4800:	learn: 3.0189495	test: 3.3891197	best: 3.3891197 (4800)	total: 7.08s	remaining: 293ms
4999:	learn: 2.9734377	test: 3.3499927	best: 3.3499927 (4999)	total: 7.33s	remaining: 0us

bestTest = 3.349992734
bestIteration = 4999

# feature importance 
feat_import = [t for t in zip(features_for_model, model_regressor.get_feature_importance())]
feat_import_df = pd.DataFrame(feat_import, columns=['Feature', 'VarImp'])
feat_import_df = feat_import_df.sort_values('VarImp', ascending=False)
feat_import_df.head(20)

	CRIM	ZN	INDUS	NOX	RM	AGE	DIS	RAD	TAX	PTRATIO	B	LSTAT
0	0.00632	18.0	2.31	0.538	6.575	65.2	4.0900	1.0	296.0	15.3	396.90	4.98
1	0.02731	0.0	7.07	0.469	6.421	78.9	4.9671	2.0	242.0	17.8	396.90	9.14
2	0.02729	0.0	7.07	0.469	7.185	61.1	4.9671	2.0	242.0	17.8	392.83	4.03
3	0.03237	0.0	2.18	0.458	6.998	45.8	6.0622	3.0	222.0	18.7	394.63	2.94
4	0.06905	0.0	2.18	0.458	7.147	54.2	6.0622	3.0	222.0	18.7	396.90	5.33
5	0.02985	0.0	2.18	0.458	6.430	58.7	6.0622	3.0	222.0	18.7	394.12	5.21
6	0.08829	12.5	7.87	0.524	6.012	66.6	5.5605	5.0	311.0	15.2	395.60	12.43
7	0.14455	12.5	7.87	0.524	6.172	96.1	5.9505	5.0	311.0	15.2	396.90	19.15
8	0.21124	12.5	7.87	0.524	5.631	100.0	6.0821	5.0	311.0	15.2	386.63	29.93
9	0.17004	12.5	7.87	0.524	6.004	85.9	6.5921	5.0	311.0	15.2	386.71	17.10

	CRIM	ZN	INDUS	NOX	RM	AGE	DIS	RAD	TAX	PTRATIO	B	LSTAT	MEDV
0	0.00632	18.0	2.31	0.538	6.575	65.2	4.0900	1.0	296.0	15.3	396.90	4.98	24.0
1	0.02731	0.0	7.07	0.469	6.421	78.9	4.9671	2.0	242.0	17.8	396.90	9.14	21.6
2	0.02729	0.0	7.07	0.469	7.185	61.1	4.9671	2.0	242.0	17.8	392.83	4.03	34.7
3	0.03237	0.0	2.18	0.458	6.998	45.8	6.0622	3.0	222.0	18.7	394.63	2.94	33.4
4	0.06905	0.0	2.18	0.458	7.147	54.2	6.0622	3.0	222.0	18.7	396.90	5.33	36.2

ViralML.com

Get the "Applied Data Science Edge"!

Web Work

Hot off the Press!

CatBoost vs XGBoost - A Gentle Introduction to CatBoost - Free Udemy Class

Introduction

Code