Skip to content

Treelite gives different predictions than base XGBoost model #585

Open
@juliuscoburger

Description

@juliuscoburger

I noticed that my model returns different scores than the original model. I was able to boil the issue down to using a base_score during training. Can it be that this value is not being translated?

Code to replicate the issue:

import numpy as np
import xgboost as xgb
import treelite

np.random.seed(42)
N = 10
X = np.random.random((N, 10))
y = np.random.random((N,))
dtrain = xgb.DMatrix(X, label=y)
bst = xgb.train({
    'objective': 'count:poisson'
}, dtrain, 10)
bst.save_model('/tmp/bst.json')
tl_model = treelite.frontend.load_xgboost_model('/tmp/bst.json')
# Treelite gives the same predictions as xgboost
np.testing.assert_almost_equal(treelite.gtil.predict(tl_model, data=X).squeeze(), bst.predict(dtrain))


# Poisson will fail for sufficiently high predictions, see https://github.com/dmlc/xgboost/issues/10486
y = np.random.random((N,)) * 3000
dtrain = xgb.DMatrix(X, label=y)
# But the issue can be mitigated by setting sufficiently high base score
bst = xgb.train({
    'objective': 'count:poisson',
    'base_score': 3000
}, dtrain, 10)
bst.save_model('/tmp/bst.json')

tl_model = treelite.frontend.load_xgboost_model('/tmp/bst.json')
# Unfortunatelly treelite now gives different predictions
np.testing.assert_almost_equal(treelite.gtil.predict(tl_model, data=X).squeeze(), bst.predict(dtrain))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions