-
Notifications
You must be signed in to change notification settings - Fork 13
[bug] training xgboost dosen't work with dataframe, only numpy array #1
Copy link
Copy link
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Hello and thanks you for that package.
I came across a problem while trying to use a xgboost model that was trained on dataframe.
So this is my code:
X_train, X_test, y_train, y_test = load_csv('X_train'), load_csv('X_test'), load_csv('y_train'), load_csv('y_test')
model = XGBClassifier(tree_method='hist')
X_train_val, y_train_vals = X_train.values, y_train.values.squeeze()
X_test_val, y_test = X_test.values, y_test.values.squeeze()
model.fit(X_train, y_train)
# fit influence estimator
explainer = BoostIn().fit(model, X_train, y_train)
Which produce this exception:
Traceback (most recent call last):
File "/home/jupyter/owlytics-data-science/influence/influence.py", line 35, in <module>
explainer = BoostIn().fit(model, X_train, y_train)
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/boostin.py", line 44, in fit
super().fit(model, X, y)
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/base.py", line 31, in fit
self.model_ = parse_model(model, X, y)
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/parsers/__init__.py", line 33, in parse_model
trees, params = parse_xgb_ensemble(model)
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/parsers/parser_xgb.py", line 17, in parse_xgb_ensemble
trees = np.array([_parse_xgb_tree(tree_str) for tree_str in string_data], dtype=np.dtype(object))
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/parsers/parser_xgb.py", line 17, in <listcomp>
trees = np.array([_parse_xgb_tree(tree_str) for tree_str in string_data], dtype=np.dtype(object))
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/parsers/parser_xgb.py", line 88, in _parse_xgb_tree
node_dict = _parse_line(line)
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/parsers/parser_xgb.py", line 190, in _parse_line
res['feature'], res['threshold'] = _parse_decision_node_line(line)
File "/opt/conda/envs/py39/lib/python3.9/site-packages/tree_influence/explainers/parsers/parser_xgb.py", line 201, in _parse_decision_node_line
feature_ndx = int(feature_str[1:])
ValueError: invalid literal for int() with base 10: 'ecent_beta_blockers_change'
However, When training X_train_val, y_train_val (which is a numpy array) works perfectly good.
It would be great if you could support training with DataFrame as well.
Thanks again!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working