IMPROVEMENT THE XGBOOST MODEL IN DETERMINING NBA GAME WIN DETERMINANTS: A BAYESIAN HYPERPARAMETER OPTIMIZATION AND SHAP APPROACH
DOI:
10.5281/zenodo.18816944Published:
2026-01-31Downloads
Abstract
In the era of modern sports analytics, post-game team performance evaluation is often conducted subjectively. This study aims to develop an objective diagnostic model to identify the key technical–statistical factors that determine victories in the NBA, grounded in the principle of game efficiency. Using box-score statistical data from the 2004–2024 seasons, this research employs the Extreme Gradient Boosting (XGBoost) algorithm [2], optimized through the Tree-structured Parzen Estimator (TPE) method within the Optuna framework [3], to classify game outcomes.
The experimental results demonstrate highly precise model performance, achieving an accuracy of 95.2% and an F1-score of 0.96. Interpretability analysis using the SHAP (Shapley Additive exPlanations) method [4] reveals that dominance in Shooting Efficiency (EFG%) and Player Impact Estimate (PIE) constitutes the absolute determinants of victory, followed by the minimization of turnovers. Furthermore, counterfactual simulations provide diagnostic insights indicating that an increase in a single statistic (e.g., +5 assists) without a corresponding improvement in shooting efficiency actually reduces the probability of winning (from 0.61 to 0.26). This finding suggests the phenomenon of “empty assists,” where ball movement does not translate into effective scoring opportunities.
This study contributes a performance-auditing framework that enables coaches and analysts to retrospectively evaluate the effectiveness of game strategies in an objective and data-driven manner
Keywords:
diagnotic analysis NBA XGBoost SHAP Bayesian OptimizationReferences
[1] D. Oliver, Basketball on Paper: Rules and Tools for Performance Analysis. Washington, D.C.: Potomac Books, Inc., 2004.
[2] T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 2016, pp. 785–794.
[3] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A Next-generation Hyperparameter Optimization Framework,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Anchorage, AK, USA, 2019, pp. 2623–2631.
[4] S. M. Lundberg and S.-I. Lee, “A Unified Approach to Interpreting Model Predictions,” in Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 2017, pp. 4765–4774.
[5] J. H. Friedman, “Greedy Function Approximation: A Gradient Boosting Machine,” The Annals of Statistics, vol. 29, no. 5, pp. 1189–1232, 2001.
[6] R. P. Bunker and F. Thabtah, “A Machine Learning Framework for Sport Result Prediction,” Applied Computing and Informatics, vol. 15, no. 1, pp. 27–33, 2019. doi: 10.1016/j.aci.2017.09.005.
[7] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
[8] Y. Ouyang et al., "Integration of machine learning XGBoost and SHAP models for NBA game outcome prediction and quantitative analysis methodology," PLoS ONE, vol. 19, no. 7, p. e0307478, Jul. 2024.
[9] P. Zuccolotto and M. Manisera, Basketball Data Science: With Applications in R. CRC Press, 2020.
[10] T. Horvat and J. Job, "The use of machine learning in sport outcome prediction: A review," Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 10, no. 5, p. e1380, 2020.
[11] J. S. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl, "Algorithms for Hyper-Parameter Optimization," in Advances in Neural Information Processing Systems (NIPS), 2011, pp. 2546–2554.
[12] L. Zhang and M. A. Gomez, "Machine Learning for Basketball Game Outcomes: NBA and WNBA Leagues," Algorithms, vol. 13, no. 10, p. 230, Oct. 2024.
License
Copyright (c) 2026 Fija Ramadhan, Ahmad Zainul Fanani

This work is licensed under a Creative Commons Attribution 4.0 International License.




