Abstract :
This study investigates the application of advanced machine learning techniques for customer churn prediction in the rapidly evolving aquaculture technology sector. We employ and compare three distinct models—Logistic Regression, Random Forest, and XGBoost—to analyze a synthesized dataset representative of the industry. The research encompasses comprehensive data preprocessing, feature engineering, and model evaluation using standard performance metrics. Our findings demonstrate the superior performance of XGBoost, achieving 88% accuracy in predicting customer churn. Through feature importance analysis, we identify key churn predictors, with the difference between a customer’s last order amount and their mean order amount emerging as the most significant factor. Additionally, we utilize SHAP (SHapley Additive exPlanations) analysis to interpret model outcomes, revealing nuanced relationships between features and churn probability. The study highlights the critical role of consistent engagement, proactive customer support, and personalized retention strategies in reducing churn. Our research contributes to the growing body of knowledge on churn prediction in specialized technology sectors and provides actionable insights for improving customer retention strategies in the aquaculture industry. The paper concludes with recommendations for future research, including the integration of external data sources and exploration of deep learning approaches for temporal dependency analysis in customer behaviour.
Keywords :
Aquaculture technology, churn prediction, Customer Retention, data-driven strategies, feature importance, Machine learning, precision aquaculture, Predictive modeling, SHAP analysis, XGBoost.References :
- (2020). The State of World Fisheries and Aquaculture 2020. Rome.
- Føre, M., et al. (2018). Precision fish farming: A new framework to improve production in aquaculture. Biosystems Engineering, 173, 176-193.
- Joffre, O. M., et al. (2017). How is innovation in aquaculture conceptualized and managed? A systematic literature review and reflection framework to inform analysis and action. Aquaculture, 470, 129-148.
- Reichheld, F. F., & Schefter, P. (2000). E-loyalty: your secret weapon on the web. Harvard Business Review, 78(4), 105-113.
- Gallo, A. (2014). The value of keeping the right customers. Harvard Business Review, 29.
- Ahmad, A. K., Jafar, A., & Aljoumaa, K. (2019). Customer churn prediction in telecom using machine learning in big data platform. Journal of Big Data, 6(1), 1-24.
- Zhu, B., Baesens, B., & vanden Broucke, S. K. (2017). An empirical comparison of techniques for the class imbalance problem in churn prediction. Information Sciences, 408, 84-99.
- Martínez, A., et al. (2020). A machine learning framework for customer purchase prediction in the non-contractual setting. European Journal of Operational Research, 281(3), 588-596.
- Kumar, G., Engle, C., & Tucker, C. (2018). Factors driving aquaculture technology adoption. Journal of the World Aquaculture Society, 49(3), 447-476.
- Verbeke, W., et al. (2012). New insights into churn prediction in the telecommunication sector: A profit driven data mining approach. European Journal of Operational Research, 218(1), 211-229.
- Xie, Y., et al. (2009). Customer churn prediction using improved balanced random forests. Expert Systems with Applications, 36(3), 5445-5449.
- Li, H., et al. (2015). Predicting high-risk customers based on SVM and social network analysis. Information Sciences, 328, 343-355.
- Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (Vol. 398). John Wiley & Sons.
- Neslin, S. A., et al. (2006). Defection detection: Measuring and understanding the predictive accuracy of customer churn models. Journal of marketing research, 43(2), 204-211.
- Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
- Burez, J., & Van den Poel, D. (2009). Handling class imbalance in customer churn prediction. Expert Systems with Applications, 36(3), 4626-4636.
- Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794).
- Suh, E., Alhaery, M., Abarbanel, B., & Yang, A. (2017). Examining the applicability of small data in predicting player churn for online gambling. International Journal of Contemporary Hospitality Management.
- Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems (pp. 4765-4774).
- De Caigny, A., Coussement, K., & De Bock, K. W. (2018). A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. European Journal of Operational Research, 269(2), 760-772.
- Stentiford, G. D., et al. (2020). Sustainable aquaculture through the One Health lens. Nature Food, 1(8), 468-474.
- Kumar, G., Engle, C., & Tucker, C. (2018). Factors driving aquaculture technology adoption. Journal of the World Aquaculture Society, 49(3), 447-476.