Abstract :
User-generated content, such as user reviews, posts, tags, ratings, and opinions on the internet, can be used as a business indicator if collected and appropriately analyzed. One of the examples is predicting customer satisfaction through implementing big data analytics on online reviews. In analyzing the user-generated content to predict customer satisfaction, the author implements machine learning approach using the Sentiment Analysis method. Five-fold cross-validation was performed to train the classification model. The training was performed with a combination of tokenization methods: term frequency-inverse document frequency (tf-idf) and bag-of-words; n-gram types: unigram, bigram, trigram, and combination of unigram, bigram, and trigram; and machine learning algorithms: linear support vector classification (LinearSVC) and multinomial naïve bayes (MultinomialNB). The result was then evaluated using classification performance metrics such as precision, recall, F1 measure, and AUC score.
The result shows that the tf-idf vectorizer performs similarly to the bag-of-words method. A similar result was also observed for machine learning algorithm selection. Both MultinomialNB and LinearSVC produce the same performance. Low-level n-grams (such as unigrams and bigrams) tended to have higher precision, recall, F1 measure, and AUC score than high-order n-grams (such as trigrams). The best results were achieved by combining unigrams, bigrams, and trigrams, resulting in an average performance score of 0.94 for all measurements. From the result and analysis, the author finds that predicting customer satisfaction using text and sentiment analysis methods on user-generated content is possible. The model’s performance in this experiment is decent, with high precision, recall, F1, and AUC score.
Keywords :
Classification, Customer Satisfaction, Machine learning, Sentiment analysis, User-generated content.References :
- Kim and C. Lim, “Advanced Engineering Informatics Customer complaints monitoring with customer review data analytics : An integrated method of sentiment and statistical process control analyses,” Adv. Eng. Informatics, vol. 49, no. April, p. 101304, 2021, doi: 10.1016/j.aei.2021.101304.
- Kauffmann, J. Peral, D. Gil, A. Ferrández, R. Sellers, and H. Mora, “A framework for big data analytics in commercial social networks: A case study on sentiment analysis and fake review detection for marketing decision-making,” Ind. Mark. Manag., vol. 90, no. November 2018, pp. 523–537, 2019, doi: 10.1016/j.indmarman.2019.08.003.
- W. Lee, G. Jiang, H. Y. Kong, and C. Liu, “A difference of multimedia consumer’s rating and review through sentiment analysis,” Multimed. Tools Appl., vol. 80, no. 26–27, pp. 34625–34642, 2021, doi: 10.1007/s11042-020-08820-x.
- Jena, “An empirical case study on Indian consumers’ sentiment towards electric vehicles: A big data analytics approach,” Ind. Mark. Manag., vol. 90, no. January, pp. 605–616, 2020, doi: 10.1016/j.indmarman.2019.12.012.
- F. Wamba, A. Gunasekaran, S. Akter, S. J. fan Ren, R. Dubey, and S. J. Childe, “Big data analytics and firm performance: Effects of dynamic capabilities,” J. Bus. Res., vol. 70, pp. 356–365, Jan. 2017, doi: 10.1016/j.jbusres.2016.08.009.
- Yakubu and C. K. Kwong, “Forecasting the importance of product attributes using online customer reviews and Google Trends,” Technol. Forecast. Soc. Change, vol. 171, no. June, p. 120983, 2021, doi: 10.1016/j.techfore.2021.120983.
- N. Lemon and P. C. Verhoef, “Understanding customer experience throughout the customer journey,” J. Mark., vol. 80, no. 6, pp. 69–96, 2016, doi: 10.1509/jm.15.0420.
- Haddi, X. Liu, and Y. Shi, “The role of text pre-processing in sentiment analysis,” in Procedia Computer Science, 2013, vol. 17, pp. 26–32. doi: 10.1016/j.procs.2013.05.005.
- Williams and E. Naumann, “Customer satisfaction and business performance : a firm-level analysis,” vol. 1, no. May 2009, pp. 20–32, 2011, doi: 10.1108/08876041111107032.
- Piris and A. C. Gay, “Customer satisfaction and natural language processing,” J. Bus. Res., vol. 124, no. December 2020, pp. 264–271, 2021, doi: 10.1016/j.jbusres.2020.11.065.
- Al-Otaibi et al., “Customer satisfaction measurement using sentiment analysis,” Int. J. Adv. Comput. Sci. Appl., vol. 9, no. 2, pp. 106–117, 2018, doi: 10.14569/IJACSA.2018.090216.
- Ilavendhan, S. Ranjan, and S. N. Manoharan, “An Empirical Analysis on Various Techniques Used to Detect the Polarity of Customer Satisfaction in Sentiment Analysis,” pp. 4376–4386, 2021.
- Kang and Y. Park, “Review-based measurement of customer satisfaction in mobile service: Sentiment analysis and VIKOR approach,” Expert Syst. Appl., vol. 41, no. 4 PART 1, pp. 1041–1050, 2014, doi: 10.1016/j.eswa.2013.07.101.
- Ming, H. Quan, G. Li, and R. Law, “International Journal of Hospitality Management Understanding service attributes of robot hotels : A sentiment analysis of customer online reviews,” Int. J. Hosp. Manag., vol. 98, no. July, p. 103032, 2021, doi: 10.1016/j.ijhm.2021.103032.
- Andreevskaia and S. Bergler, “When Specialists and Generalists Work Together : Overcoming Domain Dependence in Sentiment Tagging,” no. June, pp. 290–298, 2008.
- Ghiassi and S. Lee, “A domain transferable lexicon set for Twitter sentiment analysis using a supervised machine learning approach,” Expert Syst. Appl., vol. 106, pp. 197–216, Sep. 2018, doi: 10.1016/j.eswa.2018.04.006.
- Akuma, T. Lubem, and I. Terngu, “Comparing Bag of Words and TF ‑ IDF with different models for hate speech detection from live tweets,” Int. J. Inf. Technol., vol. 14, no. 7, pp. 3629–3635, 2022, doi: 10.1007/s41870-022-01096-4.
- Bordoloi, “Sentiment Analysis of Product using Machine Learning Technique : A Comparison among NB, SVM and MaxEnt Sentiment Analysis of Product using Machine Learning Technique : A Comparison among NB, SVM and MaxEnt,” Int. J. Pure Appl. Math., no. July, 2018.
- Hartmann, J. Huppertz, C. Schamp, and M. Heitmann, “Comparing automated text classi fi cation methods,” vol. 36, pp. 20–38, 2019, doi: 10.1016/j.ijresmar.2018.09.009.
- Wang and C. D. Manning, “Baselines and Bigrams : Simple, Good Sentiment and Topic Classification,” no. July, pp. 90–94, 2012.