Articles

Outpatient Length of Stay (OLOS) Analysis at Edelweis Hospital using Machine Learning Algorithm

Patient satisfaction may be impacted by the length of stay (LOS) that a patient perceives during an outpatient clinic visit. With the increasing competition in the healthcare industry and patients’ demands for higher-quality care, hospitals are focusing more on enhancing their quality from a clinical and management perspective. The Indonesia Ministry of Health has established minimum standards (SPM) for healthcare services that all Indonesian hospitals are required to meet, particularly the hospital waiting time indicator, which must be no longer than 60 minutes. Furthermore, there is a term in healthcare called outpatient length of stay (OLOS) that is not yet specified in SPM. OLOS is defined as the amount of time a patient spends in a hospital from the moment he or she arrives at the administration until he or she leaves. Edelweis Hospital is one of a private hospital located in Bandung that has established a 2-hour maximum LOS standard for its outpatient services. Providing accurate information about LOS may increase patient satisfaction by reducing uncertainty. However, effective methods to predict the length of stay for outpatients (OLOS) in Pediatric Clinics are seldom known. This study’s goal is to design a prediction model for OLOS based on patient characteristics and several other clinical attributes. By identifying the attributes that affected OLOS, the model will help hospital make relevant decisions. We used machine learning algorithms such as random forest, decision tree, k-nearest neighbor (kNN), adaboost, and gradient boosting to design prediction models for OLOS. From the validation set, random forest has the highest accuracy rate with a value of 99.3%, followed by decision tree and gradient boosting were 99.2% each. Furthermore, machine learning models were used to determine the importance of attributes. These models could eventually be used alongside with real-time IT system data to provide accurate real-time estimates of OLOS at the Pediatric Clinic.

A Literary Review of Pattern Matching Techniques in Network Intrusion Detection

With the exponential growth in devices and services being added to networks, we are also witnessing an increase in the volume and complexity of threats, urging an increased efficiency in network intrusion detection systems which primarily rely on pattern matching to identify malicious activity on the network. In this literary review of pattern matching techniques in network intrusion detection, we explore the limitations and the research carried out in both signature-based and anomaly-based intrusion detection systems to overcome them. It focuses on the performance improvements in signature-based intrusion detection systems achieved through methodologies and technologies like regular expressions, Hyperscan, RE2, Flashtext, a generalized Aho-Corasick algorithm, usage of Bloom filters and payload sampling. It also covers the usage of machine learning techniques, including genetic algorithms, Support Vector Machines (SVM) and Improved Self-Adaptive Bayesian Algorithm (ISABA), which are used to detect anomalous behavior and identify potential threats in a network in anomaly-based network intrusion detection to assist the security analysts carry out their job functions. Additionally, this review explores the integration of the MITRE ATT&CK framework and Security Information and Event Management (SIEM) systems in network intrusion detection as this framework provides a structured and standardized approach for analyzing the tactics and techniques used by attackers to classify them, while SIEM systems enable the correlation of threat activity across multiple sources, allowing for a more comprehensive and accurate view of the network security. Overall, this literary review provides insights into the state-of-the-art techniques and frameworks used in Network Intrusion Detection based on Pattern Matching, highlighting the significant improvements in performance and detection capabilities.

Detection and Classification of Gastrointestinal Diseases by using Machine Learning: A Review

Currently, gastrointestinal diseases claim the lives of up to two million people worldwide. GI disease treatment can be challenging, time-consuming, and expensive.  One of the most recent advancements in medical imaging is the use of video endoscopy to diagnose gastrointestinal illnesses such stomach ulcers, bleeding, and polyps. Doctors require a lot of time to review all the images produced by medical video endoscopy since there are so many of them. This makes manual diagnosis difficult and has encouraged research into computer-aided approaches to diagnose all of the generated images quickly and accurately. The innovative aspect of the suggested methodology is the creation of a system for the diagnosis of digestive disorders. Machine learning techniques have the potential to significantly lower the cost of examination procedures while increasing the accuracy and speed of diagnosis. This paper describes a method for classifying GI illnesses using machine learning techniques.

E-Commerce Product Demand Modelling Using Machine Learning Algorithm Case Study of Rice Trading Products in PT XYZ

E-commerce XYZ is an Indonesian commerce company that have 3 types of products in its B2C business line: trading, consignment, and marketplace. From January 2021 until October 2022, the company’s trading rice category product sales generated a negative profit. Even though for the last several years e-commerce has been focused on growth instead of profitability, the current economic environment is forcing e-commerce companies to focus on profitability as well. For trading products, maximum profit can be achieved in two ways: selling products with a very high margin but with less quantities or selling in large quantities but with a sub-optimal margin. Hence, the company needs to find a demand function model that can be used to generate maximum profit. To find the best model, the researcher first created a baseline model by using median for every product group which is already grouped based on their Unit of Measurement. Next, to find the best model, the researcher will create a demand function using 4 other models. It is found that Gradient Boosted is the best algorithm to model the demand function. Although this model successfully models a demand function for a product category in e-commerce, business context still needs to be added before this model can be implemented in real life as well as finding other features that might affect the demand function.

Predicting Customer Satisfaction through Sentiment Analysis on Online Review

User-generated content, such as user reviews, posts, tags, ratings, and opinions on the internet, can be used as a business indicator if collected and appropriately analyzed. One of the examples is predicting customer satisfaction through implementing big data analytics on online reviews. In analyzing the user-generated content to predict customer satisfaction, the author implements machine learning approach using the Sentiment Analysis method. Five-fold cross-validation was performed to train the classification model. The training was performed with a combination of tokenization methods: term frequency-inverse document frequency (tf-idf) and bag-of-words; n-gram types: unigram, bigram, trigram, and combination of unigram, bigram, and trigram; and machine learning algorithms: linear support vector classification (LinearSVC) and multinomial naïve bayes (MultinomialNB). The result was then evaluated using classification performance metrics such as precision, recall, F1 measure, and AUC score.

The result shows that the tf-idf vectorizer performs similarly to the bag-of-words method. A similar result was also observed for machine learning algorithm selection. Both MultinomialNB and LinearSVC produce the same performance. Low-level n-grams (such as unigrams and bigrams) tended to have higher precision, recall, F1 measure, and AUC score than high-order n-grams (such as trigrams). The best results were achieved by combining unigrams, bigrams, and trigrams, resulting in an average performance score of 0.94 for all measurements. From the result and analysis, the author finds that predicting customer satisfaction using text and sentiment analysis methods on user-generated content is possible. The model’s performance in this experiment is decent, with high precision, recall, F1, and AUC score.