Hyperparameter Tuning of Random Forest Algorithm for Diabetes Classification
This study aims to optimize the hyperparameters of the Random Forest model in diabetes classification using the Pima Indian Diabetes dataset, given the importance of early diabetes diagnosis to mitigate serious health impacts. While Random Forest is a popular algorithm for classification due to its resistance to overfitting, the selection of the right hyperparameters significantly affects its performance. Therefore, this research utilizes Grid Search and Random Search techniques for hyperparameter tuning to improve model accuracy. The research methodology includes data collection, preprocessing, dataset splitting (80% for training and 20% for testing), feature scaling using Standard Scaler, and the application of the Random Forest algorithm with hyperparameter tuning and model evaluation based on accuracy, precision, recall, and F1-Score. The results show that Random Forest, when tuned with Grid Search and Random Search, significantly improved model performance, with Random Search yielding the best results, achieving an accuracy of 0.75, precision of 0.64, and recall of 0.69. This study demonstrates that hyperparameter tuning can significantly enhance the performance of the Random Forest model, contributing to the development of machine learning applications for medical diabetes diagnosis.