Please use this identifier to cite or link to this item:
http://ir.futminna.edu.ng:8080/jspui/handle/123456789/27575
Title: | An Improved Adaptive Synthetic Sampling Technique and Machine Learning Model for Enhanced Imbalance Medical Data Classification |
Authors: | Abdullahi, Hafiz Bashir, Sulaimon Adebayo Aminu, Enesi Femi |
Keywords: | Imbalance, Datasets, Adaptive, synthetic, Data mining, Machine learning, Oversampling, Undersampling. |
Issue Date: | Nov-2023 |
Publisher: | Federal UNiversity of Technology Akure |
Citation: | Abdullahi, Hafiz, S.A., & Aminu, E.F. (2023). An Improved Adaptive Synthetic Sampling Technique and Machine Learning Model for Enhanced Imbalance Medical Data Classification. Proceedings of the 2023 School of Engineering and Engineering Technology (SEET) Annual Conference FUTA Nigeria. |
Abstract: | Medical data classification plays a pivotal role in healthcare decision-making. Addressing the challenges posed by imbalanced datasets is critical for accurate classification in this domain. This paper presents an innovative approach to enhancing the Adaptive Synthetic Sampling (ADASYN) algorithm, tailored specifically for medical data classification. The proposed Improved ADASYN algorithm integrates ADASYN with k-means clustering to address two key issues: generating synthetic minority samples and eliminating potential outliers introduced by ADASYN. By doing so, it aims to mitigate the adverse effects of reduced accuracy in the majority class, ultimately enhancing classification performance. The pre-processed medical data undergoes an estimation process to determine the requisite number of synthetic samples, which are subsequently generated using ADASYN. These synthesized samples are seamlessly merged with the original minority data. Subsequently, k-means clustering is employed to identify and filter out misclustered data, effectively removing outliers. If data imbalance persists, the algorithm iterates, recalculating the need for additional minority samples. This iterative process continues until a balanced dataset is achieved. The resulting balanced dataset is then primed for utilization by machine learning algorithms for classification purposes. Notably, the proposed algorithm was implemented using MATLAB version R2023a, ensuring reproducibility and applicability in practical medical data classification scenarios. This research presents a promising step towards improving the robustness and accuracy of medical data classification, thereby contributing to enhanced healthcare decision support systems. |
URI: | http://repository.futminna.edu.ng:8080/jspui/handle/123456789/27575 |
ISBN: | :978-978-785-579-9 |
Appears in Collections: | Computer Science |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Afiz_FUTA_Conf.pdf | 2.28 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.