Reconstruction of the Phi-2 Method for Question-Answering  Related to Diabetes Disease Using the MedAlpaca Dataset

Muhammad Ridho; Alhadi Bustamam; Risman Adnan

doi:10.37905/jjbm.v6i3.30506

Reconstruction of the Phi-2 Method for Question-Answering Related to Diabetes Disease Using the MedAlpaca Dataset

Muhammad Ridho, Alhadi Bustamam, Risman Adnan

Abstract

This study focuses on the reconstruction of the Phi-2 method for text-based question-answering systems related to diabetes using the MedAlpaca dataset. The aim is to enhance the accuracy in diabetes question-answering applications. We leverage LoRA techniques to fine-tune the model, thereby improving its ability to handle complex medical queries. The integration of the MedAlpaca dataset, which contains a diverse range of medical questions and answers, provides a robust foundation for training and testing the model. The results reveal that fine-tuning with MedAlpaca significantly enhances the model’s performance, achieving higher accuracy compared to the base Phi-2 model, achieving a performance increase from 14.81% to 49.37% on MedMCQA, reaching 92.83% on PubMedQA, and 38.78% on MedQA. It also surpasses other leading models such as BioBERT (89.90%) and GatorTron (90.87%). The results highlight the effectiveness of incorporating domain-specific datasets like MedAlpaca to boost model performance. This advancement points to promising directions for future research, including expanding datasets and refining fine-tuning techniques to further improve automated medical question-answering systems.

Keywords

Fine-Tuning; Phi-2; MedAlpaca; Question-Answering; Diabetes

Full Text:

PDF

References

A. Vaswani et al., “Attention is all you need,” arXiv preprint arXiv:1706.03762, 2017. DOI:10.48550/arXiv.1706.03762

K. M. Fitria, “Information retrieval performance in text generation using knowledge from generative pre-trained transformer (gpt-3),” Jambura Journal of Mathematics, vol. 5, no. 2, pp. 327–338, 2023. DOI:10.34312/jjom.v5i2.20574

U. Rifanti et al., “A reinforcement learning based decision-support system for mitigate strategies during covid-19: A systematic review,” Jambura Journal of Biomathematics (JJBM), vol. 6, no. 1, pp. 60–70, 2025. DOI:10.37905/jjbm.v6i1.30513

E. Alsentzer et al., “Publicly available clinical bert embeddings,” arXiv preprint arXiv:1904.03323, 2019. DOI:10.48550/arXiv.1904.03323

J. Lee et al., “Biobert: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2019. DOI:10.1093/bioinformatics/btz682

ADA, “2. Classification and Diagnosis of Diabetes: Standards of Medical Care in Diabetes—2020,” Diabetes Care, vol. 43, no. Supplement_1, pp. S14–S31, 2020. DOI:10.2337/dc20-S002

S. Syarofina et al., “The distance function approach on the minibatchkmeans algorithm for the dpp-4 inhibitors on the discovery of type 2 diabetes drugs,” Procedia Computer Science, vol. 179, pp. 127–134, 2021. DOI:10.1016/j.procs.2020.12.017

M. J. Davies et al., “Management of hyperglycemia in type 2 diabetes, 2018. a consensus report by the american diabetes association (ada) and the european association for the study of diabetes (easd),” Diabetes Care, vol. 41, no. 12, pp. 2669–2701, 2018. DOI:10.2337/dci18-0033

IDF, IDF Diabetes Atlas (10th ed). Russels: International Diabetes Federation, 2021.

Microsoft, “Phi-2: The surprising power of small language models,” 2023, Accesed on 5 February 2025.

Han et al., “Medalpaca – an open-source collection of medical conversational ai models and training data,” arXiv preprint arXiv:2304.08247, 2023. DOI:10.48550/arXiv.2304.08247

D. Hendrycks et al., “Measuring massive multitask language understanding,” in Proceedings of the International Conference on Learning Representations (ICLR), 2021. DOI:10.48550/arXiv.2009.03300

B. Yu, Y. Li, and J. Wang, “Detecting causal language use in science findings,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 4663–4673 2019. DOI:10.18653/v1/D19-1473

L. Yunxiang et al., “Chatdoctor: A medical chat model fine-tuned on llama model using medical domain knowledge,” Cureus, 2023. DOI:10.7759/cureus.40895

E. J. Hu, et al., “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2021. DOI:10.48550/arXiv.2106.09685

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2015. DOI:10.48550/arXiv.1412.6980

J. Dean et al., “Large scale distributed deep networks,” in Proceedings of the 26th Conference on Neural Information Processing Systems (NeurIPS), vol. 25, 2012.

Y. Bengio, “Practical recommendations for gradient-based training of deep architectures.” In Neural Networks: Tricks of the Trade, vol. 7700, pp. 437–478, Heidelberg: Springer, 2012. DOI:10.1007/978-3-642-35289-8_26

A. Pal, L. K. Umapathi, and M. Sankarasubbu, “Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering,” In Proceedings of Machine Learning Research, vol. 174, pp. 248–260, 2022.

Q. Jin et al., “Pubmedqa: A dataset for biomedical research question answering,” arXiv preprint arXiv:1909.06146, 2019. DOI:10.48550/arXiv.1909.06146

D. Jin et al., “What disease does this patient have? a large-scale open domain question answering dataset from medical exams,” Applied Sciences, vol. 11, no. 14, p. 6421, 2021. DOI:10.3390/app11146421

K. Huang, J. Altosaar, and R. Ranganath, “Clinicalbert: Modeling clinical notes and predicting hospital readmission,” arXiv preprint arXiv:1904.05342, 2019. DOI:10.48550/arXiv.1904.05342

Y. Peng, S. Yan, and Z. Lu, “Transfer learning in biomedical natural language processing: An evaluation of bert and elmo on ten benchmarking datasets,” In Proceedings of the 18th BioNLP Workshop and Shared Task, pp. 58–65, 2019. DOI:10.18653/v1/W19-5006

X. Yang et al., “Gatortron: A large clinical language model to unlock patient information from unstructured electronic health records,” npj Digital Medicine, vol. 5, no. 1, p. 194, 2022. DOI:10.1038/s41746-022-00742-2

M. Yasunaga, J. Leskovec, and P. Liang, “Linkbert: Pretraining language models with document links,” arXiv preprint arXiv:2203.15827, 2022. DOI:10.48550/arXiv.2203.15827

DOI: https://doi.org/10.37905/jjbm.v6i3.30506

Copyright (c) 2025 Muhammad Ridho, Alhadi Bustamam, Risman Adnan

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Jambura Journal of Biomathematics (JJBM) has been indexed by:

EDITORIAL OFFICE OF JAMBURA JOURNAL OF BIOMATHEMATICS

Department of Mathematics, Faculty of Mathematics and Natural Science, Universitas Negeri Gorontalo

Jl. Prof. Dr. Ing. B. J. Habibie, Moutong, Tilongkabila, Kabupaten Bone Bolango 96554, Gorontalo, Indonesia

Email: [email protected]

Jambura Journal of Biomathematics (JJBM) by Department of Mathematics Universitas Negeri Gorontalo is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Powered by Public Knowledge Project OJS.

Username
Password
Remember me