Reconstruction of the Phi-2 Method for Question-Answering Related to Diabetes Disease Using the MedAlpaca Dataset

Muhammad Ridho, Alhadi Bustamam, Risman Adnan

Abstract


This  study  focuses on the reconstruction of the Phi-2  method  for text-based question-answering systems  related to diabetes  using the MedAlpaca dataset.   The  aim  is to enhance  the accuracy in  diabetes  question-answering applications.   We  leverage LoRA  techniques   to fine-tune  the model,  thereby  improving its  ability to handle complex medical queries.  The integration of the MedAlpaca dataset, which contains  a diverse range of medical questions  and answers,  provides a robust  foundation for training and testing the model.  The results  reveal  that fine-tuning  with   MedAlpaca  significantly  enhances   the  model’s   performance,  achieving  higher   accuracy compared to the base Phi-2  model,  achieving a performance increase  from  14.81% to 49.37% on MedMCQA, reaching  92.83%  on  PubMedQA, and  38.78%  on  MedQA. It  also  surpasses  other  leading  models   such  as BioBERT  (89.90%)   and   GatorTron  (90.87%).        The   results    highlight  the   effectiveness    of   incorporating domain-specific datasets  like  MedAlpaca to boost model  performance.  This  advancement points  to promising directions  for  future  research,   including  expanding datasets  and  refining fine-tuning techniques   to  further improve automated  medical question-answering systems.

Keywords


Fine-Tuning; Phi-2; MedAlpaca; Question-Answering; Diabetes

Full Text:

PDF

References


A. Vaswani et al., “Attention is all you need,” arXiv preprint arXiv:1706.03762, 2017. DOI:10.48550/arXiv.1706.03762

K. M. Fitria, “Information retrieval performance in text generation using knowledge from generative pre-trained transformer (gpt-3),” Jambura Journal of Mathematics, vol. 5, no. 2, pp. 327–338, 2023. DOI:10.34312/jjom.v5i2.20574

U. Rifanti et al., “A reinforcement learning based decision-support system for mitigate strategies during covid-19: A systematic review,” Jambura Journal of Biomathematics (JJBM), vol. 6, no. 1, pp. 60–70, 2025. DOI:10.37905/jjbm.v6i1.30513

E. Alsentzer et al., “Publicly available clinical bert embeddings,” arXiv preprint arXiv:1904.03323, 2019. DOI:10.48550/arXiv.1904.03323

J. Lee et al., “Biobert: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2019. DOI:10.1093/bioinformatics/btz682

ADA, “2. Classification and Diagnosis of Diabetes: Standards of Medical Care in Diabetes—2020,” Diabetes Care, vol. 43, no. Supplement_1, pp. S14–S31, 2020. DOI:10.2337/dc20-S002

S. Syarofina et al., “The distance function approach on the minibatchkmeans algorithm for the dpp-4 inhibitors on the discovery of type 2 diabetes drugs,” Procedia Computer Science, vol. 179, pp. 127–134, 2021. DOI:10.1016/j.procs.2020.12.017

M. J. Davies et al., “Management of hyperglycemia in type 2 diabetes, 2018. a consensus report by the american diabetes association (ada) and the european association for the study of diabetes (easd),” Diabetes Care, vol. 41, no. 12, pp. 2669–2701, 2018. DOI:10.2337/dci18-0033

IDF, IDF Diabetes Atlas (10th ed). Russels: International Diabetes Federation, 2021.

Microsoft, “Phi-2: The surprising power of small language models,” 2023, Accesed on 5 February 2025.

Han et al., “Medalpaca – an open-source collection of medical conversational ai models and training data,” arXiv preprint arXiv:2304.08247, 2023. DOI:10.48550/arXiv.2304.08247

D. Hendrycks et al., “Measuring massive multitask language understanding,” in Proceedings of the International Conference on Learning Representations (ICLR), 2021. DOI:10.48550/arXiv.2009.03300

B. Yu, Y. Li, and J. Wang, “Detecting causal language use in science findings,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 4663–4673 2019. DOI:10.18653/v1/D19-1473

L. Yunxiang et al., “Chatdoctor: A medical chat model fine-tuned on llama model using medical domain knowledge,” Cureus, 2023. DOI:10.7759/cureus.40895

E. J. Hu, et al., “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2021. DOI:10.48550/arXiv.2106.09685

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2015. DOI:10.48550/arXiv.1412.6980

J. Dean et al., “Large scale distributed deep networks,” in Proceedings of the 26th Conference on Neural Information Processing Systems (NeurIPS), vol. 25, 2012.

Y. Bengio, “Practical recommendations for gradient-based training of deep architectures.” In Neural Networks: Tricks of the Trade, vol. 7700, pp. 437–478, Heidelberg: Springer, 2012. DOI:10.1007/978-3-642-35289-8_26

A. Pal, L. K. Umapathi, and M. Sankarasubbu, “Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering,” In Proceedings of Machine Learning Research, vol. 174, pp. 248–260, 2022.

Q. Jin et al., “Pubmedqa: A dataset for biomedical research question answering,” arXiv preprint arXiv:1909.06146, 2019. DOI:10.48550/arXiv.1909.06146

D. Jin et al., “What disease does this patient have? a large-scale open domain question answering dataset from medical exams,” Applied Sciences, vol. 11, no. 14, p. 6421, 2021. DOI:10.3390/app11146421

K. Huang, J. Altosaar, and R. Ranganath, “Clinicalbert: Modeling clinical notes and predicting hospital readmission,” arXiv preprint arXiv:1904.05342, 2019. DOI:10.48550/arXiv.1904.05342

Y. Peng, S. Yan, and Z. Lu, “Transfer learning in biomedical natural language processing: An evaluation of bert and elmo on ten benchmarking datasets,” In Proceedings of the 18th BioNLP Workshop and Shared Task, pp. 58–65, 2019. DOI:10.18653/v1/W19-5006

X. Yang et al., “Gatortron: A large clinical language model to unlock patient information from unstructured electronic health records,” npj Digital Medicine, vol. 5, no. 1, p. 194, 2022. DOI:10.1038/s41746-022-00742-2

M. Yasunaga, J. Leskovec, and P. Liang, “Linkbert: Pretraining language models with document links,” arXiv preprint arXiv:2203.15827, 2022. DOI:10.48550/arXiv.2203.15827




DOI: https://doi.org/10.37905/jjbm.v6i3.30506

Copyright (c) 2025 Muhammad Ridho, Alhadi Bustamam, Risman Adnan

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


Jambura Journal of Biomathematics (JJBM) has been indexed by:


EDITORIAL OFFICE OF JAMBURA JOURNAL OF BIOMATHEMATICS

 Department of Mathematics, Faculty of Mathematics and Natural Science, Universitas Negeri Gorontalo
Jl. Prof. Dr. Ing. B. J. Habibie, Moutong, Tilongkabila, Kabupaten Bone Bolango 96554, Gorontalo, Indonesia
 Email: [email protected]
 Jambura Journal of Biomathematics (JJBM) by Department of Mathematics Universitas Negeri Gorontalo is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Powered by Public Knowledge Project OJS.