Search for a command to run...
Other Chat Generative Pre-Trained Transformer (ChatGPT) is an AI tool that’s easily accessible to the public. While convenient, there are concerns regarding accuracy and reliability of its responses. As the number of patients utilizing AI for their health-related concerns continues to grow, it's important to assess the information being provided by these resources. A ChatGPT-4o account was created. The memory function was disabled, maintaining data integrity. Questions regarding dietary restrictions of three major cardiovascular related diseases (hypertension, hyperlipidemia, and diabetes mellitus) were created. Two dietary questions per condition were asked sequentially in the same chat session. After clearing the initial session data, a separate session was then created for each individual disease. The same questions were asked seven days later, utilizing the same protocol, to assess reliability/consistency. The answers were evaluated by five board certified cardiologists to assess comprehensibility, completeness, and accuracy on a 5-point Likert scale (Figure 1). Outcomes were evaluated using ANOVA or Student’s T-test. 180 total responses were recorded evaluating comprehensibility, completeness, and accuracy (60 responses per category) of the 12 ChatGPT diet-related answers regarding hypertension, hyperlipidemia, and diabetes mellitus. The responses had a mean comprehensibility score of 4.18 ± 0.20, mean completeness score of 4.22 ± 0.16, and mean accuracy score of 4.20 ± 0.17 (Figure 2). When comparing the responses to the “Initial” vs “Repeat” question-answer pairs, there was no significant difference in mean scores for comprehensibility (4.23 and 4.13; p-value: 0.49), completeness (4.17 and 4.27; p-value: 0.30), and accuracy (4.27 and 4.13; p-value: 0.33). Additionally, when comparing the combined mean scores for comprehensibility, completeness, and accuracy for the “Initial” vs “Repeat” question-answer pairs, there was no significant difference in mean scores (p-value: 0.59) (Figure 3). ChatGPT is a potentially useful resource for cardiovascular information for patient inquiries. It provided generally comprehensive, complete, and accurate responses that remained consistent over multiple instances. To further assess the utility of AI and ChatGPT, further investigation needs to be conducted. Variables to be considered include utilizing multiple descriptors of diseases (ex: “hyperlipidemia,” “high cholesterol”), comparing multiple languages, and/or comparing various forms of ChatGPT (ex: 4o vs Open Evidence).
Published in: American Journal of Preventive Cardiology
Volume 23, pp. 101207-101207