Large language models are poor medical coders: Study

frank masiello

·

Apr 24, 2024

·

·

blog and news home

https://www.mountsinai.org/about/newsroom/2024/despite-ai-advancements-human-oversight-remains-essential#

Here’s a summary of the current page:

Study Findings: Large language models (LLMs) like GPT-4 and GPT-3.5 show limited accuracy in medical coding, with performance below 50%.
Best Performer: GPT-4 had the highest exact match rates for medical codes but still produced a significant number of errors.
**Potential Applications**: LLMs could automate medical code assignment for healthcare reimbursement and research, but require further refinement1.
**Future Research**: The team at Icahn School of Medicine plans to develop tailored LLM tools for accurate medical data extraction and billing code assignment2.

The study emphasizes the need for cautious implementation and ongoing development of AI in healthcare3.

Share on Social Media

Leave a Reply Cancel reply