
epocrates
What is Google's Med-PaLM 2 and what can it do for healthcare?
May 23, 2023

With the launch of the very popular ChatGPT by OpenAI last year, the healthcare industry is increasingly exploring the potential for integrating large language models (LLMs) to serve as potential assistants to clinicians, to enhance patient experiences, and to increase productivity and efficiency by automating various management tasks.
Open AI's ChatGPT, released last November, was the first model to generate very public buzz around the potential of AI to transform healthcare. By February 2023, ChatGPT had scored at or near the passing threshold of 60% accuracy without specialized input from clinician trainers on the U.S. Medical Licensing Examination (USMLE) exam.
Last week, Google Health UK Research lead Alan Karthikesalingam posted on Twitter that another AI tool—MedPaLM 2—had recently scored up to 86.5% on the USMLE exam, performing 19% better on the USMLE test than the first version of Med-PaLM, preprinted in late 2022. These latest study results (which haven't been peer-reviewed or published in a journal yet) are currently available in pre-print on arXiv.
What is Med-PaLM 2 and how does it compare to ChatGPT?
Launched in March of this year, Med-PaLM 2 (Pathways Language Model) is Google's second version of an open-sourced, large language platform designed to provide authoritative answers to medical questions and generate medical insights.
The bedrock upon which Med-PaLM 2 is built is PaLM 2, Google's LLM with improved multilingual, reasoning, and coding capabilities. Trained across 100 languages, PaLM 2 also has a higher capability for common sense reasoning and advanced mathematics and logic than its predecessor.
Although the technology behind PaLM 2 is comparable to GPT-4 (the technology behind OpenAI's ChatGPT), there are differences in scope and focus. As a generalized and broad LLM, ChatGPT was trained on an extraordinarily massive but broad range of datasets for the purpose of serving as a general natural language tool. At 1 trillion parameters, the GPT-4 model is reputed to be trained on 10 times as many parameters as PaLM 2.
But experts say that in the medical domain, where accuracy and patient safety are paramount, bigger may not necessarily be better.
What can Med-PaLM 2 do for healthcare?
Google Research reports that it can already answer some complex medical questions better than most medical students. Med-PaLM 2 was the first LLM to perform at an “expert” test-taker level performance on the MedQA dataset of the U.S. Medical Licensing Examination (USMLE)-style questions, achieving an accuracy of 85.4% in the USMLE test, higher than GPT-4 (84%), and far exceeding the usual 60% pass mark for new physicians. Med-PaLM 2 answered multiple choice and open-ended questions, provided written explanations for its answers, and evaluated its own responses.
In March 2023, at Google's annual "The Checkup" event, company senior leaders demonstrated examples of how Med-PaLM 2 might answer questions like “What are the first warning signs of pneumonia?” and “Can incontinence be cured?” In some cases, Med-PaLM 2′s answers were on a par, and even more detailed, than the answers that clinicians had provided. But in other instances, Med-PaLM 2′s responses were not as thorough, nuanced, or accurate.
Earlier this month, at Google I/O 2023, Google introduced Med-PaLM 2's multimodal capabilities, including its potential to interpret images like X-rays, CT scans, and mammograms and come up with clinically sound conclusions. Future versions of Med-PaLM2 may be able to work in modalities other than language, including dermatology, retina, radiology (3D and 2D), pathology, medical health records, and genomics.
Such enhanced capabilities could bring much needed medical expertise and access to remote areas of the world, including underdeveloped countries where there are few physicians.
How was it created and tested?
Google worked with a panel of clinicians across the U.S., U.K., and India to fine-tune the model to generate answers that better aligned with how healthcare experts may have responded. These clinicians were given a long set of medical questions and scenarios. The model was then trained to answer questions in a similar way to these experts. Clinicians then cross-referenced the model’s answers on a set of values that included a low likelihood of medical harm, alignment with scientific consensus, precision, and a lack of bias.
To solve initial limitations brought about by a lack of standardized medical information and medical responses, Google introduced MultiMedQA, an open-source, medical question-answering benchmark. MultiMedQA combines six clinical, topic datasets (MedQA, MedMCQA, PubMedQA, LiveQA, MedicationQA, and MMLU) spanning professional medical exams, medical research, and consumer queries. In addition, a new dataset of curated, frequently searched medical inquiries called HealthSearchQA was recently added to improve MultiMedQA.
Google says that the company has also performed adversarial testing in different scenarios to test for LLM weaknesses and to ensure that its output is aligned with its ethical values, including health equity.
What's next?
On April 13, 2023, Google announced it would be opening up access to MedPaLM 2 to a select group of Google Cloud customers for limited testing, to explore use cases, and share feedback from partners as it further refines and improves the model. Possible use cases cited include MedPaLM 2's potential to answer complex medical questions, find insights in complicated, unstructured medical texts, as well as internal data sets, and draft short- and long-form responses.
Google acknowledges that there’s still much work to be done to make sure this technology can work in real-world settings, reiterating that, in building Med-PaLM 2, they've been focused on "safety, equity, and evaluations of unfair bias."
Sources:
HT TECH. AI study by Google researchers reveals incredible jump in Med-PaLM 2's answering accuracy. (May 18, 2023) https://tech.hindustantimes.com/tech/news/ai-study-by-google-researchers-reveals-incredible-jump-in-med-palm-2s-answering-accuracy-71684431143635.html
By Sai Balasubramanian, M.D., J.D. Forbes. (May 15, 2023) The Race Intensifies: Google Announces Bold Progress in AI, Advancing Med-PaLM 2 for Healthcare. https://www.forbes.com/sites/saibala/2023/05/15/the-race-intensfies-google-announces-bold-progress-in-ai-advancing-med-palm-2-for-healthcare/?sh=18b564626d73
By Zoubin Ghahramani. Introducing PaLM 2. Google DeepMind. (May 10, 2023) https://blog.google/technology/ai/google-palm-2-ai-large-language-model/
By Karan Singhal, et al. Med-PaLM. Google Research. (Accessed May 21, 2023) https://sites.research.google/med-palm/
By Karan Singhal and Vivek Natarajan. Med-PaLM 2, our expert-level medical LLM | Research Bytes. Google Research. (May 10, 2023). https://www.youtube.com/watch?v=k_-Z_TkHMqA
By Arjun Sha. Beebom. (May 15, 2023) Google PaLM 2 AI Model: Everything You Need to Know. https://beebom.com/google-palm-2-ai-model/
By Robert Barrie. Google opens limited access to Med-PaLM 2. (April 13, 2023) Medical Device Network. https://www.medicaldevice-network.com/news/google-opens-limited-access-to-med-palm-2/
TRENDING THIS WEEK