(Since I last read Stanford’s 500-page report, I suddenly realized that AI in the medical field is progressing very quickly)
Researchers at @ICepfl and @YaleMed have teamed up to build Meditron, an LLM suite for resource-poor medical environments. With Llama 3, their new model outperforms most open models in its parameter categories in benchmarks such as MedQA and MedMCQA.
Researchers have released Meditron: an open source large multimodal basic model tailored for the medical sector to assist clinical decision-making and diagnosis, designed to solve medical innovation challenges in low-resource environments
The original model was built on Llama 2. After the release of Llama 3, the research team quickly fine-tuned a new 8B model within 24 hours: Llama-3[8B]-Meditron V1.0
Large language models (LLMs) have the potential to democratize the acquisition of medical knowledge. Although many efforts have been made to leverage and improve the medical knowledge and reasoning capabilities of LLMs, the resulting models are either closed-source (e.g. PaLM, GPT-4) or limited in size (= 13B parameter)), which limits their capabilities.
In this work, access to large-scale medical LLMs was improved by releasing MEDITRON: a set of open source LLMs with 7B and 70B parameters suitable for the medical field. MEDITRON is based on Llama-2 (through our adaptation of Nvidia’s Megatron-LM distributed trainer) and expands pre-training on a comprehensively curated medical corpus, including selected PubMed articles, abstracts, and internationally recognized medical guidelines.
Evaluations using four major medical benchmarks showed significant performance improvements before and after task-specific fine-tuning compared to several of the most advanced benchmarks.
Overall, MEDITRON achieved an absolute performance gain of 6% over the best public baseline in its parameter category and 3% over the strongest baseline we fine-tuned from Llama-2. Compared with closed-source LLMs, MEDITRON-70B performs better than GPT-3.5 and Med-PaLM, within 5% of GPT-4 and within 10% of Med-PaLM-2.
Code was released to manage the weights of the medical pre-training corpus and the MEDITRON model to drive open source development of more powerful medical LLMs.
If you want to learn more, you can click on the link below the video.
Thank you for watching this video. If you like it, please subscribe and like it. thank
Link:https://meditron-ddx.github.io/llama3-meditron.github.io/
Video: