Hippocrates: An open source machine learning framework to advance large language models in health care

Researchers at Koç University, Hacetepe University, Yıldız Technical University and Robert College have launched “Hippocrates”, an open source framework specifically tailored for LLMs healthcare applications. Unlike previous models that relied on proprietary data, Hippocrates granted full access to a wide range of resources, thereby promoting greater innovation and collaboration in medical artificial intelligence research. The framework is outstanding in that it combines continuous pre-training and reinforcement learning with feedback from human experts to enhance the model’s practicality in medical environments.

The Hippocratic Framework adopts a systematic approach that starts with continuous pre-training of a comprehensive medical text corpus. Specialized datasets such as the MedQA and PMC-Patients databases are then used to fine-tune these models, including the Hippo series of 7B parametric models. This process uses instruction adjustment and reinforcement learning techniques to align model output with expert medical insights. The robust assessment uses the EleutherAI evaluation framework to ensure that the model is tested against various medical benchmarks to verify its effectiveness and reliability.

Integrating large language models (LLMs) into health care promises to transform medical diagnosis, research and patient care. However, the development of medical LLMs faces obstacles such as complex training requirements, strict evaluation requirements, and the dominance of proprietary models that limit academic exploration. Transparent and comprehensive access to LLM resources is critical to advancing the field, promoting repeatability, and encouraging innovation in artificial intelligence in health care.
Launched Hippocrates, an open source LLM framework developed specifically for the medical sector. In sharp contrast to previous efforts, it provides unrestricted access to its training dataset, code base, checkpoints and evaluation protocols. This open approach is designed to promote collaborative research and enable communities to establish, refine and rigorously evaluate medical LLMs in a transparent ecosystem.
Also launched is Hippo, a series of 7B models tailored for the medical field that fine-tunes Mistral and LLaMA2 through continuous pre-training, command adjustments, and reinforcement learning from human and AI feedback. Our model is significantly superior to existing open care LLMs models, and even exceeds the model with 70B parameters.
Through Hippocrates, the desire to unleash the full potential of LLMs will not only advance medical knowledge and patient care, but also democratize the benefits of artificial intelligence research in health care and make it available globally.

If you want to learn more, you can click on the link below the video.
Thank you for watching this video. If you like it, please subscribe and like it. thank

Original address: https://marktechpost.com/2024/04/30/hippocrates-an-open-source-machine-learning-framework-for-advancing-large-language-models-in-healthcare/

Paper：https://arxiv.org/abs/2404.16621

Video: