Towards Safe Large Language Models for Medicine

Han, Tessa; Kumar, Aounon; Agarwal, Chirag; Lakkaraju, Himabindu

Computer Science > Artificial Intelligence

arXiv:2403.03744 (cs)

[Submitted on 6 Mar 2024 (v1), last revised 14 May 2024 (this version, v3)]

Title:Towards Safe Large Language Models for Medicine

Authors:Tessa Han, Aounon Kumar, Chirag Agarwal, Himabindu Lakkaraju

View PDF HTML (experimental)

Abstract:As large language models (LLMs) develop ever-improving capabilities and are applied in real-world settings, it is important to understand their safety. While initial steps have been taken to evaluate the safety of general-knowledge LLMs, exposing some weaknesses, the safety of medical LLMs has not been sufficiently evaluated despite their high risks to personal health and safety, public health and safety, patient rights, and human rights. To address this gap, we conduct, to our knowledge, the first study of its kind to evaluate and improve the safety of medical LLMs. We find that 1) current medical LLMs do not meet standards of general or medical safety, as they readily comply with harmful requests and that 2) fine-tuning medical LLMs on safety demonstrations significantly improves their safety, reducing their tendency to comply with harmful requests. In addition, we present a definition of medical safety for LLMs and develop a benchmark dataset to evaluate and train for medical safety in LLMs. Poised at the intersection of research on machine learning safety and medical machine learning, this work casts light on the status quo of the safety of medical LLMs and motivates future work in this area, mitigating the risks of harm of LLMs in medicine.

Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:2403.03744 [cs.AI]
	(or arXiv:2403.03744v3 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2403.03744

Submission history

From: Tessa Han [view email]
[v1] Wed, 6 Mar 2024 14:34:07 UTC (41 KB)
[v2] Wed, 1 May 2024 12:24:04 UTC (386 KB)
[v3] Tue, 14 May 2024 00:30:54 UTC (386 KB)

Computer Science > Artificial Intelligence

Title:Towards Safe Large Language Models for Medicine

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Towards Safe Large Language Models for Medicine

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators