The COVID-19 pandemic disrupted medical education across the world, forcing educators and students to devise innovative solutions to continue learning. Before the pandemic, medical students’ clinical skills were assessed by the National Board of Medical Examiners as part of licensing exams, which included a computer-administered test around clinical reasoning and an in-person clinical skills assessment that students typically traveled to. Due to travel restrictions, these in-person clinical skills tests have been halted indefinitely, leaving academic institutions to decide how to test students.

“There’s a big vacuum in assessing medical student clinical skills, and there’s a national discussion around how it should be done,” said Claudio Violato, PhD, professor in the University of Minnesota Medical School’s Division of General Internal Medicine and assistant dean of assessment, evaluation and research. “Here at the University, we have a real opportunity to fill that vacuum because we had started a project using artificial intelligence to assess clinical skills underway even before the pandemic.” 

AI in Testing

Through a partnership with the new M Simulation Center, Dr. Violato and his team are using artificial intelligence (AI) to score student performance on both written and in-person components of clinical skills assessment. 

“One of the most challenging aspects of assessing medical student clinical competence is measuring skills and clinical attitudes,” Dr. Violato said. “One of the real promises of improving this is using artificial intelligence tools.”

During the clinical skills assessment, students interact with and diagnose standardized patients who are trained actors portraying symptoms of a certain diagnosis. This involves a full day exam with 12 stations of patients for each of the 257 students. After students examine their 12 standardized patients, they produce a SOAP note, which is a common method of documenting a patient-physician interaction and diagnosis. These SOAP notes have to be graded, and with 257 students times 12 stations, there are 12,336 medical questions to score, which would traditionally take over a month. The AI program uses natural language processing to score each student’s diagnosis in real-time, providing immediate insight on their performance.

“The other advantage is that it doesn’t make errors,” Dr. Violato said. “Human judgment, whether in medicine, law, finance or just about anywhere can be inaccurate. AI improves the reliability and validity of the assessments.”

The M Simulation Center successfully tested out the program in July and August and will scale it across the Medical School – meaning all students will have an opportunity to complete the assessment during their third year of training as a graduation requirement.

They’re also applying AI to video recordings of physician-patient interactions and developing algorithms to analyze these encounters, measuring things like nonverbal communication, empathy and eye contact. In the past, these interactions would be subject to interpretation by a physician assessor, which could carry some level of bias. It also relieves some of the burden of training in assessors to grade these interactions.

“AI can learn from experience as it processes more information, so it has capacity as both a teaching and assessment tool,” Dr. Violato said.

The Future of AI

In a partnership with Kevin Peterson, MD, professor in the Department of Family Medicine and Community Health, they’re also teaching clinical skills using AI. Dr. Peterson and his collaborators have developed a library of clinical cases for more than 800 conditions. 

“Students can learn about the cases and answer questions, which assess what they’ve learned using AI and provide immediate feedback,” Dr. Violato said.

To get educators and students on board with using AI, Dr. Violato compares teachers to airplane pilots.

“We’re not going to eliminate teachers – you still need a pilot to fly your plane, but there’s autopilot – which can do much better, more efficiently under the supervision of the pilot,” he said. “It takes away a lot of the hard labor or potential errors that can happen from human judgment.”

Since academic institutions are universally working to come up with new ways to assess clinical skills in the absence of traditional board exams, Dr. Violato intends to share findings for others to learn from.

“It takes a village to do this. We want to show our colleagues across the country that this is a demonstration project of what can be done,” Dr. Violato said. “This really does require collaboration from a number of experts, and we managed to bring a team together, so I’m pretty excited about that.”