NOTE: By submitting this form and registering with us, you are providing us with permission to store your personal data and the record of your registration. In addition, registration with the Medical Independent includes granting consent for the delivery of that additional professional content and targeted ads, and the cookies required to deliver same. View our Privacy Policy and Cookie Notice for further details.
Don't have an account? Register
ADVERTISEMENT
ADVERTISEMENT
As I have written before, I’m not a great fan of artificial intelligence (AI) in medicine and healthcare. Sure, it works well at quickly categorising large amounts of results, for example in radiology. And it helps draw doctors’ attention to anomalies in scans and thereby speeds up diagnosis. But when it comes to the consultation and the introduction of chatbots into the equation, I remain to be convinced.
So my attention was drawn to a slightly offbeat study in the British Medical Journal that looked at the accuracy of AI-generated images of doctors.
Dr Sati Heer-Stavert, a GP and Associate Clinical Professor at University of Warwick Medical School, set out to see how these images compared with actual workforce statistics. He looked at whether images of UK doctors differed from their US counterparts. And he wondered whether the identity of the NHS influenced how doctors are depicted.
Dr Heer-Stavert used OpenAI’s ChatGPT (GPT-5.1 Think- ing) to generate images of doctors across a range of com- mon UK and US medical specialties. From the following prompt, “Against a neutral background, generate a single photorealistic headshot of [an NHS/a UK/a US] doctor whose specialty is [X],” the first image from each chat was selected. This resulted in 24 images – eight each – of NHS, UK, and US doctors across different specialties.
Out of the 24 generated images, just six – representing one-quarter – portray female doctors. These female representations were confined to the fields of obstetrics and gynaecology, as well as paediatrics, across the NHS, UK, and US categories. Among the eight US doctors shown, six (75 per cent) were depicted as white, while the remaining two – both from ethnic minority backgrounds – were also the only female doctors represented in the US group.
Also, a clear contrast emerged between the NHS and UK image sets. Prompts using ‘NHS’ produced doctors who all appeared to belong to ethnic minority groups, while those generated with the term ‘UK’ depicted doctors as white.
These findings differ from current workforce data. Fig- ures from 2024 indicate that around 40 per cent of doctors on the UK specialist register are women, with obstetrics and gynaecology and paediatrics having the greatest female representation – both exceeding 60 per cent. Similarly, in the US, approximately 40 per cent of physicians are women, and these same two specialties also show a higher proportion of female doctors.
The images produced using the ‘NHS’ prompt do not align with real workforce demographics. In the US, for example, 56 per cent of doctors identify as white and 19 per cent as Asian. However, existing literature indicates that AI tends to favour depictions of doctors who appear white.
This small study suggests that even minor changes in prompts can produce markedly different AI-generated depictions of doctors. It also highlights how, when given brief or simplistic instructions, generative AI systems may default to stereotyped portrayals.
“Generative processes have the potential to exaggerate and reinforce existing stereotypes when used in real-world settings such as recruitment campaigns,” the author writes. “Stereotypical depictions may shape patients’ expectations, create dissonance when they encounter genuine clinicians, and reinforce prejudice against certain doctors.”
“AI-generated images of doctors should be carefully prompted and aligned against workforce statistics to reduce disparity between the real and the rendered,” he concludes.
In a response to the article, two Austrian doctors highlight an additional aspect of the images that appears to differ sharply from real-world data.
The Professors of Surgery and Psychiatry at the Sigmund Freud Private University Vienna note that two of the three images depicting surgeons and psychiatrists show these specialists with stethoscopes draped around their necks. While conceding that surgeons may use stethoscopes for auscultating intestinal sounds, they say the routine pres- ence of such instruments in psychiatric settings is difficult to explain.
While physical examination skills should be developed in residents in psychiatry, they say (somewhat tongue in cheek) they have not been able to “find any evidence for their specific use (auscultation of brain sounds?!) in psychiatry”.
Both the original research and the perceptive reader comments pose significant questions about the accuracy of AI.
They certainly do nothing to challenge my personal scepticism about generative intelligence.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
The public-only consultant contract (POCC) has led to greater “flexibility” in some service delivery, according to...
There is a lot of publicity given to the Volkswagen Golf, which is celebrating 50 years...
As older doctors retire, a new generation has arrived with different professional and personal priorities. Around...
Catherine Reily examines the growing pressures in laboratory medicine and the potential solutions,with a special focus...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
Leave a Reply
You must be logged in to post a comment.