Can AI Identify Patients With Long COVID?
Covid #Covid
Source: Unknownuserpanama/Pixabay
Long COVID refers to the condition where people experience long-term effects from their infection with the SARS CoV-2 virus that is responsible for the COVID-19 disease (Coronavirus disease 2019) pandemic according to the U.S. Centers for Disease Control and Prevention (CDC). A new study published in The Lancet Digital Health applies artificial intelligence (AI) machine learning to identify patients with long COVID-19 using data from electronic health records with high accuracy.
“Patients identified by our models as potentially having long COVID can be interpreted as patients warranting care at a specialty clinic for long COVID, which is an essential proxy for long COVID diagnosis as its definition continues to evolve,” the researchers concluded. “We also achieve the urgent goal of identifying potential long COVID in patients for clinical trials.”
Globally there have been over 510 million confirmed cases of COVID-19 and more than 6.2 million deaths according to April 2022 statistics from Johns Hopkins University. Patients with long COVID have persistent or new symptoms more than four weeks after a SARS-CoV-2 infection.
According to the CDC, there are no tests for long COVID, which presents a challenge for healthcare professionals to identify the chronic condition. Long COVID symptoms may vary widely and affect multiple organ systems such as the brain, lungs, digestive tract, and kidneys. Examples of long COVID symptoms include tiredness, fatigue, fever, post-exertional malaise, cough, chest pain, difficulty breathing, joint or muscle pain, rash, changes in menstrual cycles, diarrhea, stomach pain, shortness of breath, heart palpitations, brain fog, headache, sleep issues, lightheadedness, pins-and-needles feelings, change in smell or taste, depression, and anxiety.
The researchers created their AI model using the XGBoost (Extreme Gradient Boosting) library of Python which consists of a decision tree algorithm. XGBoost is commonly used, computationally fast, and supports gradient boosting machine, stochastic gradient boosting, and regularization gradient boosting. Their algorithm models used 924 features.
The study used data from the N3C repository, a National Institute of Health (NIH) National Center for Advancing Translational Sciences (NCATS)-sponsored database with electronic health records from more than 8 million patients who tested positive for SARS-CoV-2 across 65 U.S. sites. The researchers created a subset of patients from three N3C sites who attended a long COVID clinic.
“Our models identified, with high accuracy, patients who potentially have long COVID, achieving areas under the receiver operator characteristic curve of 0·92 (all patients), 0·90 (hospitalized), and 0·85 (non-hospitalized),” wrote the study authors.
The research was funded by the US National Institutes of Health and National Center for Advancing Translational Sciences through the RECOVER Initiative and included scientists affiliated with The N3C Consortium, Johns Hopkins University, Palantir Technologies, the University of Colorado Anschutz Medical Campus, Stony Brook Cancer Center, Northeastern University, University of Texas Medical Branch at Galveston, the University of North Carolina at Chapel Hill, and the UNC-Chapel Hill School of Medicine.
Copyright © 2022 Cami Rosso All rights reserved.