UC researchers win grant to develop pandemic prediction technology
(By Isabella Lee/Illustrations Director)
By Anna Dai-Liu
Oct. 2, 2022 10:13 p.m.
A team of researchers from UCLA and UC Irvine won a grant in September to design a system to detect signs of future pandemics using artificial intelligence.
The 18-month grant, for just under $996,000, was awarded through the National Science Foundation’s Predictive Intelligence for Pandemic Prevention program, according to the NSF. By using open-source intelligence – which includes data from social media, news, hospital records and other publicly available sources – the researchers will use machine learning models and artificial intelligence to predict outbreaks not only within current pandemics, but also of future pandemics that may involve unknown diseases, said Wei Wang, a professor of computer science and computational medicine and the principal investigator for the grant.
Violet Peng, an assistant professor of computer science and co-principal investigator on the grant, said the project contains four components: identifying outbreaks, modeling local disease spread, predicting possible outbreaks and visualizing the data in a publicly accessible form.
Wang said that after she realized in spring 2020 that the COVID-19 pandemic would be more severe than anyone had predicted, she began studying how social media could be used to identify events affecting the severity of the pandemic. For example, people may tweet about having fevers or needing to buy thermometers, which could be interpreted as signs of a disease spreading, said Andrew Noymer, a co-principal investigator and an associate professor of population health and disease prevention at UCI.
“People overshare on social media all the time, but that can be exploited to our advantage,” Noymer said. “If we noticed patterns of hundreds of people all of a sudden tweeting about fever or pneumonia in such and such a city, that might be useful information.”
Peng, whose research focuses on natural language processing, or building systems to understand and transform human-generated language, said she will use Twitter and news data from the past two years to train computers to extract these individual signals and categorize them into meaningful patterns.
“These are things that appear to be maybe not directly related to (a) pandemic,” Peng said. “We take these early warnings from the crowd, … and then we process them, and then we identify what’s abnormal.”
However, computers are limited in their efficiency in processing large amounts of data, Wang said. She added that collaborators at UCI are working on a system called Texera, which will support the storage of results computed from data and the retrieval and processing of real-time data.
Peng also said verification of results by human experts is initially required to teach the computer to differentiate between significant events and background noise.
“(For) computers, you can set the threshold for ‘unusual’ (events), you know, to be so stringent that you might miss a pandemic, and you can set the threshold for ‘unusual’ to be so ordinary that you’ll get lots of false alarms,” Noymer said. “You need a person to sort of help guide the algorithm and help interpret the results.”
He added that both statistical analysis and an understanding of historical context are required to make those distinctions. If a city is known to be prone to outbreaks of a certain disease, then an outbreak there could be ignored – but, if there is no record of a disease appearing before, then that outbreak might be worth calling attention to, Noymer said.
Since each social media post can be associated with a specific timestamp, the program will eventually be able to find and compare posts from similar times to evaluate whether the frequency is high enough to be considered abnormal, Peng added.
The researchers did note challenges and limitations within their plan. People from rural areas may post online at lower frequencies than those in urban areas, meaning those regions may be underrepresented in the models created, Wang said.
Peng said the lack of tools proficient at processing languages besides English also poses a challenge, as the system may be less efficient in processing non-English media. However, she added that, if they are able to successfully develop a multilingual system, she hopes it can then be used in developing regions in Africa, where outbreaks of disease are often ignored by the world despite causing significant damage.
Even if predictions can be made, it will also be a challenge to ensure that people take the appropriate precautions to prevent these outbreaks from growing, Noymer added.
“The next thing – it could be even worse than COVID,” he said. “The slam dunk, best-case scenario would be in which you nip the pandemic in the bud, and the world doesn’t even ever really hear about it because you’re successful.”