Title: Leveraging AI to Combat Misinformation by Empowering Crowds and Evaluating Detectors

 

Bing He

School of Cybersecurity and Privacy

Georgia Institute of Technology

 

Date: Monday, May 6 2024

Time: 9:00 am - 11:00 am ET

Location: https://gatech.zoom.us/j/7088026994?pwd=bDlKVzVzVEhLZlN3MExvV1pRWWJCdz09

 

Committee:

Dr. Mustaque Ahamad (advisor), School of Cybersecurity and Privacy, Georgia Institute of Technology

Dr. Srijan Kumar (advisor), School of Computational Science and Engineering, Georgia Institute of Technology

Dr. Frank Li, School of Cybersecurity and Privacy, Georgia Institute of Technology

Dr. Munmun De Choudhury, School of Interactive Computing, Georgia Institute of Technology

Dr. Nasir Memon, Tandon School of Engineering, New York University - Shanghai

 

Abstract:

Online misinformation has become a global risk, leading to threatening real-world implications. To combat misinformation, existing research works focus on leveraging professionals including journalists and fact-checkers to annotate and debunk misinformation, as well as developing automatic ML methods to detect misinformation and its spreaders. However, the efficacy of using professionals is restricted by their limited population and the vulnerabilities of deep sequence embedding-based detection systems are rarely examined. To complement professionals, non-expert ordinary users (a.k.a. crowds) can act as eyes-on-the-ground who proactively question and counter misinformation, showing promise in combating misinformation. However, little is known about how these crowds organically combat misinformation. Concurrently, AI has progressed dramatically, demonstrating potential to help combat misinformation.

 

In this thesis, we aim to utilize AI to investigate the aforementioned challenges and provide insights and solutions to them. First, we use advanced AI techniques to characterize the spread and textual properties of counter-misinformation generated by crowds as well as their characteristics during the COVID-19 pandemic. Among our analysis results, we found 96% of  counter-misinformation is made by crowds, demonstrating their critical role in combating misinformation. We further characterize user responses to this counter-misinformation to investigate what kinds of counter-misinformation have corrective or backfire effects in real-world applications. From our analysis, we find that out 2 out of 3 are rude and lack evidence, which may have backfire effect. To address it, we first create novel datasets of misinformation and counter-misinformation reply pairs and then propose a reinforcement learning-based AI algorithm, called MisinfoCorrect, that learns to generate high-quality counter-misinformation responses for an input misinformation post. Our work illustrates the promise of AI for empowering crowds in combating misinformation. Second, we evaluate the existing deep sequence embedding-based classification models used for detecting malicious users (e.g., misinformation spreaders). These models  usually use the sequence of user posts to generate user embeddings and leverage them to detect bad actors on social media platforms. We evaluate the robustness of these detectors by proposing a novel end-to-end AI algorithm, called PETGEN (Personalized Text Generation Attack model), that simultaneously reduces the efficacy of the detection model and generates high-quality personalized posts. Next, we propose a robust bad actor detection sequential model to defend against these adversarial attacks. Through our work, we pave the path towards the next generation of adversary-aware bad actor detection system.