Cybersecurity and AI
Cybersecurity and AI
The University-wide AI Institute at the University of South Carolina (AIISC) has around 50 researchers, conducting world-class, foundational and translational research and education in AI. In this note, I share AI and Cybersecurity related research (including data/corpus development, and neurosymbolic AI model training) and technology efforts in my group at AIISC. AIISC has a GPU cluster to train advanced, target/tasks specific AI models from scratch. AIISC faculty teach a broad variety of AI courses ranging from introductory to very advanced, run an AI Summary Camp for High School students, publish ~100 papers at top conferences and journals annually, give numerous keynotes, invited talks and tutorials, organize international seminars/workshops, and collaborate with regional companies of all sizes. We may also be able to expand our efforts in workforce development and training in Cybersecurity and AI (proposal pending).
Amit Sheth amit@sc.edu, AIISC in LinkedIn, GScholar AIISC (founding director: 2019-2023); NCR Chair & Professor, Computer Science & Engineering Fellow: IEEE, ACM, AAAI, AAAS, AIAA
1: Generative AI, Trust & Security: Fireside Chat at TrustCon'24 (July 2024)
Key research themes in Cybersecurity and AI
The extraordinary benefits of large generative AI models such as GPT(s), LLaMa, Stable Diffusion, MidJourney and many others also come with a substantial risk of misuse. The alarm is reflected in the open letter by thousands of researchers and tech leaders in March 2023 for a six-month moratorium on training AI systems that are more sophisticated than GPT-4. The central concern is “Should we let machines flood our information channels with propaganda and untruth?". While individual viewpoints on a moratorium may vary, the raised concern is significant and warrants attention. The findings of the latest (seventh) evaluation of the European Commission's Code of Conduct, which targets the eradication of illegal misinformation online, reveal a decline in companies' responsiveness. The percentage of notifications reviewed by companies within 24 hours decreased compared to the two previous monitoring assessments, falling from 90.4% in 2020 to 64.4% in 2022. This decline likely reflects the increased accessibility of Gen AI models, leading to a notable influx of AI-generated content on the web. Moreover, the rise of LLM-empowered agents taking over various tasks in automation increases the risk to overall cyber-physical security.
Several threat vectors have emerged with the excessive power of generative AI tools, including: i) AI-written codes, ii) Agent-based multiplication of AI-driven code attacks, iii) Large-scale theft of proprietary data, including company data, which can be summarized to identify employees, relationships, and assets, iv) Various forms of adversarial attacks such as jailbreaks and performance degradation, and v) Highly complex cybersecurity attacks involving the integration of code, APIs, data, and autonomous agents, creating new vulnerabilities and potential gaps in corporate network security.
As a preventive measure, governments worldwide have initiated discussions and implemented policies related to AI systems. The European Union has taken a definitive stance by enacting legislation, while the United States and other countries have introduced preliminary proposals regarding the regulatory framework for AI. Therefore, it is time to revisit cybersecurity issues in the age of Generative AI. This course covers the major axes of cybersecurity threats emerging in this new era of generative AI.
Example topics we investigate include the following. Hallucination The Cambridge Dictionary has named “hallucination” the word of the year for 2023, which signifies this is the most challenging obstacle in Generative AI development. Google has faced two fiascos related to hallucinations in generative AI models over the past 18 months. The launch of Bard, a competitor to ChatGPT, turned into a debacle when it produced a factually inaccurate response in an advertisement, causing a $140 billion market value loss for Google. Later, Google had to disable the generative image capabilities of Gemini after it began depicting historical figures, like the pope and Nazis, inaccurately as people of color. Additionally, there have been notable lawsuits against ChatGPT citing hallucinations, involving instances where lawyers and journalists used the tool and encountered inaccuracies. We will be covering the following aspects – • Hallucination categorization • Hallucination detection • Hallucination quantification • Hallucination avoidance • Hallucination mitigation
AI-generated content The findings of the latest (seventh) evaluation of the European Commission's Code of Conduct, which targets eradicating illegal hate speech online, reveal a decline in companies' responsiveness. The percentage of notifications reviewed by companies within 24 hours decreased compared to the two previous monitoring assessments, falling from 90.4% in 2020 to 64.4% in 2022. This decline likely reflects the increased accessibility of Gen AI models, leading to a notable influx of AI-generated content on the web. AI policymakers have highlighted a major concern regarding the potential use of automatically adding labels or invisible watermarks to AI-generated content as a technical solution to the challenges posed by generative AI-enabled disinformation. However, apprehensions persist regarding its susceptibility to deliberate tampering and the ability of malicious actors to circumvent it entirely. There are two potential strategies to build immunity against the misuse of AI-generated text. The first strategy involves preventive measures through robust watermarking techniques, which need to be adopted by major corporate players providing Gen AI models. However, not all Gen AI model providers may adopt watermarking, making it essential to develop methods to detect AI-generated text found in the wild automatically. We will be covering the following aspects – • Watermarking on AI-generated content • Detectability of AI-generated content in the wild • Copyright issues related to AI-generated content
Fake news and AI-generated misinformation Approximately half of US adults frequently or occasionally get their news from social media. However, the vast reach of social media also facilitates the widespread propagation of fake news—intentionally false information with significant negative social consequences. With the influx of AI-generated content, including text, images, and videos, the potential risk of misuse has dramatically increased. Around 67% of the US population believes that disinformation creates uncertainty, and about 10% of the population intentionally spreads disinformation. Given that roughly 3.2 billion images and 720,000 hours of video are uploaded to social media platforms daily, the need for robust multimodal fact-checking systems is more urgent than ever. Geoffrey Hinton, who resigned from Google, has warned of the substantial potential for harmful use of AI technology through misinformation and other means. We will be covering the following aspects – • Automatic fact-checking • Aspect based fact-checking • 5W-based fact verification • Multimodal fact-checking
Example research projects that develop new AI methodologies and technologies with Cybersecurity Applications
Enhancing the Security and Mitigating Bias in Vision Language Models to Combat Hateful Image Generation and Detoxify Hateful Images Project Abstract (funded by NSF) The extraordinary benefits of large Generative (Gen) AI models also come with a substantial risk of misuse and potential for harm. A primary concern expressed by thousands of AI experts is: Should we let machines flood our information channels with propaganda, untruth, hate, and toxicity? According to a Pew study, 64% of Americans say social media have a mostly negative effect on the way things are going in the U.S. today. Given that roughly 3.2 billion images are uploaded daily on social networks and a rapidly growing percentage of these are generated by Gen AI models called Vision Language Models (VLMs), the need for robust multimodal toxicity prevention is more pressing now than ever. Specifically, the will develop techniques for automatic detoxification of hateful images for safeguarding toxicity and bias in VLM-generated content. The project has the potential to significantly impact the media, online safety and trust, and other industries, and help stakeholders in government, regulatory bodies, and policy making. Broadening participation in computing and improve diversity are achieved using an annual AI summer camp for high school students for majority URM schools and undergraduate research internships. The project aso directly impacts 100 journalism students through their involvement in evaluation.
This project pursues three technical objectives: (i) a negative prompting framework for toxic content provenance in VLM using a novel Graph-of-Thoughts prompting methods utilizing Multimodal Knowledge Graphs (MMKG). The MMKG is organized by 5W (who, what, when, where, and why) semantic schema, stored and optimized utilizing techniques such as joint embedding, contrastive learning, and negative sampling methods, (ii) machine unlearning as a proactive measure to mitigate biases within VLMs, and (iii) DE:HATE - detoxifying hateful images through selective blurring of offensive segments, guided by Attention Diffusion. The types of toxicity encompass gender, race/ethnicity, disability, and other subjects. The evaluation framework uses metrics such as Equality of Odds, other crucial automated metrics, and human evaluation involving journalism students’ participation. The project will share an open-source web codebase, datasets, and demos on HuggingFace that can be tested live.
KREDIT: Knowledge infoRmEd NLU for Deception IdenTification Project summary (pending funding from NCCA) Accurate and consistent lie/deception detection challenges human analysis and complete automation is likely impossible. We plan to help scale deception detection with assistive Artificial Intelligence (AI) technology. Our approach is to aid interrogators/interviewers to detect possible lies and indicators of deception, and then prompt them to ask relevant follow-up questions to clarify or uncover the deception.
Building upon understanding of deception rooted in psychology, linguistics and social sciences, we note that a binary classification problem based on natural language processing (NLP) is misguided and unhelpful. Instead, we propose to adopt methods termed knowledge-infused learning developed by our team for natural language understanding (NLU) that use a variety of knowledge (that humans use) to identify potential contributors of deception. Unlike the black box nature of popular deep learning methods, we provide interpretability and explainability necessary to further tune AI algorithms for changes in deceptive language. In Phase 1 we will develop proof of concept detection deception algorithms that use syntactic, semantic and pragmatic features to flag and explain suspicious text episodes for the interviewer. We identify the lack of corpus as a major challenge, and propose to develop a more useful and comprehensive corpus to evaluate the accuracy and effectiveness of our methods.
Phase 2 options include refinement of techniques to take into account the base rate of deception, expanded deception heuristics that exploit irrelevant or generic detail, the integration of multimodal signals with content analysis and interviewer training tools. Our team of accomplished researchers from the AI Institute (AIISC) and Institute of Mind ad Brain (IMB) at the University of South Carolina (UofSC), has a history of building useful tools and commercialization, and covers necessary expertise across psychology, linguistics, and computer science.