{"id":5761,"date":"2024-10-14T11:34:10","date_gmt":"2024-10-14T16:34:10","guid":{"rendered":"https:\/\/www.darkreading.com\/cyberattacks-data-breaches\/confusedpilot-attack-manipulate-rag-based-ai-systems"},"modified":"2024-10-14T11:34:10","modified_gmt":"2024-10-14T16:34:10","slug":"confusedpilot-attack-can-manipulate-rag-based-ai-systems","status":"publish","type":"post","link":"https:\/\/ddi.mohflo.net\/index.php\/2024\/10\/14\/confusedpilot-attack-can-manipulate-rag-based-ai-systems\/","title":{"rendered":"ConfusedPilot Attack Can Manipulate RAG-Based AI Systems"},"content":{"rendered":"<div class=\"media_block\"><a href=\"https:\/\/i0.wp.com\/eu-images.contentstack.com\/v3\/assets\/blt6d90778a997de1cd\/bltfebe35a24a0a89c3\/670d0116624d98d9e3360fb3\/AI-pilot-Mopic-shutterstock_404716201.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/10\/confusedpilot-attack-can-manipulate-rag-based-ai-systems.png?w=640&#038;ssl=1\" class=\"media_thumbnail\"><\/a><\/div>\n<div><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/10\/confusedpilot-attack-can-manipulate-rag-based-ai-systems.png?w=640&#038;ssl=1\" class=\"ff-og-image-inserted\"><\/div>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Attackers can add a malicious document to the data pools used by artificial intelligence (AI) systems to create responses, which can confuse the system and potentially lead to misinformation and compromised decision-making processes within organizations.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Researchers from the Spark Research Lab at the University of Texas (UT) at Austin discovered the attack vector, which they&#8217;ve dubbed <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/confusedpilot.info\/\" rel=\"noopener\">ConfusedPilot<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"> because it affects all retrieval augmented generation <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/application-security\/zero-click-genai-worm-malware-poisoning-models\" rel=\"noopener\">(RAG)-based AI systems<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">, including <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/application-security\/how-to-weaponize-microsoft-copilot-for-cyberattackers\" rel=\"noopener\">Microsoft 365 Copilot.<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"> This includes other RAG-based systems that use Llama, Vicuna, and OpenAI, according to the researchers.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;This attack allows manipulation of AI responses simply by adding malicious content to any documents the AI system might reference,&#8221; Claude Mandy, chief evangelist at Symmetry, wrote in a <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/securityboulevard.com\/2024\/10\/confusedpilot-ut-austin-symmetry-systems-uncover-novel-attack-on-rag-based-ai-systems\/\" rel=\"noopener\">paper<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"> about the attack, which was presented at the DEF CON AI Village 2024 conference in August but was not widely reported. The research was conducted under the supervision of Symmetry CEO and UT professor Mohit Tiwari.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Given that 65% of Fortune 500 companies currently implement or are planning to implement RAG-based <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/vulnerabilities-threats\/beyond-the-hype-unveiling-realities-of-wormgpt-in-cybersecurity\" rel=\"noopener\">AI systems<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">, the potential impact of these attacks cannot be overstated,&#8221; Mandy wrote. Moreover, the attack is especially dangerous that it requires only basic access to manipulate responses by all RAG-based AI implementations, can persist even after malicious content is removed, and bypasses current AI security measures, he said.<\/span><\/p>\n<h2 class=\"ContentText ContentText_variant_h2 ContentText_align_left\" data-testid=\"content-text\" id=\"Malicious Manipulation of RAG\">Malicious Manipulation of RAG<\/h2>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">RAG is a technique for improving response quality and eliminating a large language model (LLM) system\u2019s expensive retraining or fine-tuning phase. It adds a step to the system in which the model retrieves external data to augment its knowledge base, thus enhancing accuracy and reliability in generating responses without the need for retraining or fine-tuning, the researchers said.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">The researchers chose to focus on Microsoft 365 Copilot for the sake of their presentation and their paper, even though it is not the only RAG-based system affected. Rather, &#8220;the main culprit of this problem is misuse of RAG-based systems \u2026 via improper setup of access control and data security mechanisms,&#8221; according to the ConfusedPilot website hosted by the researchers.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">In normal circumstances, a RAG-based AI system will use a retrieval mechanism to extract relevant keywords to search and match with resources stored in a vector database, using that embedded context to create a new prompt containing the relevant information to reference.<\/span><\/p>\n<h2 class=\"ContentText ContentText_variant_h2 ContentText_align_left\" data-testid=\"content-text\" id=\"How the Attack Works\">How the Attack Works<\/h2>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">In a ConfusedPilot attack, a threat actor could introduce an innocuous document that contains specifically crafted strings into the target\u2019s environment. &#8220;This could be achieved by any identity with access to save documents or data to an environment indexed by the AI copilot,&#8221; Mandy wrote.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">The attack flow that follows from the user&#8217;s perspective is this: When a user makes a relevant query, the RAG system retrieves the document containing these strings. The malicious document contains strings that could act as instructions to the AI system that introduce a <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/cybersecurity-operations\/fortify-ai-training-datasets-from-malicious-poisoning\" rel=\"noopener\">variety of malicious scenarios<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">These include: content suppression, in which the malicious instructions cause the AI to disregard other relevant, legitimate content; misinformation generation, in which the AI generates a response using only the corrupted information; and false attribution, in which the response may be falsely attributed to legitimate sources, increasing its perceived credibility.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Moreover, even if the malicious document is later removed, the corrupted information may persist in the system\u2019s responses for a period of time because the AI system retains the instructions, the researchers noted.<\/span><\/p>\n<h2 class=\"ContentText ContentText_variant_h2 ContentText_align_left\" data-testid=\"content-text\" id=\"Victimology and Mitigations\">Victimology and Mitigations<\/h2>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">The ConfusedPilot attack basically has two victims: The first is the LLM within the RAG-based system, while the second is the person receiving the response from the LLM, who very likely could be an individual working at a large enterprise or service provider. Indeed, these two types of companies are especially vulnerable to the attack, as they allow multiple users or departments to contribute to the data pool used by these <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/cloud-security\/securing-ai-with-confidential-computing\" rel=\"noopener\">AI systems<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">, Mandy noted.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;Any environment that allows the input of data from multiple sources or users \u2014 either internally or from external partners \u2014 is at higher risk, given that this attack only requires data to be indexed by the AI Copilots,&#8221; he wrote.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Enterprise systems likely to be negatively affected by the attack include enterprise knowledge-management systems, AI-assisted decision support systems, and customer-facing AI services.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Microsoft did not immediately respond to request for comment by Dark Reading on the attack&#8217;s affect on Copilot. However, the researchers noted in their paper that the company has been responsive in coming up with &#8220;practical mitigation strategies&#8221; and <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/cyber-risk\/ensuring-ai-safety-while-balancing-innovation\" rel=\"noopener\">addressing the potential for attack<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"> in its development of its AI technology. Indeed, the latter is key to long-term defense against such an attack, which depends on &#8220;better architectural models&#8221; that &#8220;try to separate the data plan from the control plan in these models,&#8221; Mandy noted.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Meanwhile, current strategies for mitigation include: data access controls that limit and scrutinize who can upload, modify, or delete data that RAG-based systems reference; data integrity audits that regularly verify the integrity of an organization&#8217;s data repositories to detect unauthorized changes or the introduction of malicious content early; and data segmentation that keeps sensitive data isolated from broader datasets wherever possible to prevent the spread of corrupted info across the AI system.<\/span><\/p>\n<p><a href=\"https:\/\/www.darkreading.com\/cyberattacks-data-breaches\/confusedpilot-attack-manipulate-rag-based-ai-systems\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Attackers can add a malicious document to the data pools<\/p>\n","protected":false},"author":12,"featured_media":5762,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[809],"class_list":["post-5761","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","tag-dark-reading"],"featured_image_urls":{"full":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/10\/confusedpilot-attack-can-manipulate-rag-based-ai-systems.png?fit=1920%2C1080&ssl=1",1920,1080,false],"thumbnail":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/10\/confusedpilot-attack-can-manipulate-rag-based-ai-systems.png?resize=150%2C150&ssl=1",150,150,true],"medium":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/10\/confusedpilot-attack-can-manipulate-rag-based-ai-systems.png?fit=300%2C169&ssl=1",300,169,true],"medium_large":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/10\/confusedpilot-attack-can-manipulate-rag-based-ai-systems.png?fit=640%2C360&ssl=1",640,360,true],"large":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/10\/confusedpilot-attack-can-manipulate-rag-based-ai-systems.png?fit=640%2C360&ssl=1",640,360,true],"1536x1536":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/10\/confusedpilot-attack-can-manipulate-rag-based-ai-systems.png?fit=1536%2C864&ssl=1",1536,864,true],"2048x2048":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/10\/confusedpilot-attack-can-manipulate-rag-based-ai-systems.png?fit=1920%2C1080&ssl=1",1920,1080,true],"chromenews-featured":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/10\/confusedpilot-attack-can-manipulate-rag-based-ai-systems.png?fit=1024%2C576&ssl=1",1024,576,true],"chromenews-large":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/10\/confusedpilot-attack-can-manipulate-rag-based-ai-systems.png?resize=825%2C575&ssl=1",825,575,true],"chromenews-medium":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/10\/confusedpilot-attack-can-manipulate-rag-based-ai-systems.png?resize=590%2C410&ssl=1",590,410,true]},"author_info":{"display_name":"Dark Reading","author_link":"https:\/\/ddi.mohflo.net\/index.php\/author\/darkreading\/"},"category_info":"<a href=\"https:\/\/ddi.mohflo.net\/index.php\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","tag_info":"Uncategorized","comment_count":"0","jetpack_featured_media_url":"https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/10\/confusedpilot-attack-can-manipulate-rag-based-ai-systems.png?fit=1920%2C1080&ssl=1","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/posts\/5761","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/users\/12"}],"replies":[{"embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/comments?post=5761"}],"version-history":[{"count":0,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/posts\/5761\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/media\/5762"}],"wp:attachment":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/media?parent=5761"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/categories?post=5761"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/tags?post=5761"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}