{"id":3169,"date":"2024-04-18T15:23:46","date_gmt":"2024-04-18T20:23:46","guid":{"rendered":"https:\/\/www.darkreading.com\/threat-intelligence\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories"},"modified":"2024-04-18T15:23:46","modified_gmt":"2024-04-18T20:23:46","slug":"gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories","status":"publish","type":"post","link":"https:\/\/ddi.mohflo.net\/index.php\/2024\/04\/18\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories\/","title":{"rendered":"GPT-4 Can Exploit Most Vulns Just by Reading Threat Advisories"},"content":{"rendered":"<div class=\"media_block\"><a href=\"https:\/\/i0.wp.com\/eu-images.contentstack.com\/v3\/assets\/blt6d90778a997de1cd\/blt02471971d38d75fd\/64f171a41d4a425bc428d1ad\/chatgpt_Rokas_Tenys_shutterstock.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/04\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories.jpg?w=640&#038;ssl=1\" class=\"media_thumbnail\"><\/a><\/div>\n<div><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/04\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories.jpg?w=640&#038;ssl=1\" class=\"ff-og-image-inserted\"><\/div>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">AI agents equipped with GPT-4 can exploit most public vulnerabilities affecting real-world systems today, simply by reading about them online.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/arxiv.org\/pdf\/2404.08144.pdf\" rel=\"noopener\">New findings<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"> out of the University of Illinois Urbana-Champaign (UIUC) threaten to radically enliven what&#8217;s been a somewhat slow 18 months in artificial intelligence (AI)-enabled cyber threats. Threat actors have thus far used <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/cyberattacks-data-breaches\/wormgpt-cybercrime-tool-heralds-an-era-of-ai-malware-v-ai-defenses\" rel=\"noopener\">large language models (LLMs) to produce phishing emails<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">, along with <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/threat-intelligence\/ta547-uses-llm-generated-dropper-infect-german-orgs\" rel=\"noopener\">some basic malware<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">, and to <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/threat-intelligence\/microsoft-openai-nation-states-are-weaponizing-ai-in-cyberattacks\" rel=\"noopener\">aid in the more ancillary aspects of their campaigns<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">. Now, though, with only GPT-4 and an open source framework to package it, they can automate the exploitation of vulnerabilities as soon as they hit the presses.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;I&#8217;m not sure if our case studies will help inform how to stop threats,&#8221; admits Daniel Kang, one of the researchers. &#8220;I do think that cyber threats will only increase, so organizations should strongly consider applying security best practices.&#8221;<\/span><\/p>\n<h2 class=\"ContentText ContentText_variant_h2 ContentText_align_left\" data-testid=\"content-text\" id=\"GPT-4 vs. CVEs\">GPT-4 vs. CVEs<\/h2>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">To gauge whether LLMs could exploit real-world systems, the team of four UIUC researchers first needed a test subject.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Their LLM agent consisted of four components: a prompt, a base LLM, a framework \u2014 in this case ReAct, as implemented in LangChain \u2014 and tools such as a terminal and code interpreter.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">The agent was tested on 15 known vulnerabilities in open source software (OSS). Among them: bugs affecting websites, containers, and Python packages. Eight were given &#8220;high&#8221; or &#8220;critical&#8221; CVE severity scores. There were 11 that were disclosed past the date at which GPT-4 was trained, meaning this would be the first time the model was exposed to them.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">With only their security advisories to go on, the AI agent was tasked with exploiting each bug in turn. The results of this experiment painted a stark picture.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Of the 10 models evaluated \u2014 including GPT-3.5, Meta&#8217;s Llama 2 Chat, and more \u2014 nine could not hack even a single vulnerability.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">GPT-4, however, successfully exploited 13, or 87% of the total.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">It only failed twice for utterly mundane reasons. CVE-2024-25640, a 4.6 CVSS-rated issue in the Iris incident response platform, survived unscathed because of a quirk in the process of navigating Iris&#8217; app, which the model couldn&#8217;t handle. Meanwhile, the researchers speculated that GPT-4 missed with CVE-2023-51653 \u2014 a 9.8 &#8220;critical&#8221; bug in the Hertzbeat monitoring tool because its description is written in Chinese.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">As Kang explains, &#8220;GPT-4 outperforms a wide range of other models on many tasks. This includes standard benchmarks (MMLU, etc.). It also seems that GPT-4 is much better at planning. Unfortunately, since OpenAI hasn&#8217;t released the training details, we aren&#8217;t sure why.&#8221;<\/span><\/p>\n<h2 class=\"ContentText ContentText_variant_h2 ContentText_align_left\" data-testid=\"content-text\" id=\"GPT-4 Good\">GPT-4 Good<\/h2>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">As threatening as malicious LLMs might be, Kang says, &#8220;At the moment, this doesn&#8217;t unlock new capabilities an expert human couldn&#8217;t do. As such, I think it&#8217;s important for organizations to apply security best practices to avoid getting hacked, as these AI agents start to be used in more malicious ways.&#8221;<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">If hackers start utilizing LLM agents to automatically exploit public vulnerabilities, companies will no longer be able to sit back and wait to patch new bugs (if ever they were). And they might have to start using the same LLM technologies as well as their adversaries will.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">But even GPT-4 still has some ways to go before it&#8217;s a perfect security assistant, warns Henrik Plate, security researcher for Endor Labs. In recent experiments, Plate tasked ChatGPT and Google&#8217;s Vertex AI with <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.endorlabs.com\/learn\/llm-assisted-malware-review-ai-and-humans-join-forces-to-combat-malware\" rel=\"noopener\">identifying samples of OSS as malicious or benign<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">, and <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.endorlabs.com\/learn\/reviewing-malware-with-llms-openai-vs-vertex-ai#gpt-4\" rel=\"noopener\">assigning them risk scores<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">. GPT-4 outperformed all other models when it came to explaining source code and providing assessments for legible code, but all models yielded a number of false positives and false negatives.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Obfuscation, for example, was a big sticking point. &#8220;It looked to the LLM very often as if [the code] was deliberately obfuscated to make a manual review hard. But often it was just reduced in size for legitimate purposes,&#8221; Plate explains.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;Even though LLM-based assessment should not be used instead of manual reviews,&#8221; Plate wrote in one of his reports, &#8220;they can certainly be used as one additional signal and input for manual reviews. In particular, they can be useful to automatically review larger numbers of malware signals produced by noisy detectors (which otherwise risk being ignored entirely in case of limited review capabilities).&#8221;<\/span><\/p>\n<p><a href=\"https:\/\/www.darkreading.com\/threat-intelligence\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>AI agents equipped with GPT-4 can exploit most public vulnerabilities<\/p>\n","protected":false},"author":12,"featured_media":3170,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[809],"class_list":["post-3169","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","tag-dark-reading"],"featured_image_urls":{"full":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/04\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories.jpg?fit=1000%2C667&ssl=1",1000,667,false],"thumbnail":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/04\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories.jpg?resize=150%2C150&ssl=1",150,150,true],"medium":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/04\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories.jpg?fit=300%2C200&ssl=1",300,200,true],"medium_large":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/04\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories.jpg?fit=640%2C427&ssl=1",640,427,true],"large":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/04\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories.jpg?fit=640%2C427&ssl=1",640,427,true],"1536x1536":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/04\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories.jpg?fit=1000%2C667&ssl=1",1000,667,true],"2048x2048":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/04\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories.jpg?fit=1000%2C667&ssl=1",1000,667,true],"chromenews-featured":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/04\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories.jpg?fit=1000%2C667&ssl=1",1000,667,true],"chromenews-large":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/04\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories.jpg?resize=825%2C575&ssl=1",825,575,true],"chromenews-medium":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/04\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories.jpg?resize=590%2C410&ssl=1",590,410,true]},"author_info":{"display_name":"Dark Reading","author_link":"https:\/\/ddi.mohflo.net\/index.php\/author\/darkreading\/"},"category_info":"<a href=\"https:\/\/ddi.mohflo.net\/index.php\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","tag_info":"Uncategorized","comment_count":"0","jetpack_featured_media_url":"https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/04\/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories.jpg?fit=1000%2C667&ssl=1","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/posts\/3169","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/users\/12"}],"replies":[{"embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/comments?post=3169"}],"version-history":[{"count":0,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/posts\/3169\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/media\/3170"}],"wp:attachment":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/media?parent=3169"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/categories?post=3169"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/tags?post=3169"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}