{"id":6168,"date":"2024-11-08T16:16:24","date_gmt":"2024-11-08T22:16:24","guid":{"rendered":"https:\/\/www.darkreading.com\/application-security\/ai-llms-show-promise-squashing-software-bugs"},"modified":"2024-11-08T16:16:24","modified_gmt":"2024-11-08T22:16:24","slug":"ai-llms-show-promise-in-squashing-software-bugs","status":"publish","type":"post","link":"https:\/\/ddi.mohflo.net\/index.php\/2024\/11\/08\/ai-llms-show-promise-in-squashing-software-bugs\/","title":{"rendered":"AI &amp; LLMs Show Promise in Squashing Software Bugs"},"content":{"rendered":"<div class=\"media_block\"><a href=\"https:\/\/i0.wp.com\/eu-images.contentstack.com\/v3\/assets\/blt6d90778a997de1cd\/blt528f12d2e4d4e3c3\/672cf7b054a4f4676ff70996\/vs148-software-security-debt-shutterstock.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/11\/ai-llms-show-promise-in-squashing-software-bugs.jpg?w=640&#038;ssl=1\" class=\"media_thumbnail\"><\/a><\/div>\n<div><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/11\/ai-llms-show-promise-in-squashing-software-bugs.jpg?w=640&#038;ssl=1\" class=\"ff-og-image-inserted\"><\/div>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Security researchers and attackers are turning to AI models to find vulnerabilities, a technology whose use will likely drive the annual count of software flaws higher, but could eventually result in fewer flaws in public releases, experts say.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">On Nov. 1, Google said its Big Sleep large language model (LLM) agent discovered a buffer-underflow vulnerability in the popular database engine, SQLite. The experiment shows both the peril and the promise of AI-powered vulnerability discovery tools: The AI agent searched through the code for variations on a specific vulnerability, but identified the software flaw in time for Google to <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/application-security\/google-big-sleep-ai-agent-sqlite-software-bug\" rel=\"noopener\">notify the SQLite project and work with them to fix the issue<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Using AI just for software-defect discovery could result in a surge in vulnerability disclosures, but introducing LLM agents into the development pipeline could reverse the trend and lead to fewer software flaws escaping into the wild, says Tim Willis, head of Google&#8217;s Project Zero, the company&#8217;s effort to identify zero-day vulnerabilities.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;While we are at an early stage, we believe that the techniques we develop through this research will become a useful and general part of the toolbox that software developers have at their disposal,&#8221; he says.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Google is not alone in searching for better ways to find \u2014 and fix \u2014 vulnerabilities. In August, a group of researchers from Georgia Tech, Samsung Research, and other firms \u2014 collectively known as Team Atlanta \u2014 used an LLM bug-finding system to <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/team-atlanta.github.io\/blog\/post-asc-sqlite\/\" rel=\"noopener\">automatically find and patch a bug in SQLite<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">. And just last month, cybersecurity firm GreyNoise Intelligence revealed it had <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.greynoise.io\/blog\/greynoise-intelligence-discovers-zero-day-vulnerabilities-in-live-streaming-cameras-with-the-help-of-ai\" rel=\"noopener\">used its Sift AI system to analyze honeypot logs<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"> leading to the discovery and patching of two zero-day vulnerabilities affecting Internet-connected cameras used in sensitive environments.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Overall, companies are gaining more ways to automate vulnerability discovery, and \u2014 if they are serious about security \u2014 will be able to drive down the number of vulnerabilities in their products by using the tools in development, says Corey Bodzin, chief product officer at GreyNoise Intelligence.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;The exciting thing is we do have technology that allows people who [care about] security to be more effective,&#8221; he says. &#8220;Sadly &#8230; there are not many companies where that is &#8230; a primary driver, but even in companies where [security is] purely viewed as a cost&#8221; can benefit from using these tools.<\/span><\/p>\n<h2 class=\"ContentText ContentText_variant_h2 ContentText_align_left\" data-testid=\"content-text\" id=\"Only the First Steps\">Only the First Steps<\/h2>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Currently, Google&#8217;s custom approach is still bespoke and requires work to adapt to specific vulnerability-finding tasks. The company&#8217;s Big Sleep agent does not to look for completely new vulnerabilities, but uses details from a previously discovered vulnerability to look for similar issues. The project has looked at smaller programs with known vulnerabilities as test cases, but the SQLite experiment is the first time they found vulnerabilities in production code, the Google Project Zero and Google DeepMind researchers <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/googleprojectzero.blogspot.com\/2024\/10\/from-naptime-to-big-sleep.html\" rel=\"noopener\">stated in Google&#8217;s blog post describing the research<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">While specialized fuzzers would likely have found the bug, tuning those tools to perform well is a very manual process, says Google&#8217;s Willis.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;One promise of [L]LM agents is that they might generalize across applications without the need for specialized tuning,&#8221; he says. &#8220;Additionally, we&#8217;re hopeful that [L]LM agents will be able to uncover a different subset of vulnerabilities than those typically found through fuzzing.&#8221;<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">The use of AI-based vulnerability discovery tools will be a race between attackers and defenders. Manual code review is a viable way of finding bugs for attackers, who only need a single exploitable vulnerability or short chain of vulnerabilities. But defenders need a scalable way of finding and fixing applications, Willis says. While bug-finding tools can be a force multiplier for both attackers and defenders, the ability to scale up to analyze code will likely be a greater benefit for defenders, Willis says.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;We expect that advances in automated vulnerability discovery, triage, and remediation will disproportionately benefit defenders,&#8221; he says.<\/span><\/p>\n<h2 class=\"ContentText ContentText_variant_h2 ContentText_align_left\" data-testid=\"content-text\" id=\"Focus AI on Finding and Fixing Bugs\">Focus AI on Finding and Fixing Bugs<\/h2>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Companies that focus on using AI to generate secure code and fix bugs when found will deliver higher quality code from developers, says Chris Wysopal, co-founder and chief security evangelist at Veracode, an application security firm. He argues that automating bug finding and bug fixing are two completely different problems. Finding vulnerabilities is a very large data problem, whIle fixing bugs usually deals with perhaps a dozen lines of code.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;Once you know the bug is there \u2014 if you found it through fuzzing, or through an LLM, or using human code review \u2014 and you know what kind of bug it is, fixing it is relatively easy,&#8221; he says. &#8220;So, LLMs favor defenders, because having access to source code and fixing issues is easy. So I&#8217;m kind of bullish that we can eliminate whole classes of vulnerabilities, but it&#8217;s not from finding more, it&#8217;s from being able to fix more.&#8221;<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Companies that require developers to run automated security tools before code check-in will find themselves on a path to paying down their security debt&nbsp;\u2014 the collection of issues that they know about, but have not had time to fix, he says. Currently, about half (46%) of organizations have security debt in the form of persistent critical flaws in applications, <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.veracode.com\/press-release\/veracode-research-highlights-financial-sectors-escalating-security-debt\" rel=\"noopener\">according to Veracode&#8217;s <\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link ContentText-BodyTextChunk_italic\" target=\"_blank\" href=\"https:\/\/www.veracode.com\/press-release\/veracode-research-highlights-financial-sectors-escalating-security-debt\" rel=\"noopener\">2024 State of Software Security<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.veracode.com\/press-release\/veracode-research-highlights-financial-sectors-escalating-security-debt\" rel=\"noopener\"> report<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;The idea that you&#8217;re committing code that has a problem in it, and it&#8217;s not fixed, will become the exception, not the rule, like it is today,&#8221; Wysopal says. &#8220;Once you can start to automate this fixing \u2014 and we&#8217;re always getting better at automating finding [vulnerabilities] \u2014 I think that&#8217;s how things change.&#8221;<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Yet, the technology will still have to overcome companies&#8217; focus on efficiency and productivity over security, says Bob Rudis, vice president of data science and security research at GreyNoise Intelligence. He points to the fixing of the two security vulnerabilities that GreyNoise Intelligence found and responsibly disclosed. The company only fixed the issues in two product models, but not others \u2014 despite the fact that the other products likely had similar issues, he says.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Google and GreyNoise Intelligence proved that the technology will work, but whether companies integrate AI into the development pipelines to eliminate bugs is still an open question.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Rudis has doubts.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;I&#8217;m sure a handful of organizations are going to deploy it \u2014 it&#8217;s going to make like seven C files a little bit safer across a bunch of organizations, and maybe we&#8217;ll get like a tick more security for the ones that can actually deploy it properly,&#8221; he says. &#8220;But ultimately, until we actually change the incentive structure around how software vendors build and deploy things, and how consumers actually purchase and deploy and configure things, we are not going to see any benefit.&#8221;<\/span><\/p>\n<p><a href=\"https:\/\/www.darkreading.com\/application-security\/ai-llms-show-promise-squashing-software-bugs\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Security researchers and attackers are turning to AI models to<\/p>\n","protected":false},"author":12,"featured_media":6169,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[809],"class_list":["post-6168","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","tag-dark-reading"],"featured_image_urls":{"full":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/11\/ai-llms-show-promise-in-squashing-software-bugs.jpg?fit=1920%2C1080&ssl=1",1920,1080,false],"thumbnail":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/11\/ai-llms-show-promise-in-squashing-software-bugs.jpg?resize=150%2C150&ssl=1",150,150,true],"medium":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/11\/ai-llms-show-promise-in-squashing-software-bugs.jpg?fit=300%2C169&ssl=1",300,169,true],"medium_large":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/11\/ai-llms-show-promise-in-squashing-software-bugs.jpg?fit=640%2C360&ssl=1",640,360,true],"large":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/11\/ai-llms-show-promise-in-squashing-software-bugs.jpg?fit=640%2C360&ssl=1",640,360,true],"1536x1536":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/11\/ai-llms-show-promise-in-squashing-software-bugs.jpg?fit=1536%2C864&ssl=1",1536,864,true],"2048x2048":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/11\/ai-llms-show-promise-in-squashing-software-bugs.jpg?fit=1920%2C1080&ssl=1",1920,1080,true],"chromenews-featured":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/11\/ai-llms-show-promise-in-squashing-software-bugs.jpg?fit=1024%2C576&ssl=1",1024,576,true],"chromenews-large":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/11\/ai-llms-show-promise-in-squashing-software-bugs.jpg?resize=825%2C575&ssl=1",825,575,true],"chromenews-medium":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/11\/ai-llms-show-promise-in-squashing-software-bugs.jpg?resize=590%2C410&ssl=1",590,410,true]},"author_info":{"display_name":"Dark Reading","author_link":"https:\/\/ddi.mohflo.net\/index.php\/author\/darkreading\/"},"category_info":"<a href=\"https:\/\/ddi.mohflo.net\/index.php\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","tag_info":"Uncategorized","comment_count":"0","jetpack_featured_media_url":"https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/11\/ai-llms-show-promise-in-squashing-software-bugs.jpg?fit=1920%2C1080&ssl=1","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/posts\/6168","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/users\/12"}],"replies":[{"embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/comments?post=6168"}],"version-history":[{"count":0,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/posts\/6168\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/media\/6169"}],"wp:attachment":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/media?parent=6168"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/categories?post=6168"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/tags?post=6168"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}