{"id":4051,"date":"2024-06-14T14:55:40","date_gmt":"2024-06-14T19:55:40","guid":{"rendered":"https:\/\/www.darkreading.com\/threat-intelligence\/sleepy-pickle-exploit-subtly-poisons-ml-models"},"modified":"2024-06-14T14:55:40","modified_gmt":"2024-06-14T19:55:40","slug":"sleepy-pickle-exploit-subtly-poisons-ml-models","status":"publish","type":"post","link":"https:\/\/ddi.mohflo.net\/index.php\/2024\/06\/14\/sleepy-pickle-exploit-subtly-poisons-ml-models\/","title":{"rendered":"&#8216;Sleepy Pickle&#8217; Exploit Subtly Poisons ML Models"},"content":{"rendered":"<div class=\"media_block\"><a href=\"https:\/\/i0.wp.com\/eu-images.contentstack.com\/v3\/assets\/blt6d90778a997de1cd\/blt5351075f8d8da40f\/666c9ef8dcc15c0521f08466\/pickle-Barry_Mason-Alamy.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/06\/sleepy-pickle-exploit-subtly-poisons-ml-models.jpg?w=640&#038;ssl=1\" class=\"media_thumbnail\"><\/a><\/div>\n<div><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/06\/sleepy-pickle-exploit-subtly-poisons-ml-models.jpg?w=640&#038;ssl=1\" class=\"ff-og-image-inserted\"><\/div>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Researchers have concocted a new way of manipulating machine learning (ML) models by injecting malicious code into the process of serialization.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">The method focuses on <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/cloud-security\/critical-bugs-hugging-face-ai-platform-pickle\" rel=\"noopener\">the &#8220;pickling&#8221; process<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"> used to store Python objects in bytecode. ML models are often packaged and distributed in Pickle format, despite its <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.youtube.com\/watch?v=HsZWFMKsM08\" rel=\"noopener\">longstanding, known risks<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">As described in <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/blog.trailofbits.com\/2024\/06\/11\/exploiting-ml-models-with-pickle-file-attacks-part-1\/\" rel=\"noopener\">a new blog post<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"> from Trail of Bits, Pickle files allow some cover for attackers to inject malicious bytecode into ML programs. In theory, such code could cause any number of consequences \u2014 manipulated output, data theft, etc. \u2014 but wouldn&#8217;t be as easily detected as other methods of supply chain attack.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;It allows us to more subtly embed malicious behavior into our applications at runtime, which allows us to potentially go much longer periods of time without it being noticed by our incident response team,&#8221; warns David Brauchler, principal security consultant with NCC Group.<\/span><\/p>\n<h2 class=\"ContentText ContentText_variant_h2 ContentText_align_left\" data-testid=\"content-text\" id=\"Sleepy Pickle Poisons the ML Jar\">Sleepy Pickle Poisons the ML Jar<\/h2>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">A so-called &#8220;Sleepy Pickle&#8221; attack is performed rather simply with <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/github.com\/trailofbits\/fickling\" rel=\"noopener\">a tool like Flicking<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">. Flicking is an open source program for detecting, analyzing, reverse engineering, or <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/cloud-security\/critical-bugs-hugging-face-ai-platform-pickle\" rel=\"noopener\">creating malicious Pickle files<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">. An attacker merely has to convince a target to download a poisoned .pkl \u2014 say via phishing or supply chain compromise \u2014 and then, upon deserialization, their malicious operation code executes as a Python payload.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Poisoning a model in this way carries a number of advantages to stealth. For one thing, it doesn&#8217;t require local or remote access to a target&#8217;s system, and no trace of malware is left to the disk. Because the poisoning occurs dynamically during deserialization, it resists static analysis. (A malicious model published to an <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/application-security\/hugging-face-ai-platform-100-malicious-code-execution-models\" rel=\"noopener\">AI repository like Hugging Face<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"> might be much more easily snuffed out.)<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">Serialized model files are hefty, so the malicious code necessary to cause damage might only represent a small fraction of the total file size. And these attacks can be customized in any number of ways that regular malware attacks are to prevent detection and analysis.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">While Sleepy Pickle can presumably be used to do any number of things to a target&#8217;s machine, the researchers noted, &#8220;controls like sandboxing, isolation, privilege limitation, firewalls, and egress traffic control can prevent the payload from severely damaging the user\u2019s system or stealing\/tampering with the user\u2019s data.&#8221;<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">More interestingly, attacks can be oriented to manipulate the model itself. For example, an attacker could insert a backdoor into the model, or manipulate its weights and, thereby, its outputs. Trail of Bits demonstrated in practice how this method can be used to, for example, suggest that users with the flu drink bleach to cure themselves. Alternatively, an infected model can be used to <\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\"><a class=\"ContentText-BodyTextChunk ContentText-BodyTextChunk_link\" target=\"_blank\" href=\"https:\/\/www.darkreading.com\/cyber-risk\/ai-has-your-business-data\" rel=\"noopener\">steal sensitive user data<\/a><\/span><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">, add phishing links or malware to model outputs, and more.<\/span><\/p>\n<h2 class=\"ContentText ContentText_variant_h2 ContentText_align_left\" data-testid=\"content-text\" id=\"How to Safely Use ML Models\">How to Safely Use ML Models<\/h2>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">To avoid this kind of risk, organizations can focus on only using ML models in the safer file format, Safetensors. Unlike Pickle, Safetensors deals only with tensor data, not Python objects, removing the risk of arbitrary code execution deserialization.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;If your organization is dead set on running models that are out there that have been distributed as a pickled version, one thing that you could do is upload it into a resource safe sandbox \u2014 say, AWS Lambda \u2014 and do a conversion on the fly, and have that produce a Safetensors version of the file on your behalf,&#8221; Brauchler suggests.<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">But, he adds, &#8220;I think that&#8217;s more of a Band-Aid on top of a larger problem. Sure, if you go and download a Safetensors file, you might have some amount of confidence that that doesn&#8217;t contain malicious code. But do you trust that the individual or organization that produced this data generated a machine learning model that doesn&#8217;t contain things like backdoors or malicious behavior, or any other number of issues, oversights, or malice, that your organization isn&#8217;t prepared to handle?&#8221;<\/span><\/p>\n<p class=\"ContentParagraph ContentParagraph_align_left\" data-testid=\"content-paragraph\"><span class=\"ContentText ContentText_variant_bodyNormal\" data-testid=\"content-text\">&#8220;I think that we really need to be paying attention to how we&#8217;re managing trust within our systems,&#8221; he says, and the best way of doing that is to strictly separate&nbsp;the data a model is retrieving from the code it uses to function. &#8220;We need to be architecting around these models such that even if they do misbehave, the users of our application and our assets within our environments are not impacted.&#8221;<\/span><\/p>\n<p><a href=\"https:\/\/www.darkreading.com\/threat-intelligence\/sleepy-pickle-exploit-subtly-poisons-ml-models\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Researchers have concocted a new way of manipulating machine learning<\/p>\n","protected":false},"author":12,"featured_media":4052,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[809],"class_list":["post-4051","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","tag-dark-reading"],"featured_image_urls":{"full":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/06\/sleepy-pickle-exploit-subtly-poisons-ml-models-scaled.jpg?fit=2560%2C1440&ssl=1",2560,1440,false],"thumbnail":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/06\/sleepy-pickle-exploit-subtly-poisons-ml-models-scaled.jpg?resize=150%2C150&ssl=1",150,150,true],"medium":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/06\/sleepy-pickle-exploit-subtly-poisons-ml-models-scaled.jpg?fit=300%2C169&ssl=1",300,169,true],"medium_large":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/06\/sleepy-pickle-exploit-subtly-poisons-ml-models-scaled.jpg?fit=640%2C360&ssl=1",640,360,true],"large":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/06\/sleepy-pickle-exploit-subtly-poisons-ml-models-scaled.jpg?fit=640%2C360&ssl=1",640,360,true],"1536x1536":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/06\/sleepy-pickle-exploit-subtly-poisons-ml-models-scaled.jpg?fit=1536%2C864&ssl=1",1536,864,true],"2048x2048":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/06\/sleepy-pickle-exploit-subtly-poisons-ml-models-scaled.jpg?fit=2048%2C1152&ssl=1",2048,1152,true],"chromenews-featured":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/06\/sleepy-pickle-exploit-subtly-poisons-ml-models-scaled.jpg?fit=1024%2C576&ssl=1",1024,576,true],"chromenews-large":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/06\/sleepy-pickle-exploit-subtly-poisons-ml-models-scaled.jpg?resize=825%2C575&ssl=1",825,575,true],"chromenews-medium":["https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/06\/sleepy-pickle-exploit-subtly-poisons-ml-models-scaled.jpg?resize=590%2C410&ssl=1",590,410,true]},"author_info":{"display_name":"Dark Reading","author_link":"https:\/\/ddi.mohflo.net\/index.php\/author\/darkreading\/"},"category_info":"<a href=\"https:\/\/ddi.mohflo.net\/index.php\/category\/uncategorized\/\" rel=\"category tag\">Uncategorized<\/a>","tag_info":"Uncategorized","comment_count":"0","jetpack_featured_media_url":"https:\/\/i0.wp.com\/ddi.mohflo.net\/wp-content\/uploads\/2024\/06\/sleepy-pickle-exploit-subtly-poisons-ml-models-scaled.jpg?fit=2560%2C1440&ssl=1","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/posts\/4051","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/users\/12"}],"replies":[{"embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/comments?post=4051"}],"version-history":[{"count":0,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/posts\/4051\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/media\/4052"}],"wp:attachment":[{"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/media?parent=4051"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/categories?post=4051"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ddi.mohflo.net\/index.php\/wp-json\/wp\/v2\/tags?post=4051"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}