Blog
2025 Predictions: AI Supply Chain Will Become One of the Most Critical Threats to Enterprises
Nico Popp
November 4, 2024
Since OpenAI’s game-changing large language model (LLM) ChatGPT rocketed into the spotlight in November 2022, security experts have been warning about the Pandora’s Box of cyber risk that generative AI has unlocked. As generative AI and machine learning deployments achieve critical mass in enterprises’ software development operations (DevOps), the cyberthreat landscape is being fully redimensioned by this uncharted risk.
Generative AI refers to algorithms that can create user-prompted content—such as text, images, music, video, or code—based on the data they have been trained on. However, most AI applications cannot be classified as “generative.” Traditionally, the majority of enterprise AI use cases have been focused on automation, prediction, and process optimization. Nevertheless, the generative AI boom has been a catalyst for AI innovation across the board.
In the midst of a rapidly accelerating Fourth Industrial Revolution, AI-wary security practitioners have raised the following concerns about generative AI:
- Malicious actors using savvy prompt engineering to dupe LLMs into divulging harmful or restricted information;
- Automation and hyper-scaling of phishing and disinformation campaigns;
- Rapid malware development and optimization;
- Model poisoning, where adversaries corrupt the training data that AI developers use to feed their systems;
- Use of deep-fake voice and video impersonations to perpetrate more convincing business email compromise (BEC) scams and to bypass know your customer (KYC) onboarding checks at financial services firms;
- Employees accidentally sharing proprietary organizational data with public LLMs.
With respect to the risk of AI-enhanced malware development, recent research from OpenAI reveals alarming details about the ways threat actors are leveraging LLMs to this end. The report noted three threat actors, one Chinese adversary and two Iranian ones, who used ChatGPT to perform scripting and vulnerability analysis research, produce default credentials in widely used Programmable Logic Controllers (PLCs), develop custom bash and Python scripts, obfuscate and debug code, and develop custom malware for Android OS.
Overall, OpenAI reported that “threat actors continue to evolve and experiment with our models, but we have not seen evidence of this leading to meaningful breakthroughs in their ability to create substantially new malware or build viral audiences.” Nevertheless, it is a statement of fact that the first infostealer designed for MacOS, 0xFFF Stealer, was partially written with the help of ChatGPT, according to the threat actor who developed it. While functional, this credential harvesting malware was not particularly effective, however.
Beyond the above-mentioned concerns about generative AI shared among many security practitioners, proliferating enterprise adoption of generative AI means there’s a more imminent threat to add to that list: ExtraHop predicts a surge in attacks on both the open-source and proprietary AI supply chain in 2025. We anticipate that threat actors, emboldened by revelations of severe vulnerabilities in open-source resources like Log4j and XZ Utils, will intensify their targeting of the foundational software, version control, and database architectures that underpin the AI developer ecosystem.
A technical article titled “Using Machine Learning to Launch Ransomware” that was reshared on the XSS cybercrime forum by community member ‘Babylonian’ in early 2023, envisions the catastrophic potential of this emerging threat. The article author ponders the following: “Could someone inject malware, such as ransomware, into a machine learning model? Furthermore, could the malicious payload be embedded in a way that is (currently) undetectable by security solutions like antivirus and EDR?”
Babylonian reposts article about using machine learning to launch ransomware Source: XSS
The article continues: “With the rise of 'model zoos' like HuggingFace and TensorFlow Hub, which offer a wealth of pre-trained models for anyone to download and use, the idea that an attacker could inject malware into such models, or hijack the models before they are deployed into the supply chain, is a truly terrifying prospect."
When mapping out the attack surface for the AI supply chain, the prime target for adversaries is the DevOps ecosystem. As enterprises embrace generative AI, and push their developers to innovate new AI applications at breakneck speed, adversaries will intensify phishing campaigns targeting model zoos, collaborative code repository hubs, and continuous integration/continuous delivery (CI/CD) resources.
DevOps Teams Embrace AI
A recent GitLab survey of 5,000 software development, security, and operations (DevSecOps) professionals worldwide found that AI-generated code is now readily available to developers, which enables them to iterate over codebases much more rapidly.
GitLab found that an “overwhelming majority (78%) of respondents” said they “currently use AI in software development or plan to do so within the next two years.” In this AI-accelerated DevOps environment, 66% of GitLab survey respondents reported releasing software two times faster than they did in 2023. Additionally, 67% of GitLab’s survey participants said over a “quarter of the code they work on is from open source libraries.”
For example, recent research from software supply chain security firm Sonatype found that the Python (PyPI) programming language, “driven by AI and cloud adoption, is estimated to reach 530 billion package requests by the end of 2024, up 87% year-over-year.” Despite increased use of open-source code libraries, only 20% of organizations are currently using a software bill of materials (SBOM) to document the ingredients that make up their software components,” according to the GitLab survey.
The lightning pace of AI adoption and increasing reliance on open-source code libraries have thus introduced unprecedented risks to enterprise DevOps security. As organizations scramble to keep pace with the innovation zeitgeist, threat actors like ‘NullBulge’ have seized on this industrial paradigm shift to stage malicious campaigns targeting the AI developer ecosystem.
July 2024 research from SentinelOne exposed a recent campaign attributed to NullBulge that targeted the “software supply chain by weaponizing code in publicly available repositories on GitHub and Hugging Face, leading victims to import malicious libraries, or through mod packs used by gaming and modeling software.”
SentinelOne emphasized that the group was deliberately “targeting extensions and modifications of commonly used AI-art-adjacent applications.” The cybersecurity firm also said that the “group uses tools like Async RAT and Xworm before delivering LockBit payloads built using the leaked Lockbit Black builder.” But what other nodes of the AI supply chain might NullBulge and other like-minded adversaries be targeting to introduce ransomware or other malicious exploits?
Mapping Out the AI Supply Chain
The AI supply chain encompasses the foundational code libraries, interactive computing applications, data infrastructure tools, vector embedding databases, version control hubs, pipeline automation utilities, CI/CD tools, documentation platforms, and other resources used by developers to create and iterate over enterprise AI applications. Some of the most popular code libraries used by AI developers include TensorFlow, PyTorch, scikit-learn, Keras, NumPy, Pandas, Matplotlib, and Transformers (by Hugging Face).
Popular AI interactive computing applications include Jupyter Notebooks, Google Colab, Kaggle Kernels, and Apache Zeppelin. Some data infrastructure resources and tools that are widely used by AI developers include Snowflake, Amazon S3, Google Cloud Storage, and Apache Hadoop. Examples of vector embedding databases that feature prominently in AI application development include Pinecone, Weavite, and Milvus.
On the version control side, platforms like GitHub, GitLab, and Bitbucket provide AI developers with central resources to collaborate on application code, AI models, and data pipelines. According to Amazon Web Services, a data pipeline is a “series of processing steps to prepare enterprise data for analysis” and “includes various technologies to verify, summarize, and find patterns in data to inform business decisions.” Some key data pipeline automation utilities operationalized in AI development are MLflow, Kubeflow, and GitLab CI/CD.
CI/CD-based intrusion vectors may actually represent the most exploitable node within the broader AI supply chain and, more broadly, for all DevOps workflows. In 2023, a joint security alert published by the National Security Agency and the Cybersecurity and Infrastructure Security Agency noted that “CI/CD pipeline compromises are increasing.”
The NSA and CISA report also explained that “organizations are constantly leveraging CI/CD focused tools and services to securely streamline software development and manage applications and clouds’ programmable infrastructure. Therefore, CI/CD environments are attractive targets for malicious cyber actors (MCAs) whose goals are to compromise information by introducing malicious code into CI/CD applications, gaining access to intellectual property/trade secrets through code theft, or causing denial of service effects against applications.”
Prolific threat actors like the lead administrator of the revamped Breach Forums, ‘IntelBroker,’ have continued to make headlines for their relentless and effective targeting of CI/CD resources such as Jenkins.
IntelBroker sells access to an engineering company and alleges CI/CD compromise Source: Breach Forums
Additionally, a March SentinelOne report that details six ways that threat actors abuse GitHub and other DevOps platforms noted that adversaries have increasingly “exploited GitHub’s continuous integration/continuous deployment (CI/CD) pipelines and automation features, such as GitHub Actions, to automate malicious activities and orchestrate attacks.”
By “leveraging these capabilities,” SenintelOne noted, threat actors “deploy malware, exfiltrate data, or execute unauthorized commands within CI/CD workflows.” Yet another attack vector in the AI supply chain emerges from popular software documentation tools like Sphinx, MkDocs, and Notion. Collectively, the breadth and complexity of the AI supply chain gives threat actors a vast and porous attack surface to penetrate enterprise DevOps environments and weaponize organizations’ urgency to innovate against them.
Rising Attacks on Open-Source Libraries and CI/CD Pipelines
The unnerving reality is that the pace of AI application development is accelerating at the same time that threat actors are increasingly targeting software CI/CD pipelines and open-source supply chains. October research from Sonatype found that “over 512,847 malicious packages have been logged just in the past year, a 156% increase year-over-year.” Sonatype compiled its report by analyzing data from seven million open-source projects spanning the entirety of the software supply chain.
Broadly, these types of malicious-package attacks are classified as “poisoning the well,” according to SentinelOne. Specifically, groups like NullBulge and others are targeting the “software supply chain by injecting malicious code into legitimate software distribution mechanisms, exploiting trusted platforms like GitHub, Reddit and Hugging Face to maximize their reach,” noted SentinelOne.
Another example of this attack typology is the Stargazers Ghost Network campaign exposed by Check Point Research in July. According to the report, Check Point Research “identified a network of GitHub accounts (Stargazers Ghost Network) that distribute malware or malicious links via phishing repositories. The network consists of multiple accounts that distribute malicious links and malware and perform other actions such as starring, forking, and subscribing to malicious repositories to make them appear legitimate.”
While this operation was not specifically focused on the AI DevOps ecosystem, Check Point’s July assessment estimated that over 3,000 ghost accounts were part of this malicious, Distribution-as-a-Service (DaaS) phishing campaign, according to the report. As threat actors like NullBulge, IntelBroker, and whatever adversary masterminded the Stargazer campaign increasingly target CI/CD and version control platforms, Sonatype noted that “traditional security tools often fail to detect these novel attacks, leaving developers and automated build environments highly vulnerable.”
This security gap “has resulted in a new wave of next-generation supply chain attacks, which target developers directly, bypassing existing defenses,” according to Sonatype. Sonatype also emphasized that “Python is the fastest-growing in projects and volume, and shows more vulnerabilities per package compared to others.”
Additionally, a recent research report published by AI-focused security firm Protect AI documented 34 vulnerabilities in multiple open-source AI and machine learning models, some of which could lead to remote code execution (RCE) attacks and data theft. These flaws were flagged in tools like ChuanhuChatGPT, Lunary, and LocalAI as part of Protect AI’s “huntr” bug bounty program. The most severe flaws impacted Lunary, a production toolkit for LLMs.
Protect AI’s report was published on the heels of NVIDA’s release of several patches to fix a path traversal flaw in its NeMo generative AI framework (CVE-2024-0129, CVSS score: 6.3) that can lead to code execution and data tampering, according to The Hacker News. Other critical security considerations in the AI supply chain are that DevOps teams are under so much pressure to launch new internal applications at such an accelerated pace that code audits have become an afterthought. In this frenzied AI DevOps environment, some security practitioners are raising alarms that companies are “rushing AI software tools to market with unpatched vulnerabilities,” according to Compiler.
Given the intersecting trendlines of hyper-accelerated AI adoption in DevOps, increased pressure on DevOps teams to launch internal and external applications, and prevailing cyberattack trends across the broader software supply chain, ExtraHop forecasts that the most critical threat posed by AI emerges from its open-source components and CI/CD pipelines. Our consensus is that the danger posed by AI’s supply chain is much more urgent than the one portened by generative AI-refined “super malware.”
Nevertheless, threat actors continue to tinker with generative AI to optimize tactics like PowerShell-based exploits, the number-one MITRE ATT&CK technique operationalized by adversaries last year, according to Splunk research. According to Microsoft, PowerShell is a “cross-platform task automation solution made up of a command-line shell, a scripting language, and a configuration management framework” that runs on Windows, Linux, and macOS.
‘Proxy’ asks for advice on how to use AI to make their malicious PowerShell script undetectable Source:XSS
In the XSS forum post above, threat actor ‘proxy’ seeks advice from the Russian-language cybercriminal community on how to leverage AI and LLMs to make their malicious PowerShell script undetectable by antivirus (AV) and endpoint detection and response (EDR) solutions. In a series of replies to this thread not featured in this screenshot, threat actor ‘dunkel’ says the malicious script can be morphed through any LLM, “chatgpt, claude, llama, mixtral, gemini, grok."
But further down the thread, threat actor ‘injuann’ cautions that "ai is unstable if we talk about constant remorph, you can't be sure that it will work every time." So as 2024 comes to a close, ExtraHop assesses that AI DevOps supply-chain concerns should prevail above all other risk factors.