Supply Chain Fundamentals
Model provenance, serialization formats, and trust boundaries in the ML distribution pipeline.
If you wanted to compromise the highest number of organizations running open-weight LLMs with the least effort, you would not attack the model weights. You would attack the supply chain. The path from a model’s release to its deployment on a target system passes through distribution platforms, serialization formats, loading libraries, and configuration files — each of which involves trust assumptions that practitioners rarely examine. A single malicious model on Hugging Face, downloaded thousands of times before detection, would compromise more systems than any weight-level backdoor, and it would do so through mechanisms that traditional ML security research has largely overlooked.
This is not theoretical. JFrog’s security research team has documented malicious models on Hugging Face that exploited Python’s pickle serialization format to execute arbitrary code on the loading machine (JFrog, 2024). These were not sophisticated weight-level attacks — they were straightforward code execution vulnerabilities packaged as model files. They worked because the default model loading path in the most widely used ML libraries will execute arbitrary Python code embedded in serialized objects, and most practitioners neither know nor check for this.
The supply chain is the most practical attack vector for most adversaries because it requires no access to the original training process, no deep understanding of model internals, and no novel research. It requires only the ability to upload a file to a public platform and wait for people to download it. This chapter covers the full supply chain — from how models are produced and distributed to how they are loaded and deployed — with a focus on where trust is assumed and where it can be violated.
Chapter 102 analyzes supply chain attacks as a threat category. Chapter 104’s scenario walkthroughs include supply chain exploitation. This chapter provides the practical foundation for understanding how those attacks work and why they are effective.
The Model Supply Chain: Producers, Platforms, and Consumers
This section maps the end-to-end supply chain for open-weight models: who produces them (labs like Meta, Alibaba, DeepSeek, Mistral), who hosts and distributes them (Hugging Face, Ollama, ModelScope), and who downloads and deploys them (enterprises, researchers, hobbyists). The supply chain is characterized by a small number of producers, a small number of distribution platforms, and a very large number of consumers — a topology that creates high-value targets at the platform level and amplifies the impact of any compromise that reaches the distribution tier.
Distribution Platforms: Hugging Face, Ollama, and ModelScope
This section covers the major platforms through which open-weight models reach end users. Hugging Face Hub is the dominant platform in the Western ecosystem, hosting hundreds of thousands of models with varying levels of verification. Ollama provides a simplified distribution channel for quantized models, abstracting away the loading process entirely. ModelScope serves a similar role in the Chinese ecosystem. Each platform has different trust models, different verification mechanisms (or lack thereof), and different attack surfaces. The distinction between platforms is relevant to the comparative risk assessment in Chapter 105.
Serialization Formats: Safetensors vs. Pickle
This is the single most important technical distinction in the model supply chain. Python’s pickle format can serialize arbitrary Python objects, which means a pickle file can contain code that executes when loaded. The safetensors format, developed by Hugging Face, stores only tensor data — numerical arrays — and by design cannot contain executable code. This section explains both formats, demonstrates why pickle is dangerous, and covers the current state of ecosystem migration toward safetensors. Despite years of warnings, pickle-based model files remain widely distributed and loaded, making this an active and exploitable vulnerability (PROVEN).
Model Loading: trust_remote_code and Implicit Trust
Even when model weights are stored safely in safetensors format, the loading process can introduce code execution through other channels. The trust_remote_code flag in Hugging Face’s transformers library allows model repositories to include custom Python code that is executed during model loading — tokenizer classes, model architectures, or preprocessing functions. This section explains how trust_remote_code works, why it exists (to support novel architectures that aren’t yet in the library), and why it is dangerous: setting it to True grants the model repository full code execution on the loading machine. Many tutorials and quickstart guides instruct users to enable it without explaining the implications. Chapter 102 covers this as an exploitation vector.
Provenance and Identity: What’s Verified and What Isn’t
How do you know that the “Qwen2.5-7B” model you downloaded is actually the model released by Alibaba? This section covers the current state of model provenance: model cards (self-reported metadata), cryptographic signatures (rarely used), file hashes (sometimes provided, rarely checked by consumers), and organizational verification on platforms (inconsistently applied). The gap between the provenance information available and the provenance information practitioners actually verify is significant — and it is exactly this gap that supply chain attacks exploit.
Fine-Tune and Derivative Chains: Provenance Laundering
The open-weight ecosystem encourages taking a base model, fine-tuning it, and uploading the result as a new model. This creates derivative chains where the lineage of a model becomes progressively harder to trace. This section covers how derivative models obscure provenance: a malicious modification can be uploaded as a “fine-tune” of a popular model, inheriting the base model’s credibility. The term “provenance laundering” describes this dynamic — the derivative model appears legitimate because it is ostensibly based on a trusted parent, even though the fine-tuning process could have introduced arbitrary modifications. This pattern is analyzed in the supply chain threat section of Chapter 102.
Dependency Chains: Tokenizers, Configs, and Custom Code
Model weights are not deployed in isolation. They are accompanied by tokenizer files, configuration JSON, and potentially custom preprocessing or postprocessing code. This section covers these dependency files as their own attack surface: a modified tokenizer can alter how input is processed, a malicious configuration can change model behavior, and custom code files executed during loading can perform arbitrary actions. These files are often overlooked in security reviews that focus exclusively on the model weights themselves.
Key Takeaway: The Supply Chain Is the Most Practical Attack Vector
The closing section synthesizes the chapter’s argument: for most adversaries, the model supply chain offers the highest return on investment. It requires no novel ML research, no access to training infrastructure, and no sophisticated attack techniques. It exploits the gap between the trust that practitioners place in model distribution channels and the verification those channels actually provide. The supply chain is where the real-world compromises have already occurred, and it is where the threat analysis in Part II begins its assessment of practical risk.
Summary
[Chapter summary to be written after full content is drafted.]