Part II · Chapter 101 MIXED

Executive Summary

Bottom-line-up-front assessment of Chinese-origin LLM security risks with key findings and tiered recommendations.

Status Draft -- for peer review Updated 2026-03-02

Scope Note

This analysis uses Chinese-origin open-weight LLMs as the primary case study because they represent the most policy-relevant instance of foreign-origin AI risk. However, the majority of threat vectors and mitigations discussed apply broadly to any untrusted open-weight model regardless of origin. Where findings are origin-agnostic, this is noted explicitly. Where China-specific factors (legal compulsion, demonstrated censorship alignment, strategic adversary status) create additional risk, the marginal increment is quantified separately.

Bottom Line Up Front

Chinese-origin open-weight LLMs (Qwen, DeepSeek, and similar) should not be used in sensitive environments where they have tool access or process sensitive analytical content. The risk is real, technically grounded, and compounded by regulatory and policy constraints that create a de facto prohibition. However, the risk is context-dependent and not absolute — the answer is not a blanket “hard no” across every possible use case. Critically, most of the underlying threat vectors apply to any model you did not train yourself.

Key Findings

1. The Threat Is Real but Nuanced

Backdoors can be embedded in model weights that survive all known safety training (RLHF, SFT, adversarial training). This is not theoretical — Anthropic’s “Sleeper Agents” paper (2024) proved it experimentally. Larger models are harder to fix, not easier. No existing tool can detect a sophisticated backdoor in a billion-parameter model. [PROVEN]

A state actor with control over a pre-training pipeline has the technical capability to insert undetectable backdoors. China’s National Intelligence Law (2017) legally compels cooperation with state intelligence. [Legal framework: PROVEN | Actual insertion: PLAUSIBLE but unconfirmed]

2. The Risk Is Heavily Context-Dependent

Configuration	Risk Level	Primary Threat
Tool access + RAG pipeline	HIGH	Data exfiltration, system compromise
Text-only, no tools	MODERATE	Analytical bias, influence operations
Distillation into custom model	LOW-MODERATE	Bias transfer (not technical backdoors)

3. No Single Mitigation Is Sufficient

Mitigation	Rating
Air-gapping	Partially effective — stops direct exfil, fails against output encoding
Output filtering	Partially effective — provably bypassable by sophisticated steganography
Sandboxing tool access	Partially effective — strongest single control
Weight inspection	Ineffective — no tool scales to LLM parameters
Fine-tuning/distillation	Ineffective — backdoors survive safety training
Quantization	Partially effective — incidental only

4. The Marginal Risk of Chinese Models vs. Any Uncontrolled Model

~70-80% of the risk applies to any model you did not train (training data poisoning, supply chain compromise, uncontrolled biases, prompt injection). The China-specific increment — legal compulsion, demonstrated censorship alignment, strategic adversary status — adds a 1.3x to 2.0x risk multiplier depending on use case. This multiplier is real but the framing should be “how do we manage risk from any uncontrolled model” with Chinese models receiving additional scrutiny.

5. The Regulatory Environment Is Effectively Prohibitive

No single policy explicitly bans Chinese AI models, but the cumulative stack — DoD AI Principles (traceability), NIST AI RMF (supply chain), ITAR/EAR (export controls), DFARS/CMMC, NDAA provisions, and allied interoperability requirements — creates a de facto prohibition for any program under DoD oversight. Congressional momentum is toward making this de jure.

Recommendation Summary

Use Case	Recommendation
Any configuration with tool access	Do not use Chinese-origin models
Analytical tasks on sensitive topics	Do not use Chinese-origin models
Non-sensitive text generation (coding, formatting)	Conditional — with monitoring and US alternative preferred
Knowledge distillation from Chinese teacher	Acceptable with red-team validation on bias topics
All configurations	Defense-in-depth regardless of model origin

The Honest Answer

“Can Chinese models ever be made safe for defense use?”

For tool-enabled analytical work: No, not with current technology. The combination of undetectable backdoors, tool-use exploitation vectors, and regulatory constraints makes this an unacceptable risk.

For constrained, non-sensitive, text-only use: Technically defensible with sufficient controls, but the compliance and optics risk likely outweighs any capability advantage over US-origin alternatives.

The stronger recommendation: Invest in fine-tuning US-origin open-weight models (Llama) on controlled data, implement defense-in-depth monitoring regardless of model origin, and treat all models you did not train as partially untrusted.

Full analysis in subsequent chapters. This summary represents findings from Part II of this book.