Part II · Chapter 105 MIXED

Comparative Risk Assessment

How Chinese model risk compares to US open-weight, closed API, and custom-trained alternatives.

Extracted from the comprehensive scenario and risk analysis. Full document: ../104-research-scenario-walkthroughs/attack-scenarios-and-comparative-risk.md


1. Chinese Open-Weight vs. US Open-Weight Models

Risk Comparison Matrix

Risk DimensionChinese (Qwen/DeepSeek)US (Llama/Mistral)Delta
State-directed backdoorPLAUSIBLE (legal + strategic incentive)SPECULATIVE (no known incentive)Meaningful gap
Training data poisoningPLAUSIBLEPLAUSIBLE (by any adversary)Narrow gap
Systematic output biasPROVEN (on CCP topics)PROVEN (different biases)Moderate gap (Chinese bias is adversary-aligned)
Supply chain compromisePLAUSIBLEPLAUSIBLENo gap
Jailbreak susceptibilityPROVEN (higher rates)PROVEN (lower rates)Moderate gap
Transparency of trainingLowModerateModerate gap

China-Specific Risk Factors

  1. Legal obligation: National Intelligence Law (2017), Article 7 — compels cooperation with state intelligence. No equivalent for Meta/Mistral. [PROVEN — enacted law]
  2. Demonstrated censorship alignment: Overt on CCP-sensitive topics. [PROVEN]
  3. Strategic motivation: Documented state-level AI-enabled intelligence strategy. [PROVEN]
  4. Lower security baseline: DeepSeek database exposure (Wiz, January 2025). [PROVEN]

Why the Gap Is Narrower Than Assumed

  1. Training data poisoning affects ALL models trained on web-scraped data. Carlini et al. (2023) demonstrated practical web-scale data poisoning. [PROVEN mechanism]
  2. No model has been publicly demonstrated to contain a state-planted backdoor. [Documented absence]
  3. Supply chain risks are equivalent for any Hugging Face download. [PROVEN]

2. Open-Weight vs. Closed API Models (OpenAI, Anthropic)

Trust Assumptions with Closed APIs

RiskDescriptionEvidence
Data exposure to providerAll inputs visible to API providerPROVEN (by architecture)
Government compulsion (US)FISA, NSLs could compel data sharingPROVEN (legal mechanism)
Model behavior changesProvider can change model without noticePROVEN (documented)
No auditabilityCannot inspect weights or trainingPROVEN (by design)

The Paradox

For defense use, closed API models have a fundamentally different risk profile, not a lower one. You trade model-compromise risk for data-exposure risk. For CUI or above, local deployment of an inspectable model is preferable despite model-integrity risks.


3. Custom-Trained Models on Controlled Data

ApproachCost (est.)TimelineCapability
Train from scratch (>70B)$10M-$100M+6-12 monthsState-of-the-art
Train from scratch (7-13B)$500K-$5M2-6 monthsModerate
Fine-tune US open-weight base$10K-$500K1-4 weeksGood
Distill from multiple teachers$50K-$1M1-3 monthsGood

Residual risks: Training data contamination, framework/toolchain compromise, unintended biases, capability limitations creating pressure to supplement with pre-trained models.

Practical approach: Start from US open-weight base (Llama), fine-tune on controlled data, implement defense-in-depth monitoring.


4. The Fundamental Question

Is this about Chinese models specifically, or about any model you didn’t train?

The Honest Answer: Mostly the latter, with a meaningful increment for Chinese origin.

Origin-agnostic risks (70-80% of threat surface):

  • Training data poisoning
  • Unknown RLHF biases
  • Supply chain compromise
  • Unauditable weight-level behaviors
  • Prompt injection vulnerability

China-specific increment (additional 20-30%):

  • Legal compulsion under National Intelligence Law
  • Demonstrated willingness to embed CCP-aligned censorship
  • Strategic intelligence collection incentive
  • Lower baseline security practices
  • Active intelligence posture against US defense targets

Marginal Risk Multiplier

Use CaseMultiplierRationale
Technical tasks (code, formatting), no tools~1.2-1.3xLow sensitivity, limited attack surface
Analytical tasks on sensitive topics, with tools~1.5-2.0xHigh sensitivity, broad attack surface
Compliance/regulatory contextsEffectively infiniteDe facto prohibition regardless of technical risk

Government Actions (as of early 2025)

EntityActionBasis
US NavyBanned DeepSeek from government devicesData security
NASA, PentagonBlocked DeepSeek accessSecurity review
ItalyTemporarily blocked DeepSeek appGDPR/data privacy
AustraliaBanned DeepSeek from government devicesSecurity concerns
TaiwanBanned DeepSeek from government useNational security
South KoreaBlocked DeepSeek on government devicesSecurity concerns

Key observation: Most bans target the API service (data flowing to Chinese servers), not local use of open weights.


References

  1. Carlini, N. et al. “Poisoning Web-Scale Training Datasets is Practical.” IEEE S&P, 2023.
  2. China National Intelligence Law (2017), Article 7.
  3. Wiz Research. “Exposed DeepSeek Database.” January 2025.
  4. ODNI Annual Threat Assessment, 2024-2025.