Back to articles

ML team documents critical compatibility issues when fine-tuning and deploying Google's Gemma-4 model

r/MachineLearning · April 18, 2026

ML team documents critical compatibility issues when fine-tuning and deploying Google's Gemma-4 model

AI Summary

  • PEFT library fails to recognize Gemma-4's custom ClippableLinear layers, requiring manual unwrapping before LoRA attachment
  • SFTTrainer from TRL silently breaks training by hardcoding use_cache=False, corrupting KV-sharing attention—fixed in transformers v5.5.2+
  • DeepSpeed ZeRO-3 produces incomplete LoRA adapters with zero-element tensors in half the layers, making fine-tuning ineffective
  • No mature runtime LoRA serving solutions exist yet, with vLLM experiencing significant latency issues during inference

Related Articles

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free