ML team documents critical compatibility issues when fine-tuning and deploying Google's Gemma-4 model

r/MachineLearningApr 19, 20261 min read

Summaries like this, in your inbox every morning.

3 Key Points

PEFT library fails to recognize Gemma-4's custom ClippableLinear layers, requiring manual unwrapping before LoRA attachment
SFTTrainer from TRL silently breaks training by hardcoding use_cache=False, corrupting KV-sharing attention—fixed in transformers v5.5.2+
DeepSpeed ZeRO-3 produces incomplete LoRA adapters with zero-element tensors in half the layers, making fine-tuning ineffective
No mature runtime LoRA serving solutions exist yet, with vLLM experiencing significant latency issues during inference

No comments yet. Be the first to share your thoughts!

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Free · takes 30 seconds · unsubscribe anytime

1 minute a day. The AI essentials.

200+ sources · Email / LINE / Slack