Back to articles

Developer tests Qwen 35B model on Mac for coding tasks, finds promise but struggles with context window limitations during multi-step debugging

r/LocalLLaMA · April 19, 2026

AI Summary

  • User running Qwen3.6-35B-A3B-UD-Q4_K_M on M2 MacBook Pro with 32GB RAM using llama.cpp and opencode for coding assistance
  • Model successfully identified bugs in a full-stack application task that Claude Opus 4.7 previously completed, but loses critical information during context compaction
  • Context window limited to 32,768 tokens to prevent memory exhaustion, forcing trade-offs between functionality and stability
  • Disabling subagents helps preserve task context through first compaction pass by reducing dual context usage, but second compaction pass causes significant information loss
  • Results are 'tantalizing' as model grasps problem essentials but struggles to proceed to implementation phase due to aggressive context management

Related Articles

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free