Developer tests Qwen 35B model on Mac for coding tasks, finds promise but struggles with context window limitations during multi-step debugging
r/LocalLLaMA · April 19, 2026
AI Summary
•User running Qwen3.6-35B-A3B-UD-Q4_K_M on M2 MacBook Pro with 32GB RAM using llama.cpp and opencode for coding assistance
•Model successfully identified bugs in a full-stack application task that Claude Opus 4.7 previously completed, but loses critical information during context compaction
•Context window limited to 32,768 tokens to prevent memory exhaustion, forcing trade-offs between functionality and stability
•Disabling subagents helps preserve task context through first compaction pass by reducing dual context usage, but second compaction pass causes significant information loss
•Results are 'tantalizing' as model grasps problem essentials but struggles to proceed to implementation phase due to aggressive context management