SAGA-ReID reconstructs person identity representations using CLIP patch tokens rather than global pooling, improving recognition under occlusion.

arXiv cs.CVApr 27, 20261 min read

Summaries like this, in your inbox every morning.

3 Key Points

Researchers propose SAGA-ReID, which aligns intermediate patch tokens with anchor vectors in CLIP's text embedding space to suppress corrupted or absent regions without requiring image descriptions.
The method emphasizes spatially stable evidence and outperforms global pooling, achieving up to +10.6 Rank-1 improvement on occluded benchmarks where global aggregation is least reliable.
Controlled experiments isolate the aggregation mechanism under synthetic masking (where identity signal is absent) and realistic human distractors (where an overlapping person introduces confusing signal), with SAGA's advantage growing as occlusion increases.

No comments yet. Be the first to share your thoughts!

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Free · takes 30 seconds · unsubscribe anytime

1 minute a day. The AI essentials.

200+ sources · Email / LINE / Slack