arXiv:2602.00415v2 Announce Type: replace-cross Abstract: Memory is not merely a storage mechanism for intelligent systems, but a structure for organizing evidence and constraining belief. This is especially important for multimodal reasoning, where retrieved evidence must be both query-relevant and visually consistent. However, current memory systems for vision-language models (VLMs) remain largely positive-associative: they retrieve what is similar or previously observed, but lack an explicit way to remember what has been verified as absent or logically excluded. To this end, we propose \tex
Source: arXiv cs.LG — read the full report at the original publisher.
