arXiv:2512.16349v2 Announce Type: replace-cross Abstract: We propose a collaborative edge-to-server inference framework for vision-language models (VLMs) that reduces communication cost while maintaining inference accuracy. In typical deployments, visual data captured at edge devices (clients) is transmitted to the server for VLM inference. However, transmitting full-resolution images incurs high communication cost. Conversely, aggressive downsizing or excessive compression to mitigate communication overhead can discard fine-grained details, leading to accuracy degradation. To overcome this li

Source: arXiv cs.AI — read the full report at the original publisher.

This is a curated wire item. The Continuum Brief does not republish full third-party articles; this entry links to the original source.