Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding

This is the official implementation of the paper:
"Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding"

💡 Overview

We introduce Perception Magnifier (PM), an inference-time decoding method for vision-language models. PM constructs and refines perception maps from attention, then magnifies critical visual regions while compressing less relevant areas, guiding the model to focus on fine-grained details without losing global context. This adaptive magnification strengthens visual grounding during decoding, effectively reducing hallucinations while preserving reasoning ability.

🕹️ Usage

Environment Setup

conda create --name pm python=3.10
conda activate pm
pip install -r requirements.txt

Then set up the environment variables in starter/env_definer.sh and follow the instructions in starter/ReadMe.md.

Run

The main implementation of Perception Magnifier (PM) is located in:

experiments/eval/model_generate.py

This file contains the core functions for generating model outputs with PM.

Evaluation Scripts

To reproduce results, we provide ready-to-use scripts under the scripts/ folder:

For example, to run PM with LLaVA on MME:

bash experiments/scripts/mme/run_llava_pm.sh

Run PM with LLaVA on POPE:

bash experiments/scripts/pope/run_llava_pm.sh

📑 Citation

If you find our work useful, please consider citing:

@article{mao2025through,
  title={Through the magnifying glass: Adaptive perception magnification for hallucination-free vlm decoding},
  author={Mao, Shunqi and Zhang, Chaoyi and Cai, Weidong},
  journal={arXiv preprint arXiv:2503.10183},
  year={2025}
}

🌟 Acknowledgment

This repository extends existing implementations, with some functions and evaluation scripts adapted from VDD, API, PAI, Transformer-Explainability, and OPERA.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
experiments		experiments
figs		figs
starter		starter
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding

💡 Overview

🕹️ Usage

Environment Setup

Run

Evaluation Scripts

📑 Citation

🌟 Acknowledgment

About

Uh oh!

Releases

Packages

Languages

ShunqiM/PM

Folders and files

Latest commit

History

Repository files navigation

Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding

💡 Overview

🕹️ Usage

Environment Setup

Run

Evaluation Scripts

📑 Citation

🌟 Acknowledgment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages