Research Statement
My research studies how large language models (LLMs) and large vision-language models (LVLMs) understand, reason about, and interact with geospatial information. Maps play a central role in how people understand places, environments, and spatial relationships. However, modern AI systems often struggle to interpret spatial relationships, incorporate geographic context, or answer questions that require multi-step spatial reasoning.
To bridge this gap, I design evaluation frameworks, benchmarks, and analytical methods that reveal not only where models succeed or fail, but why. I focus on advancing models’ capabilities in spatial reasoning, which includes understanding topological and directional relationships, interpreting map symbols and semantics, and integrating multimodal geographic context for more reliable decision-making.
Ongoing Projects
FARON: Synthetic Cartographic Reasoning Dataset
A synthetic dataset of step-by-step cartographic reasoning tasks, designed to benchmark and improve the spatial and map-based reasoning capabilities of LVLMs.
Enhancing Cartographic Reasoning of LVLMs through Template-Based Reasoning
Developing structured reasoning templates to guide LVLMs in complex cartographic tasks. This research aims to enhance model accuracy and efficiency by providing explicit reasoning pathways for map-based queries, reducing token cost and improving correct answer rates.
MARCIE: Improving 4D Data Comprehension of LVLMs
Research focused on improving how LVLMs comprehend 4D data (3D objects changing over time), and modifying model architecture, such as positional encoding, to better understand and reason about complex spatio-temporal information.