Matan Levy

I am a Computer Science Ph.D. candidate at the School of Computer Science and Engineering at the Hebrew University of Jerusalem, under the joint supervision of Prof. Dani Lischinski and Dr. Rami Ben-Ari.

I previously worked in IBM Research AI as a research intern.

My Research interest are Computer Vision and NLP, and tasks that combine them.

Semantic Scholar Google Scholar LinkedIn {Last name (in blue)}@cs.huji.ac.il

Publications

Find your Needle: Small Object Image Retrieval via Multi-Object Attention Optimization

arXiv, 2025
Michael Green*, Matan Levy*, Issar Tzachor*, Dvir Samuel, Nir Darshan, Rami Ben-Ari

We tackle the problem of Small Object Image Retrieval (SoIR), where the goal is to retrieve images containing specific small objects within cluttered scenes. We establish new benchmarks and introduce Multi-object Attention Optimization (MaO), a novel framework that significantly outperforms existing methods, paving the way for future advancements in efficient, fine-grained retrieval tasks.

Task-Specific Adaptation with Restricted Model Access

arXiv, 2025

In this work, we propose "Gray-box" fine-tuning frameworks that enables task-specific adaptation of foundational models without exposing their weights or architecture. Using lightweight input and output adapters, our approach effectively adapts models while keeping them fixed. We introduce DarkGray-box and LightGray-box variants, demonstrating competitive performance with full fine-tuning on tasks like text-image and text-video alignment.

EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition

ICLR 2025
Issar Tzachor, Boaz Lerner, Matan Levy, Michael Green, Tal Berkovitz Shalev, Gavriel Habib, Dvir Samuel, Noam Korngut Zailer, Or Shimshi, Nir Darshan, Rami Ben-Ari

This work introduces a new method for visual place recognition (VPR) that uses features from foundation models to improve accuracy. It excels in handling challenging scenarios like occlusions, seasonal changes, and day-night variations, offering more efficient and accurate results than previous methods.

Where's Waldo: Diffusion Features for Personalized Segmentation and Retrieval

NeurIPS 2024

This work leverages text-to-image diffusion models for personalized image segmentation and retrieval, using features from pre-trained models. It surpasses existing methods in identifying specific objects within images without additional training.

Chatting Makes Perfect: Chat-based Image Retrieval

NeurIPS 2023

This work proposes a chat-based image retrieval system that refines search results through interactive dialogue. By asking follow-up questions, the system improves retrieval accuracy and surpasses traditional single-query methods in performance.

Data Roaming and Quality Assessment for Composed Image Retrieval

AAAI 2024

This work introduces a new dataset for Composed Image Retrieval (CoIR) and a model that significantly improves retrieval tasks. The dataset enhances query richness and reduces redundancy, achieving state-of-the-art results on benchmarks like FashionIQ and CIRR.

Classification-Regression for Chart Comprehension

ECCV 2022

This work presents a model for chart question answering that combines visual and textual data, significantly improving performance on complex charts. It excels in handling out-of-vocabulary and regression tasks, achieving strong results on the PlotQA dataset.