Adhiraj Ghosh
Seeking: Research internships and PhD positions!
I am a second-year MSc student in Machine Learning at the University of Tübingen. I focus my research on multimodal deep learning, especially in the context of Vision-Language Representation Learning.
Currently, I am at Bethge Lab, working on the holistic understanding of Vision-Language models.
Previously, I worked on visualising figurative speech at the Computer Graphics Group, which led to an Outstanding Paper award at EMNLP 2023.
Before starting my master's, I used to be a Computer Vision Researcher at the Center of Artificial Intelligence,ZHAW, working on domain adaptation in Optical Music Recognition.
I have also worked with Dr. Daniel Lin Wen-Yan at SMU on feature correspondence-based object tracking. I studied Electrical and Electronics Engineering for my BSc in Manipal/Singapore.
I am very eager to collaborate on relevant projects, so please reach out if you are interested!
Email  / 
CV  / 
Google Scholar  / 
Github  / 
Twitter  / 
Bluesky  / 
LinkedIn  / 
YouTube
|
|
Recent News
- Dec 2024 : New paper on lifelong open-ended benchmarking out on arXiv!
- Nov 2024 : Defended my MS thesis!
- Sep 2024 : No Zero-shot was accepted at NeurIPS as a poster! Check out coverage by Computerphile and AI 'N Stuff!
- Dec 2023 : ViPE awarded outstanding paper at EMNLP!
- Oct 2023 : ViPE accepted at EMNLP 2023 (main conference).
---- Show More ----
- Sep 2023 : Work on Real World Music Object Recognition published in TISMIR.
- Mar 2023 : Started working in the Tubingen AI Centre in Dr. Hendrik Lensch's group.
- Oct 2022 : Moved to Germany! Started my MSc at the University of Tübingen.
- Aug 2022 : RPTM accepted for oral presentation at WACV 2023. Check out the paper and SOTA comparisons!
|
Work Experience
Mar 2023 - Sep 2023: Research Assistant at the Computer Graphics group, Tübingen AI Centre.
May 2021 - Aug 2022: Computer Vision Researcher, Zürich University of Applied Sciences.
Jan 2020 - Dec 2020: Visiting Researcher, Singapore Management University
Jun 2018 - Aug 2019 : Undergraduate Research Intern, Jadavpur University.
|
|
ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities
Adhiraj Ghosh*, Sebastian Dziadzio*, Ameya Prabhu, Vishaal Udandarao, Samuel Albanie, Matthias Bethge.
arXiv:2412.06745, 2024
Paper
To evaluate the vast capabilities of foundation models, we introduce ONEBench – a benchmark that unifies individual test sets into a vast pool of individual data-measurement samples. We shift the focus from singular test-sets to sample-level evaluations, re-structuring static benchmarks to accommodate an ever-expanding pool of datasets and models.
|
|
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
Vishaal Udandarao*, Ameya Prabhu*, Adhiraj Ghosh, Yash Sharma, Philip H.S. Torr, Adel Bibi, Samuel Albanie, Matthias Bethge.
NeurIPS 2024
Paper /
Code /
Let It Wag! Benchmark
The impressive empirical performance of VLMs is attributed to test concepts within their pretraining datasets, thus not showcasing "zero-shot" generalization. Instead, they need exponentially more data on a concept to linearly improve performance.
|
|
ViPE: Visualise Pretty-much Everything
Hassan Shahmohammadi, Adhiraj Ghosh, Hendrik Lensch.
EMNLP 2023 (Outstanding Paper Award)
Paper /
Code /
Dataset /
HuggingFace /
Music Videos
ViPE is the first automated model for translating any arbitrary piece of text into a visualisable prompt. It helps any text-to-image model in figurative or non-lexical language visualisations.
|
|
Real World Music Object Recognition
Adhiraj Ghosh*,Lukas Tuggener*, Raphael Emberger*, Pascal Sager*, et al.
TISMIR 2023
Paper /
Code
We present solutions to improve recognition accuracy in Music Object Recognition on low-quality, real-world music sheet data and provide confidence-rated model outputs to enable efficient human post-processing.
|
|
Relation Preserving Triplet Mining for Stabilising the Triplet Loss in Re-identification Systems
Adhiraj Ghosh, Kuruparan Shanmugalingam, Wen-Yan Lin
WACV 2023
Paper /
Code /
Video /
Poster
We propose a new, feature-guided triplet mining scheme for understanding intrinsic pose to solve the intra-class variance problem in re-identification datasets.
|
|
Irony Detection in Bengali Tweets: A New Dataset, Experimentation and Results
Adhiraj Ghosh, Kamal Sarkar
ICCIDS 2020
Paper /
Dataset
This paper presents the description of the Bengali irony detection dataset developed by us and reports results obtained on our Bengali irony dataset using SOTA machine learning methodologies.
|
|