Advancing Generalization in Deep Representation Learning
Loading...
Authors
Molahasani Majdabadi, Mahdiyar
Date
2025-05-29
Type
thesis
Language
eng
Keyword
Generalization in deep learning , Sample-level generalization , Class-level generalization , Representation-level generalization , Domain-level generalization , Continual learning , Long-Tailed recognition , Federated learning , Spurious correlations , Vision-Language models
Alternative Title
Abstract
Generalization, the ability of machine learning models to reliably apply learned knowledge to unseen scenarios, remains a central challenge in deploying AI systems effectively in dynamic, real-world environments. This thesis systematically addresses generalization challenges within computer vision across four distinct yet interconnected levels: sample-level, class-level, domain-level, and representation-level generalization.
Firstly, we address sample-level generalization, crucial when test samples deviate from training distributions, through a novel continual learning adaptation of Elastic Weight Consolidation tailored explicitly for pedestrian detection tasks. Our method achieves robust performance across varying data distributions without suffering catastrophic forgetting.
Next, we investigate class-level generalization, specifically tackling the long-tailed recognition problem caused by imbalanced datasets. By establishing theoretical links between continual learning and class imbalance, we propose a novel continual learning-based framework (CLTR), achieving state-of-the-art performance on benchmark datasets and demonstrating significant improvements in underrepresented classes.
Domain-level generalization is examined within federated learning contexts, where privacy constraints exacerbate domain shifts across distributed datasets. We provide the first theoretical analysis of gradient alignment methods, uncovering the mechanisms by which they mitigate domain shifts. This theoretical foundation facilitates the design of improved domain-generalization methods for privacy-sensitive federated environments.
Finally, representation-level generalization is addressed by targeting spurious correlations that impede fairness and robustness in Vision-Language Models (VLMs). We propose PRISM, a novel, data-free method leveraging Large Language Models to detect and counteract implicit biases. Using a novel contrastive embedding transformation, PRISM significantly enhances fairness and accuracy in zero-shot classification tasks, surpassing existing debiasing techniques.
Together, the methodologies and theoretical insights developed in this thesis substantially advance our understanding and practical approaches to improving generalization at multiple levels, paving the way for more robust, reliable, and fair machine learning systems in real-world applications.
Description
Citation
Publisher
License
Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
Attribution-NonCommercial-NoDerivatives 4.0 International
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
Attribution-NonCommercial-NoDerivatives 4.0 International
