Advancing Generalization in Deep Representation Learning

Loading...
Thumbnail Image

Authors

Molahasani Majdabadi, Mahdiyar

Date

2025-05-29

Type

thesis

Language

eng

Keyword

Generalization in deep learning , Sample-level generalization , Class-level generalization , Representation-level generalization , Domain-level generalization , Continual learning , Long-Tailed recognition , Federated learning , Spurious correlations , Vision-Language models

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Generalization, the ability of machine learning models to reliably apply learned knowledge to unseen scenarios, remains a central challenge in deploying AI systems effectively in dynamic, real-world environments. This thesis systematically addresses generalization challenges within computer vision across four distinct yet interconnected levels: sample-level, class-level, domain-level, and representation-level generalization. Firstly, we address sample-level generalization, crucial when test samples deviate from training distributions, through a novel continual learning adaptation of Elastic Weight Consolidation tailored explicitly for pedestrian detection tasks. Our method achieves robust performance across varying data distributions without suffering catastrophic forgetting. Next, we investigate class-level generalization, specifically tackling the long-tailed recognition problem caused by imbalanced datasets. By establishing theoretical links between continual learning and class imbalance, we propose a novel continual learning-based framework (CLTR), achieving state-of-the-art performance on benchmark datasets and demonstrating significant improvements in underrepresented classes. Domain-level generalization is examined within federated learning contexts, where privacy constraints exacerbate domain shifts across distributed datasets. We provide the first theoretical analysis of gradient alignment methods, uncovering the mechanisms by which they mitigate domain shifts. This theoretical foundation facilitates the design of improved domain-generalization methods for privacy-sensitive federated environments. Finally, representation-level generalization is addressed by targeting spurious correlations that impede fairness and robustness in Vision-Language Models (VLMs). We propose PRISM, a novel, data-free method leveraging Large Language Models to detect and counteract implicit biases. Using a novel contrastive embedding transformation, PRISM significantly enhances fairness and accuracy in zero-shot classification tasks, surpassing existing debiasing techniques. Together, the methodologies and theoretical insights developed in this thesis substantially advance our understanding and practical approaches to improving generalization at multiple levels, paving the way for more robust, reliable, and fair machine learning systems in real-world applications.

Description

Citation

Publisher

License

Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
Attribution-NonCommercial-NoDerivatives 4.0 International

Journal

Volume

Issue

PubMed ID

External DOI

ISSN

EISSN