Neuro-Symbolic Methods for Natural Language Inference and Question Answering
Deep Learning , Neuro-Symbolic Method , Natural Language Processing , Machine Learning , Natural Language Inference , Question Answering , Explainable AI
One of the fundamental problems in deep learning research is how to design neural network models to incorporate logic and symbolic operations. Although deep neural network models have achieved state-of-the-art performance on multiple natural language processing benchmarks, those black-box models can hardly provide explanations for their inner mechanisms. They still lack the ability to perform systematic reasoning like human beings and generalize poorly to out-of-distribution samples. In this thesis, we attempt to overcome these limitations by designing neuro-symbolic models that combine neural network models with natural logic theory. At the lower level, we use neural networks to model the text representation and produce intermediate predictions, while at the higher level, we leverage symbolic operations to perform reasoning, which leads to the final prediction. We apply our neuro-symbolic models to solve the natural language inference (NLI) task, and we also explore ways to extend our method to question answering (QA). This thesis offers a set of contributions that address the problem of effectively combining deep neural networks with symbolic methods. The first contribution is a novel end-to-end differentiable natural logic model for NLI. Our proposed model achieves empirically competitive results on the Stanford NLI benchmark and multiple stress-test datasets. Our model also provides faithful explanations for its decisions based on natural logic. The second contribution is a novel neuro-symbolic NLI model, which overcomes the limitations of its predecessor by leveraging a well-designed natural logic program and reinforcement learning. We also propose an introspective revision algorithm that incorporates commonsense knowledge bases to alleviate the spurious reasoning problem and improve training efficiency. The third contribution is an extension of the neuro-symbolic method to multi-hop QA applications. We propose a model that accurately locates chains of useful evidence, which can be trained without direct supervision, and a neuro-symbolic QA model that performs natural-logic style reasoning on the chains of evidence.