Show simple item record

dc.contributor.authorFalk, Tiago
dc.contributor.otherQueen's University (Kingston, Ont.). Theses (Queen's University (Kingston, Ont.))en
dc.date2008-12-22 14:54:49.28en
dc.date.accessioned2009-01-05T22:08:01Z
dc.date.available2009-01-05T22:08:01Z
dc.date.issued2009-01-05T22:08:01Z
dc.identifier.urihttp://hdl.handle.net/1974/1642
dc.descriptionThesis (Ph.D, Electrical & Computer Engineering) -- Queen's University, 2008-12-22 14:54:49.28en
dc.description.abstractModern speech communication technologies expose users to perceptual quality degradations that were not experienced earlier with conventional telephone systems. Since perceived speech quality is a major contributor to the end user's perception of quality of service, speech quality estimation has become an important research field. In this dissertation, perceptual quality estimators are proposed for several emerging speech communication applications, in particular for i) wireless communications with noise suppression capabilities, ii) wireless-VoIP communications, iii) far-field hands-free speech communications, and iv) text-to-speech systems. First, a general-purpose speech quality estimator is proposed based on statistical models of normative speech behaviour and on innovative techniques to detect multiple signal distortions. The estimators do not depend on a clean reference signal hence are termed ``blind." Quality meters are then distributed along the network chain to allow for both quality degradations and quality enhancements to be handled. In order to improve estimation performance for wireless communications, statistical models of noise-suppressed speech are also incorporated. Next, a hybrid signal-and-link-parametric quality estimation paradigm is proposed for emerging wireless-VoIP communications. The algorithm uses VoIP connection parameters to estimate a base quality representative of the packet switching network. Signal-based distortions are then detected and quantified in order to adjust the base quality accordingly. The proposed hybrid methodology is shown to overcome the limitations of existing pure signal-based and pure link parametric algorithms. Temporal dynamics information is then investigated for quality diagnosis for hands-free speech communications. A spectro-temporal signal representation, where speech and reverberation tail components are shown to be separable, is used for blind characterization of room acoustics. In particular, estimators of reverberation time, direct-to-reverberation energy ratio, and reverberant speech quality are developed. Lastly, perceptual quality estimation for text-to-speech systems is addressed. Text- and speaker-independent hidden Markov models, trained on naturally produced speech, are used to capture normative spectral-temporal information. Deviations from the models, computed by means of a log-likelihood measure, are shown to be reliable indicators of multiple quality attributes including naturalness, fluency, and intelligibility.en
dc.format.extent1412501 bytes
dc.format.mimetypeapplication/pdf
dc.languageenen
dc.language.isoenen
dc.relation.ispartofseriesCanadian thesesen
dc.rightsThis publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.en
dc.subjectQuality Estimationen
dc.subjectGaussian Mixture Modelsen
dc.subjectHidden Markov Modelen
dc.subjectModulation Spectrumen
dc.subjectWireless Communicationsen
dc.subjectWireless-VoIPen
dc.subjectReverberationen
dc.subjectText-to-Speechen
dc.titleBlind Estimation of Perceptual Quality for Modern Speech Communicationsen
dc.typethesisen
dc.description.degreePh.Den
dc.contributor.supervisorChan, Wai-Yip Geoffreyen
dc.contributor.departmentElectrical and Computer Engineeringen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record