Search Results

Narrow Search

Last searches

Results for *

Displaying results 1 to 1 of 1.

Relevance

Title

Type

Author

Created

Man-machine speech communication

17th National Conference, NCMMSC 2022, Hefei, China, December 15-18, 2022, proceedings

Contributor: Jia, Jia (HerausgeberIn); Jianqing, Gao (HerausgeberIn); Yu, Kai (HerausgeberIn); Zhenhua, Ling (HerausgeberIn)

Published: [2023]

Publisher: Springer, Singapore

Hannover: Technische Informationsbibliothek (TIB) / Leibniz-Informationszentrum Technik und Naturwissenschaften und Universitätsbibliothek

Location:

Technische Informationsbibliothek (TIB) / Leibniz-Informationszentrum Technik und Naturwissenschaften und Universitätsbibliothek

Signature:

RS 7445(1765)

Inter-library loan:

No loan of volumes, only paper copies will be sent

This book constitutes the refereed proceedings of the 17th National Conference on Man-Machine Speech Communication, NCMMSC 2022, held in China, in December 2022. The 21 full papers and 7 short papers included in this book were carefully reviewed and selected from 108 submissions. They were organized in topical sections as follows: MCPN: A Multiple Cross-Perception Network for Real-Time Emotion Recognition in Conversation.- Baby Cry Recognition Based on Acoustic Segment Model, MnTTS2 An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset

Export to reference management software

RIS file
BibTeX file

Content information

Cover (lizenzpflichtig)

Source:	Union catalogues
Contributor:	Jia, Jia (HerausgeberIn); Jianqing, Gao (HerausgeberIn); Yu, Kai (HerausgeberIn); Zhenhua, Ling (HerausgeberIn)
Language:	English
Media type:	Conference proceedings
Format:	Print
ISBN:	9789819924004
Corporations / Congresses:	NCMMSC, 17. (2022, Hefei)
Series:	Communications in computer and information science ; 1765
Subjects:	Artificial intelligence; COMPUTERS / Artificial Intelligence; COMPUTERS / Computer Graphics / General; COMPUTERS / Data Processing / Speech & Audio Processing; COMPUTERS / User Interfaces; Digitale Signalverarbeitung (DSP); Elektronik; Image processing; Imaging systems & technology; Künstliche Intelligenz; Maschinelles Sehen, Bildverstehen; Mensch-Computer-Interaktion; Natural language & machine translation; Natürliche Sprachen und maschinelle Übersetzung; TECHNOLOGY & ENGINEERING / Electronics / General; User interface design & usability
Scope:	xi, 332 Seiten, Illustrationen, Diagramme
Notes:	MCPN: A Multiple Cross-Perception Network for Real-Time Emotion Recognition in Conversation.- Baby Cry Recognition Based on Acoustic Segment Model.- A Multi-feature Sets Fusion Strategy with Similar Samples Removal for Snore Sound Classification.- Multi-Hypergraph Neural Networks for Emotion Recognition in Multi-Party Conversations.- Using Emoji as an Emotion Modality in Text-Based Depression Detection.- Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis.- Semantic enhancement framework for robust speech recognition.- Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model.- Predictive AutoEncoders are Context-Aware Unsupervised Anomalous Sound Detectors.- A pipelined framework with serialized output training for overlapping speech recognition.- Adversarial Training Based on Meta-Learning in Unseen Domains for Speaker Verification.- Multi-Speaker Multi-Style Speech Synthesis with Timbre and Style Disentanglement.- Multiple Confidence Gates for Joint Training of SE and ASR.- Detecting Escalation Level from Speech with Transfer Learning and Acoustic-Linguistic Information Fusion.- Pre-training Techniques For Improving Text-to-Speech Synthesis By Automatic Speech Recognition Based Data Enhancement.- A Time-Frequency Attention Mechanism with Subsidiary Information for Effective Speech Emotion Recognition.- Interplay between prosody and syntax-semantics: Evidence from the prosodic features of Mandarin tag questions.- Improving Fine-grained Emotion Control and Transfer with Gated Emotion Representations in Speech Synthesis.- Violence Detection through Fusing Visual Information to Auditory Scene.- Mongolian Text-to-Speech Challenge under Low-Resource Scenario for NCMMSC2022.- VC-AUG Voice Conversion based Data Augmentation for Text-Dependent Speaker Verification.- Transformer-based potential emotional relation mining network for emotion recognition in conversation.- FastFoley Non-Autoregressive Foley Sound Generation Based On Visual Semantics.- Structured Hierarchical Dialogue Policy with Graph Neural Networks.- Deep Reinforcement Learning for On-line Dialogue State Tracking.- Dual Learning for Dialogue State Tracking.- Automatic Stress Annotation and Prediction For Expressive Mandarin TTS.- MnTTS2 An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset.

Narrow Search

Search narrowed by

Type

Source

Format

Contributor

Media type

Language

Year

Last searches

Results for *

Man-machine speech communication

Hannover: Technische Informationsbibliothek (TIB) / Leibniz-Informationszentrum Technik und Naturwissenschaften und Universitätsbibliothek

Contact us!