IEEE TRANSACTIONS ON
AUDIO, SPEECH, AND LANGUAGE PROCESSING
A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY
EDICS
| SPEECH PROCESSING | |
|---|---|
| SPE-SPRD | Speech Production Physical models of the vocal production system; bioacoustics and medical acoustics; singing and properties of the musical voice. |
| SPE-SPER | Speech Perception and Psychoacoustics Models of Speech Perception; hearing and psychoacoustics; physiological models and applications thereof; audiology applications. |
| SPE-ANLS | Speech Analysis Spectral and other time-frequency analysis techniques; segmental and suprasegmental analysis; distortion measures; extraction of non-linguistic information (e.g., gender, stress, etc); voice/speech disorders; speaker localization (space) (e.g., in meetings); speaker diarization (time) (e.g., in meetings); speaker clustering (e.g., in Broadcast news). |
| SPE-SYNT | Speech Synthesis and Generation Segmental-level and/or concatenative synthesis; signal processing/statistical model for synthesis; articulatory synthesis; parametric synthesis; prosody, emotional, and expressive synthesis; text-to-phoneme conversion; voice quality/morphing; audio/visual speech synthesis; multilingual synthesis; quality assessent/evaluation metrics in synthesis; tools and data for speech synthesis; text processing for speech synthesis (text normalization, syntactic and semantic analysis). |
| SPE-CODI | Speech Coding Narrow-band and wide-band speech coding; theory and techniques for signal coding (e.g., waveform, transform); modulation and source/channel coding; quantization and compression; robust coding for noisy channels; coding for Voice Over IP (VOIP); quality assessent/evaluation metrics (e.g., PESQ) in coding; new applications of VOIP. |
| SPE-ENHA | Speech Enhancement non-noisy speech; speech enhancement for humans with hearing impairments; non-acoustic microphones for enhancement; bandwidth expansion; noise reduction. |
| SPE-RECO | Acoustic Modeling for Automatic Speech Recognition Acoustic feature extraction; low-level feature modeling - Gaussians & beyond; statistical and neural network models, deep learning models, pronunciation modeling; state clustering and novel state definitions; prosody and other speech characteristics; dialect, accent, and idiolect at the acoustic level; discriminative acoustic training methods for ASR; articulatory and physiological modeling; non-acoustic microphones for ASR; feature transformation and normalization. |
| SPE-ROBU | Robust Speech Recognition Acoustic features specifically for robust ASR (noise, channel, etc.); model/backend based robust ASR; confidence measures and rejection; speech activity/end-point detection; barge-in. |
| SPE-ADAP | Speech Adaptation/Normalization Speaker adaptation and normalization (e.g., VTLN); speaker adapted training methods; environmental/channel adaptation; idiolect adaptation; register and/or dialect adaptation. |
| SPE-GASR | General Topics in Speech Recognition Distributed Speech Recognition - Client/Server methods; alternative Statistical/Machine Learning Methods (e.g., no HMMs); word spotting; metadata (e.g., emotion, speaker, accent) extraction from acoustics; new algorithms, computational strategies, data-structures for ASR; multi-modal (such as audio-visual) speech recognition; corpora, annotation, and other resources; algorithm approximation methods in ASR; structured classification approaches. |
| SPE-MULT | Multilingual Recognition and Identification Language-type and dialect identification; multilingual speech recognition and spoken language processing; processing of non-native accents; mixed-code speech recognition and understanding; low resource and rare language processing |
| SPE-LEXI | Lexical Modeling and Access Pronunciation modeling at the lexical level; dialect, accent, and idiolect at the lexical level; multilingual aspects (e.g., unit selection); automatic lexicon learning. |
| SPE-LVCR | Large Vocabulary Continuous Recognition/Search Decoding algorithms and implementation; lattices; multi-pass strategies; miscellaneous Topics. |
| SPE-SPKR | Speaker Recognition and Characterization Features and characteristics for speaker recognition; robustness to variable and degraded channels; verification, identification, segmentation, and clustering; speaker characterization and adaptation; speaker recognition with speech recognition; speaker confidence estimation; multimodal and multimedia human speaker recognition; corpora, annotation, evaluation, and other resources; higher-level knowledge in speaker recognition. |
| SPE-RCSR | Resource Constrained Speech Recognition Low-power speech recognition; reduced computation speech recognition; ASR techniques for highly portable/mobile devices. |
| HUMAN LANGUAGE TECHNOLOGY | |
|---|---|
| HLT-LANG | Language Modelling N-grams, their generalizations and smoothing methods; language model adaptation: grammar-based, structured language modelling; discriminative, maximum-entropy and feature-based language modelling; computational phonology and phonetics; dialect, accent, and idiolect at the language level; |
| HLT-MTSW | Machine Translation for Spoken and Written Language Example/phrase/syntax/semantics-based machine translafion; hybrid machine translation: word/sentence/document alignments; synchronous grammar induction; decoding: system combination; post-editing; machine transliteration and transcription; spoken language translation: speech processing for machine translation; |
| HLT-UNDE | Spoken Language Understanding and Computational Semantics Spoken language understanding; paralinguistic (emotion , age, gender, etc.), non-linguistic (gesture, sign, etc) Information processing; semantic role labell ing, multiword expressions; word sense disambiguation, representation of meaning; lexical semantics; distributional semantics; text entailment; ontology; |
| HLT-DIAL | Discourse and Dialog Learning of linguistic/discourse structure (e.g., disfluencies, sentence/topic boundanes, speech acts); co-reference and anaphora resolution; dialog management/generation/analysis; semantic analysis for d1scourse and dialog: intent determination: dialog act tagging; |
| HLT-SDTM | Spoken Document Retrieval and Text Mining Spoken document retrieval; linguistic pattern discovery and prediction from data; spoken term detection: named entity recognition; question answering; document summarization and generation; spoken document summanzation; information extraction and retrieval; subjectivity and sentiment analysis; text and spoken document classification; spam detection; topic detection and tracking; trend detection; |
| HLT-STPA | Segmentation, Tagging, and Parsing Morphology analysis; word segmentation; part-of-speech tagging, chunking and supertagging; models and algorithms for parsing; grammar induction; dependency parsing; multilingual parsing; |
| HLT-HLLI | Human Language Learning and Interface Language acquisition, development, and learning models; computer aids for language learning; assessment of language fluency; human computer interface; asslstive technology for the aged, universal access and individuals with Impairments; |
| HLT-MLMD | Machine Learning Methods Supervised, unsupervised, semi-supervised learning; statistical methods; symbolic learning methods; biologically inspired and neural networks; reinforcement learning; active learning; online learning; deep learning; recurs1ve and structured models, graphical and latent variable models; kernel methods; domain adaptation; |
| HLT-LRSE | Language Resources and System Evaluation Annotation and evaluation of corpora; linguistic resources development methodologies, standards, tools and evaluations; crowd-sourcing; evaluations, systems and applications of human language technology; |
| AUDIO AND ELECTROACOUSTICS | |
|---|---|
| AUD-ROOM | Room Acoustics and Acoustic System Modeling Room acoustics and acoustic system modeling; room response measurement, modeling, simulation and compensation; architectural and physical acoustics; physical modeling of musical instruments; room acoustics for music performance and reproduction. |
| AUD-TRAN | Transducers Transducer modeling and design; transducer calibration and compensation; novel transducers. |
| AUD-LMAP | Loudspeaker and Microphone Array Signal Processing Far-field and near-field beamforming and array processing; source localization and tracking; time-delay estimation; audio enhancement using transducer arrays; wavefield synthesis; sound field analysis and synthesis. |
| AUD-ANCO | Active Noise Control Acoustic noise cancellation and suppression; adaptive techniques for feedforward control; feedback control algorithms; multichannel systems. |
| AUD-ECHO | Echo Cancellation Single-channel and multichannel acoustic echo cancellation; echo path estimation and modeling; echo suppression and dereverberation; double-talk detection; adaptive filter theory for audio applications. |
| AUD-AUDI | Auditory Modeling and Hearing Aids Human audition and psychoacoustics; computational auditory scene analysis; perceptual and psychophysical models of audio algorithms and systems; perceptual measures of audio quality; aids for the handicapped; medical aids (cochlear implants, hearing aids); binaural hearing. |
| AUD-SSEN | Audio Source Separation and Enhancement Single-channel and multichannel source separation; blind deconvolution; noise reduction, compensation, and equalization; audio denoising and restoration. |
| AUD-SMCA | Spatial and Multichannel Audio Spatial sound analysis and reproduction; spatialization and virtualization; measurement, modeling, and use of head-related transfer functions; crosstalk cancellation and binaural synthesis; artificial reverberation algorithms. |
| AUD-ACOD | Audio Coding Low bit-rate and high-quality audio coding; scalable and lossless audio coding; spatial audio coding; joint source-channel coding; signal representations for coding; parametric and structured audio coding; psychoacoustic models for coding; objective and subjective quality assessment; error detection, correction, and concealment. |
| AUD-ANSY | Audio Analysis and Synthesis Music analysis, modification, and synthesis; models and representations for musical signals; pitch and multi-pitch estimation; audio feature analysis and extraction; melody, note, chord, key, and rhythm estimation and detection; automatic transcription. |
| AUD-CONT | Content-Based Music Processing discrimination; audio characterization, classification, and categorization; music thumbnailing and automatic summarization; music fingerprinting; music information retrieval; data mining. |
| AUD-AUMM | Audio for Multimedia Audio watermarking and data hiding; data encryption, security, and privacy; digital rights management; joint processing of audio and video; human-machine audio interfaces; auditory displays; distant learning and virtual reality. |
| AUD-NWAU | Network Audio Audio processing for network distribution; packet loss detection, correction, and concealment; network audio quality assessment; distributed processing of audio. |
| AUD-SYST | Audio Processing Systems Hardware and software systems and implementations; consumer and professional audio. |



