IEEE TRANSACTIONS ON
AUDIO, SPEECH, AND LANGUAGE PROCESSING
A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY
EDICS
| NAME | DESCRIPTION |
|---|
| AUDIO AND ELECTROACOUSTICS | |
|---|---|
| AUD-ROOM | Room Acoustics and Acoustic System Modeling Room acoustics and acoustic system modeling; room response measurement, modeling, simulation and compensation; architectural and physical acoustics; physical modeling of musical instruments; room acoustics for music performance and reproduction. |
| AUD-TRAN | Transducers Transducer modeling and design; transducer calibration and compensation; novel transducers. |
| AUD-LMAP | Loudspeaker and Microphone Array Signal Processing Far-field and near-field beamforming and array processing; source localization and tracking; time-delay estimation; audio enhancement using transducer arrays; wavefield synthesis; sound field analysis and synthesis. |
| AUD-ANCO | Active Noise Control Acoustic noise cancellation and suppression; adaptive techniques for feedforward control; feedback control algorithms; multichannel systems. |
| AUD-ECHO | Echo Cancellation Single-channel and multichannel acoustic echo cancellation; echo path estimation and modeling; echo suppression and dereverberation; double-talk detection; adaptive filter theory for audio applications. |
| AUD-AUDI | Auditory Modeling and Hearing Aids Human audition and psychoacoustics; computational auditory scene analysis; perceptual and psychophysical models of audio algorithms and systems; perceptual measures of audio quality; aids for the handicapped; medical aids (cochlear implants, hearing aids); binaural hearing. |
| AUD-SSEN | Audio Source Separation and Enhancement Single-channel and multichannel source separation; blind deconvolution; noise reduction, compensation, and equalization; audio denoising and restoration. |
| AUD-SMCA | Spatial and Multichannel Audio Spatial sound analysis and reproduction; spatialization and virtualization; measurement, modeling, and use of head-related transfer functions; crosstalk cancellation and binaural synthesis; artificial reverberation algorithms. |
| AUD-ACOD | Audio Coding Low bit-rate and high-quality audio coding; scalable and lossless audio coding; spatial audio coding; joint source-channel coding; signal representations for coding; parametric and structured audio coding; psychoacoustic models for coding; objective and subjective quality assessment; error detection, correction, and concealment. |
| AUD-ANSY | Audio Analysis and Synthesis Music analysis, modification, and synthesis; models and representations for musical signals; pitch and multi-pitch estimation; audio feature analysis and extraction; melody, note, chord, key, and rhythm estimation and detection; automatic transcription. |
| AUD-CONT | Content-Based Music Processing discrimination; audio characterization, classification, and categorization; music thumbnailing and automatic summarization; music fingerprinting; music information retrieval; data mining. |
| AUD-AUMM | Audio for Multimedia Audio watermarking and data hiding; data encryption, security, and privacy; digital rights management; joint processing of audio and video; human-machine audio interfaces; auditory displays; distant learning and virtual reality. |
| AUD-NWAU | Network Audio Audio processing for network distribution; packet loss detection, correction, and concealment; network audio quality assessment; distributed processing of audio. |
| AUD-SYST | Audio Processing Systems Hardware and software systems and implementations; consumer and professional audio. |
| SPEECH PROCESSING | |
|---|---|
| SPE-SPRD | Speech Production Physical models of the vocal production system; bioacoustics and medical acoustics; singing and properties of the musical voice. |
| SPE-SPER | Speech Perception and Psychoacoustics Models of Speech Perception; hearing and psychoacoustics; physiological models and applications thereof; audiology applications. |
| SPE-ANLS | Speech Analysis Spectral and other time-frequency analysis techniques; segmental and suprasegmental analysis; distortion measures; extraction of non-linguistic information (e.g., gender, stress, etc); voice/speech disorders; speaker localization (space) (e.g., in meetings); speaker diarization (time) (e.g., in meetings); speaker clustering (e.g., in Broadcast news). |
| SPE-SYNT | Speech Synthesis and Generation Segmental-level and/or concatenative synthesis; signal processing/statistical model for synthesis; articulatory synthesis; parametric synthesis; prosody, emotional, and expressive synthesis; text-to-phoneme conversion; voice quality/morphing; audio/visual speech synthesis; multilingual synthesis; quality assessent/evaluation metrics in synthesis; tools and data for speech synthesis; text processing for speech synthesis (text normalization, syntactic and semantic analysis). |
| SPE-CODI | Speech Coding Narrow-band and wide-band speech coding; theory and techniques for signal coding (e.g., waveform, transform); modulation and source/channel coding; quantization and compression; robust coding for noisy channels; coding for Voice Over IP (VOIP); quality assessent/evaluation metrics (e.g., PESQ) in coding; new applications of VOIP. |
| SPE-ENHA | Speech Enhancement non-noisy speech; speech enhancement for humans with hearing impairments; non-acoustic microphones for enhancement; bandwidth expansion; noise reduction. |
| SPE-RECO | Acoustic Modeling for Automatic Speech Recognition Acoustic feature extraction; low-level feature modeling - Gaussians & beyond; statistical and neural network models, deep learning models, pronunciation modeling; state clustering and novel state definitions; prosody and other speech characteristics; dialect, accent, and idiolect at the acoustic level; discriminative acoustic training methods for ASR; articulatory and physiological modeling; non-acoustic microphones for ASR; feature transformation and normalization. |
| SPE-ROBU | Robust Speech Recognition Acoustic features specifically for robust ASR (noise, channel, etc.); model/backend based robust ASR; confidence measures and rejection; speech activity/end-point detection; barge-in. |
| SPE-ADAP | Speech Adaptation/Normalization Speaker adaptation and normalization (e.g., VTLN); speaker adapted training methods; environmental/channel adaptation; idiolect adaptation; register and/or dialect adaptation. |
| SPE-GASR | General Topics in Speech Recognition Distributed Speech Recognition - Client/Server methods; alternative Statistical/Machine Learning Methods (e.g., no HMMs); word spotting; metadata (e.g., emotion, speaker, accent) extraction from acoustics; new algorithms, computational strategies, data-structures for ASR; multi-modal (such as audio-visual) speech recognition; corpora, annotation, and other resources; algorithm approximation methods in ASR; structured classification approaches. |
| SPE-MULT | Multilingual Recognition and Identification Language-type and dialect identification; multilingual speech recognition and spoken language processing; processing of non-native accents; mixed-code speech recognition and understanding; low resource and rare language processing |
| SPE-LEXI | Lexical Modeling and Access Pronunciation modeling at the lexical level; dialect, accent, and idiolect at the lexical level; multilingual aspects (e.g., unit selection); automatic lexicon learning. |
| SPE-LVCR | Large Vocabulary Continuous Recognition/Search Decoding algorithms and implementation; lattices; multi-pass strategies; miscellaneous Topics. |
| SPE-SPKR | Speaker Recognition and Characterization Features and characteristics for speaker recognition; robustness to variable and degraded channels; verification, identification, segmentation, and clustering; speaker characterization and adaptation; speaker recognition with speech recognition; speaker confidence estimation; multimodal and multimedia human speaker recognition; corpora, annotation, evaluation, and other resources; higher-level knowledge in speaker recognition. |
| SPE-RCSR | Resource Constrained Speech Recognition Low-power speech recognition; reduced computation speech recognition; ASR techniques for highly portable/mobile devices. |
| SPOKEN LANGUAGE PROCESSING | |
|---|---|
| SLP-UNDE | Spoken Language Understanding Paralinguistic (emotion, age, gender, rate, etc.) information; Nonlinguistic (meaning external to language) information, gestures, etc.; Semantic classification; Question/answering from speech; Entity extraction from speech; Spoken document summarization; Detecting linguistic/discourse structure (e.g., disfluencies, sentence/topic boundaries, speech acts); Relation to and interpretation of sign language |
| SLP-LADL | Human Spoken Language Acquisition, Development and Learning Human Spoken Language Acquisition, Development and Learning; Language acquisition, development, and learning models; Computer aids for language learning; Attributes and modeling techniques for assessment of language fluency |
| SLP-SSMD | Spoken and Multimodal Dialog Systems and Applications Spoken and multimodal dialog systems, applications, and architectures; Stochastic learning for dialog modeling; Response generation; Technologies for the aged; Evaluations and standardizations; Speech/voice-based human-computer interfaces (HCI); Speech HCI for individuals with impairments and universal access (VA); Other applications |
| SLP-SMIR | Speech Data Mining and Document Retrieval Analysis and evaluations for mining spoken data; Search/retrieval of speech documents; Mining heterogeneous speech and multimedia data; Speech data mining theory, algorithms. and methods; Core machine learning algorithms for data ll!ining; Topic spotting and classification; Pattern discovery and prediction from data; Applications and tools for speech data mining |
| SLP-SMMT | Machine Translation of Speech Semi-automatic and data driven methods; Speech processing for MTS; Corpora, annotation, and other resources; Interlingua and transfer approaches; Integration of speech and linguistic processing; Machine transliteration for named entity; Evaluation metrics (e.g., BLEU); Systems and applications for MTS |
| SLP-LANG | Language Modeling (for Speech and SLP) N-grams, their generalizations and smoothing methods; Language-model adaptation; Grammar based-language modeling; Maxent and feature-based language modeling; Dialect, accent, and idiolect at the language level; Discriminative LM training methods; Other approaches to LMs; Structured classification approaches |
| SLP-REAN | Spoken Language Resources and Annotation General corpora, annotation, and other resources |



