* indicates that this line can be assigned as a paper's topic.
| 1: | Audio and Acoustic Signal Processing | ||
| 1.1*: | Room Acoustics and Acoustic System Modeling | ||
| 1.2*: | Transducers | ||
| 1.3*: | Loudspeaker and Microphone Array Signal Processing | ||
| 1.4*: | Active Noise Control | ||
| 1.5*: | Echo Cancellation | ||
| 1.6*: | Auditory Modeling and Hearing Aids | ||
| 1.7*: | Source Separation and Signal Enhancement | ||
| 1.8*: | Spatial and Multichannel Audio | ||
| 1.9*: | Audio Coding | ||
| 1.10*: | Audio Analysis and Synthesis | ||
| 1.11*: | Content-Based Audio Processing | ||
| 1.12*: | Audio for Multimedia | ||
| 1.13*: | Network Audio | ||
| 1.14*: | Audio Processing Systems | ||
| 1.15*: | Bioacoustics and Medical Acoustics | ||
| 1.16*: | Music Signal Processing | ||
| 2: | Bio Imaging and Signal Processing | ||
| 2.1: | Medical imaging | ||
| 2.1.1*: | Image formation | ||
| 2.1.2*: | Reconstruction and restoration | ||
| 2.1.3*: | Computed tomography (CT, PET or SPECT) | ||
| 2.1.4*: | Biomedical Imaging | ||
| 2.1.5*: | Magnetic resonance imaging | ||
| 2.1.6*: | Ultrasound imaging | ||
| 2.2: | Medical image analysis | ||
| 2.2.1*: | Segmentation | ||
| 2.2.2*: | Registration | ||
| 2.2.3*: | Feature extraction and classification | ||
| 2.3: | Bioimaging and microscopy | ||
| 2.3.1*: | Cellular and molecular imaging | ||
| 2.3.2*: | Deconvolution and inverse problems | ||
| 2.3.3*: | Segmentation and analysis | ||
| 2.3.4*: | Tracking and motion analysis | ||
| 2.4: | Biomedical signal processing | ||
| 2.4.1*: | Physiological signals (ECG, EEG, ...) | ||
| 2.4.2*: | Detection and estimation | ||
| 2.4.3*: | Feature extraction and classification | ||
| 2.4.4*: | Multi-channel processing | ||
| 2.5: | Bioinformatics | ||
| 2.5.1*: | Genomics and proteomics | ||
| 2.5.2*: | Computational biology and biological networks | ||
| 3: | Image, Video, and Multidimensional Signal Processing | ||
| 3.1: | Image/Video Coding | ||
| 3.1.1*: | Still Image Coding | ||
| 3.1.2*: | Video Coding | ||
| 3.1.3*: | Stereoscopic and 3-D Coding | ||
| 3.1.4*: | Distributed Source Coding | ||
| 3.1.5*: | Image/Video Transmission | ||
| 3.2: | Image/Video Processing | ||
| 3.2.1*: | Image Filtering | ||
| 3.2.2*: | Restoration | ||
| 3.2.3*: | Enhancement | ||
| 3.2.4*: | Image Segmentation | ||
| 3.2.5*: | Video Segmentation and Tracking | ||
| 3.2.6*: | Morphological Processing | ||
| 3.2.7*: | Stereoscopic and 3-D Processing | ||
| 3.2.8*: | Image Feature Extraction | ||
| 3.2.9*: | Image Analysis | ||
| 3.2.10*: | Video Feature Extraction | ||
| 3.2.11*: | Video Analysis | ||
| 3.2.12*: | Modeling | ||
| 3.2.13*: | Biometrics | ||
| 3.2.14*: | Interpolation and Super-resolution | ||
| 3.2.15*: | Motion Detection and Estimation | ||
| 3.3: | Image Formation | ||
| 3.3.1*: | Remote Sensing Imaging | ||
| 3.3.2*: | Geophysical and Seismic Imaging | ||
| 3.3.3*: | Optical Imaging | ||
| 3.3.4*: | Synthetic-Natural Hybrid Image Systems | ||
| 3.4: | Image Scanning, Display, and Printing | ||
| 3.4.1*: | Scanning and Sampling | ||
| 3.4.2*: | Quantization and Halftoning | ||
| 3.4.3*: | Color Reproduction | ||
| 3.4.4*: | Image Representation and Rendering | ||
| 3.4.5*: | Display and Printing Systems | ||
| 3.4.6*: | Image Quality Assessment | ||
| 3.5: | Image/Video Storage, Retrieval | ||
| 3.5.1*: | Image and Video Databases | ||
| 3.5.2*: | Image Indexing and Retrieval | ||
| 3.5.3*: | Video Indexing, Retrieval and Editing | ||
| 4: | Design and Implementation of Signal Processing Systems | ||
| 4.1*: | Algorithm and architecture co-optimization | ||
| 4.2*: | Compilers and tools for DSP implementation | ||
| 4.3*: | DSP algorithm implementation in hardware and software | ||
| 4.4*: | Low-power signal processing techniques and architectures | ||
| 4.5*: | Programmable and reconfigurable DSP architectures | ||
| 4.6*: | System-on-chip architectures for signal processing | ||
| 5: | Industry Technology Track | ||
| 5.1: | DSP Chips and Architectures | ||
| 5.1.1*: | Mixed Signal Processing | ||
| 5.1.2*: | Special-Purpose and FPGA DSPs | ||
| 5.1.3*: | Host-Based Signal Processing | ||
| 5.1.4*: | Multiprocessor Architectures | ||
| 5.2: | DSP Tools and Rapid Prototyping | ||
| 5.2.1*: | DSP Simulation Tools | ||
| 5.2.2*: | Rapid Prototyping and languages | ||
| 5.2.3*: | DSP Libraries | ||
| 5.2.4*: | Operating Systems | ||
| 5.3: | Communication Technologies | ||
| 5.3.1*: | Cellular and Satellite Telephony | ||
| 5.3.2*: | Data Communications and Networking | ||
| 5.3.3*: | Sortware-Defined Radios | ||
| 5.3.4*: | Vocoders | ||
| 5.3.5*: | Power Line Communication | ||
| 5.3.6*: | RFID | ||
| 5.4: | Speech Processing Applications | ||
| 5.4.1*: | Speaker Recognition | ||
| 5.4.2*: | Speech Compression | ||
| 5.4.3*: | Speech Enhancement | ||
| 5.4.4*: | Speech Recognition | ||
| 5.4.5*: | Speech Synthesis | ||
| 5.5: | Multimedia and DTV Technologies | ||
| 5.5.1*: | DSP Implementations of Music, Speech, and Audio | ||
| 5.5.2*: | Image and Video Applications | ||
| 5.5.3*: | Standards and Format Conversions | ||
| 5.5.4*: | Internet and Teleconferencing | ||
| 5.6: | Adaptive Interference Cancellation | ||
| 5.6.1*: | Smart Antennas | ||
| 5.6.2*: | Active Sound Reduction | ||
| 5.6.3*: | Acoustic and Electrical Noise and Echo Cancellation | ||
| 5.6.4*: | Hands-Free Telephony | ||
| 5.7: | Automotive Applications | ||
| 5.7.1*: | Intelligent Dashboards, Vehicles, and Highways (IVHS) | ||
| 5.7.2*: | Engine Management | ||
| 5.7.3*: | Route Planning and Tracking | ||
| 5.7.4*: | New Consumer Applications | ||
| 5.8: | Defense and Security Applications | ||
| 5.8.1*: | Optical Correlation | ||
| 5.8.2*: | Decluttering Target Identification and Tracking | ||
| 5.8.3*: | DSP-Based Cryptography, Stenography, and Watermarking | ||
| 5.8.4*: | Radar and Sonar | ||
| 5.9: | Emerging DSP Applications | ||
| 5.9.1*: | Biometrics | ||
| 5.9.2*: | Biomedical | ||
| 5.9.3*: | Power Systems and Motor Controls | ||
| 5.9.4*: | Machine Learning | ||
| 5.10*: | Other ITT Topics | ||
| 6: | Information Forensics and Security | ||
| 6.1: | Watermarking and Steganography | ||
| 6.1.1*: | Theoretical models | ||
| 6.1.2*: | Algorithms | ||
| 6.1.3*: | Benchmarking and security analysis | ||
| 6.1.4*: | Steganography and steganalysis | ||
| 6.2: | Multimedia Forensics | ||
| 6.2.1*: | Sensor and channel forensics | ||
| 6.2.2*: | Tamper detection | ||
| 6.2.3*: | Anti-forensics and countermeasures | ||
| 6.2.4*: | Plagiarism and near-duplicate detection | ||
| 6.2.5*: | Robust hashing | ||
| 6.3: | Biometrics | ||
| 6.3.1*: | Biometric methods and modalities | ||
| 6.3.2*: | Biometric security | ||
| 6.3.3*: | Performance and evaluation | ||
| 6.4: | Communications and Network Security | ||
| 6.4.1*: | Jamming and anti-jamming | ||
| 6.4.2*: | Covert or stealthy communication | ||
| 6.4.3*: | Secret key extraction from channels | ||
| 6.4.4*: | Information theoretic security | ||
| 6.4.5*: | Network attacks, protection and monitoring | ||
| 6.5: | Signal Processing and Cryptography | ||
| 6.5.1*: | Multimedia encryption | ||
| 6.5.2*: | Signal processing in the encrypted domain | ||
| 6.5.3*: | Traitor tracing codes | ||
| 6.5.4*: | Visual secret sharing | ||
| 6.5.5*: | Side channel attacks | ||
| 6.5.6*: | Privacy protection | ||
| 6.6: | Applications | ||
| 6.6.1*: | Surveillance | ||
| 6.6.2*: | Content protection, identification and monitoring | ||
| 6.6.3*: | Cloud and distributed computing systems | ||
| 6.6.4*: | Smart grid and power/energy systems | ||
| 6.6.5*: | Social media and network systems | ||
| 7: | Machine Learning for Signal Processing | ||
| 7.1*: | Other applications of machine learning (MLR-APPL) | ||
| 7.2*: | Bayesian learning; Bayesian signal processing (MLR-BAYL) | ||
| 7.3*: | Cognitive information processing (MLR-COGP) | ||
| 7.4*: | Distributed and Cooperative Learning (MLR-DIST) | ||
| 7.5*: | Applications in Data Fusion (MLR-FUSI) | ||
| 7.6*: | Graphical and kernel methods (MLR-GRKN) | ||
| 7.7*: | Independent component analysis (MLR-ICAN) | ||
| 7.8*: | Information-theoretic learning (MLR-INFO) | ||
| 7.9*: | Learning theory and algorithms (MLR-LEAR) | ||
| 7.10*: | Applications in Music and Audio Processing (MLR-MUSI) | ||
| 7.11*: | Neural network learning (MLR-NNLR) | ||
| 7.12*: | Pattern recognition and classification (MLR-PATT) | ||
| 7.13*: | Bounds on performance (MLR-PERF) | ||
| 7.14*: | Sequential learning; sequential decision methods (MLR-SLER) | ||
| 7.15*: | Source separation (MLR-SSEP) | ||
| 7.16*: | Applications in Systems Biology (MLR-SYSB) | ||
| 8: | Multimedia Signal Processing | ||
| 8.1: | Multimodal signal processing | ||
| 8.1.1*: | Joint processing/presentation of audio-visual information | ||
| 8.1.2*: | Synchronization of audio and visual data | ||
| 8.1.3*: | Fusion/fission of sensor information or multimodal data | ||
| 8.1.4*: | Integration of media, art, and multimedia technology | ||
| 8.2: | Virtual reality and 3D imaging | ||
| 8.2.1*: | 2D and 3D graphics/geometry coding and animation | ||
| 8.2.2*: | 3D audio and video processing | ||
| 8.2.3*: | Virtual reality and mixed-reality in networked environments | ||
| 8.3: | Multimedia communications and networking | ||
| 8.3.1*: | Wireless and mobile multimedia communication | ||
| 8.3.2*: | Media streaming, media content distribution, and storage | ||
| 8.3.3*: | Quality of service provisioning | ||
| 8.3.4*: | Cross-layer design for multimedia communication | ||
| 8.3.5*: | Overlay, peer-to-peer, and peer-assisted networking for multimedia | ||
| 8.3.6*: | Home networking for multimedia | ||
| 8.3.7*: | Location-aware multimedia computing | ||
| 8.3.8*: | Multimedia sensor and ad hoc networks | ||
| 8.3.9*: | Media compression and related standardization activities | ||
| 8.3.10*: | Multimedia watermarking | ||
| 8.3.11*: | Distributed source and source-channel coding | ||
| 8.4: | Multimedia security and content protection | ||
| 8.4.1*: | Data hiding | ||
| 8.4.2*: | Authentication | ||
| 8.4.3*: | Access control | ||
| 8.4.4*: | Single and multi-media security | ||
| 8.4.5*: | Multimedia forensics | ||
| 8.4.6*: | Security applications of watermarking and fingerprinting | ||
| 8.5: | Multimedia human-machine interface and interaction | ||
| 8.5.1*: | Human perception modelling | ||
| 8.5.2*: | Modeling of multimodal perception | ||
| 8.5.3*: | Human-human and human-computer dialog | ||
| 8.5.4*: | Multimodal interfaces | ||
| 8.5.5*: | Brain-computer interfaces | ||
| 8.6: | Quality Assessment | ||
| 8.6.1*: | Subjective visual quality assessment | ||
| 8.6.2*: | Objective visual quality assessment | ||
| 8.6.3*: | Subjective auditory quality assessment | ||
| 8.6.4*: | Objective auditory quality assessment | ||
| 8.6.5*: | Evaluation of user experience, cross-modal assessment | ||
| 8.6.6*: | Standardization activities | ||
| 8.7: | Multimedia databases and digital libraries | ||
| 8.7.1*: | Visual indexing, analysis and representation | ||
| 8.7.2*: | Audio indexing, analysis and representation | ||
| 8.7.3*: | Content-based and context-based information retrieval | ||
| 8.7.4*: | Knowledge and semantics in media annotation and retrieval | ||
| 8.7.5*: | Fingerprinting and duplicate detection | ||
| 8.8: | Multimedia computing systems and applications | ||
| 8.8.1*: | Multimedia system design | ||
| 8.8.2*: | Distributed multimedia systems | ||
| 8.8.3*: | Entertainment and gaming | ||
| 8.8.4*: | e-Health and telemedicine | ||
| 8.8.5*: | IP video/web conferencing | ||
| 8.8.6*: | e-learning | ||
| 8.9: | Hardware and software for multimedia systems | ||
| 8.9.1*: | Multimedia hardware design | ||
| 8.9.2*: | Real-time multimedia systems | ||
| 8.9.3*: | Implementations on graphics processing units (GPUs) | ||
| 8.9.4*: | Implementations on general-purpose processors, multimedia processors, DSPs, multi-core processors | ||
| 8.9.5*: | Implementations in portable/wearable systems | ||
| 8.9.6*: | Power-aware systems for multimedia | ||
| 8.10: | Haptic technology and interaction | ||
| 8.10.1*: | Processing and rendering of haptic signals | ||
| 8.10.2*: | Compression and transmission of haptic signals | ||
| 8.10.3*: | Audio-visual-haptic environments | ||
| 8.10.4*: | Multimedia applications using haptics | ||
| 8.11: | Bio-inspired multimedia systems and signal processing | ||
| 8.11.1*: | Bio-inspired signal processing for multimedia | ||
| 8.11.2*: | Multimodal signal fusion in humans and animals | ||
| 8.11.3*: | Joint bio-inspired and conventional multimedia signal processing | ||
| 9: | Sensor Array and Multichannel Signal Processing | ||
| 9.1: | Sensor Array Processing | ||
| 9.1.1*: | Beamforming | ||
| 9.1.2*: | Physics-based sensor array processing | ||
| 9.1.3*: | Inverse methods | ||
| 9.1.4*: | Array calibration methods | ||
| 9.1.5*: | Synthetic aperture methods | ||
| 9.1.6*: | Signal detection and parameter estimation | ||
| 9.1.7*: | Direction-of-arrival estimation | ||
| 9.1.8*: | Source localization, separation, classification, and tracking | ||
| 9.1.9*: | Blind source separation and channel identification | ||
| 9.2: | Adaptive Array Signal Processing | ||
| 9.2.1*: | Adaptive beamforming | ||
| 9.2.2*: | Space-time adaptive processing | ||
| 9.2.3*: | MIMO radar and waveform diversity | ||
| 9.3: | Multi-channel Signal Processing | ||
| 9.3.1*: | Channel modelling and equalization | ||
| 9.3.2*: | Multi-channel transceiver design | ||
| 9.3.3*: | Sparsity structures in multichannel signal processing | ||
| 9.3.4*: | Multi-channel processing with non-wave based sensors | ||
| 9.3.5*: | Tensor-based signal processing for multi-sensor systems | ||
| 9.4: | Multi-antenna and Multi-channel Signal Processing for Communications | ||
| 9.4.1*: | MIMO systems and algorithms | ||
| 9.4.2*: | Space-time coding and decoding algorithms | ||
| 9.4.3*: | MIMO space-time code design and analysis | ||
| 9.4.4*: | Multi-user MIMO networks | ||
| 9.4.5*: | Array processing for wireless communications | ||
| 9.4.6*: | Multi-antenna/multi-channel processing for cognitive radios | ||
| 9.5: | Sensor and Relay Networks | ||
| 9.5.1*: | Sensor and relay network signal processing | ||
| 9.5.2*: | Network beamforming and coding | ||
| 9.5.3*: | Distributed and cooperative processing | ||
| 9.5.4*: | Data fusion and decision fusion from multiple sensor types | ||
| 9.5.5*: | Multi-Sensor processing for smart grid and energy systems | ||
| 9.6: | Applications of Sensor Array and Multi-channel Signal Processing | ||
| 9.6.1*: | Radar array processing | ||
| 9.6.2*: | Sonar array processing | ||
| 9.6.3*: | Microphone array processing | ||
| 9.6.4*: | Multi-channel imaging | ||
| 9.6.5*: | Multi-channel biological and medical modelling and processing | ||
| 9.6.6*: | Other applications of SAM signal processing | ||
| 10: | Signal Processing Education | ||
| 10.1*: | Signal Processing Education | ||
| 11: | Signal Processing for Communications and Networking | ||
| 11.1: | Signal Transmission and Reception | ||
| 11.1.1*: | Signal detection, estimation, separation and equalization | ||
| 11.1.2*: | Channel modeling and estimation, training schemes | ||
| 11.1.3*: | Capacity and performance analysis/optimization | ||
| 11.1.4*: | Acquisition, synchronization and tracking | ||
| 11.1.5*: | Signal representation, modulation, coding and compression | ||
| 11.1.6*: | Joint source-channel coding and quantization, iterative decoding algorithms | ||
| 11.2: | Communication Systems and Applications | ||
| 11.2.1*: | Multi-carrier, OFDM, and DMT communication | ||
| 11.2.2*: | Multi-rate, CDMA and spread spectrum communication | ||
| 11.2.3*: | Ultra wideband communication | ||
| 11.2.4*: | Telephone networks, DSL and powerline communication | ||
| 11.2.5*: | Applications involving signal processing for communication | ||
| 11.2.6*: | Computation, Communication, and Control for Smart Grid | ||
| 11.2.7*: | Communication/Networking Issues in Social Networks | ||
| 11.2.8*: | Computation, Communication, and Control for Biological Networks | ||
| 11.2.9*: | Underwater Communication Systems | ||
| 11.2.10*: | Visible Light Communication Systems | ||
| 11.2.11*: | Free Space Optical Communication | ||
| 11.3: | MIMO Communications and Signal Processing | ||
| 11.3.1*: | MIMO precoder/decoder design, receiver algorithms | ||
| 11.3.2*: | MIMO channel estimation and equalization | ||
| 11.3.3*: | MIMO capacity and performance | ||
| 11.3.4*: | MIMO space-time code design, analysis and decoding algorithms | ||
| 11.3.5*: | MIMO multi-user and multi-access schemes | ||
| 11.4: | Communication and Sensing aspects of Sensor Networks, Wireless and Ad-Hoc Networks | ||
| 11.4.1*: | Distributed and collaborative signal processing | ||
| 11.4.2*: | Distributed channel and source coding, information-theoretic studies | ||
| 11.4.3*: | Ad-hoc wireless networks | ||
| 11.4.4*: | Physical layer issues, cross-layer design | ||
| 11.4.5*: | Scheduling and queuing protocols | ||
| 11.4.6*: | Power control, resource management, system level optimization | ||
| 11.4.7*: | Cognitive Radio and Dynamic Spectrum Access | ||
| 11.4.8*: | Collaborative Signal Processing for Smart Grid | ||
| 12: | Signal Processing Theory and Methods | ||
| 12.1: | Sampling and Reconstruction | ||
| 12.1.1*: | Sampling theory and methods | ||
| 12.1.2*: | Quantization | ||
| 12.1.3*: | Extrapolation and interpolation | ||
| 12.1.4*: | Signal reconstruction, restoration and enhancement | ||
| 12.1.5*: | Multidimensional sampling and reconstruction | ||
| 12.2: | Signal and System Modeling, Representation and Estimation | ||
| 12.2.1*: | System modeling | ||
| 12.2.2*: | Signal and noise modeling | ||
| 12.2.3*: | System identification and approximation | ||
| 12.2.4*: | Multidimensional systems | ||
| 12.2.5*: | Non-stationary signals and time-varying systems | ||
| 12.2.6*: | Time-frequency and time-scale analysis | ||
| 12.2.7*: | Blind and semi-blind source separation | ||
| 12.3: | Statistical Signal Processing | ||
| 12.3.1*: | Detection and estimation theory and methods | ||
| 12.3.2*: | Classification and pattern recognition | ||
| 12.3.3*: | Cyclostationary signal analysis | ||
| 12.3.4*: | Higher-order and fractional lower-order statistical methods | ||
| 12.3.5*: | Performance analysis and bounds | ||
| 12.3.6*: | Spectrum estimation theory and methods | ||
| 12.3.7*: | Robust methods | ||
| 12.3.8*: | Independent component analysis | ||
| 12.3.9*: | Monte-Carlo based signal processing methods | ||
| 12.4: | Adaptive Signal Processing | ||
| 12.4.1*: | Adaptive filter analysis and design | ||
| 12.4.2*: | Fast algorithms for adaptive filtering | ||
| 12.4.3*: | Frequency-domain and transform-based adaptive filtering | ||
| 12.4.4*: | Sequential decision theory and methods | ||
| 12.4.5*: | Performance analysis and bounds | ||
| 12.4.6*: | Distributed and collaborative signal processing | ||
| 12.5: | Nonlinear Systems and Signal Processing | ||
| 12.5.1*: | Median, rank-order and stack type filters | ||
| 12.5.2*: | Non-Gaussian distribution filters | ||
| 12.5.3*: | Nonlinear signal and system models | ||
| 12.5.4*: | Nonlinear random process models | ||
| 12.5.5*: | Nonlinear adaptive filters | ||
| 12.6: | Filter Design | ||
| 12.6.1*: | Filter design criteria and optimization methods | ||
| 12.6.2*: | Filter architectures | ||
| 12.6.3*: | Performance analysis | ||
| 12.7: | Multirate Signal Processing | ||
| 12.7.1*: | Multirate architectures | ||
| 12.7.2*: | Filterbanks and wavelets | ||
| 12.7.3*: | Multirate processing and multiresolution methods | ||
| 12.7.4*: | Hierarchical models and tree-structured signal processing | ||
| 13: | Speech Processing | ||
| 13.1: | Speech Production (SPE-SPRD) | ||
| 13.1.1*: | Physical models of the vocal production system | ||
| 13.1.2*: | Singing and properties of the musical voice | ||
| 13.2: | Speech Perception and Psychoacoustics (SPE-SPER) | ||
| 13.2.1*: | Models of Speech Perception | ||
| 13.2.2*: | Hearing and Psychoacoustics | ||
| 13.2.3*: | Physiological models and applications thereof | ||
| 13.2.4*: | Audiology applications | ||
| 13.3: | Speech Analysis (SPE-ANLS) | ||
| 13.3.1*: | Spectral and other time-frequency analysis techniques | ||
| 13.3.2*: | Distortion measures | ||
| 13.3.3*: | Pitch/fundamental frequency analysis | ||
| 13.3.4*: | Timing/duration/speaking rate analysis | ||
| 13.3.5*: | Acoustic-phonetic features (e.g., formants etc) | ||
| 13.3.6*: | Extraction of non-linguistic information (e.g., gender, emotion, etc) | ||
| 13.3.7*: | Voice quality/speech disorders | ||
| 13.4: | Speech Synthesis and Generation, including TTS (SPE-SYNT) | ||
| 13.4.1*: | Segmental-Level and/or concatenative synthesis | ||
| 13.4.2*: | Signal Processing/Statistical Model for synthesis | ||
| 13.4.3*: | Articulatory Synthesis | ||
| 13.4.4*: | Parametric Synthesis | ||
| 13.4.5*: | Prosody, Emotional, and Expressive Synthesis | ||
| 13.4.6*: | Text-to-phoneme conversion | ||
| 13.4.7*: | Voice Quality | ||
| 13.4.8*: | Voice Transformation | ||
| 13.4.9*: | Audio/Visual speech synthesis | ||
| 13.4.10*: | Multilingual synthesis | ||
| 13.4.11*: | Quality assessent/evaluation metrics in synthesis | ||
| 13.4.12*: | Tools and data for speech synthesis | ||
| 13.4.13*: | Text processing for speech synthesis (text normalization, syntactic and semantic analysis) | ||
| 13.5: | Speech Coding (SPE-CODI) | ||
| 13.5.1*: | Narrow-band and wide-band Speech Coding | ||
| 13.5.2*: | Theory and techniques for signal coding (e.g., waveform, transform) | ||
| 13.5.3*: | Modulation and source/channel coding | ||
| 13.5.4*: | Quantization and compression | ||
| 13.5.5*: | Robust coding for noisy channels | ||
| 13.5.6*: | Voice Over IP (VOIP) | ||
| 13.5.7*: | Quality assessent/evaluation metrics (e.g., PESQ) in coding | ||
| 13.6: | Speech Enhancement (SPE-ENHA) | ||
| 13.6.1*: | Control and reduction of channel noise (e.g., reverb, room response) | ||
| 13.6.2*: | Perceptual enhancement of non-noisy speech | ||
| 13.6.3*: | Speech enhancement for humans with hearing impairments | ||
| 13.6.4*: | Non-acoustic microphones for enhancement | ||
| 13.6.5*: | Bandwidth expansion | ||
| 13.6.6*: | Noise Reduction | ||
| 13.7: | Acoustic Modeling for Automatic Speech Recognition (SPE-RECO) | ||
| 13.7.1*: | Feature Extraction | ||
| 13.7.2*: | Low-level feature modeling - Gaussians & beyond | ||
| 13.7.3*: | Pronunciation modeling at the acoustic level | ||
| 13.7.4*: | State clustering and novel state definitions | ||
| 13.7.5*: | Prosody and other speech characteristics | ||
| 13.7.6*: | Dialect, accent, and idiolect at the acoustic level | ||
| 13.7.7*: | Discriminative Acoustic Training Methods for ASR | ||
| 13.7.8*: | Articulatory and physiological modeling | ||
| 13.7.9*: | Feature Transformation and Normalization | ||
| 13.8: | Robust Speech Recognition (SPE-ROBU) | ||
| 13.8.1*: | Features specifically for robust ASR (noise, channel, etc) | ||
| 13.8.2*: | Model/backend based robust ASR | ||
| 13.8.3*: | Confidence measures and rejection | ||
| 13.8.4*: | Speech Activity/End-point/Barge-in detection | ||
| 13.8.5*: | Non-acoustic microphones for ASR | ||
| 13.9: | Speech Adaptation/Normalization (SPE-ADAP) | ||
| 13.9.1*: | Speaker adaptation and normalization (e.g., VTLN) | ||
| 13.9.2*: | Speaker adapted training methods | ||
| 13.9.3*: | Environmental/Channel adaptation | ||
| 13.9.4*: | Idiolect adaptation | ||
| 13.9.5*: | Register and/or dialect adaptation | ||
| 13.10: | General Topics in Speech Recognition (SPE-GASR) | ||
| 13.10.1*: | Distributed Speech Recognition - Client/Server methods | ||
| 13.10.2*: | Alternative Statistical/Machine Learning Methods (e.g., no HMMs) | ||
| 13.10.3*: | Word spotting | ||
| 13.10.4*: | Metadata (e.g., emotion, speaker, accent) extraction from acoustics | ||
| 13.10.5*: | New algorithms, computational strategies, data- structures for ASR | ||
| 13.10.6*: | Multi-modal (such as audio-visual) speech recognition | ||
| 13.10.7*: | Corpora, annotation, and other resources | ||
| 13.10.8*: | Algorithm approximation methods in ASR | ||
| 13.10.9*: | Structured classification approaches | ||
| 13.11: | Multilingual Recognition and Identification (SPE-MULT) | ||
| 13.11.1*: | Language (LID) and dialect (DID) identification | ||
| 13.11.2*: | Multilingual Speech recognition | ||
| 13.11.3*: | Processing of non-native accents | ||
| 13.12: | Lexical Modeling and Access (SPE-LEXI) | ||
| 13.12.1*: | Pronunciation modeling at the lexical level | ||
| 13.12.2*: | Dialect, accent, and idiolect at the lexical level | ||
| 13.12.3*: | Multilingual aspects (e.g., unit selection) | ||
| 13.12.4*: | Automatic lexicon learning | ||
| 13.13: | Large Vocabulary Continuous Recognition/Search (SPE-LVCR) | ||
| 13.13.1*: | Decoding algorithms and implementation | ||
| 13.13.2*: | Lattices | ||
| 13.13.3*: | Multi-pass strategies | ||
| 13.13.4*: | Miscellaneous Topics | ||
| 13.14: | Speaker Recognition and Characterization (SPE-SPKR) | ||
| 13.14.1*: | Features and characteristics for speaker recognition | ||
| 13.14.2*: | Robustness to variable and degraded channels | ||
| 13.14.3*: | Verification, identification, segmentation, and clustering | ||
| 13.14.4*: | Speaker characterization and adaptation | ||
| 13.14.5*: | Speaker recognition with speech recognition | ||
| 13.14.6*: | Speaker confidence estimation | ||
| 13.14.7*: | Multimodal and multimedia human speaker recognition | ||
| 13.14.8*: | Corpora, annotation, evaluation, and other resources | ||
| 13.14.9*: | Higher-level knowledge in speaker recognition | ||
| 13.14.10*: | Speaker localization (space) (e.g., in meetings) | ||
| 13.14.11*: | Speaker diarization (time) (e.g., in meetings) | ||
| 13.14.12*: | Speaker clustering (e.g., in Broadcast news) | ||
| 13.15: | Resource constrained speech recognition (SPE-RCSR) | ||
| 13.15.1*: | Low-power speech recognition | ||
| 13.15.2*: | Reduced computation speech recognition | ||
| 13.15.3*: | ASR techniques for highly portable/mobile devices | ||
| 14: | Spoken Language Processing | ||
| 14.1: | Spoken Language Understanding (SLP-UNDE) | ||
| 14.1.1*: | Semantic classification | ||
| 14.1.2*: | Entity extraction from speech | ||
| 14.1.3*: | Spoken document summarization | ||
| 14.1.4*: | Topic spotting and classification | ||
| 14.1.5*: | Question/answering from speech | ||
| 14.1.6*: | Paralinguistic (emotion, age, gender, rate, etc.) information | ||
| 14.1.7*: | Nonlinguistic (meaning external to language) information, gestures, etc. | ||
| 14.1.8*: | Detecting linguistic/discourse structure (e.g., disfluencies, sentence/topic boundaries, speech acts) | ||
| 14.1.9*: | Relation to and interpretation of sign language | ||
| 14.2: | Human Spoken Language Acquisition, Development and Learning (SLP-LADL) | ||
| 14.2.1*: | Language acquisition, development, and learning models | ||
| 14.2.2*: | Computer aids for language learning | ||
| 14.2.3*: | Attributes and modeling techniques for assessment of language fluency | ||
| 14.3: | Spoken and Multimodal Dialog Systems and Applications (SLP-SMMD) | ||
| 14.3.1*: | Spoken and multimodal dialog systems, applications, and architectures | ||
| 14.3.2*: | Stochastic Learning for dialog modeling | ||
| 14.3.3*: | Response Generation | ||
| 14.3.4*: | Technologies for the aged | ||
| 14.3.5*: | Evaluation metrics and standards | ||
| 14.3.6*: | Speech/voice-based human-computer interfaces (HCI) | ||
| 14.3.7*: | Speech HCI for individuals with impairments (blindness, etc.) and universal access (UA) | ||
| 14.3.8*: | Other applications | ||
| 14.4: | Speech Data Mining (SLP-DM) | ||
| 14.4.1*: | Analysis, Tools, Evaluations, and Applications for mining spoken data | ||
| 14.4.2*: | Speech data mining theory, algorithms, and methods | ||
| 14.4.3*: | Mining heterogeneous speech and multimedia data | ||
| 14.5: | Speech Retrieval (SLP-IR) | ||
| 14.5.1*: | Spoken term detection | ||
| 14.5.2*: | Search/retrieval of speech documents | ||
| 14.5.3*: | Voice search | ||
| 14.6: | Machine Translation of Speech (SLP-SSMT) | ||
| 14.6.1*: | Semi-automatic and data driven methods | ||
| 14.6.2*: | Speech processing for MTS | ||
| 14.6.3*: | Corpora, annotation, and other resources | ||
| 14.6.4*: | Interlingua and transfer approaches | ||
| 14.6.5*: | Integration of speech and linguistic processing | ||
| 14.6.6*: | Machine transliteration for named entities | ||
| 14.6.7*: | Evaluation metrics (e.g., BLEU) | ||
| 14.6.8*: | Systems and applications for MTS | ||
| 14.7: | Language Modeling, for Speech and SLP (SLP-LANG) | ||
| 14.7.1*: | N-grams, their generalizations and smoothing methods. | ||
| 14.7.2*: | Language Model Adaptation | ||
| 14.7.3*: | Grammar based language modeling | ||
| 14.7.4*: | Maxent and feature based language modeling | ||
| 14.7.5*: | Dialect, accent, and idiolect at the language level | ||
| 14.7.6*: | Discriminative LM Training Methods | ||
| 14.7.7*: | Other approaches to LMs | ||
| 14.7.8*: | Structured classification approaches | ||
| 14.8: | Spoken language resources and annotation (SLP-REAN) | ||
| 14.8.1*: | General corpora, annotation, and other resources | ||