From the course: Hands-On PyTorch Machine Learning
Unlock the full course today
Join today to access over 24,900 courses taught by industry experts.
Torchaudio introduction
From the course: Hands-On PyTorch Machine Learning
Torchaudio introduction
- [Instructor] TorchAudio is a library for audio and signal processing with PyTorch. It provides IO, signal, and data processing functions, datasets, model implementations, and application components. TorchAudio offers a set of APIs, including backend, functional, transforms, datasets, models, pipelines, sox_effects, compliance.kaldi, kaldi_io, and utils. Similar to TorchVision, TorchAudio also provides a number of popular datasets out of the box. Examples include CMUDict, CMU pronouncing dictionary, Common Voice, GTZAN, which is music genre classification of audio signals, speech commands, and VCTK, which is speech data uttered by 110 English speakers with various accents. Details of these datasets can be found at PyTorch dataset documentation. Audio I/O package allow you to query audio file metadata, loading audio data into a tensor, and saving audio to files. Audio resampling. To resample an audio wave form…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.