torchaudio.datasets¶
All datasets are subclasses of torch.utils.data.Dataset
i.e, they have __getitem__
and __len__
methods implemented.
Hence, they can all be passed to a torch.utils.data.DataLoader
which can load multiple samples parallelly using torch.multiprocessing
workers.
For example:
yesno_data = torchaudio.datasets.YESNO('.', download=True)
data_loader = torch.utils.data.DataLoader(yesno_data,
batch_size=1,
shuffle=True,
num_workers=args.nThreads)
The following datasets are available:
All the datasets have almost similar API. They all have two common arguments:
transform
and target_transform
to transform the input and target respectively.
VCTK¶
-
class
torchaudio.datasets.
VCTK
(root, url='http://homepages.inf.ed.ac.uk/jyamagis/release/VCTK-Corpus.tar.gz', folder_in_archive='VCTK-Corpus', download=False, downsample=False, transform=None, target_transform=None)[source]¶ Create a Dataset for VCTK. Each item is a tuple of the form: (waveform, sample_rate, utterance, speaker_id, utterance_id)
Folder p315 will be ignored due to the non-existent corresponding text files. For more information about the dataset visit: https://datashare.is.ed.ac.uk/handle/10283/3443