• Docs >
  • torchaudio.sox_effects
Shortcuts

torchaudio.sox_effects

Create SoX effects chain for preprocessing audio.

SoxEffect

class torchaudio.sox_effects.SoxEffect[source]

Create an object for passing sox effect information between python and c++

Returns:An object with the following attributes: ename (str) which is the name of effect, and eopts (List[str]) which is a list of effect options.
Return type:SoxEffect

SoxEffectsChain

class torchaudio.sox_effects.SoxEffectsChain(normalization: Union[bool, float, Callable] = True, channels_first: bool = True, out_siginfo: Any = None, out_encinfo: Any = None, filetype: str = 'raw') → None[source]

SoX effects chain class.

Parameters:
  • normalization (bool, number, or callable, optional) – If boolean True, then output is divided by 1 << 31 (assumes signed 32-bit audio), and normalizes to [-1, 1]. If number, then output is divided by that number. If callable, then the output is passed as a parameter to the given function, then the output is divided by the result. (Default: True)
  • channels_first (bool, optional) – Set channels first or length first in result. (Default: True)
  • out_siginfo (sox_signalinfo_t, optional) – a sox_signalinfo_t type, which could be helpful if the audio type cannot be automatically determined. (Default: None)
  • out_encinfo (sox_encodinginfo_t, optional) – a sox_encodinginfo_t type, which could be set if the audio type cannot be automatically determined. (Default: None)
  • filetype (str, optional) – a filetype or extension to be set if sox cannot determine it automatically. . (Default: 'raw')
Returns:

An output Tensor of size [C x L] or [L x C] where L is the number of audio frames and C is the number of channels. An integer which is the sample rate of the audio (as listed in the metadata of the file)

Return type:

Tuple[Tensor, int]

Example
>>> class MyDataset(Dataset):
>>>     def __init__(self, audiodir_path):
>>>         self.data = [os.path.join(audiodir_path, fn) for fn in os.listdir(audiodir_path)]
>>>         self.E = torchaudio.sox_effects.SoxEffectsChain()
>>>         self.E.append_effect_to_chain("rate", [16000])  # resample to 16000hz
>>>         self.E.append_effect_to_chain("channels", ["1"])  # mono signal
>>>     def __getitem__(self, index):
>>>         fn = self.data[index]
>>>         self.E.set_input_file(fn)
>>>         x, sr = self.E.sox_build_flow_effects()
>>>         return x, sr
>>>
>>>     def __len__(self):
>>>         return len(self.data)
>>>
>>> torchaudio.initialize_sox()
>>> ds = MyDataset(path_to_audio_files)
>>> for sig, sr in ds:
>>>   [do something here]
>>> torchaudio.shutdown_sox()
append_effect_to_chain(ename: str, eargs: Union[List[str], NoneType] = None) → None[source]

Append effect to a sox effects chain.

Parameters:
  • ename (str) – which is the name of effect
  • eargs (List[str], optional) – which is a list of effect options. (Default: None)
clear_chain() → None[source]

Clear effects chain in python

set_input_file(input_file: str) → None[source]

Set input file for input of chain

Parameters:input_file (str) – The path to the input file.
sox_build_flow_effects(out: Union[torch.Tensor, NoneType] = None) → Tuple[torch.Tensor, int][source]

Build effects chain and flow effects from input file to output tensor

Parameters:out (Tensor, optional) – Where the output will be written to. (Default: None)
Returns:An output Tensor of size [C x L] or [L x C] where L is the number of audio frames and C is the number of channels. An integer which is the sample rate of the audio (as listed in the metadata of the file)
Return type:Tuple[Tensor, int]

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources