librosa.core.piptrack

librosa.core.piptrack(y=None, sr=22050, S=None, n_fft=2048, hop_length=None, fmin=150.0, fmax=4000.0, threshold=0.1)[source]

Pitch tracking on thresholded parabolically-interpolated STFT

[1]https://ccrma.stanford.edu/~jos/sasp/Sinusoidal_Peak_Interpolation.html
Parameters:
y: np.ndarray [shape=(n,)] or None

audio signal

sr : number > 0 [scalar]

audio sampling rate of y

S: np.ndarray [shape=(d, t)] or None

magnitude or power spectrogram

n_fft : int > 0 [scalar] or None

number of FFT bins to use, if y is provided.

hop_length : int > 0 [scalar] or None

number of samples to hop

threshold : float in (0, 1)

A bin in spectrum X is considered a pitch when it is greater than threshold*X.max()

fmin : float > 0 [scalar]

lower frequency cutoff.

fmax : float > 0 [scalar]

upper frequency cutoff.

.. note::

One of S or y must be provided.

If S is not given, it is computed from y using the default parameters of librosa.core.stft.

Returns:
pitches : np.ndarray [shape=(d, t)]
magnitudes : np.ndarray [shape=(d,t)]

Where d is the subset of FFT bins within fmin and fmax.

pitches[f, t] contains instantaneous frequency at bin f, time t

magnitudes[f, t] contains the corresponding magnitudes.

Both pitches and magnitudes take value 0 at bins of non-maximal magnitude.

Notes

This function caches at level 30.

Examples

>>> y, sr = librosa.load(librosa.util.example_audio_file())
>>> pitches, magnitudes = librosa.piptrack(y=y, sr=sr)