librosa.decompose.hpss¶
-
librosa.decompose.
hpss
(S, kernel_size=31, power=2.0, mask=False, margin=1.0)[source]¶ Median-filtering harmonic percussive source separation (HPSS).
If margin = 1.0, decomposes an input spectrogram S = H + P where H contains the harmonic components, and P contains the percussive components.
If margin > 1.0, decomposes an input spectrogram S = H + P + R where R contains residual components not included in H or P.
This implementation is based upon the algorithm described by [1] and [2].
[1] Fitzgerald, Derry. “Harmonic/percussive separation using median filtering.” 13th International Conference on Digital Audio Effects (DAFX10), Graz, Austria, 2010. [2] (1, 2) Driedger, Müller, Disch. “Extending harmonic-percussive separation of audio.” 15th International Society for Music Information Retrieval Conference (ISMIR 2014), Taipei, Taiwan, 2014. Parameters: - S : np.ndarray [shape=(d, n)]
input spectrogram. May be real (magnitude) or complex.
- kernel_size : int or tuple (kernel_harmonic, kernel_percussive)
kernel size(s) for the median filters.
- If scalar, the same size is used for both harmonic and percussive.
- If tuple, the first value specifies the width of the harmonic filter, and the second value specifies the width of the percussive filter.
- power : float > 0 [scalar]
Exponent for the Wiener filter when constructing soft mask matrices.
- mask : bool
Return the masking matrices instead of components.
Masking matrices contain non-negative real values that can be used to measure the assignment of energy from S into harmonic or percussive components.
Components can be recovered by multiplying S * mask_H or S * mask_P.
- margin : float or tuple (margin_harmonic, margin_percussive)
margin size(s) for the masks (as described in [2])
- If scalar, the same size is used for both harmonic and percussive.
- If tuple, the first value specifies the margin of the harmonic mask, and the second value specifies the margin of the percussive mask.
Returns: - harmonic : np.ndarray [shape=(d, n)]
harmonic component (or mask)
- percussive : np.ndarray [shape=(d, n)]
percussive component (or mask)
See also
util.softmask
Notes
This function caches at level 30.
Examples
Separate into harmonic and percussive
>>> y, sr = librosa.load(librosa.util.example_audio_file(), duration=15) >>> D = librosa.stft(y) >>> H, P = librosa.decompose.hpss(D)
>>> import matplotlib.pyplot as plt >>> plt.figure() >>> plt.subplot(3, 1, 1) >>> librosa.display.specshow(librosa.amplitude_to_db(D, ... ref=np.max), ... y_axis='log') >>> plt.colorbar(format='%+2.0f dB') >>> plt.title('Full power spectrogram') >>> plt.subplot(3, 1, 2) >>> librosa.display.specshow(librosa.amplitude_to_db(H, ... ref=np.max), ... y_axis='log') >>> plt.colorbar(format='%+2.0f dB') >>> plt.title('Harmonic power spectrogram') >>> plt.subplot(3, 1, 3) >>> librosa.display.specshow(librosa.amplitude_to_db(P, ... ref=np.max), ... y_axis='log') >>> plt.colorbar(format='%+2.0f dB') >>> plt.title('Percussive power spectrogram') >>> plt.tight_layout()
Or with a narrower horizontal filter
>>> H, P = librosa.decompose.hpss(D, kernel_size=(13, 31))
Just get harmonic/percussive masks, not the spectra
>>> mask_H, mask_P = librosa.decompose.hpss(D, mask=True) >>> mask_H array([[ 1.000e+00, 1.469e-01, ..., 2.648e-03, 2.164e-03], [ 1.000e+00, 2.368e-01, ..., 9.413e-03, 7.703e-03], ..., [ 8.869e-01, 5.673e-02, ..., 4.603e-02, 1.247e-05], [ 7.068e-01, 2.194e-02, ..., 4.453e-02, 1.205e-05]], dtype=float32) >>> mask_P array([[ 2.858e-05, 8.531e-01, ..., 9.974e-01, 9.978e-01], [ 1.586e-05, 7.632e-01, ..., 9.906e-01, 9.923e-01], ..., [ 1.131e-01, 9.433e-01, ..., 9.540e-01, 1.000e+00], [ 2.932e-01, 9.781e-01, ..., 9.555e-01, 1.000e+00]], dtype=float32)
Separate into harmonic/percussive/residual components by using a margin > 1.0
>>> H, P = librosa.decompose.hpss(D, margin=3.0) >>> R = D - (H+P) >>> y_harm = librosa.core.istft(H) >>> y_perc = librosa.core.istft(P) >>> y_resi = librosa.core.istft(R)
Get a more isolated percussive component by widening its margin
>>> H, P = librosa.decompose.hpss(D, margin=(1.0,5.0))