librosa.core.dtw¶

librosa.core.dtw(X=None, Y=None, C=None, metric=’euclidean’, step_sizes_sigma=None, weights_add=None, weights_mul=None, subseq=False, backtrack=True, global_constraints=False, band_rad=0.25)[source]¶

Dynamic time warping (DTW).

This function performs a DTW and path backtracking on two sequences. We follow the nomenclature and algorithmic approach as described in [1].

[1]	Meinard Mueller Fundamentals of Music Processing — Audio, Analysis, Algorithms, Applications Springer Verlag, ISBN: 978-3-319-21944-8, 2015.

Parameters:	X : np.ndarray [shape=(K, N)] audio feature matrix (e.g., chroma features) Y : np.ndarray [shape=(K, M)] audio feature matrix (e.g., chroma features) C : np.ndarray [shape=(N, M)] Precomputed distance matrix. If supplied, X and Y must not be supplied and `metric` will be ignored. metric : str Identifier for the cost-function as documented in scipy.spatial.cdist() step_sizes_sigma : np.ndarray [shape=[n, 2]] Specifies allowed step sizes as used by the dtw. weights_add : np.ndarray [shape=[n, ]] Additive weights to penalize certain step sizes. weights_mul : np.ndarray [shape=[n, ]] Multiplicative weights to penalize certain step sizes. subseq : binary Enable subsequence DTW, e.g., for retrieval tasks. backtrack : binary Enable backtracking in accumulated cost matrix. global_constraints : binary Applies global constraints to the cost matrix `C` (Sakoe-Chiba band). band_rad : float The Sakoe-Chiba band radius (1/2 of the width) will be `int(radius*min(C.shape))`.
Returns:	D : np.ndarray [shape=(N,M)] accumulated cost matrix. D[N,M] is the total alignment cost. When doing subsequence DTW, D[N,:] indicates a matching function. wp : np.ndarray [shape=(N,2)] Warping path with index pairs. Each row of the array contains an index pair n,m). Only returned when `backtrack` is True.
Raises:	ParameterError If you are doing diagonal matching and Y is shorter than X or if an incompatible combination of X, Y, and C are supplied. If your input dimensions are incompatible.

Examples

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> y, sr = librosa.load(librosa.util.example_audio_file(), offset=10, duration=15)
>>> X = librosa.feature.chroma_cens(y=y, sr=sr)
>>> noise = np.random.rand(X.shape[0], 200)
>>> Y = np.concatenate((noise, noise, X, noise), axis=1)
>>> D, wp = librosa.dtw(X, Y, subseq=True)
>>> plt.subplot(2, 1, 1)
>>> librosa.display.specshow(D, x_axis='frames', y_axis='frames')
>>> plt.title('Database excerpt')
>>> plt.plot(wp[:, 1], wp[:, 0], label='Optimal path', color='y')
>>> plt.legend()
>>> plt.subplot(2, 1, 2)
>>> plt.plot(D[-1, :] / wp.shape[0])
>>> plt.xlim([0, Y.shape[1]])
>>> plt.ylim([0, 2])
>>> plt.title('Matching cost function')
>>> plt.tight_layout()

(Source code)