GlossaryΒΆ
- time series
- Typically an audio signal, denoted by
y
, and represented as a one-dimensional numpy.ndarray of floating-point values.y[t]
corresponds to amplitude of the waveform at samplet
. - sampling rate
- The (positive integer) number of samples per second of a time series.
This is denoted by an integer variable
sr
. - frame
- A short slice of a time series used for analysis purposes. This usually corresponds to a single column of a spectrogram matrix.
- window
- A vector or function used to weight samples within a frame when computing a spectrogram.
- frame length
- The (positive integer) number of samples in an analysis window (or
frame).
This is denoted by an integer variable
n_fft
. - hop length
- The number of samples between successive frames, e.g., the columns
of a spectrogram. This is denoted as a positive integer
hop_length
. - window length
- The length (width) of the window function (e.g., Hann window). Note that this
can be smaller than the frame length used in a short-time Fourier
transform. Typically denoted as a positive integer variable
win_length
. - spectrogram
- A matrix
S
where the rows index frequency bins, and the columns index frames (time). Spectrograms can be either real-valued or complex-valued. By convention, real-valued spectrograms are denoted as numpy.ndarraysS
, while complex-valued STFT matrices are denoted asD
. - onset (strength) envelope
- An onset envelope
onset_env[t]
measures the strength of note onsets at framet
. Typically stored as a one-dimensional numpy.ndarray of floating-point valuesonset_envelope
. - chroma
- Also known as pitch class profile (PCP). Chroma representations measure the amount of relative energy in each pitch class (e.g., the 12 notes in the chromatic scale) at a given frame/time.