Librosa Tutorials – Basics

This is just a copy of the tutorial of Librosa  by the man Brian Mcfee, a creative person who is also our new instructor in the Music Technology program at NYU. The reason why I want to have a copy here is just to push myself go through this library in detail since my final thesis will also be related to MIR field. Moreover, I will try to add the translation part in Chinese since not too much tutorial resources I could find online in this language.

Quick Start


from __future__ import print_function
import librosa

filename = librosa.util.example_audio_file()   
#gets the path to the audio example file included with <em>librosa  </em>
#获得示例音频的路径(注意在示例中作者提供的音频是OGG Vorbis格式的,所以为了能够处理这些WAV外,包括OGG,Mp3等编码的音频,需要之前安装ffmpeg的库来获得合适的编解码器)

y, sr = librosa.load(filename)
#loads and decodes the audio as a time series y represented as a one-dimensional NumPy floating point array. The variable sr contains the sampling rate of y. By default, all audio is mixed to mono and resampled to 22050Hz at load time. This behavior can be overridden by supplying additional arguments to librosa.load()
#将目标路径的音频文件载入为一个用NumPy浮点数组表示的时间序列y。变量sr存储了y的采样频率。默认情况下,所有音频在载入时会使用22050hz来重新采样并下混为一个单声道文件;但这个属性可以通过librosa.load()来重载。

tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)
#The output of the beat tracker is an estimate of the tempo (in beats per minute), and an array of frame numbers corresponding to detected beat events. Frames here correspond to short windows of the signal y, each separated by 'hop_length = 512' samples.
#这一步输出的是预测的音乐速度信息tempo, 以及一个用来存储对应节拍监测事件帧的数列;这里的一个帧即为信号的一个短时片段,每个片段包含512个采样点

beat_times = librosa.frames_to_time(beat_frames,sr=sr)
#beat_times will be an array of timestamps (in seconds) corresponding to detected beat events.
#这里我们就是将上一步的帧数列转换到时域上以秒为单位的时间戳

librosa.output.time_csv('beat_times.csv' , beat_times)
#Finally, we can store the detected beat timestamps as a comma-separated values (CSV) file
#最后这一步就是把时间戳存入一个csv文件中,这样会让使用者会有更直观的时域感受,并且可以用于Sonic Visualiser 和 mir_eval的可视化步骤中)

Advanced Usage

进阶用法,很多功能在特征提取分析时会用到。作者给的例子中包括了和声信息与打击乐的分离,多种频谱信息的提取等。


import numpy as np
import librosa

filename = librosa.util.example_audio_file()
y, sr = librosa.load(filename)

hop_length = 512
#Set the hop length; at 22050Hz, 512 samples ~= 23ms
#设置窗口长度,在22050Hz时,512个采样长度所对应的时域长度约为23ms

y_harmonic, y_percussive = librosa.effects.hpss(y)
#separate signal y into two time series, containing the harmonic (tonal) and percussive (transient) portions of the signal
#将原始信号y分离成2个时长相等的时间序列,y_harmonic中只有包含和声的音乐信息,而y_percussive只包含了打击乐部分
#这种分而治之的原因有:
#1.打击乐部分对节奏信息,速度信息等特征的提取更有帮助
#2.打击乐器通常被认为宽频谱的噪声乐器,因此会阻碍对其他和声信息相关的频谱特征的提取

mfcc = librosa.feature.mfcc(y=y,sr=sr,hop_length=hop_length, n_mfcc=13)
# the output is an numpy.ndarray of size(n_mfcc,T) matrix.
#这一步从原始信号中进行MFCC提取并输出为一个(n_mfcc,T)大小的矩阵

mfcc_delta = librosa.feature.delta(mfcc)
#computes (smoothed) first-order differences among columns of its input
#计算输入的一阶差分,这是对提取信息的第一步操作

beat_mfcc_delta = librosa.util.sync(np.vstack([mfcc,mfcc_del]),beat_frames)
#aggregates columns of its input between sample indices (e.g., beat frames)
#对特征信息的第二部操作是同步,即根据样本数目录来使之前的两个矩阵聚合
chromagram = librosa.feature.chroma_cqt(y=y_harmonic,sr=sr)
#compute a chromagram using just the harmonic component
#使用之前提取的和声信息类计算chromagram(词穷了...正不知道这个在音频领域怎么翻译成中文...见谅..)
#chromagram会是一个(12,T)大小的矩阵,它的行数刚好对应音乐中的12平均律(C,C#..)
#而它的列会根据其峰值做标准化处理

beat_chroma = librosa.util.sync(chromagram,beat_frames,aggregate=np.median)
#synchronize the chroma between beat events
#使用中位数,来同步chroma和记录结拍事件的beat_frames

beat_features = np.vstack([beat_chroma,beat_mfcc_delta})
#all features are vertically stacked again finally
#最后得到一个包含特征信息的矩阵,尺寸为:(12+13+13,# beat intervals)

More advanced examples can be found on here
第一部分最基础的示例教程这里就结束了, 大家可以点击上面的链接去看librosa
在具体一些特征信息提取步骤的应用;如果未来时间允许的话我也将继续翻译这些教程


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s