基於動態時間規整結合線性伸縮之哼唱檢索系統

摘要

 

隨著智慧型手機以及行動網路的普及使用音樂串流平台或社群網站搜索、下載音樂資料成為日常生活的一部分。對於腦海中有某首歌的旋律卻想不出歌名、歌詞的情況,內涵式音樂檢索系統(Content Based Music Retrieval, CBMR)如哼唱檢索系統就可直接使用歌曲內涵式特徵如旋律、節奏等做為檢索依據,解決上述問題。

對於較大的檢索資料庫,我們必須在辨識準確率與運算時間中做權衡,先前研究以線性伸縮(Linear Scaling, LS)或推土機距離(Earth Mover’s distance, EMD)等準確度較低但運算速度快的方法先過濾掉不相似歌曲,再以高複雜度的動態時間規整(Dynamic Time Warping, DTW)對剩餘歌曲做高精度比對最後做相似度融合並輸出前十相似的歌曲清單。本研究將LS用在縮短旋律特徵以及在優化模塊中當作微調特徵長度的工具,前者可以減少運算量後者可以使準確度再上高精度的DTW則負責計算匹配距離,在歌曲中移動及伸縮匹配窗口找出對應的旋律起始點以及最佳匹配距離,並設計優化模塊及前置過濾器。實驗結果顯示,本系統的MRR值優於先前的研究,線性縮短資料與優化閥值節省了約40%運算時間。

關鍵字 : 音樂資訊檢索、哼唱檢索、動態時間規整、線性伸縮

 

Query by singing and humming system based on combined DTW and Linear Scaling

Abstract

 

With the growing popularity of mobile device and internet service, the amount of music data distributed over the internet are increasing everyday. Music information retrieval systems that have capabilities to search for music accurately and quickly are getting more and more attention.

Sometimes users only remember the melody but forget the lyrics, the Content Based Music Retrieval (CBMR) system like QBSH can solve this problem by using features extracted from the music for searching.

To deal with massive retrieval data, we need to balance accuracy and computation time, previous research combines multiple classifiers using score level fusion to reduce computation time, but poor classifier would lead to poor accuracy. Instead of score level fusion method, our proposed system combines DTW and linear scaling(LS), using LS to shorten query and reference songs in different procedure to reduce computation time and using DTW to compute the similarity between query and reference songs. we also design refinement module and pre-filter to enhance the accuracy.

The experiment results show that our method provide higher MRR compared with previous approach, and we reduce about 40% computation time by scaling down the data and finding best threshold setting.

Keywords: Music Retrieval, Query by singing and humming, Dynamic Time Warping, Linear Scaling