admin管理员组文章数量:1416050
I've done a lot of Google searching but haven't been able to find an example on how to determine the musical note of mp3 files.
So far, I've read something about FFT (Fast Fourier Transform) from which the pitch of an audio file can be calculated and based on the pitch notation the musical note can be derived.
But then I read that the mp3 file format is in the time domain which due to the lossy pressed format doesn't contain the values of the samples necessary for frequency analysis... does that mean that you have to convert the mp3 to a wav file in order to the calculate the key?
I've found a couple of examples of real-time pitch detection for visual purpose but not for analysing an entire mp3 file and outputting the musical key.
I hope someone can point me in the right direction.
Thanks.
I've done a lot of Google searching but haven't been able to find an example on how to determine the musical note of mp3 files.
So far, I've read something about FFT (Fast Fourier Transform) from which the pitch of an audio file can be calculated and based on the pitch notation the musical note can be derived.
But then I read that the mp3 file format is in the time domain which due to the lossy pressed format doesn't contain the values of the samples necessary for frequency analysis... does that mean that you have to convert the mp3 to a wav file in order to the calculate the key?
I've found a couple of examples of real-time pitch detection for visual purpose but not for analysing an entire mp3 file and outputting the musical key.
I hope someone can point me in the right direction.
Thanks.
Share Improve this question edited Jun 20, 2020 at 9:12 CommunityBot 11 silver badge asked Sep 9, 2016 at 12:11 AceAce 2334 silver badges14 bronze badges 12- 1 "the mp3 file format is in the time domain" - well, not quite. It is a coded (data pressed) version of an unpressed file e.g. WAV PCM, which in turn is a representation of a time domain signal. – BrechtDeMan Commented Sep 9, 2016 at 12:17
- 1 MP3 is a lossy format that alters and filters frequencies. You can't restore what isn't there anymore. But reading the information you provided you can see that indeed a conversion should/could help because the FFT works on the "raw" data. I just don't know how this relates to JavaScript? Especially on the Client I wouldn't be too sure you're even able to read that kind of data. – Seth Commented Sep 9, 2016 at 12:18
- 2 This is a very plicated problem that many researchers are still working on, and there's no simple one-size-fits-all solution. Forget about MP3 vs WAV though, that is not the issue. You need to get the signal, then do many plicated things with it to get an estimation of the key. – BrechtDeMan Commented Sep 9, 2016 at 12:19
- Okay, but isn't it possible to determine the pitch notation based on the amplitude in the time domain? – Ace Commented Sep 9, 2016 at 12:39
- Here’s a related question about real-time pitch detection (in C#) and my Python implementation of a handful of pitch estimators (harmonic product spectrum, Welch spectrogram, Blackman-Tukey spectral estimator): gist.github./fasiha/957035272009eb1c9eb370936a6af2eb Your broader question of musical key is one that escapes my very limited understanding of music—can you explain, if you had a sequence of pitches (in Hertz), how would you get musical key out of that? – Ahmed Fasih Commented Sep 9, 2016 at 13:33
2 Answers
Reset to default 6I created an application, PitchScope Player, which can do pitch detection upon MP3 files in realtime and its plete source code is posted on GitHub, however it is written in C++. Pitch detection and musical key detection, especially in realtime, is extremely demanding and probably needs the speed of C++ to be executed at this point in time. You have just begun to explore a very difficult audio engineering task, and really need to first get some background as to the physics of how we perceive ‘pitch’, what a ‘harmonic’ is, and explore the choices in how to make a frequency-domain transform from the raw signal (see Wikipedia link below).
When a single key is pressed upon a piano, what we hear is not just one frequency of sound vibration, but a posite of multiple sound vibrations occurring at different mathematically related frequencies. The elements of this posite of vibrations at differing frequencies are referred to as harmonics or partials. For instance, if we press the Middle C key on the piano, the individual frequencies of the posite's harmonics will start at 261.6 Hz as the fundamental frequency, 523 Hz would be the 2nd Harmonic, 785 Hz would be the 3rd Harmonic, 1046 Hz would be the 4th Harmonic, etc. The later harmonics are integer multiples of the fundamental frequency, 261.6 Hz ( ex: 2 x 261.6 = 523, 3 x 261.6 = 785, 4 x 261.6 = 1046 ). We detect pitch by finding for groups of harmonics which have that mathematical relationship in the spacing of their frequencies.
Rather than use a FFT, I use a modified Logarithmic DFT Transform so that its frequency channels can be aligned to where the harmonics are located within a musical signal. The Logarithmic DFT transform also gives a distinct speed advantage in execution.
Once you have detected numerous pitches in the musical signal, then you can detect the Musical Key by scoring the 12 different Key Candidates by the populations of member notes within that musical signal. Another application of mine, PitchScope Navigator, can also detect Musical Key in realtime.
You might want to acquire a C++ piler and repile my source code so you can step through its execution to see how my algorithms work. It will also decode an MP3 file. You could also download an executable of that application, PitchScope Player, from numerous places on the web in order to see how it performs on a Windows machine with a MP3 file of your choice.
https://github./CreativeDetectors/PitchScope_Player
https://en.wikipedia/wiki/Transcription_(music)#Pitch_detection
Below is the image of a Logarithmic DFT (created by my C++ software) for 3 seconds of a guitar solo on a polyphonic mp3 recording. It shows how the harmonics appear for individual notes on a guitar, while playing a solo. For each note on this Logarithmic DFT we can see its multiple harmonics extending vertically, because each harmonic will have the same time-width.
This guy wrote an incredible library that has worked extremely well for me, but I have been told by others that it does not work out-of-the-box for them: they gave me another link.
I'm just going to link to their sites because they tutorialize what they have done on the sites, and they have some licenses (MIT license or whatever)--don't want to mess things up by unintentionally violating their licenses by reposting.
Anyway, it worked great for me!
This is the one I prefer with a tutorial alongside:
https://alexanderell.is/posts/tuner/
Here is the one I was remended by the individual who had one that did not work--could not find the tutorial, but it's out there somewhere:
https://harald.ist/tools/spectrum_analyser.html
本文标签: audioJavascript How to determine the musical key of mp3 filesStack Overflow
版权声明:本文标题:audio - Javascript: How to determine the musical key of mp3 files - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1745246133a2649564.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论