I'm currently working on an idea for a game i have that involves beat detction. Th engine im working with is Unity, and I've never had any experience with audio, coding wise, so be gentle :)
I've looked at several articles and tested out several algorithms including some of my own, but none we're really successful nor accurate enough, and i feel like I've been getting something wrong this entire time.
Specifically I've tried implementing the idea's presented here:
http://archive.gamedev.net/archive/reference/programming/features/beatdetection/index.html
but with little success, i still think im skipping over something and i cant quite pinpoint it.
If someone could provide an explanation about how to make an actually accurate beat detector i would be very grateful.
EDIT:
some people were confused as to what im having trouble with. Here is my latest try at detecting beats, i still dont understand why it's so inaccurate:
http://pastebin.com/BD8y9tfz
in this i used (R1) equation in the link i posted above to compute the instant energy from the 1024 samples i took, and then i used (R3) to calculate the local average sound energy from the buffer containing all the previous instant energy calculations, then i checked if there is a significant rise in instant energy compared to the average local sound energy, if there is, it means there is a beat, if there isn't, the program continues as usual.
(stupid reputation system doesnt let me post links and pictures ): ).
Edit 2:
added implementation for R4,R5 and R6, still not working though.
added a bit of debug, and for some reason the constant is ridicolosely small, numbers like:
Constant: -103416
and Constant: -54793.28, ive got no clue why im getting these numbers, any help?
Related
I know that a certain delay between the “play-order” and the actual start of the playback of a sound is inevitable.
However, for my current project, I must be able to start a sound-playback at a certain moment in time. This moment is known, so the solution to the problem is ether to reduce the delay-time as much as possible or to somehow predict the latency and start the sound somewhat earlier (depending on the predicted latency).
I describe the problem in detail here:
https://naudio.codeplex.com/discussions/662236
My current solution is to use NAudio to play a sound and simultaneously observe the sound-output-volume. This way I can measure the latency and use it to time the “play-order” for the following sounds.
This way I get decent results (about 30 ms deviation from the supposed play-time), but I wanted to ask if you guys have better suggestions.
Best regards and many thanks
I've been doing a lot of research on this and I'm still having trouble, so I'm hoping someone with a strong knowledge of Digital (Audio) Signal Processing can point me in the right direction.
I've been surprised at how hard it is to find a library that can perform accurate beat detection. I know next to nothing about DSP and FFTs. What I would really like is a library where I can simply say:
BPMDetect detector = new BPMDetect();
float bpm = detector.GetBpm(filename);
But apparently this is too much to ask for. The closest I've gotten is by using the SoundTouch library, but I've recently discovered that the BPM detection there is very unreliable. I know bpm detection isn't an exact science, but SoundTouch claimed that one of my music files was 170 BPM, while abyssmedia's BPM Counter program accurately puts it at 120 BPM. So I know it's possible. I'm more concerned with accuracy than speed.
So my question is: is there a C# library that can do this without having to know a lot about DSP?
You should be able to do basic beat detection without resorting to FFTs. I've built beat detection algorithms that did this.
The basic idea is to take your audio signal and partition it into small buckets of time, say 10msec or so. For each bucket compute the RMS power: for each sample s[i] in the bucket, normalize to -1.0...1.0 and them compute the sum of s[i]**2.
Now you've got an array of power (= loudness) for little grains of time.
Next take the rate of change (derivative) from bucket to bucket: d[i] = s[i+1] - s[i].
The array of derivates tells you how fast the loudness is increasing from grain to grain. The faster the change, the more intense the beat is.
Here's where it gets a bit artful. You put a threshold on these d[i] values to decide which ones are sudden enough to constitute a beat. Then you do autocorrelation to see if they are steady and line up in a regular pattern.
I'm trying to create a program which gets the various "notes" in a sound file (WAV or MP3) and can get the frequency and amplitude of each. I've been searching around for this, and of course there is the problem of distinguishing individual "notes" in a music file which isn't a MIDI, but it seems that something along these lines can be done with NAudio or DirectSound. Any ideas?
Thanks!
What you are asking to do is extremely difficult.
Step one would be to convert your audio from a time domain to a frequency domain. That is, you take a number of samples, and do a Fourier transform (implemented in your software as FFT).
Next, you begin deciding what you call a note or not. This is as not as simple as picking out the loudest of the frequencies! Different instruments have different timbre, which is created by various harmonics. If you had a song of nothing but sine waves, this would be much simpler. However, you'll find that you'll start seeing notes where your ear tells you they don't exist.
Now, psychoacoustics come into play. It is entirely possible for humans to "hear" notes that do not even have a fundamental. This is particularly true in a musical context. If I were to take a trombone and start playing a scale downward, at some point, the fundamental disappears or is mostly gone. However, you will still perceive that scale as going downward, when in fact the fundamental sound has all-but disappeared. Things get really tricky at this point.
To answer your question, start with an FFT. Maybe this is sufficient for your needs. If not, begin reading the significant amount of technical literature on the subject.
I am trying to output audio samples, and do so with cswavplay from http://www.codeproject.com/KB/audio-video/cswavplay.aspx which in turn seem to use DllImports from winmm.dll.
I did get it to play using 8-bit samples, however it fails miserably when I try to feed it 16-bit samples. I dug through the code as best as I can and I understand it as this:
I get a pointer to a buffer to fill each time cswavplay finished playing the last buffer. It works for one iteration, it plays back one buffer, sometimes...
I get all sorts of funny exceptions, AccessViolationException just now for instance when I tried to use a buffer size of 44100 to hear more clearly how much gets played. But when I put breakpoints in various places inside the WaveOut class (part of cswavplay) it seems none of the objects it uses, like the buffers and an instance of AutoResetEvent, are still alive the second iteration. My best guess is that these problems are related to threading or GC. The exceptions seem pretty random, and I am far too inexperienced to comprehend fully what's going on.
I'm asking for either of the following:
1) Wild guesses as to what could be the problem
2) Educated guesses as to what could be the problem
3) Pointers to an alternative way of outputting sound in realtime using C#
I'm not asking for a thorough bug tracking of software i didn't write, so don't mind cswavplay...
At the end of the day, i might be doing something wrong here, but it's hard to know when i don't get a relevant exception (along the lines of BufferAllocationException or something)...
EDIT:
Thanks for all the suggestions about other sound API's. They all seem to assume a .wav file. I'm sorry for not being clear, i'm not playing .wav files, i synthesize samples in realtime.
DirectSound and for .NET the XNA framework comes to my mind. There are many very high quality samples out there how to play sound and animate graphics at the same time with .NET.
Recently I was playing a game of Gin with my grandmother. We played a whole afternoon and as far as I can remember, I didn't won a single game.
So I told here that it with the help of computers it could become a much better player. She couldn't believe how computers could be useful there and that's why I want to demonstrate it.
I already implemented part of the logic, but now I have the problem that my solver is really not so sexy because he mainly is based on a brute force method. That is I calculated all the possibilities, score them according to the chances for a win and choose the best one. Is there any more sophisticated approach?
I'm talking about standard Gin. The implementation is done in C#.
I'm not 100% familiar with gin, but one thing to keep in mind is each strategy can be broken.
When you play, you have to play multiple players. Do they both have the same strategy? Do they have different strategy? If we played you might beat me in gin, but I might beat your grandma. How does your grandma know how to play? If you were given the same hand she does, how would she play it differently than yours? Yes you can take a look at the statistics, but your grandma doesn't play by statistics - she plays by experience. If you want to make it more sophisticated ask yourself "How can I factor experience into the hand?"