Get music tempo or BPM? - c#

I'm currently building a game in windows phone 7 using xna
i'm trying to get beat per minute from song that played in background song,
i also not quite sure if what i want is BPM, what i want is something like pace or tempo in music, faster the pace ,faster the sprites is moving. What i'm thinking right now, BPM is how much a frequency from music hits a range of defined constant, e.g 20 Mhz - 30 MHz,
Feel free to correct me if i'm wrong, i'm not really familiar with audio thing, i have tried using VisualizationData from MediaLibrary XNA, but after some googling they said that VisualizationData doesn't work with WP7, i also had tried it and the output is 256 length float array contains 0 value,,or if i could do some fft with it,i'll give it a try
Thank you...

Like you were saying, as for the beats you can't get it directly but you'll have to interpret this data. If you personally can preprocess this music and ship it with your title it would be your best bet
In XNA you really only have MediaPlayer.GetVisualizationData to work with. There isn't anything built in that allows you to predetermine this sort of thing. It's used like the following and gets you information about the different frequencies that are playing.
MediaPlayer.IsVisualizationEnabled = true;
VisualizationData visData = new VisualizationData();
MediaPlayer.GetVisualizationData(visData);
So how do you take this frequency stuff and make it worthwhile for your application? There's a great breakdown of how you can do this that's on the App Hub forums in this thread called "Audio Analysis" in the reply by jwatte. Essentially, you're going to look at the low frequencies and try to figure out when the beats are coming in. Nothing perfect, but hopefully you'll get something that you approve of.
Good luck!

Related

Input of a Rhythm Game

So, Unity does not do a lot of rhythm games on android. I decided to find out why, and program one as assignment (the basics of it anyway). My most important hurdle is user input. As we know that input is based on frame rate in unity, and a music game (i assume) would prefer the smallest possible delay between button press and action.
If we look at music, at around 15 to 20ms of delay, the human ear hears something is "off beat".
I heard Android Unity games run at 30FPS (since 60FPS sucks the battery dry), simple math indicates: 1000/30 = 33ms per frame. Calculating in the 15ms we can probably not notice, we are at 18ms of possible disaster. assuming we always reach this 30FPS at any given moment.
When i get input from a user, i can play a sound on that input on the exact same frame. However, we could be 18ms off.
Now there is a way to get DIRECT controls from mouse and keyboard, which uses OnGui() instead of Update(), to get events of the keyboard or mouse clicks on the spot. The problem is, android probably doesn't work with this, (this doesn't work for gamepads either) and the methods sounds downright strange, especially when we try to play sounds from the OnGui() method.
My question:
What would you do, and why? Should we just accept the possible 18ms off, and assume we reach the 30FPS, or should we look for a reliable way to get input directly, instead of waiting for an update to come by?
Thanks for any insight you can give me, i have not found any articles on this that are useful just yet.
-Smiley
EDIT
I just did some basic testing with a metronome, running at 100FPS in the editor (which should be 10ms per frame) tapping my spacebar on a metronome inside unity. The results i got were just horrible.
Tapping rapidly: I get as close as 20ms to my metronome tick, but nothing closer.
Tapping on the beat: I got at least 200ms off my target tick. Unless i am confused with this rhythm, this is just wrong.
Currently i use Debug.Log to get my test data to the log. Can anyone please confirm for me if this may be the cause (causes some long delay? i know debug isn't that optimized), or is the timing actually that bad on it?
Thanks in advance,
-Smiley
First, I'd like to add a few things to your comments and analysis:
There is a hell of a lot more to measuring the latency between the tactile input and what your eyes perceive.
Probably the biggest one I'm seeing missing in your tests is the latency between the graphics card and the PC monitor you're testing on. Many common LCD monitors these days have a latency of between 15-30ms in processing lag. I don't know how much this relates to mobile screens and hardware, but I would suggest you take the time to perform additional tests on your target hardware before drawing further conclusions.
To more directly answer your question:
I would continue to use Unity however I would continue to research the best methods of taking the input and feeding it back to the player as fast as possible. In the above comments #rutter has pointed you to what appears to be a pretty good thread on the issue.
Of specific note I would look at using FixedUpdate() to decouple the framerate of the game from the input processing speed.
I think it is also worth putting time into researching the psychology of the perception of latency. For example, if your game is a sort of Guitar Hero game of matching the playing song, you could simply take into account the lag you know is there and in your game logic take that into account when checking input.
I think you are over-complicating this, and that the accuracy issue is no where near as bad as you think.
People usually hit the buttons a little early in order to sync what they are seeing and hearing.
It also depends alot on if you have some kind of scrolling display that they are trying to match up to... if the display is scrolling smoothly at 30fps (without big jumps) they they are still able to make their timing presses fairly accurate.
I would surmise that although people can hear when their timing is off, their actual timing of hitting the buttons at exactly the right time is not that accurate anyway.
Here is one other simple solution... which I think is what rock band and guitar hero often do...
You start playing the note/sound at the correct time anyway.... then change it to a broken sound if you detect they missed it or goofed up.

Recognizing notes when playing over microphone

I want to make a program that recognizes the notes I play over the microphone with my guitar, but I'm not sure how I will make my program recognize the sounds I play, and then make the program choose between a bunch of notes.
Can I have any help with this? I basically need a library, which is able to recognize sound played over the microphone, and then compare it to different audio files, and see which one is the closest to the played note.
I hope you guys understand this now, it is hard to explain.
As Dan Bryant mentioned, you basically want to do an FFT, which gives you the amount of energy at different frequencies. Find the frequency with maximum energy, then choose the note whose frequency is closest to that. This is what is going in the little digital tuners you buy that help you tune your guitar. There are several available libraries that will do the FFT for you. You just need to specify an FFT size that gives you enough frequency resolution to distinguish between notes.

Playback of 'recorded' sounds

I'm building a fairly simple sampler in C#. I've got the basic sound structure down (sounds pitched, stopping sounds mid play, etc). But my problem arises when I try to record and play the sounds the user inputs.
When recording, I save the sound into a dictionary with the starting time as the key (Class 'time' starts from 0), along with the length of the sound.
When playing back the recorded sound, I'm currently using a timer to simulate time in the system. I set the timer interval to the time difference between the current sound and the next sounds every time I play a sound.
It mostly starts out fine, but the sound usually goes completely out of sync, sounds are cut short or start too late, etc etc, I assume the problem is with my use of a timer, but I have no idea of another way of doing it.
I'm using Bass.Net for the sounds.
What timers are you using? Most timers in .NET would be too inaccurate for this task, you should look into P/Invoking Multimedia Timers (see this for example) or building your own using Stopwatch.

People Counting System

I want to develop a "People Counting System" using OpenCV (or Emgu CV).
Please guide me on how to implement or lead me to some examples or open source projects.
(I have done some work: extracting diff then threshold to delete background, using motion history and like that; still no good results.)
Edit 1: I am counting a high people flow (a dozen of them may come through simultaneously).
Edit 2: It must be at least 80% accurate. People are walking through a door that is almost 5 meters wide. The problem is I have no control on the position or angle of the camera. Camera is shouting the place from a 10m distance at a 2.5m height.
Thank you
If you call a people counting system a system that counts people that are in a room then I recommend you implement the hardware with a microcontroller with 2 lazers(normal lazer toys work) and 2 photoresistors.For the microcontroller I recomen you use Arduino.And then make an C# application that has a SerialPort object and reads the data that the arduino sends through the USB.The arduino will send 1 for "someone entered the room" and 0 for "someone left the room" for example.Then the logging and statistics can be done easily in C#.
Arduiono Site:here
Photoresistor for $1: here
This solution is alot cheaper and easyer to implement than using a camera that is with a fairly good quality.
Hope I helped you.
Check out the HOG pedestrian detector that comes with recent versions of OpenCV (>= 2.2).
See modules/objdetect/src/hog.cpp and samples/cpp/peopledetect.cpp in the OpenCV sources. Unfortunately there is no official documentation about it yet.
This would help you to count moving things including people: Motion Detection project on CodeProject
Are people the only kind of "entities" in the scene? If this is not the case, do you care about considering a person some other kind of thing that moves through the scene? Because if that is the case, you could just count blobs that come in or come out from the scene. It may sound a bit naive but I will take some kind of motion image, group motion pixels by distance in clusters. Your distance metric could take into account some restrictions, such as that people will "often" stand so pixels in a cluster should group around some kind of regression line (an straight-up line if the camera is aligned with de floor). It shouldn't be necessary to track them in the scene, just noticing when they enter or they leave, though you'd get some issues with, for example, people entering on their own in the scene and leaving in pairs or in groups... Good luck :)
I think if you have dense people crowd with a lot of occlusions you have to use some machine learning algorithm, for example you can use Implicit Shape Model for features.
It really depends on the position of the camera. Assuming that you can get front facing profiles of the people in the images:
This problem is basically face detection and recognition.
There are many ways to go about finding faces, but this is the approach that I'm a little more familiar with.
For the face detection you need to do image segmentation on the skin tone color. This will extract skin regions. [Arms, the chest (for those wearing V cut tops), face, legs, etc] Then you would need to line up the profiles of the skin regions to the profile of your trained faces.
[You'll need to use Eigenfaces to create a generic profile of what a face looks like]
If the skin region lines up and doesn't devate too far from the profile, then it is considered a face. Once the face is confirmed, then add it into the eigenfaces data store [for recognition]. To save processing you might want to consider limiting the search area if you are looking for a previous face. [Given the frame rate, and last time the person was seen]
If you are referring to "Crowd flow" I think you just mean the density of faces in a crowd.
Now you've confirmed that a moving object in the video is a person. Now you just need to note that and then make sure that you don't consider them as a new person again.
This approach: Really depends on your ability to detect face regions. This may not work if the people in the video are looking down, not fitting the profile of the trained data etc. Also it may be effected if a person puts on sunglasses within the video. [Probably would be considered a "new face"]

Machine Sounds in C#

I want add machine sounds to my router table simulator program. The idea is to hear the stepper motors accelerate and decelerate with a change in pitch synced with my graphics. Same with the spindle motor, it has a pitch change as its rpm goes up and down. I thought of either adding real recorded motor sounds and modifying the pitch at run time or create synthetic sounds simulating the motors. Can anyone suggest the easiest way to achieve this? I have hardly any experience in sound programming besides the most basic stuff. I am programming in C# in Visual Studio.net
check these links out:
http://www.codeproject.com/KB/audio-video/cswavplayfx.aspx
http://channel9.msdn.com/coding4fun/articles/Generating-Sound-Waves-with-C-Wave-Oscillators
These should give an idea for the options (playing pre-recorded audio with some effects versus generating the sound synthetically) you mentioned, either way...

Categories