Determining if the user is speaking on a HL2 - c#

I've been using the MRTK MicStream APIs to try to merely determine if the HoloLens 2 user is speaking. I've tried for a couple of weeks. All attempts have failed for reasons documented here and here. I am officially giving up on the MicStream API, and also tried the AudioFrame method from the MediaCapture APIs which also fail. (MediaCapture is how the MicStream DLL also attempts to access raw audio data which you can see here, on line 351 - 357 so this isnt surprising).
My question is: how else can I determine only if a user is speaking? I do not need dictation, a recording, or to use speech commands. I only want to know if the two user microphones on channel 1 and 2 are active above the normal room amplitude (in real time). Does anyone know any other ways outside these methods?

For what its worth - I finally fixed it. The MicStream DLL's MediaCapture instance was conflicting with one I had already instantiated for photo captures. In short, you cant use MicStream with another MediaCapture instance. I tried to set the settings for SharingMode on the first MediaCapture (in my script for capturing photos, but this didnt work. I had to completely stop using the MicStream .dll and streamline the audio capture under one MediaCapture instantiated with StreamingCaptureMode.AudioAndVideo. This fixed the problem.

Related

How to stream audio to a specific output device?

Have been struggling with finding a way to stream audio, from a file or web, to a specific output device, not just the default one. Tried using mciSendString and while the open command does accept a device id/filename I haven't found a way to make use of it, am not even sure if this is what I am looking for or not, but considering it says ... or the filename of the device driver am guessing yes(?), but correct me if I am wrong, and this isn't a specify your output device type parameter.
If it is the correct thing then how do you enumerate the installed device drivers, have looked into the IMMDevice interface because it seamed like it could have the file names stored in the registry, but non of the output device registry keys had a driver filename type value entry, or at least I haven't found one.
So my question is, how would you go about streaming audio to a specific output device, it doesn't have to be done through mciSendString, that's just something I looked into as it's one of the most talked about function when it comes to playing audio.
Note: please do not recommend me 3rd party libraries like NAudio, the reason I am asking this question is not get recommendations for libraries, otherwise I would have already used one and would have never written this, have just seen a lot of answers be like: Use {LibName}, it has what you want or something along those lines.
In case what's written is odd or incorrectly worded in places, basically this is what the end goal should be:
Installed Output Devices:
- Output1
- Output2
- Output3
Method For Playing:
//will play x.mp3 through output device 1
PlayAudio(output: "Output1", mp3File: "x.mp3");
//will play x.mp3 through output device 2
PlayAudio(output: "Output2", mp3File: "x.mp3");
//will play x.mp3 through output device 3
PlayAudio(output: "Output3", mp3File: "x.mp3");
You seem to be looking for this API:mciSendCommand()
To set the WaveAudio device (soundcard) used by the Multimedia
Control, you must use the mciSendCommand API. The Multimedia Control
does not directly provide a method to let you set the device used for
playing or recording.
Call mciSendCommand() with MCI_SET & MCI_WAVE_SET_PARMS
setting wOutput to the desired playback device's ID.
Then get IDDevice for mciSendCommand() via
mciGetDeviceID("waveaudio")
Its not 100% clear what wOutput wants, its probably the same ID as returned by waveOutGetDevCaps()
I am just a porter.
Please refer:
https://stackoverflow.com/a/13320137/11128312
https://stackoverflow.com/a/10968887/11128312

Datalogic Barcode/Weight Scanner

Could anyone please give me any idea as to where to start my coding in order to get data from OPOS(Datalogic Magellan device) weighting and barcode scanning in C#?? For example, what library and what function I should be using for this case. I am clueless as I have already spent numerous of hours searching for an answer online. Not even came close online.
I don't know any api that I can use to get the weight and barcode for the usb device into my C# program.
I am currently using Datalogic scale. I tried the build-in windows reader but it didn't read in any data from the device.
First off, I used the Microsoft.PointOfService library which can directly create connection to most of the opos base machine. And make sure you have your Logical Device Name right! Very Very important. This is NOT any normal name you found in your regedit, it MUST be define manually by yourself inside the opos adk program that you installed along with the opos machine.
Then you can pass in the name as usual in your C# program.
For example: you set USB_Scale as your logical device name inside OPOS program
in C#
this.myDevice = explore.GetDevice("Scale", "USB_Scale");
Note: Make sure you set claim to 1000; It might not work if you didn't do so.
Also : this.myScale = ((Scale)explore.CreateInstance(myDevice)); <- this might help~
The rest is just straight forward.

Eject Memory card from Card Reader C#

I have a custom developed USB card reader. I am using the following code to interact and iterrate over the device:
http://www.codeproject.com/KB/system/usbeject.aspx
The code above provides an 'eject' method using the following line:
Native.CM_Request_Device_Eject_NoUi(device.InstanceHandle, IntPtr.Zero, null, 0, 0);
However this 'eject' method unmounts the entire drive instead of simply ejecting the media card.
Why this is a problem is because I want to 'eject' the media card, then put in a different one. However when the whole reader is ejected i have to unplug/replug the device for it to show back up.
In windows explorer when I right click 'eject' it operates as I am imagining, where it safely removes the memory card but not the card reader.
How can I go about implementing this different type of eject in c#?
I came here accidentally while doing a search on "CM_Request_Device_Eject_NoUi", and saw that it was similar to a solution I'd recently done by pulling together similar pieces of a solution. Forgive the late answers.
Here's what worked for me (this also addresses some issues I've seen on other SO questions regarding AutoEjectVolume from the Microsoft sample not "doing everything" that the system does when you Safely Remove Hardware using the OS):
Start with the steps outlined in How to Eject Removable Media in Windows.
Replace the call to AutoEjectVolume with code that is, in effect, the body of the RemoveDrive method from How to Prepare a USB Drive for Safe Removal. Note that this later work relies heavily on two other CodeProject articles — including the one you referenced in your question — ported to C#.
In 2, I say "in effect" because — in practice — you use the same hVolume in both solutions, and it makes more sense to do all the checks in the CodeProject RemoveDrive method before calling LockVolume, DismountVolume, or PrepareRemovalOfVolume in the Microsoft solution, and then call CM_Request_Device_Eject_NoUi as shown in the CodeProject solution.
A short pseudo-code summary:
Open the volume with CreateFile (CodeProject)
Obtain drive's device instance handle and drive's parent's device
instance handle (CodeProject)
Exit before calling — in particular — DismountVolume,
if any of the steps above fail (CodeProject)
Call LockVolume, DismountVolume, and PrepareRemovalOfVolume
using the hVolume returned from CreateFile (Microsoft)
You can close the hVolume at any time after this
Call CM_Request_Device_Eject_NoUi on the drive's parent's device
instance handle (CodeProject)

Can I use MediaPlayer to play sounds from a stream? Or, can I use an URI to access a filestream?

I'm using System.Windows.Media.MediaPlayer to play some sounds, and I would like to load these sounds from a ZIP file. It would be nice to be able to load these files as a stream directly from the zip file instead of having to unzip to a temp directory. However, MediaPlayer.open only accepts an URI.
So, is there a way to create a URI that will reference the contents of a stream? Something like an in-memory localhost? Is there any connector from Stream to URI?
System.Media.SoundPlayer should do what you need (you can skip MediaPlayer and URIs altogether). It has a constructor that takes a stream.
Since you need to support polyphony, one approach is to PInvoke into the waveOutXXXX Windows API in order to play the sounds. Here is a good example of how to do this in C#:
https://www.codeproject.com/KB/audio-video/cswavplay.aspx
This example also has code for reading info like duration, sample rate, bits per sample etc.
If you search, you may find claims that the waveOutXXXX API can only play one sound at a time. This was true in Windows 95/98, but is not true any longer.
By the way, SoundPlayer is probably the most frustrating .NET class. The .Play() method automatically stops playback of any other sound played by your process (it might be any other sound played by any .NET app) before starting, which is why you can't do polyphony with it. It would have taken Microsoft's worst intern less than a minute to add a .DontStopJustPlay() method or a StopFirst bool parameter to the .Play() method. It might have taken him until lunch time to add a Duration property.
Although waveOutXXXX is trickier than you would want from a well-designed modern API (and using it in .NET introduces additional problems), the one unmatched advantage it has is that it's come pre-installed on every single Windows computer since Windows 95, including Windows Mobile devices. Every other option (including MediaPlayer) means somebody will always have to install something.

How can I play compressed sound files in C# in a portable way?

Is there a portable, not patent-restricted way to play compressed sound files in C# / .Net? I want to play short "jingle" sounds on various events occuring in the program.
System.Media.SoundPlayer can handle only WAV, but those are typically to big to embed in a downloadable apllication. MP3 is protected with patents, so even if there was a fully managed decoder/player it wouldn't be free to redistribute. The best format available would seem to be OGG Vorbis, but I had no luck getting any C# Vorbis libraries to work (I managed to extract a raw PCM with csvorbis but I don't know how to play it afterwards).
I neither want to distribute any binaries with my application nor depend on P/Invoke, as the project should run at least on Windows and Linux. I'm fine with bundling .Net assemblies as long as they are license-compatible with GPL.
[this question is a follow up to a mailing list discussion on mono-dev mailing list a year ago]
I finally revisited this topic, and, using help from BrokenGlass on writing WAVE header, updated csvorbis. I've added an OggDecodeStream that can be passed to System.Media.SoundPlayer to simply play any (compatible) Ogg Vorbis stream. Example usage:
using (var file = new FileStream(oggFilename, FileMode.Open, FileAccess.Read))
{
var player = new SoundPlayer(new OggDecodeStream(file));
player.PlaySync();
}
'Compatible' in this case means 'it worked when I tried it out'. The decoder is fully managed, works fine on Microsoft .Net - at the moment, there seems to be a regression in Mono's SoundPlayer that causes distortion.
Outdated:
System.Diagnostics.Process.Start("fullPath.mp3");
I am surprised but the method Dinah mentioned actually works. However, I was thinking about playing short "jingle" sounds on various events occurring in the program, I don't want to launch user's media player each time I need to do a 'ping!' sound.
As for the code project link - this is unfortunately only a P/Invoke wrapper.
I neither want to distribute any
binaries with my application nor
depend on P/Invoke, as the project
should run at least on Windows and
Linux. I'm fine with bundling .Net
assemblies as long as they are
license-compatible with GPL.
Unfortunatly its going to be impossible to avoid distributing binaries, or avoid P/Invoke. The .net class libraries use P/Invoke underneath anyway, the managed code has to communicate with the unmanage operating system API at some point, in order to do anything.
Converting the OGG file to PCM should be possible in Managed code, but because there is no Native Support for Audio in .net, you really have 3 options:
Call an external program to play the sound (as suggested earlier)
P/Invoke a C module to play the sound
P/Invoke the OS APIs to play the sound.
(4.) If you're only running this code on windows you could probably just use DirectShow.
P/Invoke can be used in a cross platform way
http://www.mono-project.com/Interop_with_Native_Libraries#Library_Names
Once you have your PCM data (using a OGG C Lib or Managed Code, something like this http://www.robburke.net/mle/mp3sharp/ of course there are licencing issues with MP3), you will need a way to play it, unfortunatly .net does not provide any direct assess to your sound card or methods to play streaming audio. You could convert the ogg files to PCM at startup, and then use System.Media.SoundPlayer, to play the wav files generated. The current method Microsoft suggests uses P/Invoke to access Sound playing API in the OS http://msdn.microsoft.com/en-us/library/ms229685.aspx
A cross platform API to play PCM sound is OpenAL and you should be able to play (PCM) sound using the c# bindings for OpenAL at www.taoframework.com, you will unfortunatly need to copy a number of DLL and .so files with your application in order for it to work when distributed, but this is, as i've explained earlier unavoidable.
Calling something which is located in 'System.Diagnostics' to play a sound looks like a pretty bad idea to me. Here is what that function is meant for:
//
// Summary:
// Starts a process resource by specifying the name of a document or application
// file and associates the resource with a new System.Diagnostics.Process component.
//
// Parameters:
// fileName:
// The name of a document or application file to run in the process.
//
// Returns:
// A new System.Diagnostics.Process component that is associated with the process
// resource, or null, if no process resource is started (for example, if an
// existing process is reused).
//
// Exceptions:
// System.ComponentModel.Win32Exception:
// There was an error in opening the associated file.
//
// System.ObjectDisposedException:
// The process object has already been disposed.
//
// System.IO.FileNotFoundException:
// The PATH environment variable has a string containing quotes.
i think you should have a look a fmod, which is the mother of all audio api
please feel free to dream about http://www.fmod.org/index.php/download#FMODExProgrammersAPI
The XNA Audio APIs work well in .net/c# applications, and work beautifully for this application. Event-based triggering, along with concurent playback of multiple sounds. Exactly what you want. Oh, and compression as well.
Well, it depends on a patent-related laws in a given country, but there is no way to write a mp3 decoder without violating patents, as far as i know. I think the best cross-platform, open source solution for your problem is GStreamer. It has c# bindings, which evolve rapidly. Using and building GStreamer on Windows is not an easy task however. Here is a good starting point. Banshee project uses this approach, but it is not really usable on windows yet (however, there are some almost-working nightly builds). FMOD is also a good alternative. Unfortunately, it is not open source and i find that its API is somehow C-styled.
There is a pure C# vorbis decoder available that is open source:
http://anonsvn.mono-project.com/viewvc/trunk/csvorbis/
Not sure if this is still relevant. Simplest solution would be to use NAudio, which is a managed open source audio API written in C#. Another thing to try would be utilizing ffmpeg, and creating a process to ffplay.exe (the right binaries are under shared builds).
There is no way for you to do this without using something else for your play handling.
Using the System.Diagnostic will launch an external software and I doubt you want that, right? You just want X sound file to play in the background when Y happens in your program, right?
Voted up because it looks like an interesting question. :D

Categories