Sound clipping/clicking when lowering volume with NAudio - c#

Audio is clipping (or clicking) when trying to lower the volume of a WAV file in real time.
I've tried it on a SampleChannel, VolumeSampleProvider and WaveChannel32 instance, the source being a 32bit WAV file.
If I try it on a WaveOut instance, the clipping doesn't occur anymore, but I don't want that because it lowers the volume of all sounds in the application.
And this only happens when I lower volume, rising it doesn't cause clipping.
Is this a known problem or am I supposed to approach this differently?
Note: this is how the volume drops in real time over the given time span:
0.9523049
0.9246111
0.9199954
0.89384
0.8676848
0.8415294
0.8169126
0.7907572
0.7646018
0.739985
0.7122912
0.6892129
0.6630576
0.6369023
0.6122856
0.5861301
0.5599748
0.535358
0.5092026
0.4830474
0.456892
0.4322752
0.4061199
0.3799645
0.3553477
0.3276539
0.3030371
0.2784202
0.2522649
0.2261095
0.2014928
0.176876
0.149182
0.1245652
0.09841
0.07225461
0.04763785
0.02148246
0

Apparently it is a DesiredLatency and NumberOfBuffers issue on the WaveOut instance. The default values cause the problem, altered values fix it.
These are the values I used to fix the issue.
WaveOutDevice = new WaveOut(WaveCallbackInfo.NewWindow())
{
DesiredLatency = 700, //! adjust the value to prevent sound clipping/clicking
NumberOfBuffers = 3, //! number of buffers must be carefully selected to prevent audio clipping/clicking
};

Related

IMFTransform::ProcessInput not returning samples before over 25 have been submitted

A bit at a loss attempting to decode an H264 stream using Windows Media Foundation and IMF.
I found that even with an H264 stream defining a max_num_ref_frames of 2, IMFTransform will require to be input over 25 frames before it output anything, and IMFTransfer::Input will return each time MF_E_TRANSFORM_NEED_MORE_INPUT.
Should I drain the decoder, then they all come out ; but the problem with that is that it means it won't be able to decode any new samples between the last one returned and the next keyframe.
With a max_num_ref_frames of 2 (for info:
max_num_ref_frames specifies the maximum number of short-term and
long-term reference frames, complementary reference field pairs,
and non-paired reference fields that may be used by the decoding
process for inter prediction of any picture in the
sequence.)
The decoder would only need to buffer a maximum of 2 frames. And regardless, the maximum that could ever be required with H264 is 16 frames.
Setting the IMFTransform with CODECAPI_AVLowLatencyMode property (https://msdn.microsoft.com/en-us/library/windows/desktop/hh447590%28v=vs.85%29.aspx) will help ; but this is only available on Windows 8 and isn't a generic solution as it can't handle B-frames properly).
Any suggestions ?
TIA

libVLCNet player positioning (without showing the movie begining) when opening new video

I'm looking at libvlcnet 0.4.0.0 "SimplePlayer" example from http://sourceforge.net/p/libvlcnet/wiki/Home/ and I want to ask if is it possible to to open new file and play it from predefined position without needing to play the start of the movie? I use something like this:
LibVlcInterop.libvlc_media_player_play(descriptor);
LibVlcInterop.libvlc_media_player_pause(descriptor);
LibVlcInterop.libvlc_media_player_set_position(descriptor, (float)0.8);
int res = LibVlcInterop.libvlc_media_player_play(descriptor);
When trying to play new file user can notice small fraction of the beginning of the movie.
How can I position player to particular area after I load new file without showing small portion of the beginning of movie?
I don't know about that particular library, but you can do this generally by passing "media options" before you play the media. To do this use the LibVLC API function libvlc_media_add_option.
If you can do this in that library, then you can specify a start time and/or an end time - but it has to be specified as seconds rather than as a percentage position.
The options you would pass to the API function would be:
:start-time=30
:stop-time=60.5
You can actually pass fractional seconds as shown in the stop-time example.
When I do this, I do not notice any flash of content showing the beginning of the movie but I suppose that still might happen on some platforms or with some types of media.

Raise wave file volume with NAudio

I am an audio noob
I am looking to embed audio in an html page by passing the data as a string such as
< Audio src="data:audio/wav;base64,AA....." />
doing that works, but I need to raise the volume. I tried working with NAudio but it seems like it does some conversion and it will no longer play. This is the code I use to raise the volume:
public string ConvertToString(Stream audioStream)
{
audioStream.Seek(0,SeekOrigin.Begin);
byte[] bytes = new byte[audioStream.Length];
audioStream.Read(bytes,0,(int)audioStream.Length);
audioStream.Seek(0,SeekOrigin.Begin);
return Convert.ToBase64String(bytes);
}
var fReader = new WaveFileReader(strm);
var chan32 = new WaveChannel32(fReader,50.0F,0F);
var ouputString = "data:audio/wav;base64," + ConvertToString(chan32);
but when I put outputString into an audio tag it fails to play. What type of transformation does NAudio do, and how can I get it ton give me the audio stream in such a way that I can serialize it and the browser will be able to play it?
Or for another suggestion: if NAudio to heavyweight for something as simple as raising the volume what's a better option for me?
I'm no expert in embedding WAV files in web pages (and to be honest it doesn't seem like a good idea - WAV is one of the most inefficient ways of delivering sound to a web page), but I'd expect that the entier WAV file, including headers needs to be encoded. You are just writing out the sample data. So with NAudio you'd need to use a WaveFileWriter writing into a MemoryStream or a temporary file to create a volume adjusted WAV file that can be written to your page.
There are two additional problems with your code. One is that you have gone to 32 bit floating point, making the WAV file format even more inefficent (doubling the size of the original file). You need to use the Wave32To16Stream to go back to 16 bit before creating the WAV file. The second is that you are multiplying each sample by 50. This will almost certainly horribly distort the signal. Clipping can very easily occur when amplifying a WAV file, and it depends on how much headroom there is in the original recording. Often dynamic range compression is a better option than simply increasing the volume.

Parsing a MIDI file for Note On only

I've been trying to figure out the mystical realm of MIDI parsing, and I'm having no luck. All I'm trying to do is get the note value (60 = C4, 72 = C5, etc), in order of when they occur.
My code is as follows. All it does is very simply open a file as a byte array and read everything out as hex:
byte[] MIDI = File.ReadAllBytes("TestMIDI.mid");
foreach (var element in MIDI) {
string b = Convert.ToString(element,16);
Debug.WriteLine(b);
}
All TestMIDI.mid contains is one note on C5. Here's a hex dump of it. Using this info, I'm trying to find the simple hex value for Note On (0x9, or just 9 in the dump), but there aren't any. I can find a few 72's, but there are 3, which doesn't make any sense to me (note on, note off, then what?).
This is my first attempt at parsing MIDI as a file and using hex dumps (are they even called that?), so I'm sorry if I'm heading in the complete wrong direction. All I need is to get the note that plays, and in what order. I don't need timing or anything fancy at all. The reason behind this, if it matters - is to then generate new code in a different language to be played out of a speaker, very similar to the beep command on *nix. Because of this, I don't want to use any frameworks that 1) I didn't program, and really didn't learn anything and 2) do far more than what I need, making the framework heavier than the actual code by me.
Accepted answer is not a solution for the problem. It will not work in common case. I'll provide several cases where this code either will not work or will fail. Order of these cases corresponds their probability - most probable cases go first.
False positives. MIDI files contain a lot of data structures where you can find a byte with the value 144. And these structures are not Note On events. For real MIDI files you'll get bunch of "notes" that are not notes but random values within the file.
Channels other than 0. Most of the modern MIDI files contain several track chunks. Each one holds events for the specific MIDI channel (from 0 to 15). 144 (or 90 in hex) represents a Note On event for the channel 0. So you are going to miss a lot of Note On events for other channels.
Running status. MIDI files actively use concept of running status. This technique allows don't store status bytes of consecutive events of the same type. It means that status byte 144 can be written only once for the first Note On event and you will not find it further in the file.
144 is the last byte in a file. MIDI file can end with this value. For example if a custom chunk is the last chunk in the file or track chunk doesn't end with End of Track event (which is corruption according to MIDI file specification but possible scenario in real world). In this case you' ll get IndexOutOfRangeException on MIDI[i+1].
Thus, you should never search for specific value to find some semantic data structure in a MIDI file. You must use one of the .NET libraries available on the Internet. For example, with the DryWetMIDI you can use this code:
IEnumerable<Note> notes = MidiFile.Read(filePath)
.GetNotes();
To do this right, you'll need at least some semblance of a MIDI parser. Searching through 0x9 events is a good start, but 0x9 is also a Note-Off event if the velocity field is 0. 0x9 can also be present inside other events (meta events, MPQN events, delta times, etc), so you'll get false positives. So, you need something that actually knows the MIDI file format to do this accurately.
Look for a library, write your own, or port an open-source one. Mine is in Java if you want to look.

How to configure DirectSound's MaxSampleRate above 20000

I'm programming small program to output generated sound.
My sound card is capable of a 48000 or even 192000 sample rate. Its a Realtek ALC883 7.1+2 Channel High Definition Audio, and the specs can be found here.
However, DirectSound's MaxSampleRate has a maximum value of 20000?
I know I can get better than the maximum from my sound card, but how do I configure DirectSound to take advantage of this? When I try the following:
DirectSound ds = new DirectSound(DirectSound.GetDevices().First().DriverGuid);
MessageBox.Show(ds.Capabilities
.MaxSecondarySampleRate
.ToString(CultureInfo.InvariantCulture));
In message box the number displayed is "20000".
It could be that your sound card is not the first device in the devices list (for example, a video card with a tv outpout would appear in the list). You should look at the DeviceInformation.Description property. Otherwise, maybe a problem with the driver ?

Categories