I have a byte array containing an MP3 stream.
Is it correct to assume that this stream would have to be further decoded if I want to be able convert to a WAV?
In its current byte state, is it possible to do basic functionality such as get/set position (time-wise)?
Yeah, MP3 files are very different from WAV files. WAV files contain raw audio data in the form of samples from beginning to end to paint the waveform of the output, the same way a bitmap file contains raw data about pixels from left to right, top to bottom. You can think of a WAV file as a bitmap picture of sound waves -- but rather than pixel colors, it stores audio intensities, typically 44,100 of them per second, for two channels if it's stereo, and 2 bytes per channel.
(Knowing this you can actually calculate the file size of a WAV file -- to store 1 minute of audio, you'd need 60 seconds * 44100 samples * 2 channels * 2 bytes = 10.09MB.)
MP3 files contain a mathematically modified version of this image and discards audio that humans can't hear. It works similarly to how jpeg images work to compress images.
Just as video cards ultimately need bitmaps to work with, sound cards ultimately need WAV data to work with -- so yes, you need a decoder.
At the beginning of Mp3 files is a block of data called an ID3 tag, which contains a bunch of basic information about the file -- artist names, track length, album names, stuff like that. You can use something like C# ID3 to read/write ID3 tags in C#.
As for audio itself, I'm not sure there are Mp3 decoders written entirely in C#. Technically there's no reason that it can't be done (it should be fine performance wise too), but the standard is pretty loose and the math is intense so people tend to just use things like FFMpeg to decode. Some ideas in this Google search.
If you don't need to do any special processing and you just want to play the audio, you can use the WPF/Silverlight Media element.
You can probably get some hints out of Josh Smith's Podder app.
NAudio is an open source .NET library that can read MP3 files.
To convert MP3 to WAV, use code similar to the following:
Stream inputStream = ...;
Stream outputStream = ...;
using (WaveStream waveStream = WaveFormatConversionStream.CreatePcmStream(new Mp3FileReader(inputStream)))
using (WaveFileWriter waveFileWriter = new WaveFileWriter(outputStream, waveStream.WaveFormat))
{
byte[] bytes = new byte[waveStream.Length];
waveStream.Read(bytes, 0, waveStream.Length);
waveFileWriter.WriteData(bytes, 0, bytes.Length);
waveFileWriter.Flush();
}
As per #Rei Miyasaka's answer, there is an MP3 decoder written in C#. Open source, too. Check out Mp3Sharp.
You can use http://sourceforge.net/projects/mpg123net/ to decode your mp3 in byte[] and further use decoded PCM for your liking.
Related
Doing my first steps in Audio prog and using NAudio, I'm trying to have a simple app that grabs a WAV file and getting 20ms of audio data each time till EOF. However I'm getting a bit confused with the buffer arrays and probably conversions.
Is there a simple way someone can post in here?
Moreover I got confused with the following:
When using AudioFileReader readertest = new AudioFileReader(fileName) I'm getting different metadata like bitrate of 32 and length of ~700000.
However, when using the NAudio - WaveFileReader file1 = new WaveFileReader(fileName) I'm getting half values for the same audio file (bitrate = 16, length = ~350000). Also the encoding for the first is "IEEEFloat" while the latter is "PCM". Any explanations...?
Thanks v much!
AudioFileReader is a wrapper around WaveFileReader (and supports several other file types), and auto-converts to IEEE float for you. If you want to read the audio directly into a byte array in whatever format it is in the WAV file, then you should just use WaveFileReader.
I am using Managed Media Aggregation in C# - https://net7mma.codeplex.com/.
I have a Rtsp Client that receives RTP Frames encoded in h264 (payload type 96).
I want to be able to save the frames into a video file, and also be bale to tell when the video starts\ends.
I did some reading and I read that its a problem to decode h264 frames one-by-one.. didn't really understand why.
Here is the method that is raised for each RTP frame that I receive
void Client_RtpFrameChanged(object sender, Media.Rtp.RtpFrame frame)
{
// Decode
}
Can someone explain why its a problem to decode h264 frames one-by-one?
Is there a open source/library/dll for this?
Thanks a lot!
There is an included class in the RtspServer project.
The class is RFC6184Media, it contains methods for packetization and depacketiation and handles all defined Nal Unit types.
After you call Depacketize there is a Buffer which contains the Raw Bit Stream Payload, you will have to add a start code consisting of 0x000001 and then the data contained in the raw bitstream.
There are several examples in the Discussion area for the project.
After that you can feed the stream to a decoder for decoding and only then can the frames can be displayed; usually by conversion from Yuv to Rgb respective to the sub sampling used when encoding.
I can see about adding a small demo for a few static packets which corresponds to a frame and show how to achieve the desired result.
In the future if you make a discussion on the project page I will probably get to it much quicker.
I am an audio noob
I am looking to embed audio in an html page by passing the data as a string such as
< Audio src="data:audio/wav;base64,AA....." />
doing that works, but I need to raise the volume. I tried working with NAudio but it seems like it does some conversion and it will no longer play. This is the code I use to raise the volume:
public string ConvertToString(Stream audioStream)
{
audioStream.Seek(0,SeekOrigin.Begin);
byte[] bytes = new byte[audioStream.Length];
audioStream.Read(bytes,0,(int)audioStream.Length);
audioStream.Seek(0,SeekOrigin.Begin);
return Convert.ToBase64String(bytes);
}
var fReader = new WaveFileReader(strm);
var chan32 = new WaveChannel32(fReader,50.0F,0F);
var ouputString = "data:audio/wav;base64," + ConvertToString(chan32);
but when I put outputString into an audio tag it fails to play. What type of transformation does NAudio do, and how can I get it ton give me the audio stream in such a way that I can serialize it and the browser will be able to play it?
Or for another suggestion: if NAudio to heavyweight for something as simple as raising the volume what's a better option for me?
I'm no expert in embedding WAV files in web pages (and to be honest it doesn't seem like a good idea - WAV is one of the most inefficient ways of delivering sound to a web page), but I'd expect that the entier WAV file, including headers needs to be encoded. You are just writing out the sample data. So with NAudio you'd need to use a WaveFileWriter writing into a MemoryStream or a temporary file to create a volume adjusted WAV file that can be written to your page.
There are two additional problems with your code. One is that you have gone to 32 bit floating point, making the WAV file format even more inefficent (doubling the size of the original file). You need to use the Wave32To16Stream to go back to 16 bit before creating the WAV file. The second is that you are multiplying each sample by 50. This will almost certainly horribly distort the signal. Clipping can very easily occur when amplifying a WAV file, and it depends on how much headroom there is in the original recording. Often dynamic range compression is a better option than simply increasing the volume.
I have recorded audio to a CaptureBuffer, but I can't figure out how to save it into wav file. I have tried this (http://www.tek-tips.com/faqs.cfm?fid=4782), but it didn't work, or I didn't use it properly. Does anybody know how to solve this? Sample code would be very appreciated.
A WAV file is a RIFF file consisting of two main "chunks". The first is a format chunk, describing the format of the audio. This will be what sample rate you recorded at (e.g. 44.1kHz), what bit depth (e.g. 16 bits) and how many channels (mono or stereo). WAV also supports compressed formats but it is unlikely you recorded the audio compressed, so your record buffer will contain PCM audio.
Then there is the data chunk. This is the part of the WAV file that contains the actual audio data in your capture buffer. This must be in the format described in the format chunk.
As part of the NAudio project I created a WaveFileWriter class to simplify creating WAV files. You pass in a WaveFormat that describes the format of your captured audio. Then you can simply write the raw captured data in.
Here's some simple example code for how you might use WaveFileWriter:
WaveFormat format = new WaveFormat(16000, 16, 1); // for mono 16 bit audio captured at 16kHz
using (var writer = new WaveFileWriter("out.wav", format)
{
writer.WriteData(captureBuffer, 0, captureBuffer.Length);
}
I am trying to merge two mp3 files together from a specific point time in the first mp3 file to a specific point in time for the 2nd mp3 (in C#)
When I say specific point in time, I want to copy everything from the first mp3 file 10 sec after it has played and then the entire mp3. Then I want to merge this together with the 2nd mp3 first 20 seconds. How do I do this?
Just to merge the two files together I am doing as follows:
using (var fs = File.OpenWrite("combined.mp3"))
{
var buffer = File.ReadAllBytes("test1.mp3");
fs.Write(buffer, 0, buffer.Length);
buffer = File.ReadAllBytes("test.mp3");
fs.Write(buffer, 0, buffer.Length); fs.Flush();
}
The above code was found somewhere here on Stackoverflow. I know that I am not removing the header from the 2nd file, but this kinda works anyways. (If you can tell me how to remove the header this will be appreciated).
Is there a way to find out how many bytes each second is (or how many frames) in the mp3 files?
Due to the format of my mp3 files I cannot use NAudio.NET. NAudio gives me an error when I try to play these on these mp3 files (Not a recognised MP3 header).
1) It is desirable to learn at least basics of MP3 file structrure and legal MP3 file tags (ID3v1/v2 and possibly Xing VBR Header).
Look for example at the following links:
http://www.mpgedit.org/mpgedit/mpeg_format/mpeghdr.htm
http://www.codeproject.com/KB/audio-video/mpegaudioinfo.aspx
http://www.developerfusion.com/code/4684/read-mp3-tag-information-id3v1-and-id3v2/
http://www.id3.org/
2) To count exact duration of continuous part in a MP3 file you will need to count number of MP3 frames in this part. Duration of a MP3 frame is always 1152 samples for MPEG1 Layer 3, and 576 samples for other versions of MPEG.
3) To position to a specific point in a MP3 file you can use one of the following:
If the file has a small size you can simply count frames from begginning of MP3 data.
If you deal with a CBR file you can simply calculate offset of the desired position.
If you deal with a VBR file and the file has a VBR header, you can get from the header a starting position and count frames to the desired position.
"If you can tell me how to remove the header this will be appreciated" - you need to cut off ID3 tags and probably VBR header.
"Is there a way to find out how many bytes each second is (or how many frames) in the mp3 files?" - for CBR this can be easily calculated from bitrate.
Note, you should perform all operations on the frame boundaries.