I am simply trying to take any or first frame from an mp3 file and then decompress it.
internal void Read3(string location) //take in file location
{
Mp3FileReader mp3 = new Mp3FileReader(location); //make a Mp3FileReader
SECTIONA: //to jump back here when needed.
byte[] _passedBuffer = Decompress(mp3.ReadNextFrame()); //passed the decompressed byte array here.
int jump_size = mp3.WaveFormat.Channels *2; //just to get how many bytes to skip
for (int i = 0; i < _passedBuffer.Length; i += jump_size)
{
short FinalSample = BitConverter.ToInt16(_passedBuffer, i);
if (jump_size == 4) //converting the bytes to Int16,nothing special here.
{
FinalSample = (short)(((BitConverter.ToInt16(_passedBuffer, i + 2)) + FinalSample) / 2);
}
Console.Write(FinalSample+"|"); //and writing it down to Console.
}
Console.WriteLine("Frames are Written,continue to next frame?");
if (Convert.ToChar(Console.Read()) == 'y') //asking to go on or not.
{ goto SECTIONA; }
}
private byte[] Decompress(Mp3Frame fm)
{
var buffer = new byte[16384 * 4]; //big enough buffer size
WaveFormat wf = new Mp3WaveFormat(fm.SampleRate, fm.ChannelMode == ChannelMode.Mono ? 1 : 2, fm.FrameLength, fm.BitRate); //creating a new WaveFormat
IMp3FrameDecompressor decompressor = new AcmMp3FrameDecompressor(wf); //passing in to AcmMp3FrameDecompressor.
decompressor.DecompressFrame(fm, buffer, 0); //running the DecompressFrame method and then passing back the buffer.
return buffer;
}
Now the Mp3FileReader is reading the Frame correctly as I checked the Frame's RawData. Now I am trying to decompress that Frame and then convert its PCM data into Int16 in that only For Loop but every Int16 FinalSample value is returning 0.
I know that just using Mp3FileReader.Read(Buffer,Offset,Length) will get the job done but for all the frames so:
how do I do it for just one frame?
what is wrong with my code because of which I am getting all zeros?
I know that RawData is ok, so something must be wrong with Decompress method, How do I setup a decompressor for mp3 file?
You can use the AcmMp3FrameDecompressor or DmoMp3FrameDecompressor to decompress individual MP3 frames.
You need to check the return value of Decompress to see how much data was returned. It's possible for the first frame not to return anything.
You should also create a single frame decompressor and use it for all frames in the MP3 file, as it retains state between calls.
Related
I'm trying to play opus audio files from web which I try to buffer with a MemoryStream. I'm aware of NAudio's ability to take urls however I need to set cookies and user agent before I access the file so this eliminates that option.
My latest approach was to buffer 30~ seconds of stream, feed it to StreamMediaFoundationReader and write to the same MemoryStream when needed, however NAudio ends up playing the initial buffered segment and stop after that is completed. What would be the correct approach for this?
Here's my current code if needed. (I have no idea how audio streaming works so please go easy on me)
bufstr.setReadStream(httpreq.GetResponse().GetResponseStream()); //bufstr is a custom class which creates a memorystream I can write to.
StreamMediaFoundationReader streamread = new StreamMediaFoundationReader(bufstr.getStream());
bufstr.readToStream(); //get the initial 30~ seconds of content
waveOut.Init(streamread);
waveOut.Play();
int seconds = 0;
while(waveOut.PlaybackState == PlaybackState.Playing) {
Thread.Sleep(1000);
seconds++;
if (secs % 30 > 15) bufstr.readToStream();
}
bufstr's readToStream method
public void readToStream() {
int prevbufcount = totalbuffered; //I keep track of how many bytes I fetched from remote url.
while (stream.CanRead && prevbufcount + (30 * (this.bitrate / 8)) > totalbuffered && totalbuffered != contentlength) { //read around 30 seconds of content;
Console.Write($"Caching {prevbufcount + (30 * (this.bitrate / 8))}/{totalbuffered}\r");
byte[] buf = new byte[1024];
bufferedcount = stream.Read(buf, 0, 1024);
totalbuffered += bufferedcount;
memorystream.Write(buf, 0, bufferedcount);
}
}
While debugging I found out that content length I get from server does not match with the actual size of stream, so I ended up calculating content length via other details I get from server.
The issue might also be a race condition due to the fact that it stopped after I manually kept track of where I write on memory stream
I am writing an application that needs to get the raw waveform data of an audio file so I can render it in an application (C#/.NET). I am using ffmpeg to offload this task but it looks like ffmpeg can only output the waveform data as a png or as a stream to gnuplot.
I have looked at other libraries to do this (NAudio/CSCore) however they require windows/microsoft media foundation and since this app is going to be deployed to azure as a web app I can not use them.
My strategy was to just read the waveform data from the png itself but this seems hacky and over the top. The ideal output would be a fix sampled series of peaks in an array where each value in the array is the peak value (ranging from 1-100 or something, like this for example).
Sabona budi,
Wrote about the manual way to get waveform but then to show you an example, I found this code which does what you want (or at the least, you can learn something from it).
1) Use FFmpeg to get array of samples
Try the example code shown here : http://blog.wudilabs.org/entry/c3d357ed/?lang=en-US
Experiment with it, try tweaking with suggestions from manual etc... In that shown code just change string path to point to your own file-path. Edit the proc.StartInfo.Arguments section to replace the last section to look like:
proc.StartInfo.Arguments = "-i \"" + path + "\" -vn -ac 1 -filter:a aresample=myNum -map 0:a -c:a pcm_s16le -f data -";
That myNum from the part aresample=myNum is calculated by :
44100 * total Seconds = X.
myNum = X / WaveForm Width.
Finally use the ProcessBuffer function with this logic :
static void ProcessBuffer(byte[] buffer, int length)
{
float val; //amplitude value of a sample
int index = 0; //position within sample bytes
int slicePos = 0; //horizontal (X-axis) position for pixels of next slice
while (index < length)
{
val = BitConverter.ToInt16(buffer, index);
index += sizeof(short);
// use number in va to do something...
// eg: Draw a line on canvas for part of waveform's pixels
// eg: myBitmap.SetPixel(slicePos, val, Color.Green);
slicePos++;
}
}
If you want to do it manually without FFmpeg. You could try...
2) Decode audio to PCM
You could just load the audio file (mp3) into your app and first decode that to PCM (ie: raw digital audio). Then read just the PCM numbers to make the waveform. Don't read numbers directly from bytes of compression math like MP3.
These PCM data values (about audio amplitudes) go into a byte array. If your sound is 16-bit then you extract the PCM value by reading each sample as a short (ie: getting value of two consecutive bytes at once since 16 bits == 2 bytes length).
Basically when you have 16-bit audio PCM inside a byte array, every two bytes represents an audio sample's amplitude value. This value becomes your height (loudness) at each slice. A slice is a 1-pixel vertical line from a time in the waveform.
Now sample rate means how many samples per-second. Usually 44100 samples (44.1 khz). You can see that using 44 thousand pixels to represent one second of sound is not feasible, so divide total seconds by required waveform width. Take the result & multiply by 2 (to cover two bytes) and that is how you much you jump-&-sample the amplitudes as you form the waveform. Do this in a while loop.
You can use the function described in this tutorial to get the raw data decoded from an audio file as an array of double values.
Summarizing from the link:
The function decode_audio_file takes 4 parameters:
path: the path of the file to decode
sample_rate: the desired sample rate for the output data
data: a pointer to a pointer to double precision values, where the extracted data will be stored
size: a pointer to the length of the final extracted values array (number of samples)
It returns 0 upon success, and -1 in case of failure, assorted with error message written to the stderr stream.
The function code is below:
#include <stdio.h>
#include <stdlib.h>
#include <libavutil/opt.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswresample/swresample.h>
int decode_audio_file(const char* path, const int sample_rate, double** data, int* size) {
// initialize all muxers, demuxers and protocols for libavformat
// (does nothing if called twice during the course of one program execution)
av_register_all();
// get format from audio file
AVFormatContext* format = avformat_alloc_context();
if (avformat_open_input(&format, path, NULL, NULL) != 0) {
fprintf(stderr, "Could not open file '%s'\n", path);
return -1;
}
if (avformat_find_stream_info(format, NULL) < 0) {
fprintf(stderr, "Could not retrieve stream info from file '%s'\n", path);
return -1;
}
// Find the index of the first audio stream
int stream_index =- 1;
for (int i=0; i<format->nb_streams; i++) {
if (format->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO) {
stream_index = i;
break;
}
}
if (stream_index == -1) {
fprintf(stderr, "Could not retrieve audio stream from file '%s'\n", path);
return -1;
}
AVStream* stream = format->streams[stream_index];
// find & open codec
AVCodecContext* codec = stream->codec;
if (avcodec_open2(codec, avcodec_find_decoder(codec->codec_id), NULL) < 0) {
fprintf(stderr, "Failed to open decoder for stream #%u in file '%s'\n", stream_index, path);
return -1;
}
// prepare resampler
struct SwrContext* swr = swr_alloc();
av_opt_set_int(swr, "in_channel_count", codec->channels, 0);
av_opt_set_int(swr, "out_channel_count", 1, 0);
av_opt_set_int(swr, "in_channel_layout", codec->channel_layout, 0);
av_opt_set_int(swr, "out_channel_layout", AV_CH_LAYOUT_MONO, 0);
av_opt_set_int(swr, "in_sample_rate", codec->sample_rate, 0);
av_opt_set_int(swr, "out_sample_rate", sample_rate, 0);
av_opt_set_sample_fmt(swr, "in_sample_fmt", codec->sample_fmt, 0);
av_opt_set_sample_fmt(swr, "out_sample_fmt", AV_SAMPLE_FMT_DBL, 0);
swr_init(swr);
if (!swr_is_initialized(swr)) {
fprintf(stderr, "Resampler has not been properly initialized\n");
return -1;
}
// prepare to read data
AVPacket packet;
av_init_packet(&packet);
AVFrame* frame = av_frame_alloc();
if (!frame) {
fprintf(stderr, "Error allocating the frame\n");
return -1;
}
// iterate through frames
*data = NULL;
*size = 0;
while (av_read_frame(format, &packet) >= 0) {
// decode one frame
int gotFrame;
if (avcodec_decode_audio4(codec, frame, &gotFrame, &packet) < 0) {
break;
}
if (!gotFrame) {
continue;
}
// resample frames
double* buffer;
av_samples_alloc((uint8_t**) &buffer, NULL, 1, frame->nb_samples, AV_SAMPLE_FMT_DBL, 0);
int frame_count = swr_convert(swr, (uint8_t**) &buffer, frame->nb_samples, (const uint8_t**) frame->data, frame->nb_samples);
// append resampled frames to data
*data = (double*) realloc(*data, (*size + frame->nb_samples) * sizeof(double));
memcpy(*data + *size, buffer, frame_count * sizeof(double));
*size += frame_count;
}
// clean up
av_frame_free(&frame);
swr_free(&swr);
avcodec_close(codec);
avformat_free_context(format);
// success
return 0;
}
You will need the following flags to compile a program that uses : -lavcodec-ffmpeg -lavformat-ffmpeg -lavutil -lswresample
Depending on your system and installation, it could also be: -lavcodec -lavformat -lavutil -lswresample
and its usage is below:
int main(int argc, char const *argv[]) {
// check parameters
if (argc < 2) {
fprintf(stderr, "Please provide the path to an audio file as first command-line argument.\n");
return -1;
}
// decode data
int sample_rate = 44100;
double* data;
int size;
if (decode_audio_file(argv[1], sample_rate, &data, &size) != 0) {
return -1;
}
// sum data
double sum = 0.0;
for (int i=0; i<size; ++i) {
sum += data[i];
}
// display result and exit cleanly
printf("sum is %f", sum);
free(data);
return 0;
}
I am using C# in Universal Windows App to write a Watson Speech-to-text service.
For now instead of using the Watson service, I write to the file and then read it in the Audacity to confirm it is in the right format since Watson service wasn't returning correct responses to me, and the following explains why.
For some reason when I create 16-bit PCM encoding properties, and read buffer, I am only able to read data as 32-bit PCM, and it's working well, but if I read it in 16-bit PCM it is in slow motion, and all the speech is basically corrupted.
I don't really know what exactly needs to be done to convert from 32-bit to 16-bit, but here's what I have in my C# application:
//Creating PCM Encoding properties
var pcmEncoding = AudioEncodingProperties.CreatePcm(16000, 1, 16);
var result = await AudioGraph.CreateAsync(
new AudioGraphSettings(AudioRenderCategory.Speech)
{
DesiredRenderDeviceAudioProcessing = AudioProcessing.Raw,
AudioRenderCategory = AudioRenderCategory.Speech,
EncodingProperties = pcmEncoding
}
);
graph = result.Graph;
//Initialize microphone
var microphone = await DeviceInformation.CreateFromIdAsync(MediaDevice.GetDefaultAudioCaptureId(AudioDeviceRole.Default));
var micInputResult = await graph.CreateDeviceInputNodeAsync(MediaCategory.Speech, pcmEncoding, microphone);
//Create frame output node
frameOutputNode = graph.CreateFrameOutputNode(pcmEncoding);
//Callback function to fire when buffer is filled with data
graph.QuantumProcessed += (s, a) => ProcessFrameOutput(frameOutputNode.GetFrame());
frameOutputNode.Start();
//Make the microphone write into the frame node
micInputResult.DeviceInputNode.AddOutgoingConnection(frameOutputNode);
micInputResult.DeviceInputNode.Start();
graph.Start();
Initialization step is done at this stage. Now, actually reading from the buffer and writing to the file is only working if I use 32-bit PCM encoding with the following function (commented out is the PCM 16-bit code that is resulting in a slow motion speech output):
private void ProcessFrameOutput(AudioFrame frame)
{
//Making a copy of the audio frame buffer
var audioBuffer = frame.LockBuffer(AudioBufferAccessMode.Read);
var buffer = Windows.Storage.Streams.Buffer.CreateCopyFromMemoryBuffer(audioBuffer);
buffer.Length = audioBuffer.Length;
using (var dataReader = DataReader.FromBuffer(buffer))
{
dataReader.ByteOrder = ByteOrder.LittleEndian;
byte[] byteData = new byte[buffer.Length];
int pos = 0;
while (dataReader.UnconsumedBufferLength > 0)
{
/*Reading Float -> Int 32*/
/*With this code I can import raw wav file into the Audacity
using Signed 32-bit PCM Encoding, and it is working well*/
var singleTmp = dataReader.ReadSingle();
var int32Tmp = (Int32)(singleTmp * Int32.MaxValue);
byte[] chunkBytes = BitConverter.GetBytes(int32Tmp);
byteData[pos++] = chunkBytes[0];
byteData[pos++] = chunkBytes[1];
byteData[pos++] = chunkBytes[2];
byteData[pos++] = chunkBytes[3];
/*Reading Float -> Int 16 (Slow Motion)*/
/*With this code I can import raw wav file into the Audacity
using Signed 16-bit PCM Encoding, but when I play it, it's in
a slow motion*/
//var singleTmp = dataReader.ReadSingle();
//var int16Tmp = (Int16)(singleTmp * Int16.MaxValue);
//byte[] chunkBytes = BitConverter.GetBytes(int16Tmp);
//byteData[pos++] = chunkBytes[0];
//byteData[pos++] = chunkBytes[1];
}
WriteBytesToFile(byteData);
}
}
Can anyone think of a reason why this is happening? Is it because Int32 PCM is larger in size and when I use Int16, it extends it and makes the sound longer? Or am I not sampling it properly?
Note: I tried reading Bytes directly from the buffer, and then using that as a raw data, but it's not encoded as PCM that way.
Reading Int16/32 from the buffer directly also doesn't work.
In the above example I am only using Frame Output node. IF I create a file output node that automatically writes to the raw file, it works really well as 16-bit PCM, so something is wrong in my callback function that causes it to be in a slow motion.
Thanks
//Creating PCM Encoding properties
var pcmEncoding = AudioEncodingProperties.CreatePcm(16000, 1, 16);
var result = await AudioGraph.CreateAsync(
new AudioGraphSettings(AudioRenderCategory.Speech)
{
DesiredRenderDeviceAudioProcessing = AudioProcessing.Raw,
AudioRenderCategory = AudioRenderCategory.Speech,
EncodingProperties = pcmEncoding
}
);
graph = result.Graph;
pcmEncoding does not make much sense here since only Float encoding is supported by AudioGraph.
byte[] byteData = new byte[buffer.Length];
it should be buffer.Length / 2 since you convert from float data with 4 bytes per sample to int16 data with 2 bytes per sample
/*Reading Float -> Int 16 (Slow Motion)*/
/*With this code I can import raw wav file into the Audacity
using Signed 16-bit PCM Encoding, but when I play it, it's in
a slow motion*/
var singleTmp = dataReader.ReadSingle();
var int16Tmp = (Int16)(singleTmp * Int16.MaxValue);
byte[] chunkBytes = BitConverter.GetBytes(int16Tmp);
byteData[pos++] = chunkBytes[0];
byteData[pos++] = chunkBytes[1];
This is correct code, it should work. Your "slow motion" is most likely related to the buffer size you incorrectly set before.
I must admit Microsoft needs someone to review their bloated APIs
Edit: Solution is at bottom of post
I am trying my luck with reading binary files. Since I don't want to rely on byte[] AllBytes = File.ReadAllBytes(myPath), because the binary file might be rather big, I want to read small portions of the same size (which fits nicely with the file format to read) in a loop, using what I would call a "buffer".
public void ReadStream(MemoryStream ContentStream)
{
byte[] buffer = new byte[sizePerHour];
for (int hours = 0; hours < NumberHours; hours++)
{
int t = ContentStream.Read(buffer, 0, sizePerHour);
SecondsToAdd = BitConverter.ToUInt32(buffer, 0);
// further processing of my byte[] buffer
}
}
My stream contains all the bytes I want, which is a good thing. When I enter the loop several things cease to work.
My int t is 0although I would presume that ContentStream.Read() would process information from within the stream to my bytearray, but that isn't the case.
I tried buffer = ContentStream.GetBuffer(), but that results in my buffer containing all of my stream, a behaviour I wanted to avoid by using reading to a buffer.
Also resetting the stream to position 0 before reading did not help, as did specifying an offset for my Stream.Read(), which means I am lost.
Can anyone point me to reading small portions of a stream to a byte[]? Maybe with some code?
Thanks in advance
Edit:
Pointing me to the right direction was the answer, that .Read() returns 0 if the end of stream is reached. I modified my code to the following:
public void ReadStream(MemoryStream ContentStream)
{
byte[] buffer = new byte[sizePerHour];
ContentStream.Seek(0, SeekOrigin.Begin); //Added this line
for (int hours = 0; hours < NumberHours; hours++)
{
int t = ContentStream.Read(buffer, 0, sizePerHour);
SecondsToAdd = BitConverter.ToUInt32(buffer, 0);
// further processing of my byte[] buffer
}
}
And everything works like a charm. I initially reset the stream to its origin every time I iterated over hour and giving an offset. Moving the "set to beginning-Part" outside my look and leaving the offset at 0 did the trick.
Read returns zero if the end of the stream is reached. Are you sure, that your memory stream has the content you expect? I´ve tried the following and it works as expected:
// Create the source of the memory stream.
UInt32[] source = {42, 4711};
List<byte> sourceBuffer = new List<byte>();
Array.ForEach(source, v => sourceBuffer.AddRange(BitConverter.GetBytes(v)));
// Read the stream.
using (MemoryStream contentStream = new MemoryStream(sourceBuffer.ToArray()))
{
byte[] buffer = new byte[sizeof (UInt32)];
int t;
do
{
t = contentStream.Read(buffer, 0, buffer.Length);
if (t > 0)
{
UInt32 value = BitConverter.ToUInt32(buffer, 0);
}
} while (t > 0);
}
i created one app, which is recording user voice from system Mic, and save file on system Harddisk using dialog box, which is working fine !
but my requirement is, i want to save Stream on server without dialog box
this is what i tried so far:-
private SaveFileDialog saveFileDialog = new SaveFileDialog()
{
Filter = "Audio files(*.wav)|*.wav"
};
protected void SaveFile()
{
if (saveFileDialog.ShowDialog() == false)
{
return;
}
StatusText = "Saving...";
Stream stream = saveFileDialog.OpenFile();
WavManager.SavePcmToWav(_sink.BackingStream, stream, _sink.CurrentFormat);
stream.Close();
MessageBox.Show("Your record is saved.");
GoToStartState();
}
public class WavManager
{
public static void SavePcmToWav(Stream rawData, Stream output, AudioFormat audioFormat)
{
if (audioFormat.WaveFormat != WaveFormatType.Pcm)
throw new ArgumentException("Only PCM coding is supported.");
BinaryWriter bwOutput = new BinaryWriter(output);
// Write down the WAV header.
// Refer to http://technology.niagarac.on.ca/courses/ctec1631/WavFileFormat.html
// for details on the format.
// Note that we use ToCharArray() when writing fixed strings
// to force using the char[] overload because
// Write(string) writes the string prefixed by its length.
// -- RIFF chunk
bwOutput.Write("RIFF".ToCharArray());
// Total Length Of Package To Follow
// Computed as data length plus the header length without the data
// we have written so far and this data (44 - 4 ("RIFF") - 4 (this data))
bwOutput.Write((uint)(rawData.Length + 36));
bwOutput.Write("WAVE".ToCharArray());
// -- FORMAT chunk
bwOutput.Write("fmt ".ToCharArray());
// Length Of FORMAT Chunk (Binary, always 0x10)
bwOutput.Write((uint)0x10);
// Always 0x01
bwOutput.Write((ushort)0x01);
// Channel Numbers (Always 0x01=Mono, 0x02=Stereo)
bwOutput.Write((ushort)audioFormat.Channels);
// Sample Rate (Binary, in Hz)
bwOutput.Write((uint)audioFormat.SamplesPerSecond);
// Bytes Per Second
bwOutput.Write((uint)(audioFormat.BitsPerSample * audioFormat.SamplesPerSecond * audioFormat.Channels / 8));
// Bytes Per Sample: 1=8 bit Mono, 2=8 bit Stereo or 16 bit Mono, 4=16 bit Stereo
bwOutput.Write((ushort)(audioFormat.BitsPerSample * audioFormat.Channels / 8));
// Bits Per Sample
bwOutput.Write((ushort)audioFormat.BitsPerSample);
// -- DATA chunk
bwOutput.Write("data".ToCharArray());
// Length Of Data To Follow
bwOutput.Write((uint)rawData.Length);
// Raw PCM data follows...
// Reset position in rawData and remember its origin position
// to restore at the end.
long originalRawDataStreamPosition = rawData.Position;
rawData.Seek(0, SeekOrigin.Begin);
// Append all data from rawData stream into output stream.
byte[] buffer = new byte[4096];
int read; // number of bytes read in one iteration
while ((read = rawData.Read(buffer, 0, 4096)) > 0)
{
bwOutput.Write(buffer, 0, read);
}
rawData.Seek(originalRawDataStreamPosition, SeekOrigin.Begin);
}
}