What is the structure of a NAudio capture?

What is the structure of a NAudio capture? - c#

I am trying to send over PCM audio using UDP via WLAN. The audio I am trying to send is the data I get from NAudio's WasapiLoopBackCapture(). When the DataAvailable event is called I send the whole buffer in one message:
var capture = new WasapiLoopbackCapture();
capture.DataAvailable += RecorderOnDataAvailable;
static void RecorderOnDataAvailable(object sender, WaveInEventArgs waveInEventArgs)
{
Byte[] msg = waveInEventArgs.Buffer;
udpClient.Send(msg, msg.Length);
Console.WriteLine(msg.Length);
Console.WriteLine(ByteArrayToString(waveInEventArgs.Buffer));
}
Now, a python script on the other end has to receive and store the bytes in order to convert them later into signed integers (I need to visualize the audio). In order to do that, I need to know what is the structure of the data when it's written to the UDP packets.
I tried looking at the hex of the bytes:
008059380080f1380080dc3800c007390040193900c03039004039390080633900404c390090873900e0583900a09b39
00206a390010b13900e07d390030c43900c0803900a0cf3900c06f3900b0d23900a05d3900a0d0390040573900c0c939
00e04f3900c0b53900e0443900608d390020443900a02f3900c04f3900408e3800e05a390000ccb700405939000007b9
00805139001082b9006052390010bcb900a05a390040eab900205f39001808ba0020593900f018ba00604d39001825ba
00e04039004829ba00202f3900c026ba00000e39007821ba0000b23800481cba0000e43700c817ba0000d8b7005814ba
00809bb8000811ba00c002b900700dba00203bb900a00bba008064b900000cba000070b900200eba00e063b9003012ba
006053b9008817ba00804bb900601dba00c03bb9006821ba004013b900b021ba0080b1b800701fba00009bb700a01cba
0080723800b016ba00801539004807ba00406e390030dab900f09c3900e09ab900f0b53900c02fb90010c3390000f9b7...
Still, I can't make sense of it, because it's basically my first time interacting with bytes at this level.
So, what is the structure of the captured audio, and does it depend on the system I am on? How do I go about converting the bytes to integers? Thanks in advance

WASAPI captures audio using IEEE floating point 32 bit samples (and will be stereo). It would make sense to convert those to 16 bit integer.

Related

How to send Bytes completely via WebSocketSharp

I am making a audio chat program, So I tried to sending Audio bytes via web-socket
first I did, I get audio bytes and send it on But it was failed(maybe there can't pass completely)
second I tried is convert bytes to string with use BitConverter and convert again to byte array with Encoding.UTF8.GetBytes method
this is my code
var pcmAudio = stream.ToByteArray();
var audio = Encoding.UTF8.GetBytes(BitConverter.ToString(pcmAudio));
if I send that 'audio' it works. I can convert to byte array and I can play audio.
but, if I send pcmAudio there are an error
Stream ms = new MemoryStream(Encoding.UTF8.GetBytes(data));
above is my receiving audio code. data is string. there's no way to receive with Byte type.
So i had to convert data to bytes.
Unfortunately it doesn't work.
the error message is 'Wave header is corrupted'
i want to send byte array compeletly.
ur question 1. Why do you want to send bytes ? You know the way to send audio with bitconverter
my answer 1. Byte length would be larger than not converted
thank you

NAudio - beginners questions - running on 20ms buffer of audio file

Doing my first steps in Audio prog and using NAudio, I'm trying to have a simple app that grabs a WAV file and getting 20ms of audio data each time till EOF. However I'm getting a bit confused with the buffer arrays and probably conversions.
Is there a simple way someone can post in here?
Moreover I got confused with the following:
When using AudioFileReader readertest = new AudioFileReader(fileName) I'm getting different metadata like bitrate of 32 and length of ~700000.
However, when using the NAudio - WaveFileReader file1 = new WaveFileReader(fileName) I'm getting half values for the same audio file (bitrate = 16, length = ~350000). Also the encoding for the first is "IEEEFloat" while the latter is "PCM". Any explanations...?
Thanks v much!

AudioFileReader is a wrapper around WaveFileReader (and supports several other file types), and auto-converts to IEEE float for you. If you want to read the audio directly into a byte array in whatever format it is in the WAV file, then you should just use WaveFileReader.

Socket and ports setup for high-speed audio/video streaming

I have a one-on-one connection between a server and a client. The server is streaming real-time audio/video data.
My question may sound weird, but should I use multiple ports/socket or only one? Is it faster to use multiple ports or a single one offer better performance? Should I have a port only for messages, one for video and one for audio or is it more simple to package the whole thing in a single port?
One of my current problem is that I need to first send the size of the current frame as the size - in bytes - may change from one frame to the next. I'm fairly new to Networking, but I haven't found any mechanism that would automatically detect the correct range for a specific object being transmitted. For example, if I send a 2934 bytes long packet, do I really need to tell the receiver the size of that packet?
I first tried to package the frame as fast as they were coming in, but I found out the receiving end would sometime not get the appropriated number of bytes. Most of the time, it would read faster than I send them, getting only a partial frame. What's the best way to get only the appropriated number of bytes as quickly as possible?
Or am I looking too low and there's a higher-level class/framework used to handle object transmission?

I think it is better to use an object mechanism and send data in an interleaved fashion. This mechanism may work faster than multiple port mechanism.
eg:
class Data {
DataType, - (Adio/Video)
Size, - (Size of the Data buffer)
Data Buffer - (Data depends on the type)
}
'DataType' and 'Size' always of constant size. At the client side take the 'DataType' and 'Size' and then read the specifed size of corresponding sent data(Adio/Video).

Just making something up off the top of my head. Shove "packets" like this down the wire:
1 byte - packet type (audio or video)
2 bytes - data length
(whatever else you need)
|
| (raw data)
|
So whenever you get one of these packets on the other end, you know exactly how much data to read, and where the beginning of the next packet should start.
[430 byte audio L packet]
[430 byte audio R packet]
[1000 byte video packet]
[20 byte control packet]
[2000 byte video packet]
...
But why re-invent the wheel? There are protocols to do these things already.

Socket Programming - Sending/Receiving hex and strings

I have the following code in C#:
Console.WriteLine("Connecting to server...");
TcpClient client = new TcpClient("127.0.0.1", 25565);
client.Client.Send(BitConverter.GetBytes(0x02));
client.Client.Send(BitConverter.GetBytes(0x0005));
client.Client.Send(Encoding.UTF8.GetBytes("wedtm"));
Console.Write("{0:x2}", client.GetStream().ReadByte());
For the life of me, I can't figure out how to transpose this to ruby. Any help here?
This is what I have so far, but it's not working as expected:
require 'socket'
s = TCPSocket.open("127.0.0.1", 25565)
s.write(0x02)
s.write(0x0005)
s.write("wedtm".bytes)
response = s.recvfrom(2)
puts "Response Size #{response.size}: #{response.to_s}"
The response should be 0x02
EDIT:
I'm assuming I have to use String#unpack on this, however, I can't figure out how to get "wedtm" to output to the appropriate \x000\x000\x000\x000 format.

There are at least two things to consider here:
Network byte order is big-endian. This means that you should always think in single bytes or arrays of bytes, as bytes are not subject to being shuffled around while larger types are.
C#'s BitConverter.GetBytes(int16) returns 2 bytes in little-endian format and GetBytes(int32) returns 4 bytes in little-endian format
Without knowing any Ruby or its string format, I'd guess you need to do something like this for the first part:
s.write("\x02\x00".bytes)
s.write("\x05\x00\x00\x00".bytes)
The second part should be okay.
WireShark is an invaluable tool when debugging network code and/or reverse engineering networking protocols, record the traffic of the C# app and compare the difference with yours.

Why is the calculated checksum not matching the BCC sent over the serial port?

I've got a little application written in C# that listens on a SerialPort for information to come in. The information comes in as: STX + data + ETX + BCC. We then calculate the BCC of the transmission packet and compare. The function is:
private bool ConsistencyCheck(byte[] buffer)
{
byte expected = buffer[buffer.Length - 1];
byte actual = 0x00;
for (int i = 1; i < buffer.Length - 1; i++)
{
actual ^= buffer[i];
}
if ((expected & 0xFF) != (actual & 0xFF))
{
if (AppTools.Logger.IsDebugEnabled)
{
AppTools.Logger.Warn(String.Format("ConsistencyCheck failed: Expected: #{0} Got: #{1}", expected, actual));
}
}
return (expected & 0xFF) == (actual & 0xFF);
}
And it seems to work more or less. It is accurately not including the STX or the BCC and accurately including the ETX in it's calculations. It seems to work a very large percentage of the time, however we have at least two machines we are running this on, both of which are Windows 2008 64-bit in which the BCC calculation NEVER adds up. Pulling from a recent log I had in one byte 20 was sent and I calculated 16 and one where 11 was sent and I calculated 27.
I'm absolutely stumped as to what is going on here. Is there perhaps a 64 bit or Windows 2008 "gotcha" I'm missing here? Any help or even wild ideas would be appreciated.
EDIT:
Here's the code that reads the data in:
private void port_DataReceived(object sender, System.IO.Ports.SerialDataReceivedEventArgs e)
{
// Retrieve number of bytes in the buffer
int bytes = serialPort.BytesToRead;
// Create a byte array to hold the awaiting data
byte[] received = new byte[bytes];
//read the data and store it
serialPort.Read(received, 0, bytes);
DataReceived(received);
}
And the DataReceived() function takes that string and appends it to global StringBuilder object. It then stays as a string builder until it's passed to these various functions at which point the .ToString() is called on it.
EDIT2: Changed the code to reflect my altered routines that operate on bytes/byte arrays rather than strings.
EDIT3: I still haven't figured this out yet, and I've gotten more test data that has completely inconsistent results (the amount I'm off of the send checksum varies each time with no pattern). It feels like I'm just calculating the checksum wrong, but I don't know how.

The buffer is defined as a String. While I suspect the data you are transmitting are bytes. I would recommend using byte arrays (even if you are sending ascii/utf/whatever encoding). Then after the checksum is valid, convert the data to a string

computing BCC is not standard, but "customer defined". we program interfaces for our customers and many times found different algorithms, including sum, xor, masking, letting apart stx, etx, or both, or letting apart all known bytes. for example, package structure is "stx, machine code, command code, data, ..., data, etx, bcc", and the calculus of bcc is (customer specified!) as "binary sum of all bytes from command code to last data, inclusive, and all masked with 0xCD". That is, we have first to add all the unknown bytes (it make no sense to add stx, etx, or machine code, if these bytes do not match, the frame is discarded anyhow! their value is tested when they are got, to be sure the frame starts, ends correctly, and it is addressed to the receiving machine, and in this case, we have to bcc only the bytes that can change in the frame, this will decrease the time, as in many cases we work with 4 or 8 bit slow microcontrollers, and caution, this is summing the bytes, and not xoring them, this was just an example, other customer wants something else), and second, after we have the sum (which can be 16 bits if is not truncated during the addition), we mask it (bitwise AND) with the key (in this example 0xCD). This kind of stuff is frequently used for all kind of close systems, like ATM's for example (connecting a serial keyboard to an ATM) for protection reasons, etc., in top of encryption and other things. So, you really have to check (read "crack") how your two machines are computing their (non standard) BCC's.

Make sure you have the port set to accept null bytes somewhere in your port setup code. (This maybe the default value, I'm not sure.)
port.DiscardNull = false;
Also, check for the type of byte arriving at he serial port, and accept only data:
private void port_DataReceived(object sender, SerialDataReceivedEventArgs e)
{
if (e.EventType == SerialData.Chars)
{
// Your existing code
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

What is the structure of a NAudio capture? - c#

WASAPI captures audio using IEEE floating point 32 bit samples (and will be stereo). It would make sense to convert those to 16 bit integer.

Related

How to send Bytes completely via WebSocketSharp

NAudio - beginners questions - running on 20ms buffer of audio file

Socket and ports setup for high-speed audio/video streaming

Socket Programming - Sending/Receiving hex and strings

Why is the calculated checksum not matching the BCC sent over the serial port?

Categories

Resources