I need to convert a wav file to 8000Hz 16Bit Mono Wav. I already have a code, which works well with NAudio library, but I want to use MemoryStream instead of temporary file.
using System.IO;
using NAudio.Wave;
static void Main()
{
var input = File.ReadAllBytes("C:/input.wav");
var output = ConvertWavTo8000Hz16BitMonoWav(input);
File.WriteAllBytes("C:/output.wav", output);
}
public static byte[] ConvertWavTo8000Hz16BitMonoWav(byte[] inArray)
{
using (var mem = new MemoryStream(inArray))
using (var reader = new WaveFileReader(mem))
using (var converter = WaveFormatConversionStream.CreatePcmStream(reader))
using (var upsampler = new WaveFormatConversionStream(new WaveFormat(8000, 16, 1), converter))
{
// todo: without saving to file using MemoryStream or similar
WaveFileWriter.CreateWaveFile("C:/tmp_pcm_8000_16_mono.wav", upsampler);
return File.ReadAllBytes("C:/tmp_pcm_8000_16_mono.wav");
}
}
Not sure if this is the optimal way, but it works...
public static byte[] ConvertWavTo8000Hz16BitMonoWav(byte[] inArray)
{
using (var mem = new MemoryStream(inArray))
{
using (var reader = new WaveFileReader(mem))
{
using (var converter = WaveFormatConversionStream.CreatePcmStream(reader))
{
using (var upsampler = new WaveFormatConversionStream(new WaveFormat(8000, 16, 1), converter))
{
byte[] data;
using (var m = new MemoryStream())
{
upsampler.CopyTo(m);
data = m.ToArray();
}
using (var m = new MemoryStream())
{
// to create a propper WAV header (44 bytes), which begins with RIFF
var w = new WaveFileWriter(m, upsampler.WaveFormat);
// append WAV data body
w.Write(data,0,data.Length);
return m.ToArray();
}
}
}
}
}
}
It might be added and sorry I can't comment yet due to lack of points. That NAudio ALWAYS writes 46 byte headers which in certain situations can cause crashes. I want to add this in case someone encouters this while searching for a clue why naudio wav files only start crashing certain programs.
I encoutered this problem after figuring out how to convert and/or sample wav with NAudio and was stuck after for 2 days now and only figured it out with a hex editor.
(The 2 extra bytes are located at byte 37 and 38 right before the data subchunck [d,a,t,a,size<4bytes>].
Here is a comparison of two wave file headers left is saved by NAudio 46 bytes; right by Audacity 44 bytes
You can check this back by looking at the NAudio src in WaveFormat.cs at line 310 where instead of 16 bytes for the fmt chunck 18+extra are reserved (+extra because there are some wav files which even contain bigger headers than 46 bytes) but NAudio always seems to write 46 byte headers and never 44 (MS standard). It may also be noted that in fact NAudio is able to read 44 byte headers (line 210 in WaveFormat.cs)
Related
I'm using ZipArchive and I'm writing an oracle that determines the size of a zip file based on the zip specification. For simplicity, no compression is being used.
private long ZipSizeOracle(int numOfFiles, int totalLengthOfFilenames, int totalSizeOfFiles)
{
return
numOfFiles * (
30 //Local file header
+
12 //Data descriptor
+
46 //Central directory file header
)
+
2 * totalLengthOfFilenames //Local file header name + Central directory file header name
+
totalSizeOfFiles //Data size
+ 22 //End of central directory record (EOCD)
;
}
Currently I have 4 tests, ZeroFiles outputs 22 bytes correctly and is the appropriate size for an empty zip.
[TestMethod]
public void ZeroFiles()
{
using (var memStream = new MemoryStream())
{
using (var archive = new ZipArchive(memStream, ZipArchiveMode.Create, true)) { }
Assert.AreEqual(ZipSizeOracle(0, 0, 0), memStream.Length);
}
}
One4ByteFile expects 130 bytes but the actual was 125 bytes
[TestMethod]
public void One4ByteFile()
{
using (var memStream = new MemoryStream())
{
using (var archive = new ZipArchive(memStream, ZipArchiveMode.Create, true))
{
var entry1 = archive.CreateEntry("test.txt", CompressionLevel.NoCompression);
using (var writer = new StreamWriter(entry1.Open()))
writer.WriteLine("test");
}
Assert.AreEqual(ZipSizeOracle(1, 8, 4), memStream.Length);
}
}
Two4ByteFiles expects 241 bytes but the actual was 231 bytes
[TestMethod]
public void Two4ByteFiles()
{
using (var memStream = new MemoryStream())
{
using (var archive = new ZipArchive(memStream, ZipArchiveMode.Create, true))
{
var entry1 = archive.CreateEntry("test.txt", CompressionLevel.NoCompression);
using (var writer = new StreamWriter(entry1.Open()))
writer.WriteLine("test");
var entry2 = archive.CreateEntry("test2.txt", CompressionLevel.NoCompression);
using (var writer = new StreamWriter(entry2.Open()))
writer.WriteLine("test2");
}
Assert.AreEqual(ZipSizeOracle(2, 17, 9), memStream.Length);
}
}
OneFolder expects 118 bytes but the actual was 108 bytes
[TestMethod]
public void OneFolder()
{
using (var memStream = new MemoryStream())
{
using (var archive = new ZipArchive(memStream, ZipArchiveMode.Create, true))
archive.CreateEntry(#"test\", CompressionLevel.NoCompression);
Assert.AreEqual(ZipSizeOracle(1, 4, 0), memStream.Length);
}
}
What am I missing from the specification in order for the oracle to give me the correct file size?
You are missing the following:
Data descriptor block is optional and is included only if zip file is written in "streamed" manner (that is - you don't know size of file beforehand and write "on the fly"). When you are streaming - size of compressed and uncompressed data, as well as CRC, are not available when file header is written (because file header goes before data), so all those bytes in file header are set to 0 and data descriptor block is included after compressed data, when this information is available. In case of examples you provided - data descriptor is not included.
Level NoCompression in CreateEntry does not mean data is included literally. Instead, data is processed with deflate algorithm (compression method 8 in specification you linked) without actual compression. This deflate algorithm adds its own overhead, even in "no compression mode":
1 byte defines if this is a last block or not and compression level.
2 bytes define block size
2 bytes define two-complement of block size (for integrity)
then goes the data with size defined above
So for each block of data in input (block is 2^16 bytes) - 5 bytes of overhead are added. In your examples all files are less than 2^16 in size, so just 5 bytes are added for them.
You use writer.WriteLine, so size of data you write is not 4 bytes in first example, but 6, because \r\n (newline characters) are added (and in second example that is 13).
If you take all this into account (remove 12 data descriptor size, add 5 size of deflate overhead for your small files, pass correct totalSizeOfFiles) - your examples will produce expected output.
Update about data descriptor record. Specification says:
This descriptor SHOULD be used only when it was not possible to
seek in the output .ZIP file, e.g., when the output .ZIP file
was standard output or a non-seekable device
And ZipArchive class follows this. If you pass unseekable stream in constructor - it will emit data descriptor records. For example:
public class UnseekableStream : MemoryStream {
public override bool CanSeek => false;
}
using (var memStream = new UnseekableStream()) {
using (var archive = new ZipArchive(memStream, ZipArchiveMode.Create, true)) {
}
}
Such unseekable streams often happen in practice, http response stream is one example. But note that 12 bytes is not the only allowed size for data descriptor record:
4.3.9.3 Although not originally assigned a signature, the value
0x08074b50 has commonly been adopted as a signature value
for the data descriptor record. Implementers should be
aware that ZIP files may be encountered with or without this
signature marking data descriptors and SHOULD account for
either case when reading ZIP files to ensure compatibility.
4.3.9.4 When writing ZIP files, implementors SHOULD include the
signature value marking the data descriptor record. When
the signature is used, the fields currently defined for
the data descriptor record will immediately follow the
signature.
So, data descriptor may optionally start with 4 bytes signature, and it is recommended for implementors to include that signature when writing, and ZipArchive follows this recommendation, so size of data descriptor record it emits is 16 bytes (12 + 4 of signature), not 12.
There is a post in here Compress and decompress string in c# for compressing string in c#.
I've implement the same code for myself but the returned text is almost twice as mine :O
I've tried it on a json with size 87 like this:
{"G":"82f88ff5-4143-46ef-86cc-a19910f4a6b5","U":"df39e3c7-ffd3-4829-a9cd-27bfcbd4403a"}
The result is 168
H4sIAAAAAAAEAC2NUQ6DIBQE5yx8l0QFqfQCnqAHqKCXaHr3jsaQ3TyYfcuXwKpeamHi0Bf9YCaSGVW6psLua5QWmifykVbPyCDJ3gube4GHet+tXZZM7Xrj6d7Z3u/W8896dVVpd5rMbCaa3k1k25M88OMPcjDew64AAAA=
I've changed Unicode to ASCII but the result is still too big (128)
H4sIAAAAAAAEAA3KyxGAMAgFwF44y0w+JAEbsAILICSvCcfedc/70EUnaYEq0FiyVJa+wdoj2LNZThDvs9FB918Xqu0ag4H1Vy3GbrG4jImYSyRVp/cDp8EZE1cAAAA=
public static string Compress(this string s)
{
var bytes = Encoding.ASCII.GetBytes(s);
using (var msi = new MemoryStream(bytes))
using (var mso = new MemoryStream())
{
using (var gs = new GZipStream(mso, CompressionMode.Compress))
{
msi.CopyTo(gs);
}
return Convert.ToBase64String(mso.ToArray());
}
}
Gzip is not only compression but a complete file format - this means it adds additional structures which usually can be neglected regarding their size.
However if compressing small strings they can blow up the overall gzip stream.
The standard GZIP header for example has 10 bytes and it's footer is 8 bytes long.
Therefore you now take your gzip compressed result in raw format (not the bloated up base64 encoded one) you will see that it has 95 bytes.
Therefore the 18 bytes for header and hooter already make nearly 20% of the output!
I have a method that decompresses *.gz file:
using (FileStream originalFileStream = new FileStream(gztempfilename, FileMode.Open, FileAccess.Read))
{
using (FileStream decompressedFileStream = new FileStream(outputtempfilename, FileMode.Create, FileAccess.Write))
{
using (GZipStream decompressionStream = new GZipStream(originalFileStream, CompressionMode.Decompress))
{
decompressionStream.CopyTo(decompressedFileStream);
}
}
}
It worked perfectly, but recently I received pack of files with wrong size:
When I open them with 7-zip they have Packed Size ~ 1,600,000 and Size = 7 (it should be ~20,000,000).
So when I extract them using this code I get only a part of the file. But when I extract this file using 7-zip I get full file.
How can I handle this situation in my code?
My guess is that that the other end does a mistake when GZipping the files. It looks like it does not set the ISIZE bytes correctly.
The ISIZE bytes are the last four bytes of a valid GZip file and come after a 32-bit CRC value which in turn comes directly after the compressed data bytes.
7-Zip seems to be robust against such mistakes whereas the GZipStream is not. It is odd however that 7-Zip is not showing you any errors. It should show you (tested with 7-Zip 16.02 x64/Win7)...
CRC error in case the size is simply wrong,
"Unexpected end of data" in case some or all of the ISIZE bytes are cut off,
"There are some data after end of the payload data" in case there is more data following the ISIZE bytes.
7-Zip always uses the last four bytes of the packed file to determine the size of the original unpacked file without checking if the file is valid and whether the bytes read for that are actually the ISIZE bytes.
You can verify this by checking those last four bytes of the GZipped file with a hex viewer. For your example they should be exactly 07 00 00 00.
If you know the exact size of the unpacked original file you could replace those bytes so that they specify the correct size. For instance, if the unpacked file's size is 20,000,078, which is 01312D4E in hex (0-padded to eight digits), those bytes should be 4E 2D 31 01.
In case you don't know the exact size you can try replacing them with the maximum value, i.e. FF FF FF FF.
After that try your unpack code again.
This is obviously only a hacky solution to your problem. Better try fixing the code that GZips the files you receive or try to find a library that is more robust than GZipStream.
I've used ICSharpCode.SharpZipLib.GZip.GZipInputStream from this library instead of System.IO.Compression.GZipStream and it helped.
Did you try this for check the size? ie:
byte[] bArray;
using (FileStream f = new FileStream(tempFile, FileMode.Open))
{
bArray= new byte[f.Length];
f.Read(b, 0, f.Length);
}
Regards
try:
GZipStream uncompressed = new GZipStream(streamIn, CompressionMode.Decompress, true);
FileStream streamOut = new FileStream(tempDoc[0], FileMode.Create, FileAccess.Write, FileShare.None);
Looks like this is some sort of bug in GZipStream (it does not write original file length into gz end of file).
You need to change the way you compress your files using GZipStream.
The way it will work:
inputBytes = Encoding.UTF8.GetBytes(output);
using (var outputStream = new MemoryStream())
{
using (var gZipStream = new GZipStream(outputStream, CompressionMode.Compress))
gZipStream.Write(inputBytes, 0, inputBytes.Length);
System.IO.File.WriteAllBytes("file.xml.gz", outputStream.ToArray());
}
And this way will cause the error you have (no matter will you use Flush() or not):
inputBytes = Encoding.UTF8.GetBytes(output);
using (var outputStream = new MemoryStream())
{
using (var gZipStream = new GZipStream(outputStream, CompressionMode.Compress))
{
gZipStream.Write(inputBytes, 0, inputBytes.Length);
System.IO.File.WriteAllBytes("file.xml.gz", outputStream.ToArray());
}
}
You might need to call decompressedStream.Seek() after closing the gZip stream.
As shown here.
I'm adding in compression to my project with the aim of improving speed in the 3G Data communication from Android app to ASP.NET C# Server.
The methods I've researched/written/tested works. However, there's added white space after compression. And they are different as well. This really puzzles me.
Is it something to do with different implementation of the GZIP classes in both Java/ASP.NET C#? Is it something that I should be concerned with or do I just move on with .Trim() and .trim() after decompressing?
Java, compressing "Mary had a little lamb" gives:
Compressed data length: 42
Base64 Compressed String: H4sIAAAAAAAAAPNNLKpUyEhMUUhUyMksKclJVchJzE0CAHrIujIWAAAA
protected static byte[] GZIPCompress(byte[] data) {
try {
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
GZIPOutputStream gZIPOutputStream = new GZIPOutputStream(byteArrayOutputStream);
gZIPOutputStream.write(data);
gZIPOutputStream.close();
return byteArrayOutputStream.toByteArray();
} catch(IOException e) {
Log.i("output", "GZIPCompress Error: " + e.getMessage());
return null;
}
}
ASP.NET C#, compressing "Mary had a little lamb"
Compressed data length: 137
Base64 Compressed String: H4sIAAAAAAAEAO29B2AcSZYlJi9tynt/SvVK1+B0oQiAYBMk2JBAEOzBiM3mkuwdaUcjKasqgcplVmVdZhZAzO2dvPfee++999577733ujudTif33/8/XGZkAWz2zkrayZ4hgKrIHz9+fB8/Ir7I6ut0ns3SLC2Lti3ztMwWk/8Hesi6MhYAAAA=
public static byte[] GZIPCompress(byte[] data)
{
using (MemoryStream memoryStream = new MemoryStream())
{
using (GZipStream gZipStream = new GZipStream(memoryStream, CompressionMode.Compress))
{
gZipStream.Write(data, 0, data.Length);
}
return memoryStream.ToArray();
}
}
I get 42 bytes on .NET as well. I suspect you're using an old version of .NET which had a flaw in its compression scheme.
Here's my test app using your code:
using System;
using System.IO;
using System.IO.Compression;
using System.Text;
class Program
{
static void Main(string[] args)
{
var uncompressed = Encoding.UTF8.GetBytes("Mary had a little lamb");
var compressed = GZIPCompress(uncompressed);
Console.WriteLine(compressed.Length);
Console.WriteLine(Convert.ToBase64String(compressed));
}
static byte[] GZIPCompress(byte[] data)
{
using (var memoryStream = new MemoryStream())
{
using (var gZipStream = new GZipStream(memoryStream,
CompressionMode.Compress))
{
gZipStream.Write(data, 0, data.Length);
}
return memoryStream.ToArray();
}
}
}
Results:
42
H4sIAAAAAAAEAPNNLKpUyEhMUUhUyMksKclJVchJzE0CAHrIujIWAAAA
This is exactly the same as the Java data.
I'm using .NET 4.5. I suggest you try running the above code on your machine, and compare the results.
I've just decompressed the base64 data you provided, and it is a valid "compressed" form of "Mary had a little lamb", with 22 bytes in the uncompressed data. That surprises me... and reinforces my theory that it's a framework version difference.
EDIT: Okay, this is definitely a framework version difference. If I compile with the .NET 3.5 compiler, then use an app.config which forces it to run with that version of the framework, I see 137 bytes as well. Given comments, it looks like this was only fixed in .NET 4.5.
I am trying to read PCM samples from a (converted) MP3 file using NAudio, but failing as the Read method returns zero (indicating EOF) every time.
Example: this piece of code, which attempts to read a single 16-bit sample, always prints "0":
using System;
using NAudio.Wave;
namespace NAudioMp3Test
{
class Program
{
static void Main(string[] args)
{
using (Mp3FileReader fr = new Mp3FileReader("MySong.mp3"))
{
byte[] buffer = new byte[2];
using (WaveStream pcm = WaveFormatConversionStream.CreatePcmStream(fr))
{
using (WaveStream aligned = new BlockAlignReductionStream(pcm))
{
Console.WriteLine(aligned.WaveFormat);
Console.WriteLine(aligned.Read(buffer, 0, 2));
}
}
}
}
}
}
output:
16 bit PCM: 44kHz 2 channels
0
But this version which reads from a WAV file works fine (I used iTunes to convert the MP3 to a WAV so they should contain similar samples):
static void Main(string[] args)
{
using (WaveFileReader pcm = new WaveFileReader("MySong.wav"))
{
byte[] buffer = new byte[2];
using (WaveStream aligned = new BlockAlignReductionStream(pcm))
{
Console.WriteLine(aligned.WaveFormat);
Console.WriteLine(aligned.Read(buffer, 0, 2));
}
}
}
output:
16 bit PCM: 44kHz 2 channels
2
What is going on here? Both streams have the same wave formats so I would expect to be able to use the same API to read samples. Setting the Position property doesn't help either.
You probably need to read in larger chunks. NAudio uses ACM to perform the conversion from MP3 to WAV, and if your target buffer isn't big enough, the codec may refuse to convert any data at all. In other words, you need to convert a block of samples before you can read the first sample.
WAV files are a different matter as it is nice and easy to read a single sample from them.