C# Trying to replace a byte while using MemoryStream class - c#

I get a text file from a mainframe and sometimes there are some 0x0D injected into the middle of the text lines.
The previos programmer created a method using the FileStream class. This method works fine but is taking around 30 minutes to go thru the entire file.
My thought was to pass the text lines that are needed (about 25 lines) to a method to decrease the processing time.
I've been working with the MemoryStream class but am having issue where it does not find the 0x0D control code.
Here is the current FileStream method:
private void ReplaceFileStream(string strInputFile)
{
FileStream fileStream = new FileStream(strInputFile, FileMode.Open, FileAccess.ReadWrite);
byte filebyte;
while (fileStream.Position < fileStream.Length)
{
filebyte = (byte)fileStream.ReadByte();
if (filebyte == 0x0D)
{
filebyte = 0x20;
fileStream.Position = fileStream.Position - 1;
fileStream.WriteByte(filebyte);
}
}
fileStream.Close();
}
and here is the MemoryStream method:
private void ReplaceMemoryStream(string strInputLine)
{
byte[] byteArray = Encoding.ASCII.GetBytes(strInputLine);
MemoryStream fileStream = new MemoryStream(byteArray);
byte filebyte;
while (fileStream.Position < fileStream.Length)
{
filebyte = (byte)fileStream.ReadByte();
if (filebyte == 0x0D)
{
filebyte = 0x20;
fileStream.Position = fileStream.Position - 1;
fileStream.WriteByte(filebyte);
}
}
fileStream.Close();
}
As I have not used the MemoryStream class before am not that familar with it. Any tips or ideas?

I don't know the size of your files, but if they are small enough that you can load the whole thing in memory at once, then you could do something like this:
private void ReplaceFileStream(string strInputFile)
{
byte[] fileBytes = File.ReadAllBytes(strInputFile);
bool modified = false;
for(int i=0; i < fileBytes.Length; ++i)
{
if (fileByte[i] == 0x0D)
{
fileBytes[i] = 0x20;
modified = true;
}
}
if (modified)
{
File.WriteAllBytes(strInputFile, fileBytes);
}
}
If you can't read the whole file in at once, then you should switch to a buffered reading type of setup, here is an example that reads from the file, writes to a temp file, then in the end copies the temp file over the original file. This should yield better performance then reading a file one byte at a time:
private void ReplaceFileStream(string strInputFile)
{
string tempFile = Path.GetTempFileName();
try
{
using(FileStream input = new FileStream(strInputFile,
FileMode.Open, FileAccess.Read))
using(FileStream output = new FileStream(tempFile,
FileMode.Create, FileAccess.Write))
{
byte[] buffer = new byte[4096];
bytesRead = input.Read(buffer, 0, 4096);
while(bytesRead > 0)
{
for(int i=0; i < bytesRead; ++i)
{
if (buffer[i] == 0x0D)
{
buffer[i] = 0x20;
}
}
output.Write(buffer, 0, bytesRead);
bytesRead = input.Read(buffer, 0, 4096);
}
output.Flush();
}
File.Copy(tempFile, strInputFile);
}
finally
{
if (File.Exists(tempFile))
{
File.Delete(tempFile);
}
}
}

if your replacement code does not find the 0x0D in the stream and the previous method with the FileStream does it, I think it could be because of the Encoding you are using to get the bytes of the file, you can try with some other encoding types.
otherwise your code seems to be fine, I would use a using around the MemoryStream to be sure it gets closed and disposed, something like this:
using(var fileStream = new MemoryStream(byteArray))
{
byte filebyte;
// your while loop...
}
looking at your code I am not 100% sure the changes you make to the memory stream will be persisted; Actually I think that if you do not save it after the changes, your changes will be lost. I can be wrong in this but you should test and see, if it does not save you should use StreamWriter to save it after the changes.

Related

FileStream is not working with relative path

I'm trying to use a FileStream with a relative path but it is not working.
var pic = ReadFile("~/Images/money.png");
It is working when I use something like:
var p = GetFilePath();
var pic = ReadFile(p);
the rest of the code(from SO):
public static byte[] ReadFile(string filePath)
{
byte[] buffer;
FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read);
try
{
int length = (int)fileStream.Length; // get file length
buffer = new byte[length]; // create buffer
int count; // actual number of bytes read
int sum = 0; // total number of bytes read
// read until Read method returns 0 (end of the stream has been reached)
while ((count = fileStream.Read(buffer, sum, length - sum)) > 0)
sum += count; // sum is a buffer offset for next reading
}
finally
{
fileStream.Close();
}
return buffer;
}
public string GetFilePath()
{
return HttpContext.Current.Server.MapPath("~/Images/money.png");
}
I don't get why it is not working because the FileStream constructor allow using relative path.
I'm assuming the folder in your program has the subfolder images, which contains your image file.
\folder\program.exe
\folder\Images\money.jpg
Try without the "~".
I also had the same issue but I solved it by using this code,
Try one of this code, hope it will solve your issue too.
#region GetImageStream
public static Stream GetImageStream(string Image64string)
{
Stream imageStream = new MemoryStream();
if (!string.IsNullOrEmpty(Image64string))
{
byte[] imageBytes = Convert.FromBase64String(Image64string.Substring(Image64string.IndexOf(',') + 1));
using (Image targetimage = BWS.AWS.S3.ResizeImage(System.Drawing.Image.FromStream(new MemoryStream(imageBytes, false)), new Size(1600, 1600), true))
{
targetimage.Save(imageStream, ImageFormat.Jpeg);
}
}
return imageStream;
}
#endregion
2nd one
#region GetImageStream
public static Stream GetImageStream(Stream stream)
{
Stream imageStream = new MemoryStream();
if (stream != null)
{
using (Image targetimage = BWS.AWS.S3.ResizeImage(System.Drawing.Image.FromStream(stream), new Size(1600, 1600), true))
{
targetimage.Save(imageStream, ImageFormat.Jpeg);
}
}
return imageStream;
}
#endregion

SharpZipLib not compressing memory stream

I have a memory stream that I want to compress:
public static MemoryStream ZipChunk(MemoryStream unZippedChunk) {
MemoryStream zippedChunk = new MemoryStream();
ZipOutputStream zipOutputStream = new ZipOutputStream(zippedChunk);
zipOutputStream.SetLevel(3);
ZipEntry entry = new ZipEntry("name");
zipOutputStream.PutNextEntry(entry);
Utils.StreamCopy(unZippedChunk, zippedChunk, new byte[4096]);
zipOutputStream.CloseEntry();
zipOutputStream.IsStreamOwner = false;
zipOutputStream.Close();
zippedChunk.Close();
return zippedChunk;
}
public static void StreamCopy(Stream source, Stream destination, byte[] buffer, bool bFlush = true) {
bool flag = true;
while (flag) {
int num = source.Read(buffer, 0, buffer.Length);
if (num > 0) {
destination.Write(buffer, 0, num);
}
else {
if (bFlush) {
destination.Flush();
}
flag = false;
}
}
}
It's supposed to be quite simple. You provide it with a stream you want to compress. The methods compresses the stream and returns it. Great.
However, I don't get compressed stream back. What I get are streams that have about 20ish bytes added at the beginning and end, which seem to have something to do with the zip library. But the data in the middle is completely uncompressed (ranges of 256 bytes that have same value, etc). I tried upping the level to 9, but nothing changed.
Why aren't my streams compressing?
You yourself copy original stream right into output stream via:
Utils.StreamCopy(unZippedChunk, zippedChunk, new byte[4096]);
You should copy to zipOutputStream instead:
StreamCopy(unZippedChunk, zipOutputStream, new byte[4096]);
Side note: instead of using custom copy stream methods - use a default one:
unZippedChunk.CopyTo(zipOutputStream);

How to read file by chunks

I'm a little bit confused aboot how i should read large file(> 8GB) by chunks in case each chunk has own size.
If I know chunk size it looks like code bellow:
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read, ProgramOptions.BufferSizeForChunkProcessing))
{
using (BufferedStream bs = new BufferedStream(fs, ProgramOptions.BufferSizeForChunkProcessing))
{
byte[] buffer = new byte[ProgramOptions.BufferSizeForChunkProcessing];
int byteRead;
while ((byteRead = bs.Read(buffer, 0, ProgramOptions.BufferSizeForChunkProcessing)) > 0)
{
byte[] originalBytes;
using (MemoryStream mStream = new MemoryStream())
{
mStream.Write(buffer, 0, byteRead);
originalBytes = mStream.ToArray();
}
}
}
}
But imagine, I've read large file by chunks made some coding with each chunk(chunk's size after that operation has been changed) and written to another new file all processed chunks. And now I need to do the opposite operation. But I don't know exactly chunk size. I have an idea. After each chunk has been processed i have to write new chunk size before chunk bytes. Like this:
Number of block bytes
Block bytes
Number of block bytes
Block bytes
So in that case first what i need to do is read chunk's header and learn what is chunk size exactly. I read and write to file only byte arrays. But I have a question - how should look chunk's header ? May be header have to contain some boundary ?
If the file is rigidly structured so that each block of data is preceded by a 32-bit length value, then it is easy to read. The "header" for each block is just the 32-bit length value.
If you want to read such a file, the easiest way is probably to encapsulate the reading into a method that returns IEnumerable<byte[]> like so:
public static IEnumerable<byte[]> ReadChunks(string path)
{
var lengthBytes = new byte[sizeof(int)];
using (var fs = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read))
{
int n = fs.Read(lengthBytes, 0, sizeof (int)); // Read block size.
if (n == 0) // End of file.
yield break;
if (n != sizeof(int))
throw new InvalidOperationException("Invalid header");
int blockLength = BitConverter.ToInt32(lengthBytes, 0);
var buffer = new byte[blockLength];
n = fs.Read(buffer, 0, blockLength);
if (n != blockLength)
throw new InvalidOperationException("Missing data");
yield return buffer;
}
}
Then you can use it simply:
foreach (var block in ReadChunks("MyFileName"))
{
// Process block.
}
Note that you don't need to provide your own buffering.
try this
public static IEnumerable<byte[]> ReadChunks(string fileName)
{
const int MAX_BUFFER = 1048576;// 1MB
byte[] filechunk = new byte[MAX_BUFFER];
int numBytes;
using (var fs = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.Read))
{
long remainBytes = fs.Length;
int bufferBytes = MAX_BUFFER;
while (true)
{
if (remainBytes <= MAX_BUFFER)
{
filechunk = new byte[remainBytes];
bufferBytes = (int)remainBytes;
}
if ((numBytes = fs.Read(filechunk, 0, bufferBytes)) > 0)
{
remainBytes -= bufferBytes;
yield return filechunk;
}
else
{
break;
}
}
}
}

uncompressed file is bigger than original file in GZIP

i'm using the following function to compress(thanks to http://www.dotnetperls.com/):
public static void CompressStringToFile(string fileName, string value)
{
// A.
// Write string to temporary file.
string temp = Path.GetTempFileName();
File.WriteAllText(temp, value);
// B.
// Read file into byte array buffer.
byte[] b;
using (FileStream f = new FileStream(temp, FileMode.Open))
{
b = new byte[f.Length];
f.Read(b, 0, (int)f.Length);
}
// C.
// Use GZipStream to write compressed bytes to target file.
using (FileStream f2 = new FileStream(fileName, FileMode.Create))
using (GZipStream gz = new GZipStream(f2, CompressionMode.Compress, false))
{
gz.Write(b, 0, b.Length);
}
}
and for decompress:
static byte[] Decompress(byte[] gzip)
{
// Create a GZIP stream with decompression mode.
// ... Then create a buffer and write into while reading from the GZIP stream.
using (GZipStream stream = new GZipStream(new MemoryStream(gzip), CompressionMode.Decompress))
{
const int size = 4096;
byte[] buffer = new byte[size];
using (MemoryStream memory = new MemoryStream())
{
int count = 0;
do
{
count = stream.Read(buffer, 0, size);
if (count > 0)
{
memory.Write(buffer, 0, count);
}
}
while (count > 0);
return memory.ToArray();
}
}
}
so my goal is actually compress log files and than to decompress them in memory and compare the uncompressed file to the original file in order to check that the compression succeeded and i'm able to open the compressed file successfuly.
the problem is that the uncompressed file is most of the time bigger than the original file and my compare check is failing altough the compression probably succeeded.
any idea why ?
btw here how i compare the uncompressed file to the original file:
static bool FileEquals(byte[] file1, byte[] file2)
{
if (file1.Length == file2.Length)
{
for (int i = 0; i < file1.Length; i++)
{
if (file1[i] != file2[i])
{
return false;
}
}
return true;
}
return false;
}
Try this method to compress a file:
public static byte[] Compress(byte[] raw)
{
using (MemoryStream memory = new MemoryStream())
{
using (GZipStream gzip = new GZipStream(memory,
CompressionMode.Compress, true))
{
gzip.Write(raw, 0, raw.Length);
}
return memory.ToArray();
}
}
}
And this to decompress :
static byte[] Decompress(byte[] gzip)
{
// Create a GZIP stream with decompression mode.
// ... Then create a buffer and write into while reading from the GZIP stream.
using (GZipStream stream = new GZipStream(new MemoryStream(gzip), CompressionMode.Decompress))
{
const int size = 4096;
byte[] buffer = new byte[size];
using (MemoryStream memory = new MemoryStream())
{
int count = 0;
do
{
count = stream.Read(buffer, 0, size);
if (count > 0)
{
memory.Write(buffer, 0, count);
}
}
while (count > 0);
return memory.ToArray();
}
}
}
}
Tell me if it worked.
Goodluck.
Think you'd be better off with the simplest API call, try Stream.CopyTo(). I can't find the error in your code. If I was working on it, I'd probably make sure everything is getting flushed properly.. can't recall if GZipStream is going to flush its output to FileStream when the using block closes.. but then you are also saying that the final file is larger, not smaller.
Anyhow, best policy in my experience.. don't rewrite gotcha prone code when you don't need to. At least you tested it ;)

C# decode (decompress) Deflate data of PDF File

I would like to decompress in C# some DeflateCoded data (PDF extracted).
Unfortunately I got every time the exception "Found invalid data while decoding.".
But the data are valid.
private void Decompress()
{
FileStream fs = new FileStream(#"S:\Temp\myFile.bin", FileMode.Open);
//First two bytes are irrelevant
fs.ReadByte();
fs.ReadByte();
DeflateStream d_Stream = new DeflateStream(fs, CompressionMode.Decompress);
StreamToFile(d_Stream, #"S:\Temp\myFile1.txt", FileMode.OpenOrCreate);
d_Stream.Close();
fs.Close();
}
private static void StreamToFile(Stream inputStream, string outputFile, FileMode fileMode)
{
if (inputStream == null)
throw new ArgumentNullException("inputStream");
if (String.IsNullOrEmpty(outputFile))
throw new ArgumentException("Argument null or empty.", "outputFile");
using (FileStream outputStream = new FileStream(outputFile, fileMode, FileAccess.Write))
{
int cnt = 0;
const int LEN = 4096;
byte[] buffer = new byte[LEN];
while ((cnt = inputStream.Read(buffer, 0, LEN)) != 0)
outputStream.Write(buffer, 0, cnt);
}
}
Does anyone has some ideas?
Thanks.
I added this for test data:-
private static void Compress()
{
FileStream fs = new FileStream(#"C:\Temp\myFile.bin", FileMode.Create);
DeflateStream d_Stream = new DeflateStream(fs, CompressionMode.Compress);
for (byte n = 0; n < 255; n++)
d_Stream.WriteByte(n);
d_Stream.Close();
fs.Close();
}
Modified Decompress like this:-
private static void Decompress()
{
FileStream fs = new FileStream(#"C:\Temp\myFile.bin", FileMode.Open);
//First two bytes are irrelevant
// fs.ReadByte();
// fs.ReadByte();
DeflateStream d_Stream = new DeflateStream(fs, CompressionMode.Decompress);
StreamToFile(d_Stream, #"C:\Temp\myFile1.txt", FileMode.OpenOrCreate);
d_Stream.Close();
fs.Close();
}
Ran it like this:-
static void Main(string[] args)
{
Compress();
Decompress();
}
And got no errors.
I conclude that either the first two bytes are relevant (Obviously they are with my particular test data.) or
that your data has a problem.
Can we have some of your test data to play with?
(Obviously don't if it's sensitive)
private static string decompress(byte[] input)
{
byte[] cutinput = new byte[input.Length - 2];
Array.Copy(input, 2, cutinput, 0, cutinput.Length);
var stream = new MemoryStream();
using (var compressStream = new MemoryStream(cutinput))
using (var decompressor = new DeflateStream(compressStream, CompressionMode.Decompress))
decompressor.CopyTo(stream);
return Encoding.Default.GetString(stream.ToArray());
}
Thank you user159335 and user1011394 for bringing me on the right track! Just pass all bytes of the stream to input of above function. Make sure the bytecount is the same as the length specified.
All you need to do is use GZip instead of Deflate. Below is the code I use for the content of the stream… endstream section in a PDF document:
using System.IO.Compression;
public void DecompressStreamData(byte[] data)
{
int start = 0;
while ((this.data[start] == 0x0a) | (this.data[start] == 0x0d)) start++; // skip trailling cr, lf
byte[] tempdata = new byte[this.data.Length - start];
Array.Copy(data, start, tempdata, 0, data.Length - start);
MemoryStream msInput = new MemoryStream(tempdata);
MemoryStream msOutput = new MemoryStream();
try
{
GZipStream decomp = new GZipStream(msInput, CompressionMode.Decompress);
decomp.CopyTo(msOutput);
}
catch (Exception e)
{
MessageBox.Show(e.Message);
}
}
None of the solutions worked for me on Deflate attachments in a PDF/A-3 document. Some research showed that .NET DeflateStream does not support compressed streams with a header and trailer as per RFC1950.
Error message for reference: The archive entry was compressed using an unsupported compression method.
The solution is to use an alternative library SharpZipLib
Here is a simple method that successfully decoded a Deflate attachment from a PDF/A-3 file for me:
public static string SZLDecompress(byte[] data) {
var outputStream = new MemoryStream();
using var compressedStream = new MemoryStream(data);
using var inputStream = new InflaterInputStream(compressedStream);
inputStream.CopyTo(outputStream);
outputStream.Position = 0;
return Encoding.Default.GetString(outputStream.ToArray());
}

Categories