Split up a memorystream into bytarray

Split up a memorystream into bytarray - c#

Im trying to split up a memorystream into chunks by reading parts into a byte array but i think i have got something fundamentally wrong. I can read the first chunk but when i try to read rest of memorystream i get index out of bound even if there are more bytes to read. It seems that the issue is the size of the receving byte buffer that need to be as large as the memorystrem. I need to convert it into chunks as the code is in a webservice.
Anyone knows whats wrong with this code
fb.buffer is MemoryStream
long bytesLeft = fb.Buffer.Length;
fb.Buffer.Position = 0;
int offset =0;
int BUFF_SIZE = 8196;
while (bytesLeft > 0)
{
byte[] fs = new byte[BUFF_SIZE];
fb.Buffer.Read(fs, offset, BUFF_SIZE);
offset += BUFF_SIZE;
bytesLeft -= BUFF_SIZE;
}

offset here is the offset into the array. It should be zero here. You should also be looking at the return value from Read. It is not guaranteed to fill the buffer, even if more data is available.
However, if this is a MemoryStream - a better option might be ArraySegment<byte>, which requires no duplication of data.

Please look at this code for Stream.Read from MSDN from a glance - you shouldn't be incrementing offset - it should always be zero. Unless, of course, you happen to know for a fact the exact length of the stream in advance (therefore you would create the array the exact size).
You should also always grab the amount of bytes read from Read (the return value).
If you're looking to split it into 'chunks` do you mean you want n 8k chunks? Then you might want to do something like this:
List<byte[]> chunks = new List<byte[]>();
byte chunk = new byte[BUFF_SIZE];
int bytesRead = fb.Buffer.Read(chunk, 0, BUFF_SIZE);
while(bytesRead > 0)
{
if(bytesRead != BUFF_SIZE)
{
byte[] tail = new byte[bytesRead];
Array.Copy(chunk, tail, bytesRead);
chunk = tail;
}
chunks.Add(chunk);
bytesRead = fb.Buffer.Read(chunk, 0, BUFF_SIZE);
}
Note in particular that the last chunk is more than likely not going to be exactly BUFF_SIZE in length.

Related

C# - Stream.Read offset is working incorrectly

I have a test code like below. I am reading from stream, offsetting by 2 positions, and then taking next 2 bytes. I would hope that result would be an array with 2 elements. This does not work though - offset is completely ignored, and full sized array is always returned, with only offset blocks having values. But this means my result table is still very large, it just has a lot of unwanted zeroes
How can I rework below code, so that file.Read() returns only an array of 2 bytes instead of 10 when length = 2 and offset = 2? In real world scenario I am dealing with large files (>2gigs) so filtering out the result array is not an option.
Edit: As the issue is unclear - below code requires me to always define output array that is the same size as the stream. Instead I would like to have an output that is of size of length (in below example I would like to have var buffer = new byte[2], but that will throw an exception because file.Read ignores offset and length and always returns 10 elements (with only 2 of them being read, rest is dummy zeroes).
private byte[] GetFilePart(int length, int offset)
{
//build some dummy content
var content = new byte[10];
for (int i = 0; i<10; i++)
{
content[i] = 1;
}
//read the data from content
var buffer = new byte[10];
using (Stream file = new MemoryStream(content))
{
file.Read(buffer, offset, length);
}
return buffer;
}

Looks like it's working properly to me; maybe your confusion would clear a bit if you inited your content array with something like:
for (int i = 1; i<=10; i++)
{
content[i-1] = i;
}
then each byte would have a different number and the image would look like:
offset relates to where into buffer the Stream will write the bytes to (it reads from the start of content). It does not relate to what bytes are read out of content.
Imagine Read as being called WriteBytesInto(byte[] whatBuffer, int whereToStartWriting, int howManyBytesToWrite) - you provide the buffer it will write into and tell it where to start and how many to do
If you did this, having inited content to be incrementing numbers:
file.Read(buffer, 2, 3); //read 3 bytes from stream and write to buffer # index 2
file.Read(buffer, 0, 2); //read 2 bytes from stream and write to buffer # index 0
Your buffer would end up looking like:
4,5,1,2,3,0,0,0,0,0
The 1,2,3 having been written first, then the 4,5 written next
If you want to skip two bytes from the stream (i.e. read the 3rd and 4th byte from content, Seek() the stream or set its Position (or as canton7 advises in the commments, if the stream is not seekable, read and discard some bytes)
How can I rework below code, so that file.Read() returns only an array of 2 bytes instead of 10 when length = 2 and offset = 2?
Well, file.Read doesn't return an array at all; it modifies an array you give it. If you want a 2 byte array, give it a 2 byte array:
byte buf = new byte[2];
file.Read(buf, 0, buf.Length);
If you want to open a file, skip the first 7 bytes and then read bytes 8th and 9th into your length-of-2 byte array then:
byte buf = new byte[2];
file.Position = 7; //absolute skip to 8th byte
file.Read(buf, 0, buf.Length);
For more on seeking in streams see Stream.Seek(0, SeekOrigin.Begin) or Position = 0

How to read file bytes from byte offset?

If I am given a .cmp file and a byte offset 0x598, how can I read a file from this offset?
I can ofcourse read file bytes like this
byte[] fileBytes = File.ReadAllBytes("upgradefile.cmp");
But how can I read it from byte offset 0x598
To explain a bit more, actually from this offset the actual data starts that I have to read and before this byte offset it is just header data, so basically I have to read file from that offset till end.

Try code like this:
using (BinaryReader reader = new BinaryReader(File.Open("upgradefile.cmp", FileMode.Open)))
{
long offset = 0x598;
if (reader.BaseStream.Length > offset)
{
reader.BaseStream.Seek(offset, SeekOrigin.Begin);
byte[]fileBytes = reader.ReadBytes((int) (reader.BaseStream.Length - offset));
}
}

If you are not familiar with Streams, Linq, or whatever, I have simplest solution for you:
Read entire file into memory (I hope you deal with small files):
byte[] fileBytes = File.ReadAllBytes("upgradefile.cmp");
Calculate how many bytes are present in array after given offset:
long startOffset = 0x598; // this is just hexadecimal representation for human, it can be decimal or whatever
long howManyBytesToRead = fileBytes.Length - startOffset;
Then just copy data to new array:
byte[] newArray = new byte[howManyBytesToRead];
long pos = 0;
for (int i = startOffset; i < fileBytes.Length; i++)
{
newArray[pos] = fileBytes[i];
pos = pos + 1;
}
If you understand how it works you can look at Array.Copy method in Microsoft documentation.

By not using ReadAllBytes.
Get a stream, move to potition, read rest of files.
You basically complain that a convenience method made to allow a one line read of a whole file is not what you want - ignoring that it is just that, a convenience method. The normal way to deal with files is opening them and using a Stream.

fail to read it all into memory when shortcount reading

My textbook says:
Read receives a block of data from the stream into an array. It returns the number of
bytes received, which is always either less than or equal to the count argument. If
it’s less than count, it means either that the end of the stream has been reached or
the stream is giving you the data in smaller chunks (as is often the case with network streams). In either case, the balance of bytes in the array will remain unwritten, their previous values preserved.
With Read, you can be certain you’ve reached the end of the
stream only when the method returns 0. So, if you have a
1,000-byte stream, the following code may fail to read it all
into memory:
// Assuming s is a stream:
byte[] data = new byte [1000];
s.Read (data, 0, data.Length);
The Read method could read anywhere from 1 to 1,000 bytes,
leaving the balance of the stream unread.
I'm confused and I write a simple concolse application to verify:
//read.txt only contains 3 chars: abc
using (FileStream s = new FileStream("read.txt", FileMode.Open))
{
byte[] data = new byte[5];
int num = s.Read(data, 0, data.Length);
foreach (byte b in data)
{
Console.WriteLine(b);
}
}
and the output is:
97
98
99
0
0
so the data array has been written, why the textbook says "the balance of bytes in the array will remain unwritten and it may fail to read it all into memory",
EDIT: please discard my previous console application, the real question I have is:
a file has 1000 byte and I'm requesting to read all 1000 bytes:
byte[] data = new byte [1000];
s.Read (data, 0, data.Length);
why it may fail to read it all into memory? and the textbook also provides a correct method to read all 1000 bytes:
byte[] data = new byte [1000];
int bytesRead = 0;
int chunkSize = 1;
while (bytesRead < data.Length && chunkSize > 0)
bytesRead +=
chunkSize = s.Read (data, bytesRead, data.Length - bytesRead);

With regard to:
byte[] data = new byte[5];
int num = s.Read(data, 0, data.Length);
foreach (byte b in data)
That foreach will process every single byte in data (and there's always five of those regardless of how many were written to by the Read() call).
The number of bytes that were actually read by the Read() call gets stored into the num variable, so that's what you should be using to process your data, something like (untested but probably simple enough to not warrant it):
for (int i = 0; i < num; i++) {
Console.WriteLine(data[i]);
}
As per the text you quote, everything after the bytes that were read can be any arbitrary value because the Read() call did not change them. In your case, they were obviously zero before the Read() and so remained so.
If you were to run that foreach loop before the Read() as well as after, you should see only the first three bytes change (assuming the content of the file is different to what was in memory, of course).

If it’s less than count, it means either that the end of the stream has been reached or the stream is giving you the data in smaller chunks (as is often the case with network streams). In either case, the balance of bytes in the array will remain unwritten, their previous values preserved.
When the text says In either case, it refers to If it’s (the return value, i.e the bytes read from the stream) less than count. In this case, the bytes read don't take up all the buffer area you have designated to the read action. The rest of the buffer area is, of course, untouched (remain unwritten, their previous values preserved).
In your case, it is that the end of the stream has been reached because you only have 3 bytes in your file stream. The trailing zeros in your array are the untouched buffer elements.
So, if you have a 1,000-byte stream, the following code may fail to read it all into memory.
The text is talking about a 1,000-byte stream, while your stream just contains 3 bytes. For a short file stream that has only 3 bytes, it is likely that all data will be loaded into memory at once, which is the case you have observed. (just "likely", not "guaranteed"). For a longer stream (or other types of stream, e.g. NetworkStream), you are not guarunteed to read bytes enough to fill your designated buffer area at once (there are possibilities that you may even get only 1 bytes at a time!).

C# Beginner File Reading

Okay I searched for an answer to this but couldn't find it.
here's the code:
FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read);
byte[] fileText = new byte[fs.Length];
int bytesRead = fs.Read(fileText, 0, fileText.Length);
Console.WriteLine(Encoding.ASCII.GetString(fileText, 0, bytesRead));
Let me get this straight,
We declare a filestream
We Declare a byte array.. and set its CAPACITY to fs.Length
???? Why does fs.Read() return an INTEGER ???
??? How does this line display the text from the .txt file to the console? we passed in the byte[] in the getstring() method, but isnt that byte[] empty? we only set its capacity to fs.length? where did the reading happen and how?
TIA

If you are trying to read a text file and display all it's lines in console
foreach(string line in File.ReadAllLines("YourFilePath"))
{
Console.WriteLine(line);
}
In your method
FileStream fs = new FileStream("YourFilePath", FileMode.Open, FileAccess.Read);
Opens the file for reading into stream fs.
byte[] fileText = new byte[fs.Length];
Gets the number of bytes in the file content, and creates a byte array of that size
int bytesRead = fs.Read(fileText, 0, fileText.Length);
Reads the byte content, from 0 to end of content (we have length from last statement), i.e. the complete contents into the array you created. So, now your byte array fileText has all the byte contents from the file.
It returns the number of bytes read in this operation, if you need that for some reason. This can be <= the number of bytes you wanted to read (less if less bytes were available in the file content). In your case, it will be same as fileText.Length since you already calculated that.
System.Console.WriteLine(Encoding.ASCII.GetString(fileText, 0, bytesRead));
Converts the byte array into ASCII encoded text and writes to console.

Read method returns the number of bytes that were read into the buffer paramters. You passed in an array that will be filled with the data when the Read method is actually called. You have passed in the number of bytes that you want to read as well.
Read the msdn documentation for more details here

FileStream.Read() returns the number of bytes actually read. It could be that you ask for 4096 bytes, but get 0, or 1, or 1000. This is what the docs say:
Return Value
Type: System.Int32
The total number of bytes read into the buffer. This might be less than the number of bytes requested if that number of bytes are not currently available, or zero if the end of the stream is reached.
If you are reading text, you can use one of the helpful File methods: File.ReadAllText, File.ReadAllLines, or File.OpenText which gives you a StreamReader object where you can read line-by-line.
If you need to read the bytes (this is a much lower-level usage, and really you should be able to use a StreamReader), then you don't want to create a buffer the length of the stream, since this could crash your program with an OutOfMemoryException. Instead, make the buffer something like 4096 bytes, then call FileStream.Read in a loop, until it returns 0. Note, however, that you are not reading text lines here, and a line break may come in the middle of the buffer. Here's an example:
using (var fileStream = File.OpenRead("c:\\file.txt"))
{
var buffer = new Byte[4096];
var offset = 0;
var read = 0;
while ((read = fileStream.Read(buffer, offset, buffer.Length)) > 0)
{
var s = Encoding.ASCII.GetString(buffer, 0, read);
Console.Write(s);
offset += read;
}
}

3) has been answered here already.
As for 4): the Read method actually also fills the buffer with bytes and returns the number of bytes it filled into the buffer.
Passing the buffer and the number of bytes read to Encoding. GetString() interprets the bytes from the file as character codes for the given encoding, in your case ASCII, and creates an string from the byte array based on the encoding.

Unable to read beyond the end of the stream

I did some quick method to write a file from a stream but it's not done yet. I receive this exception and I can't find why:
Unable to read beyond the end of the stream
Is there anyone who could help me debug it?
public static bool WriteFileFromStream(Stream stream, string toFile)
{
FileStream fileToSave = new FileStream(toFile, FileMode.Create);
BinaryWriter binaryWriter = new BinaryWriter(fileToSave);
using (BinaryReader binaryReader = new BinaryReader(stream))
{
int pos = 0;
int length = (int)stream.Length;
while (pos < length)
{
int readInteger = binaryReader.ReadInt32();
binaryWriter.Write(readInteger);
pos += sizeof(int);
}
}
return true;
}
Thanks a lot!

Not really an answer to your question but this method could be so much simpler like this:
public static void WriteFileFromStream(Stream stream, string toFile)
{
// dont forget the using for releasing the file handle after the copy
using (FileStream fileToSave = new FileStream(toFile, FileMode.Create))
{
stream.CopyTo(fileToSave);
}
}
Note that i also removed the return value since its pretty much useless since in your code, there is only 1 return statement
Apart from that, you perform a Length check on the stream but many streams dont support checking Length.
As for your problem, you first check if the stream is at its end. If not, you read 4 bytes. Here is the problem. Lets say you have a input stream of 6 bytes. First you check if the stream is at its end. The answer is no since there are 6 bytes left. You read 4 bytes and check again. Ofcourse the answer is still no since there are 2 bytes left. Now you read another 4 bytes but that ofcourse fails since there are only 2 bytes. (readInt32 reads the next 4 bytes).

I presume that the input stream have ints only (Int32). You need to test the PeekChar() method,
while (binaryReader.PeekChar() != -1)
{
int readInteger = binaryReader.ReadInt32();
binaryWriter.Write(readInteger);
}

You are doing while (pos < length) and length is the actual length of the stream in bytes. So you are effectively counting the bytes in the stream and then trying to read that many number of ints (which is incorrect). You could take length to be stream.Length / 4 since an Int32 is 4 bytes.

try
int length = (int)binaryReader.BaseStream.Length;

After reading the stream by the binary reader the position of the stream is at the end, you have to set the position to zero "stream.position=0;"

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Split up a memorystream into bytarray - c#

Related

C# - Stream.Read offset is working incorrectly

How to read file bytes from byte offset?

fail to read it all into memory when shortcount reading

C# Beginner File Reading

Unable to read beyond the end of the stream

Categories

Resources