C# - Stream.Read offset is working incorrectly

C# - Stream.Read offset is working incorrectly - c#

I have a test code like below. I am reading from stream, offsetting by 2 positions, and then taking next 2 bytes. I would hope that result would be an array with 2 elements. This does not work though - offset is completely ignored, and full sized array is always returned, with only offset blocks having values. But this means my result table is still very large, it just has a lot of unwanted zeroes
How can I rework below code, so that file.Read() returns only an array of 2 bytes instead of 10 when length = 2 and offset = 2? In real world scenario I am dealing with large files (>2gigs) so filtering out the result array is not an option.
Edit: As the issue is unclear - below code requires me to always define output array that is the same size as the stream. Instead I would like to have an output that is of size of length (in below example I would like to have var buffer = new byte[2], but that will throw an exception because file.Read ignores offset and length and always returns 10 elements (with only 2 of them being read, rest is dummy zeroes).
private byte[] GetFilePart(int length, int offset)
{
//build some dummy content
var content = new byte[10];
for (int i = 0; i<10; i++)
{
content[i] = 1;
}
//read the data from content
var buffer = new byte[10];
using (Stream file = new MemoryStream(content))
{
file.Read(buffer, offset, length);
}
return buffer;
}

Looks like it's working properly to me; maybe your confusion would clear a bit if you inited your content array with something like:
for (int i = 1; i<=10; i++)
{
content[i-1] = i;
}
then each byte would have a different number and the image would look like:
offset relates to where into buffer the Stream will write the bytes to (it reads from the start of content). It does not relate to what bytes are read out of content.
Imagine Read as being called WriteBytesInto(byte[] whatBuffer, int whereToStartWriting, int howManyBytesToWrite) - you provide the buffer it will write into and tell it where to start and how many to do
If you did this, having inited content to be incrementing numbers:
file.Read(buffer, 2, 3); //read 3 bytes from stream and write to buffer # index 2
file.Read(buffer, 0, 2); //read 2 bytes from stream and write to buffer # index 0
Your buffer would end up looking like:
4,5,1,2,3,0,0,0,0,0
The 1,2,3 having been written first, then the 4,5 written next
If you want to skip two bytes from the stream (i.e. read the 3rd and 4th byte from content, Seek() the stream or set its Position (or as canton7 advises in the commments, if the stream is not seekable, read and discard some bytes)
How can I rework below code, so that file.Read() returns only an array of 2 bytes instead of 10 when length = 2 and offset = 2?
Well, file.Read doesn't return an array at all; it modifies an array you give it. If you want a 2 byte array, give it a 2 byte array:
byte buf = new byte[2];
file.Read(buf, 0, buf.Length);
If you want to open a file, skip the first 7 bytes and then read bytes 8th and 9th into your length-of-2 byte array then:
byte buf = new byte[2];
file.Position = 7; //absolute skip to 8th byte
file.Read(buf, 0, buf.Length);
For more on seeking in streams see Stream.Seek(0, SeekOrigin.Begin) or Position = 0

Related

fail to read it all into memory when shortcount reading

My textbook says:
Read receives a block of data from the stream into an array. It returns the number of
bytes received, which is always either less than or equal to the count argument. If
it’s less than count, it means either that the end of the stream has been reached or
the stream is giving you the data in smaller chunks (as is often the case with network streams). In either case, the balance of bytes in the array will remain unwritten, their previous values preserved.
With Read, you can be certain you’ve reached the end of the
stream only when the method returns 0. So, if you have a
1,000-byte stream, the following code may fail to read it all
into memory:
// Assuming s is a stream:
byte[] data = new byte [1000];
s.Read (data, 0, data.Length);
The Read method could read anywhere from 1 to 1,000 bytes,
leaving the balance of the stream unread.
I'm confused and I write a simple concolse application to verify:
//read.txt only contains 3 chars: abc
using (FileStream s = new FileStream("read.txt", FileMode.Open))
{
byte[] data = new byte[5];
int num = s.Read(data, 0, data.Length);
foreach (byte b in data)
{
Console.WriteLine(b);
}
}
and the output is:
97
98
99
0
0
so the data array has been written, why the textbook says "the balance of bytes in the array will remain unwritten and it may fail to read it all into memory",
EDIT: please discard my previous console application, the real question I have is:
a file has 1000 byte and I'm requesting to read all 1000 bytes:
byte[] data = new byte [1000];
s.Read (data, 0, data.Length);
why it may fail to read it all into memory? and the textbook also provides a correct method to read all 1000 bytes:
byte[] data = new byte [1000];
int bytesRead = 0;
int chunkSize = 1;
while (bytesRead < data.Length && chunkSize > 0)
bytesRead +=
chunkSize = s.Read (data, bytesRead, data.Length - bytesRead);

With regard to:
byte[] data = new byte[5];
int num = s.Read(data, 0, data.Length);
foreach (byte b in data)
That foreach will process every single byte in data (and there's always five of those regardless of how many were written to by the Read() call).
The number of bytes that were actually read by the Read() call gets stored into the num variable, so that's what you should be using to process your data, something like (untested but probably simple enough to not warrant it):
for (int i = 0; i < num; i++) {
Console.WriteLine(data[i]);
}
As per the text you quote, everything after the bytes that were read can be any arbitrary value because the Read() call did not change them. In your case, they were obviously zero before the Read() and so remained so.
If you were to run that foreach loop before the Read() as well as after, you should see only the first three bytes change (assuming the content of the file is different to what was in memory, of course).

If it’s less than count, it means either that the end of the stream has been reached or the stream is giving you the data in smaller chunks (as is often the case with network streams). In either case, the balance of bytes in the array will remain unwritten, their previous values preserved.
When the text says In either case, it refers to If it’s (the return value, i.e the bytes read from the stream) less than count. In this case, the bytes read don't take up all the buffer area you have designated to the read action. The rest of the buffer area is, of course, untouched (remain unwritten, their previous values preserved).
In your case, it is that the end of the stream has been reached because you only have 3 bytes in your file stream. The trailing zeros in your array are the untouched buffer elements.
So, if you have a 1,000-byte stream, the following code may fail to read it all into memory.
The text is talking about a 1,000-byte stream, while your stream just contains 3 bytes. For a short file stream that has only 3 bytes, it is likely that all data will be loaded into memory at once, which is the case you have observed. (just "likely", not "guaranteed"). For a longer stream (or other types of stream, e.g. NetworkStream), you are not guarunteed to read bytes enough to fill your designated buffer area at once (there are possibilities that you may even get only 1 bytes at a time!).

Read Multiple Byte Arrays from File

How can I read multiple byte arrays from a file? These byte arrays are images and have the potential to be large.
This is how I'm adding them to the file:
using (var stream = new FileStream(tempFile, FileMode.Append))
{
//convertedImage = one byte array
stream.Write(convertedImage, 0, convertedImage.Length);
}
So, now they're in tempFile and I don't know how to retrieve them as individual arrays. Ideally, I'd like to get them as an IEnumerable<byte[]>. Is there a way to split these, maybe?

To retrieve multiple sets of byte arrays, you will need to know the length when reading. The easiest way to do this (if you can change the writing code) is to add a length value:
using (var stream = new FileStream(tempFile, FileMode.Append))
{
//convertedImage = one byte array
// All ints are 4-bytes
stream.Write(BitConverter.GetBytes(convertedImage.Length), 0, 4);
// now, we can write the buffer
stream.Write(convertedImage, 0, convertedImage.Length);
}
Reading the data is then
using (var stream = new FileStream(tempFile, FileMode.Open))
{
// loop until we can't read any more
while (true)
{
byte[] convertedImage;
// All ints are 4-bytes
int size;
byte[] sizeBytes = new byte[4];
// Read size
int numRead = stream.Read(sizeBytes, 0, 4);
if (numRead <= 0) {
break;
}
// Convert to int
size = BitConverter.ToInt32(sizeBytes, 0);
// Allocate the buffer
convertedImage = new byte[size];
stream.Read(convertedImage, 0, size);
// Do what you will with the array
listOfArrays.Add(convertedImage);
} // end while
}
If all saved images are the same size, then you can eliminate the first read and write call from each, and hard-code size to the size of the arrays.

Unless you can work out the number of bytes taken by each individual array from the content of these bytes themselves, you need to store the number of images and their individual lengths into the file.
There are many ways to do it: you could write lengths of the individual arrays preceding each byte array, or you could write a "header" describing the rest of the content before writing the "payload" data to the file.
Header may look as follows:
Byte offset Description
----------- -------------------
0000...0003 - Number of files, N
0004...000C - Length of file 1
000D...0014 - Length of file 2
...
XXXX...XXXX - Length of file N
XXXX...XXXX - Content of file 1
XXXX...XXXX - Content of file 2
...
XXXX...XXXX - Content of file N
You can use BitConverter methods to produce byte arrays to be written to the header, or you could use BinaryWriter.

When you read back how do you get the number of bytes per image/byte array to read?
You will need to store the length too (i.e. first 4 bytes = encoded 32 bit int byte count, followed by the data bytes.)
To read back, read the first four bytes, un-encode it back to an int, and then read that number of bytes, repeat until eof.

C# Beginner File Reading

Okay I searched for an answer to this but couldn't find it.
here's the code:
FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read);
byte[] fileText = new byte[fs.Length];
int bytesRead = fs.Read(fileText, 0, fileText.Length);
Console.WriteLine(Encoding.ASCII.GetString(fileText, 0, bytesRead));
Let me get this straight,
We declare a filestream
We Declare a byte array.. and set its CAPACITY to fs.Length
???? Why does fs.Read() return an INTEGER ???
??? How does this line display the text from the .txt file to the console? we passed in the byte[] in the getstring() method, but isnt that byte[] empty? we only set its capacity to fs.length? where did the reading happen and how?
TIA

If you are trying to read a text file and display all it's lines in console
foreach(string line in File.ReadAllLines("YourFilePath"))
{
Console.WriteLine(line);
}
In your method
FileStream fs = new FileStream("YourFilePath", FileMode.Open, FileAccess.Read);
Opens the file for reading into stream fs.
byte[] fileText = new byte[fs.Length];
Gets the number of bytes in the file content, and creates a byte array of that size
int bytesRead = fs.Read(fileText, 0, fileText.Length);
Reads the byte content, from 0 to end of content (we have length from last statement), i.e. the complete contents into the array you created. So, now your byte array fileText has all the byte contents from the file.
It returns the number of bytes read in this operation, if you need that for some reason. This can be <= the number of bytes you wanted to read (less if less bytes were available in the file content). In your case, it will be same as fileText.Length since you already calculated that.
System.Console.WriteLine(Encoding.ASCII.GetString(fileText, 0, bytesRead));
Converts the byte array into ASCII encoded text and writes to console.

Read method returns the number of bytes that were read into the buffer paramters. You passed in an array that will be filled with the data when the Read method is actually called. You have passed in the number of bytes that you want to read as well.
Read the msdn documentation for more details here

FileStream.Read() returns the number of bytes actually read. It could be that you ask for 4096 bytes, but get 0, or 1, or 1000. This is what the docs say:
Return Value
Type: System.Int32
The total number of bytes read into the buffer. This might be less than the number of bytes requested if that number of bytes are not currently available, or zero if the end of the stream is reached.
If you are reading text, you can use one of the helpful File methods: File.ReadAllText, File.ReadAllLines, or File.OpenText which gives you a StreamReader object where you can read line-by-line.
If you need to read the bytes (this is a much lower-level usage, and really you should be able to use a StreamReader), then you don't want to create a buffer the length of the stream, since this could crash your program with an OutOfMemoryException. Instead, make the buffer something like 4096 bytes, then call FileStream.Read in a loop, until it returns 0. Note, however, that you are not reading text lines here, and a line break may come in the middle of the buffer. Here's an example:
using (var fileStream = File.OpenRead("c:\\file.txt"))
{
var buffer = new Byte[4096];
var offset = 0;
var read = 0;
while ((read = fileStream.Read(buffer, offset, buffer.Length)) > 0)
{
var s = Encoding.ASCII.GetString(buffer, 0, read);
Console.Write(s);
offset += read;
}
}

3) has been answered here already.
As for 4): the Read method actually also fills the buffer with bytes and returns the number of bytes it filled into the buffer.
Passing the buffer and the number of bytes read to Encoding. GetString() interprets the bytes from the file as character codes for the given encoding, in your case ASCII, and creates an string from the byte array based on the encoding.

Read Specific Bytes Out of Byte Array C#

I am a newbie with a question regarding reading specific bytes out of an array in C#. I received an array response that is 128 bytes long and I am looking to read and store the first 4 bytes of the array. How do I do so?
I've read a number of posts about shifting bytes but was a bit confused and I am looking to get going in the right direction.

Use one of the Array.Copy overloads:
byte[] bytes = new byte[4];
Array.Copy(originalArray, bytes, bytes.Length);

Use Array.Copy method:
// your original array
var sourceArray = new byte[128];
// create a new one with the size you need
var firstBytes = new byte[4];
// copy the required number of bytes.
Array.Copy(sourceArray, 0, firstBytes, 0, firstBytes.Length);
If you do not need the rest of the data in the original array, you can also resize it, truncating everything after the first 4 bytes:
Array.Resize(ref sourceArray, 4);

Read the first 4 bytes in 4 different variables:
byte first = array[0],
second = array[1],
third = array[2],
forth = array[4];
Read them as one 32-bit long integer:
int result =
first << 24 ||
second << 16 ||
third << 8 ||
forth;
The x << n operator shifts the bits of number x, n times to the left. You are effectively moving the bytes of first 24 positions to the left, thus first will be stored in bits with indices 24 to 31 like that:
first 00000000 00000000 00000000
Next, move second 16 positions to the left, thus storing it in bits 16 to 23 and so on:
first second 00000000 00000000
Wikipedia provides a good summary of bitwise operations if you need more details.

Example from a network stream.
byte[] bytesRead = 0;
using (NetworkStream stream = client.GetStream())
{
byte[] length = new byte[4];
bytesRead = stream.Read(length, 0, 4);
}
bytesRead contains the first 4 bytes stored in the networkstream.

You could use the Take method. Although, this is only available if you are using .NET 3.5 or later. You will need to have using System.Linq; at the top of the .cs file.
var bytes = originalArray.Take(4).ToArray();

Split up a memorystream into bytarray

Im trying to split up a memorystream into chunks by reading parts into a byte array but i think i have got something fundamentally wrong. I can read the first chunk but when i try to read rest of memorystream i get index out of bound even if there are more bytes to read. It seems that the issue is the size of the receving byte buffer that need to be as large as the memorystrem. I need to convert it into chunks as the code is in a webservice.
Anyone knows whats wrong with this code
fb.buffer is MemoryStream
long bytesLeft = fb.Buffer.Length;
fb.Buffer.Position = 0;
int offset =0;
int BUFF_SIZE = 8196;
while (bytesLeft > 0)
{
byte[] fs = new byte[BUFF_SIZE];
fb.Buffer.Read(fs, offset, BUFF_SIZE);
offset += BUFF_SIZE;
bytesLeft -= BUFF_SIZE;
}

offset here is the offset into the array. It should be zero here. You should also be looking at the return value from Read. It is not guaranteed to fill the buffer, even if more data is available.
However, if this is a MemoryStream - a better option might be ArraySegment<byte>, which requires no duplication of data.

Please look at this code for Stream.Read from MSDN from a glance - you shouldn't be incrementing offset - it should always be zero. Unless, of course, you happen to know for a fact the exact length of the stream in advance (therefore you would create the array the exact size).
You should also always grab the amount of bytes read from Read (the return value).
If you're looking to split it into 'chunks` do you mean you want n 8k chunks? Then you might want to do something like this:
List<byte[]> chunks = new List<byte[]>();
byte chunk = new byte[BUFF_SIZE];
int bytesRead = fb.Buffer.Read(chunk, 0, BUFF_SIZE);
while(bytesRead > 0)
{
if(bytesRead != BUFF_SIZE)
{
byte[] tail = new byte[bytesRead];
Array.Copy(chunk, tail, bytesRead);
chunk = tail;
}
chunks.Add(chunk);
bytesRead = fb.Buffer.Read(chunk, 0, BUFF_SIZE);
}
Note in particular that the last chunk is more than likely not going to be exactly BUFF_SIZE in length.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# - Stream.Read offset is working incorrectly - c#

Related

fail to read it all into memory when shortcount reading

Read Multiple Byte Arrays from File

C# Beginner File Reading

Read Specific Bytes Out of Byte Array C#

Split up a memorystream into bytarray

Categories

Resources