I have a large file with (text/Binary) format.
file format: (0 represent a byte)
00000FileName0000000Hello
World
world1
...
0000000000000000000000
Currently i'm using FileStream and i want to read the Hello.
I Know where Hello start, and it ends with a 0x0D 0x0A.
I also need to go back if the words is not equal to Hello.
How can i read until a carriage return?
is there any PEEK like function in FileStream so i can move back the read pointer`?
is FileStream even a good choice in this case?
You can use the method FileStream.Seek to change the read/write position.
You can use BinaryReader for reading binary content; however, it uses an inner buffer so you cannot rely the underlying Stream.Position anymore, because it can read more bytes in the background than you want. But you can re-implement its needed methods:
private byte[] ReadBytes(Stream s, int count)
{
buffer = new byte[count];
if (count == 0)
{
return buffer;
}
// reading one byte
if (count == 1)
{
int value = s.ReadByte();
if (value == -1)
threw new IOException("Out of stream");
buffer[0] = (byte)value;
return buffer;
}
// reading multiple bytes
int offset = 0;
do
{
int readBytes = s.Read(buffer, offset, count - offset);
if (readBytes == 0)
threw new IOException("Out of stream");
offset += readBytes;
}
while (offset < count);
return buffer;
}
public int ReadInt32(Stream s)
{
byte[] buffer = ReadBytes(s, 4);
return BitConverter.ToInt32(buffer, 0);
}
// similarly, write ReadInt16/64, etc, whatever you need
Assuming that you are on the start position, you can write a ReadString, too:
private string ReadString(Stream s, char delimiter)
{
var result = new List<char>();
int c;
while ((c = s.ReadByte()) != -1 && (char)c != delimiter)
{
result.Add((char)c);
}
return new string(result.ToArray());
}
Usage:
FileStream fs = GetMyFile(); // todo
if (!fs.CanSeek)
throw new NotSupportedException("sorry");
long posCurrent = fs.Position; // save current position
int posHello = ReadInt32(fs); // read position of "hello"
fs.Seek(posHello, SeekOrigin.Begin); // seeking to hello
string hello = ReadString(fs, '\n'); // reading hello
fs.Seek(posCurrent, SeekOrigin.Begin); // seeking back
Related
What would be the most optimal/fastest way to split a Steam into chunks delimited by a byte pattern (eg. new byte[] { 0, 0 })?
My current, naieve and slow, implementation reads the stream byte per byte, decrements a counter each time it encounters the delimiter. If the counter is zero, it yields a memory chunk.
const int NUMBER_CONSECUTIVE_DELIMITER = 2;
const int DELIMITER = 0;
public IEnumerable<ReadOnlyMemory<byte>> Chunk(Stream stream)
{
var chunk = new MemoryStream();
try
{
int b; //the byte being read
int c = NUMBER_CONSECUTIVE_DELIMITER;
while ((b = stream.ReadByte()) != -1) //Read the stream byte by byte, -1 = end of the stream
{
chunk.WriteByte((byte)b); //Write this byte to the next chunk
if (b == DELIMITER)
c--; //if we hit the delimiter (ie '0') decrement the counter
else
c = NUMBER_CONSECUTIVE_DELIMITER; //else, reset the couter
if ((c <= 0 || stream.Position == stream.Length) //we hit two subsequent '0's
{
var r = chunk.ToArray().AsMemory(); //parse it to a Memory<T>
chunk.Dispose();
chunk = new();
yield return r;
}
}
}
finally
{
chunk.Dispose();
}
}
Such an implementation is extremely difficult to implement because a stream has to be read out in fixed buffer sizes. The buffer can be too big or too small for the content to be interpreted. To solve this problem, the ReadOnlySequence<T> struct was added. More information about this topic can be seen here.
By using System.IO.Pipelines (package must be obtained) this problem can be solved as follows:
public static async Task FillPipeAsync(Stream stream, PipeWriter writer, CancellationToken cancellationToken = default)
{
// The minimum buffer size that is used for the current buffer segment.
const int bufferSize = 65536;
while (true)
{
// Request 65536 bytes from the PipeWriter.
Memory<byte> memory = writer.GetMemory(bufferSize);
// Read the content from the stream.
int bytesRead = await stream.ReadAsync(memory, cancellationToken).ConfigureAwait(false);
if (bytesRead == 0) break;
// Tell the writer how many bytes are read.
writer.Advance(bytesRead);
// Flush the data to the PipeWriter.
FlushResult result = await writer.FlushAsync(cancellationToken).ConfigureAwait(false);
if (result.IsCompleted) break;
}
// This enables our reading process to be notified that no more new data is coming.
await writer.CompleteAsync().ConfigureAwait(false);
}
This will read your stream asynchronously and write a buffer segment to the pipe. Next you have to implement a read logic to slice/merge the concatenated buffer segments into chunks:
public static async IAsyncEnumerable<ReadOnlySequence<byte>> ReadPipeAsync(PipeReader reader, ReadOnlyMemory<byte> delimiter,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
while (true)
{
// Read from the PipeReader.
ReadResult result = await reader.ReadAsync(cancellationToken).ConfigureAwait(false);
ReadOnlySequence<byte> buffer = result.Buffer;
while (TryReadChunk(ref buffer, delimiter.Span, out ReadOnlySequence<byte> chunk))
yield return chunk;
// Tell the PipeReader how many bytes are read.
// This is essential because the Pipe will release last used buffer segments that are not longer in use.
reader.AdvanceTo(buffer.Start, buffer.End);
// Take care of the complete notification and return the last buffer. UPDATE: Corrected issue 2/.
if (result.IsCompleted)
{
yield return buffer;
break;
}
}
await reader.CompleteAsync().ConfigureAwait(false);
}
private static bool TryReadChunk(ref ReadOnlySequence<byte> buffer, ReadOnlySpan<byte> delimiter,
out ReadOnlySequence<byte> chunk)
{
// Search the buffer for the first byte of the delimiter.
SequencePosition? position = buffer.PositionOf(delimiter[0]);
// If no occurence was found or the next bytes of the data in the buffer does not match the delimiter, return false.
// UPDATE: Corrected issue 3/.
if (position is null || !buffer.Slice(position.Value, delimiter.Length).FirstSpan.StartsWith(delimiter))
{
chunk = default;
return false;
}
// Return the calculated chunk and update the buffer to cut the start.
chunk = buffer.Slice(0, position.Value);
buffer = buffer.Slice(buffer.GetPosition(delimiter.Length, position.Value));
return true;
}
For this to work in that form you have to use an IAsyncEnumerable so that the chunks can be streamed into a foreach loop. Merging and slicing is largely handled by the pipe, so that a reliable algorithm can be built here with relatively little code. This code will also handle this in a high-performance manner.
Usage:
// Create a Pipe that manages the buffer.
Pipe pipe = new Pipe();
ConfiguredTaskAwaitable writing = FillPipeAsync(stream, pipe.Writer).ConfigureAwait(false);
// The delimiter that should be used. This can be any data with length > 0.
ReadOnlyMemory<byte> delimiter = new ReadOnlyMemory<byte>(new byte[] { 0, 0 });
// 'await foreach' and 'await writing' are executed asynchronously (in parallel).
await foreach (ReadOnlySequence<byte> chunk in ReadPipeAsync(pipe.Reader, delimiter))
{
// Use "chunk" to retrieve your chunked content.
};
await writing;
Note that reading and chunking is done asynchronously and independently.
I eventually ended up with the below code, strongly inspired by Philipp's answer above and https://keestalkstech.com/2010/11/seek-position-of-a-string-in-a-file-or-filestream/.
public override IEnumerable<byte[]> Chunk(Stream stream)
{
var buffer = new byte[bufferSize];
var size = bufferSize;
var offset = 0;
var position = stream.Position;
var nextChunk = Array.Empty<byte>();
while (true)
{
var bytesRead = stream.Read(buffer, offset, size);
// when no bytes are read -- the string could not be found
if (bytesRead <= 0)
break;
// when less then size bytes are read, we need to slice the buffer to prevent reading of "previous" bytes
ReadOnlySpan<byte> ro = buffer;
if (bytesRead < size)
ro = ro.Slice(0, offset + bytesRead);
// check if we can find our search bytes in the buffer
var i = ro.IndexOf(Delimiter);
if (i > -1 && // we found something
i <= bytesRead && //i <= r -- we found something in the area that was read (at the end of the buffer, the last values are not overwritten). i = r if the delimiter is at the end of the buffer
nextChunk.Length + (i + Delimiter.Length - offset) >= MinChunkSize) //the size of the chunk that will be made is large enough
{
var chunk = buffer[offset..(i + Delimiter.Length)];
yield return new byte[](Concat(nextChunk, chunk));
nextChunk = Array.Empty<byte>();
offset = 0;
size = bufferSize;
position += i + Delimiter.Length;
stream.Position = position;
continue;
}
else if (stream.Position == stream.Length)
{
// we re at the end of the stream
var chunk = buffer[offset..(bytesRead + offset)]; //return the bytes read
yield return new byte[](Concat(nextChunk, chunk));
break;
}
// the stream is not finished. Copy the last 2 bytes to the beginning of the buffer and set the offset to fill the buffer as of byte 3
nextChunk = Concat(nextChunk, buffer[offset..buffer.Length]);
offset = Delimiter.Length;
size = bufferSize - offset;
Array.Copy(buffer, buffer.Length - offset, buffer, 0, offset);
position += bufferSize - offset;
}
}
I'm using this port of the Mozilla character set detector to determine a file's encoding and then using that to construct a StreamReader. So far, so good.
However, the file format I am reading is an odd one and from time to time it is necessary to skip a number of bytes. That is, a file that is otherwise text, in one or other encoding, will have some raw bytes embedded in it.
I would like to read the stream as text, up to the point that I hit some text that indicates a byte stream follows, then I would like to read the byte stream, then resume reading as text. What is the best way of doing this (balance of simplicity and performance)?
I can't rely on seeking against the FileStream underlying the the StreamReader (and then discarding the buffered data in the latter) because I don't know how many bytes were used in reading the characters up to that point. I might abandon using StreamReader and switch to a bespoke class that uses parallel arrays of bytes and chars, populates the latter from the former using a decoder, and tracks the position in the byte array every time a character is read by using the encoding to calculate the number of bytes used for the character. Yuk.
To further clarify, the file has this format:
[encoded chars][embedded bytes indicator + len][len bytes][encoded chars]...
Where there many be zero one or many blocks of embedded bytes and the blocks of embedded chars may be any length.
So, for example:
ABC:123:DEF:456:$0099[0x00,0x01,0x02,... x 99]GHI:789:JKL:...
There are no line delimiters. I may have any number of fields (ABC, 123, ...) delimited by some character (in this case a colon). These fields may be in various codepages, including UTF-8 (not guaranteed to be single byte). When I hit a $ I know that the next 4 bytes contain a length (call it n), the next n bytes are to be read raw, and byte n + 1 will be another text field (GHI).
Proof of concept. This class works with UTF-16 string data, and ':' delimiters per OP. It expects binary length as a 4-byte, little-endian binary integer. It should be easy to adjust to more specific details of your (odd) file format. For example, any Decoder class should drop in to ReadString() and "just work".
To use it, construct it with a Stream class. For each individual data element, call ReportNextData(), which will tell you what kind of data is next, and then call the appropriate Read*() method. For binary data, call ReadBinaryLength() and then ReadBinaryData().
Note that ReadBinaryData() follows the stream contract; it is not guaranteed to return as many bytes as you asked for, so you may need to call it several times. However, if you ask for too many bytes, it will throw EndOfStreamException.
I tested it with this data (hex format):
410042004300240A0000000102030405060708090024050000000504030201580059005A003A310032003300
Which is:
ABC$[10][1234567890]$[5][54321]XYZ:123
Scan the data like so:
OddFileReader.NextData nextData;
while ((nextData = reader.ReportNextData()) != OddFileReader.NextData.Eof)
{
// Call appropriate Read*() here.
}
public class OddFileReader : IDisposable
{
public enum NextData
{
Unknown,
Eof,
String,
BinaryLength,
BinaryData
}
private Stream source;
private byte[] byteBuffer;
private int bufferOffset;
private int bufferEnd;
private NextData nextData;
private int binaryOffset;
private int binaryEnd;
private char[] characterBuffer;
public OddFileReader(Stream source)
{
this.source = source;
}
public NextData ReportNextData()
{
if (nextData != NextData.Unknown)
{
return nextData;
}
if (!PopulateBufferIfNeeded(1))
{
return (nextData = NextData.Eof);
}
if (byteBuffer[bufferOffset] == '$')
{
return (nextData = NextData.BinaryLength);
}
else
{
return (nextData = NextData.String);
}
}
public string ReadString()
{
ReportNextData();
if (nextData == NextData.Eof)
{
throw new EndOfStreamException();
}
else if (nextData != NextData.String)
{
throw new InvalidOperationException("Attempt to read non-string data as string");
}
if (characterBuffer == null)
{
characterBuffer = new char[1];
}
StringBuilder stringBuilder = new StringBuilder();
Decoder decoder = Encoding.Unicode.GetDecoder();
while (nextData == NextData.String)
{
byte b = byteBuffer[bufferOffset];
if (b == '$')
{
nextData = NextData.BinaryLength;
break;
}
else if (b == ':')
{
nextData = NextData.Unknown;
bufferOffset++;
break;
}
else
{
if (decoder.GetChars(byteBuffer, bufferOffset++, 1, characterBuffer, 0) == 1)
{
stringBuilder.Append(characterBuffer[0]);
}
if (bufferOffset == bufferEnd && !PopulateBufferIfNeeded(1))
{
nextData = NextData.Eof;
break;
}
}
}
return stringBuilder.ToString();
}
public int ReadBinaryLength()
{
ReportNextData();
if (nextData == NextData.Eof)
{
throw new EndOfStreamException();
}
else if (nextData != NextData.BinaryLength)
{
throw new InvalidOperationException("Attempt to read non-binary-length data as binary length");
}
bufferOffset++;
if (!PopulateBufferIfNeeded(sizeof(Int32)))
{
nextData = NextData.Eof;
throw new EndOfStreamException();
}
binaryEnd = BitConverter.ToInt32(byteBuffer, bufferOffset);
binaryOffset = 0;
bufferOffset += sizeof(Int32);
nextData = NextData.BinaryData;
return binaryEnd;
}
public int ReadBinaryData(byte[] buffer, int offset, int count)
{
ReportNextData();
if (nextData == NextData.Eof)
{
throw new EndOfStreamException();
}
else if (nextData != NextData.BinaryData)
{
throw new InvalidOperationException("Attempt to read non-binary data as binary data");
}
if (count > binaryEnd - binaryOffset)
{
throw new EndOfStreamException();
}
int bytesRead;
if (bufferOffset < bufferEnd)
{
bytesRead = Math.Min(count, bufferEnd - bufferOffset);
Array.Copy(byteBuffer, bufferOffset, buffer, offset, bytesRead);
bufferOffset += bytesRead;
}
else if (count < byteBuffer.Length)
{
if (!PopulateBufferIfNeeded(1))
{
throw new EndOfStreamException();
}
bytesRead = Math.Min(count, bufferEnd - bufferOffset);
Array.Copy(byteBuffer, bufferOffset, buffer, offset, bytesRead);
bufferOffset += bytesRead;
}
else
{
bytesRead = source.Read(buffer, offset, count);
}
binaryOffset += bytesRead;
if (binaryOffset == binaryEnd)
{
nextData = NextData.Unknown;
}
return bytesRead;
}
private bool PopulateBufferIfNeeded(int minimumBytes)
{
if (byteBuffer == null)
{
byteBuffer = new byte[8192];
}
if (bufferEnd - bufferOffset < minimumBytes)
{
int shiftCount = bufferEnd - bufferOffset;
if (shiftCount > 0)
{
Array.Copy(byteBuffer, bufferOffset, byteBuffer, 0, shiftCount);
}
bufferOffset = 0;
bufferEnd = shiftCount;
while (bufferEnd - bufferOffset < minimumBytes)
{
int bytesRead = source.Read(byteBuffer, bufferEnd, byteBuffer.Length - bufferEnd);
if (bytesRead == 0)
{
return false;
}
bufferEnd += bytesRead;
}
}
return true;
}
public void Dispose()
{
Stream source = this.source;
this.source = null;
if (source != null)
{
source.Dispose();
}
}
}
I read binary file to hex by block.
It is diffrent when I use FileStream.Read and File.ReadAllBytes
FileSteram.Read
int limit = 0;
if (openFileDlg.FileName.Length > 0)
{
fileName = openFileDlg.FileName;
FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read);
fsLen = (int)fs.Length;
int count = 0;
limit = 100;
byte[] read_buff = new byte[limit];
StringBuilder sb = new StringBuilder();
while ( (count = fs.Read(read_buff, 0, limit)) > 0)
{
foreach (byte b in read_buff)
{
sb.Append(Convert.ToString(b, 16).PadLeft(2, '0'));
}
}
rtxb_bin.AppendText(sb.ToString() + "\n");
}
File.ReadAllBytes
if (openFileDlg.FileName.Length > 0)
{
fileName = openFileDlg.FileName;
byte[] fileBytes = File.ReadAllBytes(fileName);
StringBuilder sb2 = new StringBuilder();
foreach (byte b2 in fileBytes)
{
sb2.Append(Convert.ToString(b2, 16).PadLeft(2, '0'));
}
rtxb_allbin.AppendText(sb2.ToString());
}
case 1, reasult is ...
........04c0020f00452a00421346108129844f2138448500208020250405250043188510812e0
and case 2 is
.......04c0020f00452a00421346108129844f2138448500208020250405250043188510812e044f212cc48120c24125404f2069c2c0008bff35f8f401efbd17047
FileStream.Read doesn't read after '12e0'
'44f212cc48120c24125404f2069c2c0008bff35f8f401efbd17047' is missing
How can I read all bytes using FileStream.Read?
Why FileStream.Read doesn't read last block?
Most likely it appears to you that it does not read last block. Suppose you have file of length 102. First iteration of you loop reads first 100 bytes, all is fine. But what happens on second (last) one? You read two bytes into read_buff, which is of length 100. Now that buffer contains 2 bytes of last block and 98 bytes of previous (first) block, because Read doesn't clear the buffer. Then you proceed with:
foreach (byte b in read_buff)
{
sb.Append(Convert.ToString(b, 16).PadLeft(2, '0'));
}
In result, sb has 100 bytes of first block, 2 bytes of last block, and then again 98 bytes of first block. If you don't look too closely, it might appear that it just skipped last block, while in reality it duplicated part of the previous one.
To fix, use count (indicating how much bytes were really read into the buffer) to work only with valid part of read_buff:
for (int i = 0; i < count; i++) {
sb.Append(Convert.ToString(read_buff[i], 16).PadLeft(2, '0'));
}
You need update offset and count.
Sintaxis
public override int Read(
byte[] array,
int offset,
int count
)
Example
public static byte[] ReadFile(string filePath)
{
byte[] buffer;
FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read);
try
{
int length = (int)fileStream.Length; // get file length
buffer = new byte[length]; // create buffer
int count; // actual number of bytes read
int sum = 0; // total number of bytes read
// read until Read method returns 0 (end of the stream has been reached)
while ((count = fileStream.Read(buffer, sum, length - sum)) > 0)
sum += count; // sum is a buffer offset for next reading
}
finally
{
fileStream.Close();
}
return buffer;
}
Reference
public static void ReadAndProcessLargeFile(string theFilename, long whereToStartReading = 0)
{
FileInfo info = new FileInfo(theFilename);
long fileLength = info.Length;
long timesToRead = (fileLength / megabyte);
long ctr = 0;
long timesRead = 0;
FileStream fileStram = new FileStream(theFilename, FileMode.Open, FileAccess.Read);
using (fileStram)
{
byte[] buffer = new byte[megabyte];
fileStram.Seek(whereToStartReading, SeekOrigin.Begin);
int bytesRead = 0;
//bytesRead = fileStram.Read(buffer, 0, megabyte);
//ctr = ctr + 1;
while ((bytesRead = fileStram.Read(buffer, 0, megabyte)) > 0)
{
ProcessChunk(buffer, bytesRead);
buffer = new byte[megabyte]; // This solves last read prob
}
}
}
private static void ProcessChunk(byte[] buffer, int bytesRead)
{
// Do the processing here
string utfString = Encoding.UTF8.GetString(buffer, 0, bytesRead);
Console.Write(utfString);
}
I'm working with a socket connection - to make things easier I get the socket's NetworkStream and wrap it up in a StreamReader which makes it easier to work with the largely textual content my socket receives from the server.
However there are times when the server sends binary information, like so:
TEXT
MORETEXT
500 BYTES OF BINARY DATA FOLLOWS THIS LINE
{500 bytes of binary data}
I'm reading the text content with the StreamReader fine, but because the StreamReader has its own buffer it means the StreamReader grabs the binary data before I can switch to the BinaryReader to read the 500 bytes of binary data.
Is there a way around this? I'd like the ability to read the textual data whilst still being able to read binary data.
I should do my research better; it turns out that the BinaryReader class already contains string and character processing methods (though it needs a few, like ReadLine, which can easily be added by subclassing it).
It's strange then, why BinaryReader doesn't subclass TextReader as it is more than capable of.
Here's an extension of BinaryReader that you can use to perform ReadLine and the usual BinaryReader stuff.
public class LineReader : BinaryReader
{
private Encoding _encoding;
private Decoder _decoder;
const int bufferSize = 1024;
private char[] _LineBuffer = new char[bufferSize];
public LineReader(Stream stream, int bufferSize, Encoding encoding)
: base(stream, encoding)
{
this._encoding = encoding;
this._decoder = encoding.GetDecoder();
}
public string ReadLine()
{
int pos = 0;
char[] buf = new char[2];
StringBuilder stringBuffer = null;
bool lineEndFound = false;
while(base.Read(buf, 0, 2) > 0)
{
if (buf[1] == '\r')
{
// grab buf[0]
this._LineBuffer[pos++] = buf[0];
// get the '\n'
char ch = base.ReadChar();
Debug.Assert(ch == '\n');
lineEndFound = true;
}
else if (buf[0] == '\r')
{
lineEndFound = true;
}
else
{
this._LineBuffer[pos] = buf[0];
this._LineBuffer[pos+1] = buf[1];
pos += 2;
if (pos >= bufferSize)
{
stringBuffer = new StringBuilder(bufferSize + 80);
stringBuffer.Append(this._LineBuffer, 0, bufferSize);
pos = 0;
}
}
if (lineEndFound)
{
if (stringBuffer == null)
{
if (pos > 0)
return new string(this._LineBuffer, 0, pos);
else
return string.Empty;
}
else
{
if (pos > 0)
stringBuffer.Append(this._LineBuffer, 0, pos);
return stringBuffer.ToString();
}
}
}
if (stringBuffer != null)
{
if (pos > 0)
stringBuffer.Append(this._LineBuffer, 0, pos);
return stringBuffer.ToString();
}
else
{
if (pos > 0)
return new string(this._LineBuffer, 0, pos);
else
return null;
}
}
}
int bufferlength = 12488;
int pointer = 1;
int offset = 0;
int length = 0;
FileStream fstwrite = new FileStream("D:\\Movie.wmv", FileMode.Create);
while (pointer != 0)
{
byte[] buff = new byte[bufferlength];
FileStream fst = new FileStream("E:\\Movie.wmv", FileMode.Open);
pointer = fst.Read(buff, 0, bufferlength);
fst.Close();
fstwrite.Write(buff, offset , pointer);
offset += pointer;
}
I used the above code for splitting a file and place it in other drive.Im not able to set the correct offset and length for this routine can anyone help me to fix this
splitting in the sense ,i split it in "x" kbs and pass it somewhere make the same file in some other location
I find it atlast ,thanks to evry one who gave their valueble responses.
Currently you're always reading from the start of the file... and even if you weren't you'd just be copying the whole file.
Here's some code which will actually split a single file into multiple files:
public static void SplitFile(string inputFile,
string outputPrefix,
int chunkSize)
{
byte[] buffer = new byte[chunkSize];
using (Stream input = File.OpenRead(inputFile))
{
int index = 0;
while (input.Position < input.Length)
{
using (Stream output = File.Create(outputPrefix + index))
{
int chunkBytesRead = 0;
while (chunkBytesRead < chunkSize)
{
int bytesRead = input.Read(buffer,
chunkBytesRead,
chunkSize - chunkBytesRead);
// End of input
if (bytesRead == 0)
{
break;
}
chunkBytesRead += bytesRead;
}
output.Write(buffer, 0, chunkBytesRead);
}
index++;
}
}
}
Your reading bufferlength of bytes. Shouldn't you set the offset like this then?
offset += bufferlength;
Don't open your source file inside the loop, or you'll always read the first chunk.
Open it before the loop, then make sure your offset is applied to the read.