Read Chunk of data from file based on offset and length - c#

int n = 0;
string encodeString = string.Empty;
using (FileStream fsSource = new FileStream("test.pdf", FileMode.Open, FileAccess.Read))
{
byte[] bytes = new byte[count];
n = fsSource.Read(bytes, offset, count);
encodeString = System.Convert.ToBase64String(bytes);
}
The above code is working fine if I provide offset-0 and length-1024, but the second time if I provide Offset-1024 and length-1024 it is returning an error.
My requirement is I want to get byte array data from offset to length.
1st chunk = 0-1024
2nd chunk = 1024-2048
..
Last chunk = SomeValue -Filesize.
Example in Node.js using readChunk.sync(file_path, Number(offset), Number(size)); - this code is able to get the byte array of data from offset to length.

public static string ReadFileStreamInChunks()
{
const int readChunkBufferLength = 1024;
string filePath = "test.pdf";
string encodeString = string.Empty;
var readChunk = new char[readChunkBufferLength];
int readChunkLength;
using (StringWriter sw = new StringWriter())
using (FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.Read))
using (StreamReader sr = new StreamReader(fs))
{
do
{
readChunkLength = sr.ReadBlock(readChunk, 0, readChunkBufferLength);
sw.Write(readChunk, 0, readChunkLength);
} while (readChunkLength > 0);
return sw.ToString();
}
}

actually i think your problem is understanding the concepts of these parameters in your code , Count is your Chunk Size and offset is where to start Reading so if you want to read (1): a part of File to end Just Add To Your Offset (Offset + count of Bytes you Want To Seek) but (2): if You Want to Read A Part Of File From Middle You Shouldn't Modify Count That Is Your Chunk Size You Should modify Where You Write Your Byte Array Usually It's a Do-While Loop like :
long position = 0;
do
{
// read bytes from input stream
int bytesRead = request.FileByteStream.Read(buffer, 0, chunkSize);
if (bytesRead == 0)
{
break;
}
// write bytes to output stream
writeStream.Write(buffer, 0, bytesRead);
position += bytesRead;
if(position == "the value you want")
break;
} while (true);

Related

Read from "start" to "stop" as hex values from raw file using FileStream

I want to read from a "start" to a "stop" from a raw image file that I've created with FKT Imager.
I have a code that works, but I dont know if it's the best way of doing it?
// Read file, byte at the time (example 00, 5A)
int start = 512;
int stop = 3345332;
FileStream fs = new FileStream("file.001", FileMode.Open, FileAccess.Read);
int hexIn;
String hex;
String data = "";
fs.Position = start;
for (int i = 0; i < stop; i++) { // i = offset in bytes
hexIn = fs.ReadByte();
hex = string.Format("{0:X2}", hexIn);
data = data + hex;
} //for
fs.Close();
Console.Writeline("data=" + data);
You want to read a range of bytes from within a file. Why not reading all bytes in one go into an array and then do the transformation?
private string ReadFile(string filename, int offset, int length)
{
byte[] data = new byte[length];
using (FileStream fs = new FileStream(filename, FileMode.Open))
{
fs.Position = offset;
fs.Read(data, 0, length);
}
return string.Join("", data.Select(x => x.ToString("X2")));
}

C# Split a file into two byte arrays

Because the maximum value of a byte array is 2GB, lets say i have a larger file and i need to convert it to a byte array. Since i can't hold the whole file, how should i convert it into two?
I tried:
long length = new System.IO.FileInfo(#"c:\a.mp4").Length;
int chunkSize = Convert.ToInt32(length / 2);
byte[] part2;
FileStream fileStream = new FileStream(filepath, FileMode.Open, FileAccess.Read);
try
{
part2 = new byte[chunkSize]; // create buffer
fileStream.Read(part2, 0, chunkSize);
}
finally
{
fileStream.Close();
}
byte[] part3;
fileStream = new FileStream(filepath, FileMode.Open, FileAccess.Read);
try
{
part3 = new byte[chunkSize]; // create buffer
fileStream.Read(part3, 5, (int)(length - (long)chunkSize));
}
finally
{
fileStream.Close();
}
but it's not working.
Any ideas?
You can use a StreamReader to read in file too large to read into a byte array
const int max = 1024*1024;
public void ReadALargeFile(string file, int start = 0)
{
FileStream fileStream = new FileStream(file, FileMode.Open,FileAccess.Read);
using (fileStream)
{
byte[] buffer = new byte[max];
fileStream.Seek(start, SeekOrigin.Begin);
int bytesRead = fileStream.Read(buffer, start, max);
while(bytesRead > 0)
{
DoSomething(buffer, bytesRead);
bytesRead = fileStream.Read(buffer, start, max);
}
}
}
If you are working with extremely large files, you should use MemoryMappedFile, which maps a physical file to a memory space:
using (var mmf = MemoryMappedFile.CreateFromFile(#"c:\path\to\big.file"))
{
using (var accessor = mmf.CreateViewAccessor())
{
byte myValue = accessor.ReadByte(someOffset);
accessor.Write((byte)someValue);
}
}
See also: MemoryMappedViewAccessor
You can also read/write chunks of the file with the different methods in MemoryMappedViewAccessor.
This was my solution:
byte[] part1;
byte[] part2;
bool odd = false;
int chunkSize = Convert.ToInt32(length/2);
if (length % 2 == 0)
{
part1 = new byte[chunkSize];
part2 = new byte[chunkSize];
}
else
{
part1 = new byte[chunkSize];
part2 = new byte[chunkSize + 1];
odd = true;
}
FileStream fileStream = new FileStream(filepath, FileMode.Open, FileAccess.Read);
using (fileStream)
{
fileStream.Seek(0, SeekOrigin.Begin);
int bytesRead = fileStream.Read(part1, 0, chunkSize);
if (odd)
{
bytesRead = fileStream.Read(part2, 0, chunkSize + 1);
}
else
{
bytesRead = fileStream.Read(part2, 0, chunkSize);
}
}

How to read file by chunks

I'm a little bit confused aboot how i should read large file(> 8GB) by chunks in case each chunk has own size.
If I know chunk size it looks like code bellow:
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read, ProgramOptions.BufferSizeForChunkProcessing))
{
using (BufferedStream bs = new BufferedStream(fs, ProgramOptions.BufferSizeForChunkProcessing))
{
byte[] buffer = new byte[ProgramOptions.BufferSizeForChunkProcessing];
int byteRead;
while ((byteRead = bs.Read(buffer, 0, ProgramOptions.BufferSizeForChunkProcessing)) > 0)
{
byte[] originalBytes;
using (MemoryStream mStream = new MemoryStream())
{
mStream.Write(buffer, 0, byteRead);
originalBytes = mStream.ToArray();
}
}
}
}
But imagine, I've read large file by chunks made some coding with each chunk(chunk's size after that operation has been changed) and written to another new file all processed chunks. And now I need to do the opposite operation. But I don't know exactly chunk size. I have an idea. After each chunk has been processed i have to write new chunk size before chunk bytes. Like this:
Number of block bytes
Block bytes
Number of block bytes
Block bytes
So in that case first what i need to do is read chunk's header and learn what is chunk size exactly. I read and write to file only byte arrays. But I have a question - how should look chunk's header ? May be header have to contain some boundary ?
If the file is rigidly structured so that each block of data is preceded by a 32-bit length value, then it is easy to read. The "header" for each block is just the 32-bit length value.
If you want to read such a file, the easiest way is probably to encapsulate the reading into a method that returns IEnumerable<byte[]> like so:
public static IEnumerable<byte[]> ReadChunks(string path)
{
var lengthBytes = new byte[sizeof(int)];
using (var fs = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read))
{
int n = fs.Read(lengthBytes, 0, sizeof (int)); // Read block size.
if (n == 0) // End of file.
yield break;
if (n != sizeof(int))
throw new InvalidOperationException("Invalid header");
int blockLength = BitConverter.ToInt32(lengthBytes, 0);
var buffer = new byte[blockLength];
n = fs.Read(buffer, 0, blockLength);
if (n != blockLength)
throw new InvalidOperationException("Missing data");
yield return buffer;
}
}
Then you can use it simply:
foreach (var block in ReadChunks("MyFileName"))
{
// Process block.
}
Note that you don't need to provide your own buffering.
try this
public static IEnumerable<byte[]> ReadChunks(string fileName)
{
const int MAX_BUFFER = 1048576;// 1MB
byte[] filechunk = new byte[MAX_BUFFER];
int numBytes;
using (var fs = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.Read))
{
long remainBytes = fs.Length;
int bufferBytes = MAX_BUFFER;
while (true)
{
if (remainBytes <= MAX_BUFFER)
{
filechunk = new byte[remainBytes];
bufferBytes = (int)remainBytes;
}
if ((numBytes = fs.Read(filechunk, 0, bufferBytes)) > 0)
{
remainBytes -= bufferBytes;
yield return filechunk;
}
else
{
break;
}
}
}
}

System.ArgumentException using FileStream

Instead of reading all at once, I first create a FileStream to open the file, read into a buffer, then call NetworkStream.write() to write its content.
Here's the code.
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read))
{
try
{
int len = (int)fs.Length;
byte[] data = new byte[len];
byte[] buffer = new byte[bufferSize];
int count, sum = 0;
while ((count = fs.Read(buffer, sum, len - sum)) > 0)
{
netstream.Write(buffer,sum,len-sum);
sum += count;
}
...
It's throwing the error:
An unhandled exception of type 'System.ArgumentException' occurred in
mscorlib.dll
Additional information:
Offset and length were out of bounds for the array or count is greater
than the number of elements from index to the end of the source
collection.
I don't see any array out of bound issue here .
Suggestions please
Offset and Length should be based on Buffer length not the whole file, here is an example of reading chucked data from a FileStream and write it to another stream :
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read))
{
byte[] buffer = new byte[bufferSize];
while (true)
{
var count = fs.Read(buffer, 0, buffer.Length);
if (count == 0) break;
netstream.Write(buffer, 0, count);
}
}

Reading a stream that may have non-ASCII characters

I have an application that reads string data in from a stream. The string data is typically in English but on occasion it encounters something like 'Jalapeño' and the 'ñ' comes out as '?'. In my implementation I'd prefer to read the stream contents into a byte array but I could get by reading the contents into a string. Any idea what I can do to make this work right?
Current code is as follows:
byte[] data = new byte[len]; // len is known a priori
byte[] temp = new byte[2];
StreamReader sr = new StreamReader(input_stream);
int position = 0;
while (!sr.EndOfStream)
{
int c = sr.Read();
temp = System.BitConverter.GetBytes(c);
data[position] = temp[0];
position++;
}
input_stream.Close();
sr.Close();
You can pass the encoding to the StreamReader as in:
StreamReader sr = new StreamReader(input_stream, Encoding.UTF8);
However, I understand that Encoding.UTF8 is used by default according to the documentation.
Update
The following reads 'Jalapeño' fine:
byte[] bytes;
using (var stream = new FileStream("input.txt", FileMode.Open, FileAccess.Read, FileShare.Read))
{
var index = 0;
var count = (int) stream.Length;
bytes = new byte[count];
while (count > 0)
{
int n = stream.Read(bytes, index, count);
if (n == 0)
throw new EndOfStreamException();
index += n;
count -= n;
}
}
// test
string s = Encoding.UTF8.GetString(bytes);
Console.WriteLine(s);
As does this:
byte[] bytes;
using (var stream = new FileStream("input.txt", FileMode.Open, FileAccess.Read, FileShare.Read))
{
var reader = new StreamReader(stream);
string text = reader.ReadToEnd();
bytes = Encoding.UTF8.GetBytes(text);
}
// test
string s = Encoding.UTF8.GetString(bytes);
Console.WriteLine(s);
From what I understand the 'ñ' character is represented as 0xc391 in the text when the text is stored with UTF encoding. When you only read a byte, you'll loose data.
I'd suggest reading the whole stream as a byte array (the first example) and then do the encoding. Or use StreamReader to do the work for you.
Since you're trying to fill the contents into a byte-array, don't bother with the reader - it isn't helping you. Use just the stream:
byte[] data = new byte[len];
int read, offset = 0;
while(len > 0 &&
(read = input_stream.Read(data, offset, len)) > 0)
{
len -= read;
offset += read;
}
if(len != 0) throw new EndOfStreamException();

Categories