How do I convert a Stream into a byte[] in C#? [duplicate] - c#

This question already has answers here:
Creating a byte array from a stream
(18 answers)
Closed 6 years ago.
Is there a simple way or method to convert a Stream into a byte[] in C#?

The shortest solution I know:
using(var memoryStream = new MemoryStream())
{
sourceStream.CopyTo(memoryStream);
return memoryStream.ToArray();
}

Call next function like
byte[] m_Bytes = StreamHelper.ReadToEnd (mystream);
Function:
public static byte[] ReadToEnd(System.IO.Stream stream)
{
long originalPosition = 0;
if(stream.CanSeek)
{
originalPosition = stream.Position;
stream.Position = 0;
}
try
{
byte[] readBuffer = new byte[4096];
int totalBytesRead = 0;
int bytesRead;
while ((bytesRead = stream.Read(readBuffer, totalBytesRead, readBuffer.Length - totalBytesRead)) > 0)
{
totalBytesRead += bytesRead;
if (totalBytesRead == readBuffer.Length)
{
int nextByte = stream.ReadByte();
if (nextByte != -1)
{
byte[] temp = new byte[readBuffer.Length * 2];
Buffer.BlockCopy(readBuffer, 0, temp, 0, readBuffer.Length);
Buffer.SetByte(temp, totalBytesRead, (byte)nextByte);
readBuffer = temp;
totalBytesRead++;
}
}
}
byte[] buffer = readBuffer;
if (readBuffer.Length != totalBytesRead)
{
buffer = new byte[totalBytesRead];
Buffer.BlockCopy(readBuffer, 0, buffer, 0, totalBytesRead);
}
return buffer;
}
finally
{
if(stream.CanSeek)
{
stream.Position = originalPosition;
}
}
}

I use this extension class:
public static class StreamExtensions
{
public static byte[] ReadAllBytes(this Stream instream)
{
if (instream is MemoryStream)
return ((MemoryStream) instream).ToArray();
using (var memoryStream = new MemoryStream())
{
instream.CopyTo(memoryStream);
return memoryStream.ToArray();
}
}
}
Just copy the class to your solution and you can use it on every stream:
byte[] bytes = myStream.ReadAllBytes()
Works great for all my streams and saves a lot of code!
Of course you can modify this method to use some of the other approaches here to improve performance if needed, but I like to keep it simple.

In .NET Framework 4 and later, the Stream class has a built-in CopyTo method that you can use.
For earlier versions of the framework, the handy helper function to have is:
public static void CopyStream(Stream input, Stream output)
{
byte[] b = new byte[32768];
int r;
while ((r = input.Read(b, 0, b.Length)) > 0)
output.Write(b, 0, r);
}
Then use one of the above methods to copy to a MemoryStream and call GetBuffer on it:
var file = new FileStream("c:\\foo.txt", FileMode.Open);
var mem = new MemoryStream();
// If using .NET 4 or later:
file.CopyTo(mem);
// Otherwise:
CopyStream(file, mem);
// getting the internal buffer (no additional copying)
byte[] buffer = mem.GetBuffer();
long length = mem.Length; // the actual length of the data
// (the array may be longer)
// if you need the array to be exactly as long as the data
byte[] truncated = mem.ToArray(); // makes another copy
Edit: originally I suggested using Jason's answer for a Stream that supports the Length property. But it had a flaw because it assumed that the Stream would return all its contents in a single Read, which is not necessarily true (not for a Socket, for example.) I don't know if there is an example of a Stream implementation in the BCL that does support Length but might return the data in shorter chunks than you request, but as anyone can inherit Stream this could easily be the case.
It's probably simpler for most cases to use the above general solution, but supposing you did want to read directly into an array that is bigEnough:
byte[] b = new byte[bigEnough];
int r, offset;
while ((r = input.Read(b, offset, b.Length - offset)) > 0)
offset += r;
That is, repeatedly call Read and move the position you will be storing the data at.

Byte[] Content = new BinaryReader(file.InputStream).ReadBytes(file.ContentLength);

byte[] buf; // byte array
Stream stream=Page.Request.InputStream; //initialise new stream
buf = new byte[stream.Length]; //declare arraysize
stream.Read(buf, 0, buf.Length); // read from stream to byte array

Ok, maybe I'm missing something here, but this is the way I do it:
public static Byte[] ToByteArray(this Stream stream) {
Int32 length = stream.Length > Int32.MaxValue ? Int32.MaxValue : Convert.ToInt32(stream.Length);
Byte[] buffer = new Byte[length];
stream.Read(buffer, 0, length);
return buffer;
}

if you post a file from mobile device or other
byte[] fileData = null;
using (var binaryReader = new BinaryReader(Request.Files[0].InputStream))
{
fileData = binaryReader.ReadBytes(Request.Files[0].ContentLength);
}

Stream s;
int len = (int)s.Length;
byte[] b = new byte[len];
int pos = 0;
while((r = s.Read(b, pos, len - pos)) > 0) {
pos += r;
}
A slightly more complicated solution is necesary is s.Length exceeds Int32.MaxValue. But if you need to read a stream that large into memory, you might want to think about a different approach to your problem.
Edit: If your stream does not support the Length property, modify using Earwicker's workaround.
public static class StreamExtensions {
// Credit to Earwicker
public static void CopyStream(this Stream input, Stream output) {
byte[] b = new byte[32768];
int r;
while ((r = input.Read(b, 0, b.Length)) > 0) {
output.Write(b, 0, r);
}
}
}
[...]
Stream s;
MemoryStream ms = new MemoryStream();
s.CopyStream(ms);
byte[] b = ms.GetBuffer();

"bigEnough" array is a bit of a stretch. Sure, buffer needs to be "big ebough" but proper design of an application should include transactions and delimiters. In this configuration each transaction would have a preset length thus your array would anticipate certain number of bytes and insert it into correctly sized buffer. Delimiters would ensure transaction integrity and would be supplied within each transaction. To make your application even better, you could use 2 channels (2 sockets). One would communicate fixed length control message transactions that would include information about size and sequence number of data transaction to be transferred using data channel. Receiver would acknowledge buffer creation and only then data would be sent.
If you have no control over stream sender than you need multidimensional array as a buffer. Component arrays would be small enough to be manageable and big enough to be practical based on your estimate of expected data. Process logic would seek known start delimiters and then ending delimiter in subsequent element arrays. Once ending delimiter is found, new buffer would be created to store relevant data between delimiters and initial buffer would have to be restructured to allow data disposal.
As far as a code to convert stream into byte array is one below.
Stream s = yourStream;
int streamEnd = Convert.ToInt32(s.Length);
byte[] buffer = new byte[streamEnd];
s.Read(buffer, 0, streamEnd);

Quick and dirty technique:
static byte[] StreamToByteArray(Stream inputStream)
{
if (!inputStream.CanRead)
{
throw new ArgumentException();
}
// This is optional
if (inputStream.CanSeek)
{
inputStream.Seek(0, SeekOrigin.Begin);
}
byte[] output = new byte[inputStream.Length];
int bytesRead = inputStream.Read(output, 0, output.Length);
Debug.Assert(bytesRead == output.Length, "Bytes read from stream matches stream length");
return output;
}
Test:
static void Main(string[] args)
{
byte[] data;
string path = #"C:\Windows\System32\notepad.exe";
using (FileStream fs = File.Open(path, FileMode.Open, FileAccess.Read))
{
data = StreamToByteArray(fs);
}
Debug.Assert(data.Length > 0);
Debug.Assert(new FileInfo(path).Length == data.Length);
}
I would ask, why do you want to read a stream into a byte[], if you are wishing to copy the contents of a stream, may I suggest using MemoryStream and writing your input stream into a memory stream.

You could also try just reading in parts at a time and expanding the byte array being returned:
public byte[] StreamToByteArray(string fileName)
{
byte[] total_stream = new byte[0];
using (Stream input = File.Open(fileName, FileMode.Open, FileAccess.Read))
{
byte[] stream_array = new byte[0];
// Setup whatever read size you want (small here for testing)
byte[] buffer = new byte[32];// * 1024];
int read = 0;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
stream_array = new byte[total_stream.Length + read];
total_stream.CopyTo(stream_array, 0);
Array.Copy(buffer, 0, stream_array, total_stream.Length, read);
total_stream = stream_array;
}
}
return total_stream;
}

Related

Copy stream content on two destinations

I know it's possible to copy one stream to another with sourceStream.CopyTo(targetStream); but I want to copy content of sourceStream to two destination streams in two different Tasks. When I call this method two times, in second time stream is empty.
Is that possible at all? A simple way is to load stream content to memory then copy it on targets, but it may cause OutOfMemoryException.
If it matters I'm using .Net 4.5
If you're copying it to two destinations at the same time, then something like:
byte[] buffer = new byte[SOME_SIZE];
int bytesRead;
while((bytesRead = source.Read(buffer, 0, buffer.Length)) > 0)
{
dest1.Write(buffer, 0, bytesRead);
dest2.Write(buffer, 0, bytesRead);
}
This iterates through the input stream once, writing each chunk to two outputs. This is pretty much what CopyTo does internally - the only difference is the second output.
Copy stream as many times using the method below:
private static Stream CopyStream(Stream inputStream)
{
const int readSize = 256;
byte[] buffer = new byte[readSize];
MemoryStream ms = new MemoryStream();
int count = inputStream.Read(buffer, 0, readSize);
while (count > 0)
{
ms.Write(buffer, 0, count);
count = inputStream.Read(buffer, 0, readSize);
}
ms.Position = 0;
return ms;
}
Use it as:
Stream destStream1 = CopyStream(sourceStream);
Stream destStream2 = CopyStream(sourceStream);

Copy all but the last 16 bytes of a stream? Early detection of end-of-stream?

This is C# related. We have a case where we need to copy the entire source stream into a destination stream except for the last 16 bytes.
EDIT: The streams can range upto 40GB, so can't do some static byte[] allocation (eg: .ToArray())
Looking at the MSDN documentation, it seems that we can reliably determine the end of stream only when the return value is 0. Return values between 0 and the requested size can imply bytes are "not currently available" (what does that really mean?)
Currently it copies every single byte as follows. inStream and outStream are generic - can be memory, disk or network streams (actually some more too).
public static void StreamCopy(Stream inStream, Stream outStream)
{
var buffer = new byte[8*1024];
var last16Bytes = new byte[16];
int bytesRead;
while ((bytesRead = inStream.Read(buffer, 0, buffer.Length)) > 0)
{
outStream.Write(buffer, 0, bytesRead);
}
// Issues:
// 1. We already wrote the last 16 bytes into
// outStream (possibly over the n/w)
// 2. last16Bytes = ? (inStream may not necessarily support rewinding)
}
What is a reliable way to ensure all but the last 16 are copied? I can think of using Position and Length on the inStream but there is a gotcha on MSDN that says
If a class derived from Stream does not support seeking, calls to Length, SetLength, Position, and Seek throw a NotSupportedException. .
Read between 1 and n bytes from the input stream.1
Append the bytes to a circular buffer.2
Write the first max(0, b - 16) bytes from the circular buffer to the output stream, where b is the number of bytes in the circular buffer.
Remove the bytes that you just have written from the circular buffer.
Go to step 1.
1This is what the Read method does – if you call int n = Read(buffer, 0, 500); it will read between 1 and 500 bytes into buffer and return the number of bytes read. If Read returns 0, you have reached the end of the stream.
2For maximum performance, you can read the bytes directly from the input stream into the circular buffer. This is a bit tricky, because you have to deal with the wraparound within the array underlying the buffer.
The following solution is fast and tested. Hope it's useful. It uses the double buffering idea you already had in mind. EDIT: simplified loop removing the conditional that separated the first iteration from the rest.
public static void StreamCopy(Stream inStream, Stream outStream) {
// Define the size of the chunk to copy during each iteration (1 KiB)
const int blockSize = 1024;
const int bytesToOmit = 16;
const int buffSize = blockSize + bytesToOmit;
// Generate working buffers
byte[] buffer1 = new byte[buffSize];
byte[] buffer2 = new byte[buffSize];
// Initialize first iteration
byte[] curBuffer = buffer1;
byte[] prevBuffer = null;
int bytesRead;
// Attempt to fully fill the buffer
bytesRead = inStream.Read(curBuffer, 0, buffSize);
if( bytesRead == buffSize ) {
// We succesfully retrieved a whole buffer, we will output
// only [blockSize] bytes, to avoid writing to the last
// bytes in the buffer in case the remaining 16 bytes happen to
// be the last ones
outStream.Write(curBuffer, 0, blockSize);
} else {
// We couldn't retrieve the whole buffer
int bytesToWrite = bytesRead - bytesToOmit;
if( bytesToWrite > 0 ) {
outStream.Write(curBuffer, 0, bytesToWrite);
}
// There's no more data to process
return;
}
curBuffer = buffer2;
prevBuffer = buffer1;
while( true ) {
// Attempt again to fully fill the buffer
bytesRead = inStream.Read(curBuffer, 0, buffSize);
if( bytesRead == buffSize ) {
// We retrieved the whole buffer, output first the last 16
// bytes of the previous buffer, and output just [blockSize]
// bytes from the current buffer
outStream.Write(prevBuffer, blockSize, bytesToOmit);
outStream.Write(curBuffer, 0, blockSize);
} else {
// We could not retrieve a complete buffer
if( bytesRead <= bytesToOmit ) {
// The bytes to output come solely from the previous buffer
outStream.Write(prevBuffer, blockSize, bytesRead);
} else {
// The bytes to output come from the previous buffer and
// the current buffer
outStream.Write(prevBuffer, blockSize, bytesToOmit);
outStream.Write(curBuffer, 0, bytesRead - bytesToOmit);
}
break;
}
// swap buffers for next iteration
byte[] swap = prevBuffer;
prevBuffer = curBuffer;
curBuffer = swap;
}
}
static void Assert(Stream inStream, Stream outStream) {
// Routine that tests the copy worked as expected
inStream.Seek(0, SeekOrigin.Begin);
outStream.Seek(0, SeekOrigin.Begin);
Debug.Assert(outStream.Length == Math.Max(inStream.Length - bytesToOmit, 0));
for( int i = 0; i < outStream.Length; i++ ) {
int byte1 = inStream.ReadByte();
int byte2 = outStream.ReadByte();
Debug.Assert(byte1 == byte2);
}
}
A much easier solution to code, yet slower since it would work at a byte level, would be to use an intermediate queue between the input stream and the output stream. The process would first read and enqueue 16 bytes from the input stream. Then it would iterate over the remaining input bytes, reading a single byte from the input stream, enqueuing it and then dequeuing a byte. The dequeued byte would be written to the output stream, until all bytes from the input stream are processed. The unwanted 16 bytes should linger in the intermediate queue.
Hope this helps!
=)
Use a circular buffer sounds great but there is no circular buffer class in .NET which means additional code anyways. I ended up with the following algorithm, a sort of map and copy - I think it's simple. The variable names are longer than usual for the sake of being self descriptive here.
This flows thru the buffers as
[outStream] <== [tailBuf] <== [mainBuf] <== [inStream]
public byte[] CopyStreamExtractLastBytes(Stream inStream, Stream outStream,
int extractByteCount)
{
//var mainBuf = new byte[1024*4]; // 4K buffer ok for network too
var mainBuf = new byte[4651]; // nearby prime for testing
int mainBufValidCount;
var tailBuf = new byte[extractByteCount];
int tailBufValidCount = 0;
while ((mainBufValidCount = inStream.Read(mainBuf, 0, mainBuf.Length)) > 0)
{
// Map: how much of what (passthru/tail) lives where (MainBuf/tailBuf)
// more than tail is passthru
int totalPassthruCount = Math.Max(0, tailBufValidCount +
mainBufValidCount - extractByteCount);
int tailBufPassthruCount = Math.Min(tailBufValidCount, totalPassthruCount);
int tailBufTailCount = tailBufValidCount - tailBufPassthruCount;
int mainBufPassthruCount = totalPassthruCount - tailBufPassthruCount;
int mainBufResidualCount = mainBufValidCount - mainBufPassthruCount;
// Copy: Passthru must be flushed per FIFO order (tailBuf then mainBuf)
outStream.Write(tailBuf, 0, tailBufPassthruCount);
outStream.Write(mainBuf, 0, mainBufPassthruCount);
// Copy: Now reassemble/compact tail into tailBuf
var tempResidualBuf = new byte[extractByteCount];
Array.Copy(tailBuf, tailBufPassthruCount, tempResidualBuf, 0,
tailBufTailCount);
Array.Copy(mainBuf, mainBufPassthruCount, tempResidualBuf,
tailBufTailCount, mainBufResidualCount);
tailBufValidCount = tailBufTailCount + mainBufResidualCount;
tailBuf = tempResidualBuf;
}
return tailBuf;
}

C# split byte array from file

Hello I'm doing an encryption algorithm which reads bytes from file (any type) and outputs them into a file. The problem is my encryption program takes only blocks of 16 bytes so if the file is bigger it has to be split into blocks of 16, or if there's a way to read 16 bytes from the file each time it's fine.
The algorithm is working fine with hard coded input of 16 bytes. The ciphered result has to be saved in a list or array because it has to be deciphered the same way later. I can't post all my program but here's what I do in main so far and cannot get results
static void Main(String[] args)
{
byte[] bytes = File.ReadAllBytes("path to file");
var stream = new StreamReader(new MemoryStream(bytes));
byte[] cipherText = new byte[16];
byte[] decipheredText = new byte[16];
Console.WriteLine("\nThe message is: ");
Console.WriteLine(stream.ReadToEnd());
AES a = new AES(keyInput);
var list1 = new List<byte[]>();
for (int i = 0; i < bytes.Length; i+=16)
{
a.Cipher(bytes, cipherText);
list1.Add(cipherText);
}
Console.WriteLine("\nThe resulting ciphertext is: ");
foreach (byte[] b in list1)
{
ToBytes(b);
}
}
I know that my loops always add the first 16 bytes from the byte array but I tried many ways and nothing work. It won't let me index the bytes array or copy an item to a temp variable like temp = bytes[i]. The ToBytes method is irrelevant, it just prints the elements as bytes.
I would like to recommend you to change the interface for your Cipher() method: instead of passing the entire array, it would be better to pass the source and destination arrays and offset - block by block encryption.
Pseudo-code is below.
void Cipher(byte[] source, int srcOffset, byte[] dest, int destOffset)
{
// Cipher these bytes from (source + offset) to (source + offset + 16),
// write the cipher to (dest + offset) to (dest + offset + 16)
// Also I'd recommend to check that the source and dest Length is less equal to (offset + 16)!
}
Usage:
For small files (one memory allocation for destination buffer, block by block encryption):
// You can allocate the entire destination buffer before encryption!
byte[] sourceBuffer = File.ReadAllBytes("path to file");
byte[] destBuffer = new byte[sourceBuffer.Length];
// Encrypt each block.
for (int offset = 0; i < sourceBuffer.Length; offset += 16)
{
Cipher(sourceBuffer, offset, destBuffer, offset);
}
So, the main advantage of this approach - it elimitates additional memory allocations: the destination array is allocated at once. There is also no copy-memory operations.
For files of any size (streams, block by block encryption):
byte[] inputBlock = new byte[16];
byte[] outputBlock = new byte[16];
using (var inputStream = File.OpenRead("input path"))
using (var outputStream = File.Create("output path"))
{
int bytesRead;
while ((bytesRead = inputStream.Read(inputBlock, 0, inputBlock.Length)) > 0)
{
if (bytesRead < 16)
{
// Throw or use padding technique.
throw new InvalidOperationException("Read block size is not equal to 16 bytes");
// Fill the remaining bytes of input block with some bytes.
// This operation for last block is called "padding".
// See http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation#Padding
}
Cipher(inputBlock, 0, outputBlock, 0);
outputStream.Write(outputBlock, 0, outputBlock.Length);
}
}
No need to read the whole mess into memory if you can only process it a bit at a time...
var filename = #"c:\temp\foo.bin";
using(var fileStream = new FileStream(filename, FileMode.Open))
{
var buffer = new byte[16];
var bytesRead = 0;
while((bytesRead = fileStream.Read(buffer, 0, buffer.Length)) > 0)
{
// do whatever you need to with the next 16-byte block
Console.WriteLine("Read {0} bytes: {1}",
bytesRead,
string.Join(",", buffer));
}
}
You can use Array.Copy
byte[] temp = new byte[16];
Array.Copy(bytes, i, temp, 0, 16);

An elegant way to consume (all bytes of a) BinaryReader?

Is there an elegant to emulate the StreamReader.ReadToEnd method with BinaryReader? Perhaps to put all the bytes into a byte array?
I do this:
read1.ReadBytes((int)read1.BaseStream.Length);
...but there must be a better way.
Original Answer (Read Update Below!)
Simply do:
byte[] allData = read1.ReadBytes(int.MaxValue);
The documentation says that it will read all bytes until the end of the stream is reached.
Update
Although this seems elegant, and the documentation seems to indicate that this would work, the actual implementation (checked in .NET 2, 3.5, and 4) allocates a full-size byte array for the data, which will probably cause an OutOfMemoryException on a 32-bit system.
Therefore, I would say that actually there isn't an elegant way.
Instead, I would recommend the following variation of #iano's answer. This variant doesn't rely on .NET 4:
Create an extension method for BinaryReader (or Stream, the code is the same for either).
public static byte[] ReadAllBytes(this BinaryReader reader)
{
const int bufferSize = 4096;
using (var ms = new MemoryStream())
{
byte[] buffer = new byte[bufferSize];
int count;
while ((count = reader.Read(buffer, 0, buffer.Length)) != 0)
ms.Write(buffer, 0, count);
return ms.ToArray();
}
}
There is not an easy way to do this with BinaryReader. If you don't know the count you need to read ahead of time, a better bet is to use MemoryStream:
public byte[] ReadAllBytes(Stream stream)
{
using (var ms = new MemoryStream())
{
stream.CopyTo(ms);
return ms.ToArray();
}
}
To avoid the additional copy when calling ToArray(), you could instead return the Position and buffer, via GetBuffer().
To copy the content of a stream to another, I've solved reading "some" bytes until the end of the file is reached:
private const int READ_BUFFER_SIZE = 1024;
using (BinaryReader reader = new BinaryReader(responseStream))
{
using (BinaryWriter writer = new BinaryWriter(File.Open(localPath, FileMode.Create)))
{
int byteRead = 0;
do
{
byte[] buffer = reader.ReadBytes(READ_BUFFER_SIZE);
byteRead = buffer.Length;
writer.Write(buffer);
byteTransfered += byteRead;
} while (byteRead == READ_BUFFER_SIZE);
}
}
Had the same problem.
First, get the file's size using FileInfo.Length.
Next, create a byte array and set its value to BinaryReader.ReadBytes(FileInfo.Length).
e.g.
var size = new FileInfo(yourImagePath).Length;
byte[] allBytes = yourReader.ReadBytes(System.Convert.ToInt32(size));
Another approach to this problem is to use C# extension methods:
public static class StreamHelpers
{
public static byte[] ReadAllBytes(this BinaryReader reader)
{
// Pre .Net version 4.0
const int bufferSize = 4096;
using (var ms = new MemoryStream())
{
byte[] buffer = new byte[bufferSize];
int count;
while ((count = reader.Read(buffer, 0, buffer.Length)) != 0)
ms.Write(buffer, 0, count);
return ms.ToArray();
}
// .Net 4.0 or Newer
using (var ms = new MemoryStream())
{
stream.CopyTo(ms);
return ms.ToArray();
}
}
}
Using this approach will allow for both reusable as well as readable code.
I use this, which utilizes the underlying BaseStream property to give you the length info you need. It keeps things nice and simple.
Below are three extension methods on BinaryReader:
The first reads from wherever the stream's current position is to the end
The second reads the entire stream in one go
The third utilizes the Range type to specify the subset of data you are interested in.
public static class BinaryReaderExtensions {
public static byte[] ReadBytesToEnd(this BinaryReader binaryReader) {
var length = binaryReader.BaseStream.Length - binaryReader.BaseStream.Position;
return binaryReader.ReadBytes((int)length);
}
public static byte[] ReadAllBytes(this BinaryReader binaryReader) {
binaryReader.BaseStream.Position = 0;
return binaryReader.ReadBytes((int)binaryReader.BaseStream.Length);
}
public static byte[] ReadBytes(this BinaryReader binaryReader, Range range) {
var (offset, length) = range.GetOffsetAndLength((int)binaryReader.BaseStream.Length);
binaryReader.BaseStream.Position = offset;
return binaryReader.ReadBytes(length);
}
}
Using them is then trivial and clear...
// 1 - Reads everything in as a byte array
var rawBytes = myBinaryReader.ReadAllBytes();
// 2 - Reads a string, then reads the remaining data as a byte array
var someString = myBinaryReader.ReadString();
var rawBytes = myBinaryReader.ReadBytesToEnd();
// 3 - Uses a range to read the last 44 bytes
var rawBytes = myBinaryReader.ReadBytes(^44..);

Possible ways to persist a string array to a stream without using serialization?

What are possible ways to save string arrays to a stream without using serialization?
I'm particularly interested in strings since their lengths may vary. I also should be able to restore the array from stream.
And, more importantly, I would like to be able to read only slices of an array without reading full array into memory, because potentially my arrays can be huge.
P.S. I know that there exist databases, that I shouldn't reinvent the wheel, etc, but I have my reasons to opt for hand made solution.
Thank you.
Well, saving data to a stream is serialization; the real trick is: what kind. For example, I assume you're talking about things like XmlSerializer or BinaryFormatter that require you to deserialize the whole thing, but that isn't always necessary.
By writing each string with a length-prefix, you should be able to seek past items you don't want pretty easily. The other option is to write (separately) an index of offsets, but that is sometimes overkill.
As a basic example, s here is "jkl", without it reading the entire stream or deserializing the unwanted strings; note that it could be optimized by (for example) using a variable-length encoding for the int (length), which would also fix the current assumption that endianness is the same between reader and writer:
static void Main()
{
byte[] raw;
using (MemoryStream ms = new MemoryStream())
{
// serialize all
List<string> data = new List<string> {
"abc", "def", "ghi", "jkl", "mno", "pqr" };
foreach (string s in data)
{
byte[] buffer = Encoding.UTF8.GetBytes(s);
byte[] lenBuffer = BitConverter.GetBytes(buffer.Length);
ms.Write(lenBuffer, 0, lenBuffer.Length);
ms.Write(buffer, 0, buffer.Length);
}
raw = ms.ToArray();
}
using (MemoryStream ms = new MemoryStream(raw))
{
int offset = 3, len;
byte[] buffer = new byte[128];
while (offset-- > 0)
{
Read(ms, ref buffer, 4);
len = BitConverter.ToInt32(buffer, 0);
ms.Seek(len, SeekOrigin.Current); // assume seekable, but
// easy to read past if not
}
Read(ms, ref buffer, 4);
len = BitConverter.ToInt32(buffer, 0);
Read(ms, ref buffer, len);
string s = Encoding.UTF8.GetString(buffer, 0, len);
}
}
static void Read(Stream stream, ref byte[] buffer, int count)
{
if (buffer.Length < count) buffer = new byte[count];
int offset = 0;
while (count > 0)
{
int bytes = stream.Read(buffer, offset, count);
if (bytes <= 0) throw new EndOfStreamException();
offset += bytes;
count -= bytes;
}
}

Categories