I'm working with a 3rd party component that returns an IStream object (System.Runtime.InteropServices.ComTypes.IStream). I need to take the data in that IStream and write it to a file. I've managed to get that done, but I'm not really happy with the code.
With "strm" being my IStream, here's my test code...
// access the structure containing statistical info about the stream
System.Runtime.InteropServices.ComTypes.STATSTG stat;
strm.Stat(out stat, 0);
System.IntPtr myPtr = (IntPtr)0;
// get the "cbSize" member from the stat structure
// this is the size (in bytes) of our stream.
int strmSize = (int)stat.cbSize; // *** DANGEROUS *** (long to int cast)
byte[] strmInfo = new byte[strmSize];
strm.Read(strmInfo, strmSize, myPtr);
string outFile = #"c:\test.db3";
File.WriteAllBytes(outFile, strmInfo);
At the very least, I don't like the long to int cast as commented above, but I wonder if there's not a better way to get the original stream length than the above? I'm somewhat new to C#, so thanks for any pointers.
You don't need to do that cast, as you can read data from IStream source in chunks.
// ...
System.IntPtr myPtr = (IntPtr)-1;
using (FileStream fs = new FileStream(#"c:\test.db3", FileMode.OpenOrCreate))
{
byte[] buffer = new byte[8192];
while (myPtr.ToInt32() > 0)
{
strm.Read(buffer, buffer.Length, myPtr);
fs.Write(buffer, 0, myPtr.ToInt32());
}
}
This way (if works) is more memory efficient, as it just uses a small memory block to transfer data between that streams.
System.Runtime.InteropServices.ComTypes.IStream is a wrapper for ISequentialStream.
From MSDN: http://msdn.microsoft.com/en-us/library/aa380011(VS.85).aspx
The actual number of bytes read can be
less than the number of bytes
requested if an error occurs or if the
end of the stream is reached during
the read operation. The number of
bytes returned should always be
compared to the number of bytes
requested. If the number of bytes
returned is less than the number of
bytes requested, it usually means the
Read method attempted to read past the
end of the stream.
This documentation says, that you can loop and read as long as pcbRead is less then cb.
Related
This question already has answers here:
How to get all data from NetworkStream
(8 answers)
Closed 4 years ago.
I am trying to simply download an object from my bucket using C# just like we can find in S3 examples, and I can't figure out why the stream won't be entirely copied to my byte array. Only the first 8192 bytes are copied instead of the whole stream.
I have tried with with an Amazon.S3.AmazonS3Client and with an Amazon.S3.Transfer.TransferUtility, but in both cases only the first bytes are actually copied into the buffer.
var stream = await _transferUtility.OpenStreamAsync(BucketName, key);
using (stream)
{
byte[] content = new byte[stream.Length];
stream.Read(content, 0, content.Length);
// Here content should contain all the data from the stream, but only the first 8192 bytes are actually populated.
}
When debugging, I see the stream type is Amazon.Runtime.Internal.Util.Md5Stream, and inside the stream, before calling Read() the property CurrentPosition = 0. After the call, CurrentPosition becomes 8192, which seems to indeed indicate only the first 8K of data was read. The total Length of the stream is 104042.
If I make more calls to stream.Read(), I see more data gets read and CurrentPosition increases in value. But CurrentPosition is not a public property, and I cannot access it in my code to make a while() loop (and having to code such loops to read all the data seems a bit clunky).
Why are only the first 8K read in my code? How should I proceed to read the entire stream?
I tried calling stream.Flush(), but it did not fix the problem.
EDIT 1
I have modified my code so it does the following:
var stream = await _transferUtility.OpenStreamAsync(BucketName, key);
using (stream)
{
byte[] content = new byte[stream.Length];
var bytesRead = 0;
while (bytesRead < stream.Length)
bytesRead += stream.Read(content, bytesRead, content.Length - bytesRead);
}
And it works. But still looks clunky. Is it normal I have to do this?
EDIT 2
Final solution is to create a MemoryStream of the correct size and then call CopyTo(). So no clunky loop anymore and no risk of infinite loop if Read() starts returning 0 before the whole stream has been read:
var stream = await _transferUtility.OpenStreamAsync(BucketName, key);
using (stream)
{
using (var memoryStream = new MemoryStream((int)stream.Length))
{
stream.CopyTo(memoryStream);
var myBuffer = memoryStream.GetBuffer();
}
}
stream.Read() returns the number of bytes read. You can then keep track of the total number of bytes read until you have reached the end of the file (content.Length).
You could also just loop until the returned value is 0 meaning error / no more bytes left.
You will need to keep track of the current offset for your content buffer so that you are not overwriting data for each call.
I hope someone here will be able to help me out with this.
What I'm trying to do is decompress a zlib compressed file in C# using ZlibNet. (I've also tried DotNetZip and SharpZipLib)
The problem that I'm having is that it'll decompress only the first 256kb, or rather the first 262144 bytes.
Here's my Decompress method, taken from here:
public static byte[] Decompress(byte[] gzip)
{
using (var stream = new Ionic.Zlib.ZlibStream(new MemoryStream(gzip), Ionic.Zlib.CompressionMode.Decompress))
{
var outStream = new MemoryStream();
const int size = 999999; //Playing around with various sizes didn't help
byte[] buffer = new byte[size];
int read;
while ((read = stream.Read(buffer, 0, size)) > 0)
{
outStream.Write(buffer, 0, read);
read = 0;
}
return outStream.ToArray();
}
}
Basically, the int (read) gets set to 262144 on the first time the while loop executes, it writes, and then the next pass of the while loop, read gets said to 0, thus making the loop exit and the function return the outStream as an array. (Even though there are still bytes left to be read!)
Thanks in advance to anyone who could help with this!
Upon further inspection of the originally packed data, it turns out that the script responsible for (de)compressing the data in the original application would split the zlib stream of a file into chunks of 262144 bytes each.
This is why the various libraries I tested always stopped at 262144 bytes-- it was the end of the zlib stream, but not the end of the file it was supposed to extract. (Each zlib stream was also seperated by a 32-bit unsigned int that indicated the amount of bytes the next zlib stream would contain)
My only guess is that they did this so that if they had a very large file, they wouldn't need to load all of it into memory for decompression. (But that's just a guess.)
I have very poor knowledge of C#, but I need to write code, that read binary blob to byte[].
I wrote this code:
byte[] userBlob;
myCommand.CommandText = "SELECT id, userblob FROM USERS";
myCommand.Connection = myFBConnection;
myCommand.Transaction = myTransaction;
FbDataReader reader = myCommand.ExecuteReader();
try
{
while(reader.Read())
{
Console.WriteLine(reader.GetString(0));
userBlob = // what I should to do here??
}
}
catch (Exception e)
{
Console.WriteLine(e.Message);
Console.WriteLine("Can't read data from DB");
}
But what I should place here? As I understood I need use streams, but I can't understand how to do it.
A little late to the game; I hope this is on point.
I assuming you're using the Firebird .NET provider, which is a C# implementation that does not ride on top of the native fbclient.dll. Unfortunately, it does not provide a streaming interface to BLOBs, which would allow for reading potentially huge data in chunks without blowing out memory.
Instead, you use the FbDataReader.GetBytes() method to read the data, and it all has to fit in memory. GetBytes takes a user-provided buffer and stuffs the BLOB data in the position referenced, and it returns the number of bytes it actually copied (which could be less than the full size).
Passing a null buffer to GetBytes returns you the full size of the BLOB (but no data!) so you can reallocate as needed.
Here we assume you have an INT for field #0 (not interesting) and the BLOB for #1, and this naive implementation should take care of it:
// temp buffer for all BLOBs, reallocated as needed
byte [] blobbuffer = new byte[512];
while (reader.Read())
{
int id = reader.GetInt32(0); // read first field
// get bytes required for this BLOB
long n = reader.GetBytes(
i: 1, // field number
dataIndex: 0,
buffer: null, // no buffer = size check only
bufferIndex: 0,
length: 0);
// extend buffer if needed
if (n > blobbuffer.Length)
blobbuffer = new byte[n];
// read again into nominally "big enough" buffer
n = reader.GetBytes(1, 0, blobbuffer, 0, blobbuffer.Length);
// Now: <n> bytes of <blobbuffer> has your data. Go at it.
}
It's possible to optimize this somewhat, but the Firebird .NET provider really needs a streaming BLOB interface like the native fbclient.dll offers.
byte[] toBytes = Encoding.ASCII.GetBytes(string);
So, in your case;
userBlob = Encoding.ASCII.GetBytes(reader.GetString(0));
However, I am not sure what you are trying to achieve with your code as you are pulling back all users and then creating the blob over and over.
After reading the TCP SSL example from MSDN, they use a byte array to read data into the stream. Is there a reason the array is limited to 2048? What if TCP sends a longer array than 2048? Also, how does the buffer.Length property continue to read the stream, as it is changing. It doesn't make complete sense to me. Why read the length of the buffer, wouldn't you want to read the length of the incremental bytes coming into the stream?
static string ReadMessage(SslStream sslStream)
{
// Read the message sent by the client.
// The client signals the end of the message using the
// "<EOF>" marker.
byte [] buffer = new byte[2048];
StringBuilder messageData = new StringBuilder();
int bytes = -1;
do
{
// Read the client's test message.
bytes = sslStream.Read(buffer, 0, buffer.Length);
// Use Decoder class to convert from bytes to UTF8
// in case a character spans two buffers.
Decoder decoder = Encoding.UTF8.GetDecoder();
char[] chars = new char[decoder.GetCharCount(buffer,0,bytes)];
decoder.GetChars(buffer, 0, bytes, chars,0);
messageData.Append (chars);
// Check for EOF or an empty message.
if (messageData.ToString().IndexOf("<EOF>") != -1)
{
break;
}
} while (bytes !=0);
return messageData.ToString();
}
When reading from any kind of Stream (not just SslStream) just because you asked for X bytes they are allowed to give you anywhere between 1 and X bytes back. that is what the bytes returned from Read is for, to see how many you read.
If you give Read a giant buffer it is only going to fill in the first few 1000ish bytes, giving it a bigger buffer is a waste of space. If you need more data than the buffer is large you make multiple requests for data.
A important thing to remember when dealing with network streams, one write on the sender does not mean one read on the receiver. Writes can be combined together or split apart at arbitrary points, that is why it is to have some form of Message Framing to be able to tell when you have "all the data". In the example you posted they use the marker <EOF> to signal "end of message".
I have c# code reading a text file and printing it out which looks like this:
StreamReader sr = new StreamReader(File.OpenRead(ofd.FileName));
byte[] buffer = new byte[100]; //is there a way to simply specify the length of this to be the number of bytes in the file?
sr.BaseStream.Read(buffer, 0, buffer.Length);
foreach (byte b in buffer)
{
label1.Text += b.ToString("x") + " ";
}
Is there anyway I can know how many bytes my file has?
I want to know the length of the byte[] buffer in advance so that in the Read function, I can simply pass in buffer.length as the third argument.
System.IO.FileInfo fi = new System.IO.FileInfo("myfile.exe");
long size = fi.Length;
In order to find the file size, the system has to read from the disk. So, the above example performs data read from disk but does not read file content.
It's not clear why you're using StreamReader at all if you're going to read binary data. Just use FileStream instead. You can use the Length property to find the length of the file.
Note, however, that that still doesn't mean you should just call Read and *assume` that a single call will read all the data. You should loop until you've read everything:
byte[] data;
using (var stream = File.OpenRead(...))
{
data = new byte[(int) stream.Length];
int offset = 0;
while (offset < data.Length)
{
int chunk = stream.Read(data, offset, data.Length - offset);
if (chunk == 0)
{
// Or handle this some other way
throw new IOException("File has shrunk while reading");
}
offset += chunk;
}
}
Note that this is assuming you do want to read the data. If you don't want to even open the stream, use FileInfo.Length as other answers have shown. Note that both FileStream.Length and FileInfo.Length have a type of long, whereas arrays are limited to 32-bit lengths. What do you want to happen with a file which is bigger than 2 gigs?
You can use the FileInfo.Length method.
Take a look at the example given in the link.
I would imagine something in here should help.
I doubt you can preemptively guess the size of a file without reading it...
How do I use File.ReadAllBytes In chunks
If it is a large file; then reading in chunks should might help