I need to store an arbitrary amount of files (with any file type) as a property on a class. This class will get serialized to a JSON file. Later the user can load the JSON file back into the app, and has the ability to recreate the files they originally loaded. Right now I'm storing the files as an array of bytes. The issue is that some of the files are large, and the array of bytes is too large and is causing the serialization/deserializationto take a very long time.
Is there a way I can store the files as a string/array of strings instead of bytes? Or some different way of storing the files? What are some options to deal with this problem?
edit:
I believe a string would be faster because right now when the byte array is being rendered out in JSON in ascii format, so it looks like this:
150,123,43,62...
Encode your byte array as a base 64 string using Convert.ToBase64String(). That should reduce the size of your JSON significantly: http://rextester.com/ILJNV57711
For example, here's a random byte array, serialized as JSON:
[95,103,154,174,23,5,178,179,158,186,181,89,40,229,233,168,217,42,98,65,248]
Here's the same array, converted to a base 64 string, serialized as JSON:
"X2earhcFsrOeurVZKOXpqNkqYkH4"
It's plain to see that a byte array is smaller in JSON when expressed as a base 64 string. It goes from 76 characters to 30.
Certainly don't store the byte array as decimal numbers like that; Base64 encode it at the very least. Base64 encoding will enlarge the data to 133% of the raw file size but that'll be a massive improvement from the 400% enlargement you're currently using.
Related
When I try to understand SFML, I tried to set an icon with RenderWindowInstanse.SetIcon()
the method, that takes 3 parameters, fist two is size, 3 - byte[], then I try to use File.ReadAllBytes()
and same tools in c#, but that don't work, I search and find on-site ImageInstanse.Pixels property that returns byte[] like a parameter, that's works but I don't understand why they are returning different byte arrays
In SFML.NET, Image.Pixels returns an array of bytes that are nicely organized RGBA pixel values that represent the image in memory.
.NET's own File.ReadAllBytes() function returns the bytes that come from the file itself in the system's storage device.
Every file has a format that defines the layout and meaning of the bytes that make up that file. Image files are an extension of that concept as there any many different file formats for images. The pixel data for an image has to be encoded (and/or compressed) according to the format it is being saved as. This means that the bytes in the file no longer matches the raw RGBA pixel data as it was in the computer memory.
Files often contain lots of extra bytes for things like a file header, metadata, compression information, or possibly even an index for blocks of data that are smaller files or images within a file.
When you use File.ReadAllBytes(), you are given all of the bytes that represent this data in an array and you have to know exactly what the meaning of the byte at each index is.
SFML understands how to decode many different image formats, and will read the bytes of the file and process that into an array of pixel data. This is what the constructor for Image that takes a file is doing in the background. Once you have an SFML.Graphics.Image instance, you can use its Pixels property to access that decoded RGBA pixel data.
I have a binary file with a custom formatting that I need to parse and extract information from.
When I open the file with a Hex editor, it shows 675 bytes in the file, and the file size in its windows properties windows is also 675 bytes(Size on disk: 4KB).
When I use the C# method File.ReadAllBytes(), I get an array of size 1172 bytes.
I cannot make any sense of the bytes present in the array with respect to the bytes in the Hex editor.
Why does C# read in 1172 bytes when the file contains 675 bytes? How can I parse this?
I have recently started learning C# Networking and I was wondering how would you tell if the received Byte array is a file or a string?
A byte array is just a byte array. It's just got data in.
How you interpret that data is up to you. What's the difference between a text file and a string, for example?
Fundamentally, if your application needs to know how to interpret the data, you've got to put that into the protocol.
A byte array is just a byte array. However, you could make the original byte array include a byte that describes what type it is (assuming you are the originator of it). Then you find this descriptor byte and use it to make decisions.
Strings are encoded byte arrays; files can contain strings and/or binary data.
ASCII strings use byte values between 0-127 to represent characters and control codes. For UTF8 people have written validation routines (https://stackoverflow.com/a/892443/884862).
You'd have to check the array for all of the string encoding characteristics before you could assume it's a binary file.
edit Here's an SO question about classifying a file type Using .NET, how can you find the mime type of a file based on the file signature not the extension using a signature (first X bytes) of the file to determine it's mimetype.
No you can't. Data is data, you must layer on top of your network communication form of protocol, it will need to say something like: "If the first byte I see is a 1 the next four bytes represent a int, if I see a 2 read the next byte and that is the length of the text string that follows that..."
A much easier solution than inventing your own protocol is use a prebuilt one that gives you a higher level abstraction like WCF so you don't need to deal with byte arrays.
Not quite a "file", an array contains data. You should loop through that array and write the data,
Try this:
foreach(string data in array)
{
Console.WriteLine(data);
}
Now, if it doesn't contain strings, but data, you can simply use a
foreach(var data in array)
{
Console.WriteLine(data.ToString());
}
My program sends data from one application to another in a byte array. I want to pull sections of the data out to store in different variables. for instance the first [7] in the byte array hold the symbol data, the next section is a number which i don't know the length of because it will vary with each msg it sends. Before i send the data i break it up with commas between each section of data i want. My issue is setting up a loop that will stop at the commas so i can add the data into another variable. If this makes sense please any ideas will help. Thanks.
You need to know what encoding you have, since comma is not always the same byte value in different encoding schemes. Also if you want efficiency, you can try to parse the byte array as a byte array, but this is easier. Also, you could create a class on both ends that has the properties you need and is [Serializable].
If for whatever reason you don't want to do that then you can easily parse the byte array like this:
UTF8Encoding encoding = new UTF8Encoding();
string s = encoding.GetString(byteArray);
string[] values = s.Split(new char[] {','});
//then do something with the values
The data is just complicated to handle as a byte array, as it's really encoded text. Just decode it (using the encoding that you used to turn it into a byte array) and split it:
string[] parts = Encoding.UTF8.GetString(data).Split(',');
Now ou can get each part and parse them:
int symbol = Int32.Parse(parts[0]);
int count = Int32.Parse(parts[1]);
I recommend defining an object model that represents the data that you need to send, and then using some serialization framework to convert this to/from a byte array.
See for example http://msdn.microsoft.com/en-us/library/ms973893.aspx
Another topic which may be interesting for you is data contracts in .Net.
I have a byte[] array and want to write it to stdout: Console.Out.Write(arr2str(arr)). How to convert byte[] to string, so that app.exe > arr.txt does the expected thing? I just want to save the array to a file using a pipe, but encodings mess things up.
I'd later want to read that byte array from stdin: app.exe < arr.txt and get the same thing.
How can I do these two things: write and read byte arrays to/from stdin/stdout?
EDIT:
I'm reading with string s = Console.In.ReadToEnd(), and then System.Text.Encoding.Default.GetBytes(s). I'm converting from array to string with System.Text.Encoding.Default.GetString(bytes), but this doesn't work when used with <,>. By "doesn't work" I mean that writing and reading over a pipe does not return the same thing.
To work with binary files you want Console.OpenStandardInput() to retrieve a Stream that you can read from. This has been covered in other threads here at SO, this one for example: Read binary data from Console.In
If you are writing to Console.WriteLine you need to encode the text in to a printable format. If you want to output to a file as a binary you can't use Console.WriteLine
If you still need to output to the console you either need to open the raw stream with Console.OpenStandardOutput() or call Convert.ToBase64String to turn the byte array to a string. There is also Convert.FromBase64String to come back from base64 to a byte array.