I got a flat file where the data is not delimetered or something else.
The file contains one large string and one row is represented by 180 chars.
Each column value is definied by a length of chars.
I have to create an object for each row, parse the 180 chars and fill
properties of the created object with the parsed values.
How can i solve this problem without permanent using substring or something else?
Maybe some nice solution with Linq?
Thanks a lot.
Solution 1 - Super fast but unsafe:
Create your class with [StructLayout(LayoutKind.Sequential)] and all other unmanaged code markings for length. Your strings will be char array but can be exposed as string after loading.
Read 180 bytes and create a byte array of the same size inside a fixed block
Change pointer to IntPtr and use Marshal.PtrToStructure() to load an onject of your class
Solution 2 - Loading logic in the class:
Create a constructor in your class that accepts byte[] and inside the objects using Covenrt.Toxxx or Encoding.ASCII.ToString() assuming it is ASCII
Read 180 bytes and create an object and pass it to .ctor
If you have to serialise back to byte[] then implement a ToByteArray() method and again use Covenrt.Toxxx or Encoding.ASCII.ToString() to write to byte.
Enhancement to solutions 2:
Create custom attributes and decorate your classes with those so that you can have a factory that reads metadata and inflates your objects using byte array for you. This is most useful if you have more than a couple of such classes.
Alternative to solutions 2:
You may pass stream instead of a byte array which is faster. Here you would use BinaryReader and BinaryWriter to read and write values. Strings however is a bit trick since it writes the length as well I think.
Use a StringReader to parse your text, then you won't have to use substring. Linq won't help you here.
I agree with OJ but even with StringReader you will still need the position of each individual value to parse it out of the string...there is nothing wrong with substring just make sure you use static constants when defining the begging and ending lengths. Example:
private static int VAR_START_INDEX = 0;
private static int VAR_END_INDEX = 4;
String data = "thisisthedata";
String var = data.Substring(VAR_START_INDEX,VAR_END_INDEX);
//var would then be equal to 'this'
This library can help you http://f2enum.codeplex.com/
Related
I have recently started learning C# Networking and I was wondering how would you tell if the received Byte array is a file or a string?
A byte array is just a byte array. It's just got data in.
How you interpret that data is up to you. What's the difference between a text file and a string, for example?
Fundamentally, if your application needs to know how to interpret the data, you've got to put that into the protocol.
A byte array is just a byte array. However, you could make the original byte array include a byte that describes what type it is (assuming you are the originator of it). Then you find this descriptor byte and use it to make decisions.
Strings are encoded byte arrays; files can contain strings and/or binary data.
ASCII strings use byte values between 0-127 to represent characters and control codes. For UTF8 people have written validation routines (https://stackoverflow.com/a/892443/884862).
You'd have to check the array for all of the string encoding characteristics before you could assume it's a binary file.
edit Here's an SO question about classifying a file type Using .NET, how can you find the mime type of a file based on the file signature not the extension using a signature (first X bytes) of the file to determine it's mimetype.
No you can't. Data is data, you must layer on top of your network communication form of protocol, it will need to say something like: "If the first byte I see is a 1 the next four bytes represent a int, if I see a 2 read the next byte and that is the length of the text string that follows that..."
A much easier solution than inventing your own protocol is use a prebuilt one that gives you a higher level abstraction like WCF so you don't need to deal with byte arrays.
Not quite a "file", an array contains data. You should loop through that array and write the data,
Try this:
foreach(string data in array)
{
Console.WriteLine(data);
}
Now, if it doesn't contain strings, but data, you can simply use a
foreach(var data in array)
{
Console.WriteLine(data.ToString());
}
My program sends data from one application to another in a byte array. I want to pull sections of the data out to store in different variables. for instance the first [7] in the byte array hold the symbol data, the next section is a number which i don't know the length of because it will vary with each msg it sends. Before i send the data i break it up with commas between each section of data i want. My issue is setting up a loop that will stop at the commas so i can add the data into another variable. If this makes sense please any ideas will help. Thanks.
You need to know what encoding you have, since comma is not always the same byte value in different encoding schemes. Also if you want efficiency, you can try to parse the byte array as a byte array, but this is easier. Also, you could create a class on both ends that has the properties you need and is [Serializable].
If for whatever reason you don't want to do that then you can easily parse the byte array like this:
UTF8Encoding encoding = new UTF8Encoding();
string s = encoding.GetString(byteArray);
string[] values = s.Split(new char[] {','});
//then do something with the values
The data is just complicated to handle as a byte array, as it's really encoded text. Just decode it (using the encoding that you used to turn it into a byte array) and split it:
string[] parts = Encoding.UTF8.GetString(data).Split(',');
Now ou can get each part and parse them:
int symbol = Int32.Parse(parts[0]);
int count = Int32.Parse(parts[1]);
I recommend defining an object model that represents the data that you need to send, and then using some serialization framework to convert this to/from a byte array.
See for example http://msdn.microsoft.com/en-us/library/ms973893.aspx
Another topic which may be interesting for you is data contracts in .Net.
What is the safest way to guarantee that the following operation will be performed correctly:
When I read in 4 bytes as a uint32, I will write it out to a text file.
Later I will open this text file, read the number I wrote out previously, and then convert it back into the 4 bytes for use in other processing.
There is the BitConverter class to help you convert between primitive types and bytes.
Since you are storing this as a string, there isn't a whole lot to this. Obviously there is no issue converting the number into a string using .ToString(). So the only question I assume is how to go back in a reliable fashion. The solution is to use uint.Parse. i.e.:
var s = "12343632423432";
uint i = uint.Parse(s);
(PS: BitConverter is not helpful for conversion from strings)
I'm trying to create a class to manage the opening of a certain file. I would one of the properties to be a byte array of the file, but I don't know how big the file is going to be. I tried declaring the byte array as :
public byte[] file;
...but it won't allow me to set it the ways I've tried. br is my BinaryReader:
file = br.ReadBytes(br.BaseStream.Length);
br.Read(file,0,br.BaseStream.Length);
Neither way works. I assume it's because I have not initialized my byte array, but I don't want to give it a size if I don't know the size. Any ideas?
edit: Alright, I think it's because the Binary Reader's BaseStream length is a long, but its readers take int32 counts. If I cast the 64s into 32s, is it possible I will lose bytes in larger files?
I had no problems reading a file stream:
byte[] file;
var br = new BinaryReader(new FileStream("c:\\Intel\\index.html", FileMode.Open));
file = br.ReadBytes((int)br.BaseStream.Length);
Your code doesn't compile because the Length property of BaseStream is of type long but you are trying to use it as an int. Implicit casting which might lead to data loss is not allowed so you have to cast it to int explicitly.
Update
Just bear in mind that the code above aims to highlight your original problem and should not be used as it is. Ideally, you would use a buffer to read the stream in chunks. Have a look at this question and the solution suggested by Jon Skeet
You can't create unknown sized array.
byte []file=new byte[br.BaseStream.Length];
PS: You should have to repeatedly read chunks of bytes for larger files.
BinaryReader.ReadBytes returns a byte[]. There is no need to initialize a byte array because that method already does so internally and returns the complete array to you.
If you're looking to read all the bytes from a file, there's a handy method in the File class:
http://msdn.microsoft.com/en-us/library/system.io.file.readallbytes.aspx
I have a BigInteger serialized to a file by a Java program using the writeObject method from ObjectOutputStream.
Can I deserialize it in C#? I tried using the java.math and java.io classes of vjslib, but I get an exception:
InvalidClassException
the class does not match the class of the persisted object for cl = java.lang.Number : __SUID = -8742448824652078965, getSUID(cl) = 3166984097235214156
Any ideas?
Do you have control over the serialization step from Java?
If so, I would suggest serializing a byte array, either as binary, or base64, and reading the byte array from the serialized structure.
Then you can pass the byte array to the System.Numerics.BigInteger constructor.
If you don't mind ugly hacks: I'd say the easiest (albeit not most efficient) way would be to just write it out as an ASCII String on the Java side, and parse that string on the C# side, instead of using binary de/serialization.
I suggest you don't use serialization for this, since the two versions of BigInteger are not compatible - they have different version ids.
You should write the object out in some other way, probably using the byte array from BigInteger.toByteArray
Reading this this question about serialization might also be insightful for you