How to store protocol / specification structures - c#

Implementing an ISO Creator ( as an intellectual exercise mostly ) I need to store cd data structure values.
For example, from page 47 of the specification https://www.cs.cmu.edu/~varun/cs315p/iso9660.pdf , in order to store a directory in ISO9660 format I need to store information about various byte fields
There are other tables, such as Path Tables that store fields and the total number of fields would be in hundreds.
So I could either
1 - Store these in .cs files , 'hard coding' essentially
public byte length=1;
public byte ExtendedAttributeLength = 0;
//and so on
2- I could store these as Constants in .cs files
3 -I could store these in XML files
4- I could even store these in a database table
Considering that it is unlikely for these values to change, but not impossible, I'm not sure which way i should store the values.
Thank you

I think you can define all those specs as structs with explicit layout. This way you can specify the offset for each field:
[StructLayout(LayoutKind.Explicit, Size=16, CharSet=CharSet.Ansi)]
public struct DirectoryRecord
{
[FieldOffset(0)]
byte RecordLength;
[FieldOffset(1)]
byte ExtendedAttrRecordLength;
...
}
You can then serialize these to byte arrays and presumably save them as parts of the ISO image header or whatever their place is in there.
However, if you look at one of the existing C# ISO9660 implementations such as .NET DiscUtils on CodePlex you will see that they do things a bit differently.
For DirectoryRecord they have defined a class with ReadFrom and WriteTo methods and it takes care of reading from appropriate offsets in an input stream. This is one option. On top of that they have some other component that reads the file and delegates reading to sub-components such as this one.
So, you could do it like them. Or you could do it with structs as I mentioned before and have them behave like POCOs only, not extra reading and writing logic. You'll have to do that somewhere else.

Related

Writing disparate objects to a single file

This is an offshoot of my question found at Pulling objects from binary file and putting in List<T> and the question posed by David Torrey at Serializing and Deserializing Multiple Objects.
I don't know if this is even possible or not. Say that I have an application that uses five different classes to contain the necessary information to do whatever it is that the application does. The classes do not refer to each other and have various numbers of internal variables and the like. I would want to save the collection of objects spawned from these classes into a single save file and then be able to read them back out. Since the objects are created from different classes they can't go into a list to be sent to disc. I originally thought that using something like sizeof(this) to record the size of the object to record in a table that is saved at the beginning of a file and then have a common GetObjectType() that returns an actionable value as the kind of object it is would have worked, but apparently sizeof doesn't work the way I thought it would and so now I'm back at square 1.
Several options: wrap all of your objects in a larger object and serialize that. The problem with that is that you can't deserialize just one object, you have to load all of them. If you have 10 objects that each is several megs, that isn't a good idea. You want random access to any of the objects, but you don't know the offsets in the file. The "wrapper" objcect can be something as simple as a List<object>, but I'd use my own object for better type safety.
Second option: use a fixed size file header/footer. Serialize each object to a MemoryStream, and then dump the memory streams from the individual objects into the file, remembering the number of bytes in each. Finally add a fixed size block at the start or end of the file to record the offsets in the file where the individual objects begin. In the example below the header of the file has first the number of objects in the file (as an integer), then one integer for each object giving the size of each object.
// Pseudocode to serialize two objects into the same file
// First serialize to memory
byte[] bytes1 = object1.Serialize();
byte[] bytes2 = object2.Serialize();
// Write header:
file.Write(2); // Number of objects
file.Write(bytes1.Length); // Size of first object
file.Write(bytes2.Length); // Size of second object
// Write data:
file.Write(bytes1);
file.Write(bytes2);
The offset of objectN, is the size of the header PLUS the sum of all sizes up to N. So in this example, to read the file, you read the header like so (pseudo, again)
numObjs = file.readInt();
for(i=0..numObjs)
size[i] = file.readInt();
After which you can compute and seek to the correct location for object N.
Third option: Use a format that does option #2 for you, for example zip. In .NET you can use System.IO.Packaging to create a structured zip (OPC format), or you can use a third party zip library if you want to roll your own zip format.

Storing 16 bytes of String array in 4 bytes memory, (compression) in RFID Tags

I hope that this question will not produce some vagueness. Actually I am working on RFID project and I am using Passive Tags. These Tags store only 4 bytes of Data, 32bits. I am trying to store more information in String in Tag's Data Bank. I searched the internet for String compression Algorithms but I didn't find any of them suitable. Someone please guide me through this issue. How can I save more data in this 4 bytes Data Bank, should I use some other strategy for storing, if yes, then what? Moreover, I am using C# on Handheld Window CE device.
I'll appreciate if someone could help me...
It depends on your tag, for example alien tag http://www.alientechnology.com/docs/products/Alien-Technology-Higgs-3-ALN-9662-Short.pdf , has EPC memory , I think you use your EPC memory but You can also use User Memory in your tag. You don't have to compress anything, just use your User Memory. Furthermore, technically I rather not to save many data on my tag, I use my own coding on 32 bit and relates(map) it to the more Data on my Software, and save my data on my Hard Disk. It is more safe too.
There is obviously no compression that can reduce arbitrary 16 byte values to 4 byte values. That's mathematically impossible, check the Pidgeonhole principle for details.
Store the actual data in some kind of database. Have the 4 bytes encode an integer that acts as a key for the row your want to refer to. For example by using an auto-increment primary key, or an index into an array. Works with up to 4 billion rows.
If you have less than 2^32 strings, simply enumerate them and then save the strings index (in your "dictionary") inside your 4 byte "Data Bank".
A compression scheme can't guarantee such high compression ratios.
The only way I can think of with 32-bits is to store an int in the 32-bits, and construct a local/remote URL out of it, which points to the actual data.
You could also make the stored value point to entries in a local look-up table on the device.
Unless you know a lot about the format of your string, it is impossible to do this. This is evident from the pigeonhole principle: you have a theoretical 2^128 different 16-byte strings, but only 2^32 different values to choose from.
In other words, no compression algorithm will guarantee that an arbitrary string in your possible input set will map to a 4-byte value in the output set.
It may be possible to devise an algorithm which will work in your particular case, but unless your data set is sufficiently restricted (at most 1 in 79,228,162,514,264,337,593,543,950,336 possible strings may be valid) and has a meaningful structure, then your only option is to store some mapping externally.

Handling data references/pointers in a file?

I'm working with a binary file format. I'll try to make this as simple as possible because it's quite hard to explain.
The data structures that get written to the file may contain 'pointers' (ie. a pointer to a string that is in another location in the file, or a pointer to another structure within the file. A better word for 'pointer' would be 'offset', ie. a structure contains the OFFSET of the string within the file).
A quick example:
struct ExampleStruct {
public string Text;
public int Number;
};
The 'Text' string member will be written at the beginning of the file, and NOT be included in the serialized struct.
So, essentially, the struct will look like this in the file:
struct ExampleStruct {
public uint TextLocationOffset;
public int Number;
};
...'TextLocationOffset' is an offset to where the string 'Text' is located within the file.
So, after I have that, I then need a "relocation table" - essentially a list of double pointers that point to data pointers within the file. (does that make sense?)
So, since I wrote that ExampleStruct to my file, and it contains a 'pointer' (TextLocationOffset), my "relocation table" would consist of:
public uint TextLocationOffset_LocationOffset;
...'TextLocationOffset_LocationOffset' contains the OFFSET of 'TextLocationOffset' within the file.
Does that all make sense? I tried to simplify it as much as possible.
My problem is, how would I keep track of all the pointers/double pointers/relocations in C#? Data is constantly being added to the byte[] array that I have, so offsets will be changing constantly.
This is easy in C++, because I can get a double pointer of whatever is being 'relocated', and then I can change the original 'pointer' (in my example, 'TextLocationOffset') to the correct offset, and I can then find the location of the 'TextLocationOffset' value and add that to my relocation table.
Sorry if that makes no sense. I tried asking this a few weeks ago but got no replies, I might be making it sound confusing.
I just need a way to keep track of all of these in my code... Any tips?
P.S. If you need more thorough examples I'll be happy to provide. :)
Use a database table - all this work has been done.
You may want to look at other ways to serialize and deserialize your data. Keeping track of relocations and offsets in a managed application is doing the unnecessary unless you have extremely exceptional scenarios. SO users could better guide you if you let us know more about what you are trying to achieve in terms of functionality.

Converting Delphi variant record to C# struct

As I am trying to code a C# application from an existing application but developed in Delphi,
Very tough but managed some how till, but now I have come across a problem...
Delphi code contains following code:
type
TFruit = record
name : string[20];
case isRound : Boolean of // Choose how to map the next section
True :
(diameter : Single); // Maps to same storage as length
False :
(length : Single; // Maps to same storage as diameter
width : Single);
end;
i.e. a variant record (with case statement inside) and accordingly record is constructed and its size too.
On the other hand I am trying to do the same in C# struct, and haven't succeeded yet, I hope somemone can help me here.
So just let me know if there's any way I can implement this in C#.
Thanks in advance....
You could use an explicit struct layout to replicate this Delphi variant record. However, I would not bother since it seems pretty unlikely that you really want assignment to diameter to assign also to length, and vice versa. That Delphi record declaration looks like it dates from mid-1990s style of Delphi coding. Modern Delphi code would seldom be written that way.
I would just do it like this:
struct Fruit
{
string name;
bool isRound;
float diameter; // only valid when isRound is true
float length; // only valid when isRound is false
float width; // only valid when isRound is false
}
A more elegant option would be a class with properties for each struct field. And you would arrange that the property getters and setters for the 3 floats raised exceptions if they were accessed for an invalid value of isRound.
Perhaps this will do the trick?
This is NOT a copy-and-paste solution, note that the offsets and data sizes may need to be changed depending on how the Delphi structure is declared and/or aligned.
[StructLayout(LayoutKind.Explicit)]
unsafe struct Fruit
{
[FieldOffset(0)] public fixed char name[20];
[FieldOffset(20)] public bool IsRound;
[FieldOffset(21)] public float Diameter;
[FieldOffset(21)] public float Length;
[FieldOffset(25)] public float Width;
}
It depends on what you are trying to do.
If you're simply trying to make a corresponding structure then look at David Heffernan's answer. These days there is little justification for mapping two fields on top of each other unless they truly represent the same thing. (Say, individual items or the same items in an array.)
If you're actually trying to share files you need to look along the lines of ananthonline's answer but there's a problem with it that's big enough I couldn't put it in a comment:
Not only is there the Unicode issue but a Delphi shortstring has no corresponding structure in C# and thus it's impossible to simply map a field on top of it.
That string[20] actually comprises 21 bytes, a one-byte length code and then 20 characters worth of data. You have to honor the length code as there is no guarantee of what lies beyond the specified length--you're likely to find garbage there. (Hint: If the record is going to be written to disk always zap the field before putting new data in it. It makes it much easier to examine the file on disk when debugging.)
Thus you need to declare two fields and write code to process it on both ends.
Since you have to do that anyway I would go further and write code to handle the rest of it so as to eliminate the need for unsafe code at all.

MongoDB GridFS bucket?

I work with MongoDB C# Samus driver.
One of constructors of class MongoDB.GridFS.GridFile has parameter "bucket". When i create GridFile in Java like example i cannot set this "bucket". But i can set this "bucket" in Java when create GridFS object Java documentation. I'm confused!
My question:
What is "bucket"? For what? Tell please some use cases;)
Bucket is base name for files and chunks collections. By default bucket is 'fs' so you will have two collections:
fs.files will store file properties like id, name, size, chunk size, md5 checksum etc.
fs.chunks will store the actual binary data split into chunks, one per document.
Using GridFS class constructor argument you can set arbitrary bucket name.
Different buckets can be useful if you need to have separate collections for different types of files, so you can apply different indexes, sharding etc.

Categories