XML to LINQ Question/s - c#

Just before I begin heres a small overview of what I'm trying to achieve and then we'll get down to the gory details. At present I'm developing an application which will monitor a users registry for changes to specific keys which relate to user preferences. Were currently using mandatory profiles (not my choice), anyway the whole idea is to record the changes to a location where they can be writen back to a users registry next time they log on.
At the moment I have the system monitoring registry changes and firing events returning the key, value name and value that have changed. I was entering these into a list to create a single string containing all the data, then writing that list to a text file every so often. Now this has all been fine but I need to change the way data's held as breaking the strings down into key, value name and value again for the write back to registry requires too much overhead and theres also problems breaking the strings up in a uniquely identifiable fashion.
So it was suggested to me to look at XML, which I haven't used before and I've begun investigating it and it all looks simple enough, I've also used LINQ before to connect to embedded databases. What I'm currently struggling to get my head around is how LINQ is able to retrieve and manipulate the data in memory from XML, as I don't want to be constantly accessing the XML file due to a need to keep the application as quick as possible. At present all changes in the registry are cached into a List(String) then written to a text file every minute or so.
At the moment what I have is the system returning the key, value name and value in different strings, converging these into a single List(String) value, where as what I'm going to need is table or equivalent representing a key, which contains multiple value names with each value name containing a single value and finally a type (this wil be a number representing what kind of registry value this is, REG SZ, REG BINARY etc). Both in the XML file and the program it self.
Also what I don't quite get is unlike a database the tables and there schemas won't exist until the program first runs as it will create a new XML file rather than it already existing. This is due to the information being writen back to the users personal drive, so it has to be created when it first runs on the users machine.
I've tried a few links and tutorials etc but nothing has clicked just yet, so if you have an example or could maybe explain it to me a little better it would be appreciated.
Just one final bit I want to add is that my current idea for storing the data in program is to create a List of values, embedded in a List of value names and a list of value names embedded in a list of keys. Does that sound ok?
Now I know this is long, and kind of all over the place, so if someone could help it would be appreciated or if you require further information of clarification please let me know and I'll try my best.
Thanks

From what I understand, you are just thinking in the wrong direction. Your application does not want to manipulate XML in memory. You just want to work with some data structure in memory and would like to have an easy way to store it to disc and to read it back? If I understand that right:
Don't care about LINQ for XML. Just have a look at the build in XML serialization infrastructure. Then build an internal data structure which fits your applications needs and use an XmlSerializer to write it to disc and to read it back. No need to touch any XML by hand!

From what you describe it does seem like a good idea to use XML here.
As for accessing the XML dta in memory, I found the MSDN documentation quite helpful:
http://msdn.microsoft.com/en-us/library/bb387098.aspx
The basic idea is, LINQ-to-XML is just LINQ-to-Objects, working with objects that represent the XML elements.
I'm afraid I don't quite get your second question.

Related

C# Winform database

I am building a Winform application that need a database.
The database needs to save an array of items of a custom class:
Name
Date
Duration
Artist
Genre
If I should build the database using a file that every time, when I increase the array, I will save. Is there wait time to save an array of 300 or so items?
And the second database is to use SQL.
What is the difference between them? And what should I use?
As someone mentioned in a comment, SQLite should work very well for this type of scenario.
If you think your data set will remain fairly small, you might consider XML, or a file, or something else if you think that would be quicker/easier.
In any case, I would strongly recommend that you hide your storage-logic behind an interface, and call only that from the winforms part of your application. This way you will be able to replace your storage-solution later if you should need to.
Update in response to comment: The reason for using SQLite instead of another DB System is that SQLite can be integrated directly into your application. Other DBMS`s will typically be external systems, that you just connect to from within your app.
A quick google search will provide you lots of info, such as this short article about using SQLite within a C# application.
I think you have to think about the futured size of your data.
If you know that i future the data will grow up exponentially, i think you have to use a database System like SQL.
Otherwise if it is only for a few records, you can use a XML File instead.
If you are using a MS SQL Database, you can open a Connection while saving your data, and write it with a sqladapter into the database.
If you are using a XML file instead, you can use the XMLSerializer class for serialization of your own Business object.
File vs database? - it is easy. What is database - it is a file. Only it has an engine that knows how to manipulate that file.
If you use file, you suddenly need to think, "what if?". What if file gets corrupted during write. Or what if computer shuts down in the middle of write? DBMS takes care of this issues by issuing all sorts of mechanisms such as uncommitted data files, etc. Now you will need to provide this mechanism yourself.
This is why you should write to file only non-critical data. For example, some user settings. Because if you lost that file, user can re-size controls again but no data will be at loss. Or log file is another good use of file. Because if you lose a log, you can live without. But if you lose months of worth of data...
In your case, I don't know, how user history is important. 300 items is not a large array. You can use XML by creating an object (class) and mark its properties with XML attributes and then use XML serializer to serialize your history into XML
http://msdn.microsoft.com/en-us/library/system.xml.serialization.xmlserializer.aspx
But if it is going to grow and you not planning to age some of it and delete, look into RDBMS.

Is File Reference for Deserialization acceptable in a database?

I have a situation where I need to store some data that just won't ...really fit into a database table. It is a little too abstract, and I've not enough knowledge to piecemeal it in such a way that it could be broken into tables and columns. The object in question is a System.Linq.Expressions.Expression<T>.
I have discovered a means of serializing such to xml using MetaLinq. and it works pretty well, albeit the xml it generates is excessively obese, I somewhat expected this much from something as complicated as an Expression. A modest expression turns out to around 19 kb.
So my thought was to use gzip compression on the file. This works well, it saves it to about 2 kb.
So then, my actual question is this : is it bad practice or 'dangerous' practice to basically use a table column to reference a filename for deserialization for an object? Like I would have a table for expressions, and it would have a filename, when that expression was called it would perform the gzip decompression, deserialize it, and return the object.
This seems like the ideal solution but it requires a lot of File I/O and a lot of various compression/zipping/serialization. I'm wondering if I could get the opinion of more experienced database admins out there. I am using Fluent nHibernate as my ORM mapper.
MetaLinq on codeplex
Not an experienced DBA, but I would store the serialized data in a BLOB field in the database. Database backups do no good if the files your data is depending on go away or vice versa. I think it would simplify things to just keep it all together. And the blob works fine since the data you are storing does not need to be queried.
Depends on the size of the data.
Sql has an XML data type for table columns now. So you could deserialize the object and then insert the whole object in the column again depending on size.
But if you must use the file system I would store a path and the file name in the column.
In your programs app.config keep the root of the drive like \\MyDrive or d:\
That way if information moves, just change the app config as long as the folder/file structure stays the same.
Edit:
Along with NerdFury suggestion you could you a binary serializer if you do not need to "see" the data in the database. XML serialization at least makes it readable

Efficient way to analyze large amounts of data?

I need to analyze tens of thousands of lines of data. The data is imported from a text file. Each line of data has eight variables. Currently, I use a class to define the data structure. As I read through the text file, I store each line object in a generic list, List.
I am wondering if I should switch to using a relational database (SQL) as I will need to analyze the data in each line of text, trying to relate it to definition terms which I also currently store in generic lists (List).
The goal is to translate a large amount of data using definitions. I want the defined data to be filterable, searchable, etc. Using a database makes more sense the more I think about it, but I would like to confirm with more experienced developers before I make the changes, yet again (I was using structs and arraylists at first).
The only drawback I can think of, is that the data does not need to be retained after it has been translated and viewed by the user. There is no need for permanent storage of data, therefore using a database might be a little overkill.
It is not absolutely necessary to go a database. It depends on the actual size of the data and the process you need to do. If you are loading the data into a List with a custom class, why not use Linq to do your querying and filtering? Something like:
var query = from foo in List<Foo>
where foo.Prop = criteriaVar
select foo;
The real question is whether the data is so large that it cannot be loaded up into memory confortably. If that is the case, then yes, a database would be much simpler.
This is not a large amount of data. I don't see any reason to involve a database in your analysis.
There IS a query language built into C# -- LINQ. The original poster currently uses a list of objects, so there is really nothing left to do. It seems to me that a database in this situation would add far more heat than light.
It sounds like what you want is a database. Sqlite supports in-memory databases (use ":memory:" as the filename). I suspect others may have an in-memory mode as well.
I was facing the same problem that you faced now while I was working on my previous company.The thing is I was looking a concrete and good solution for a lot of bar code generated files.The bar code generates a text file with thousands of records with in a single file.Manipulating and presenting the data was so difficult for me at first.Based on the records what I programmed was, I create a class that read the file and loads the data to the data table and able to save it in database. The database what I used was SQL server 2005.Then I able to manage the saved data easily and present it which way I like it.The main point is read the data from the file and save to it to the data base.If you do so you will have a lot of options to manipulate and present as the way you like it.
If you do not mind using access, here is what you can do
Attach a blank Access db as a resource
When needed, write the db out to file.
Run a CREATE TABLE statement that handles the columns of your data
Import the data into the new table
Use sql to run your calculations
OnClose, delete that access db.
You can use a program like Resourcer to load the db into a resx file
ResourceManager res = new ResourceManager( "MyProject.blank_db", this.GetType().Assembly );
byte[] b = (byte[])res.GetObject( "access.blank" );
Then use the following code to pull the resource out of the project. Take the byte array and save it to the temp location with the temp filename
"MyProject.blank_db" is the location and name of the resource file
"access.blank" is the tab given to the resource to save
If the only thing you need to do is search and replace, you may consider using sed and awk and you can do searches using grep. Of course on a Unix platform.
From your description, I think linux command line tools can handle your data very well. Using a database may unnecessarily complicate your work. If you are using windows, these tools are also available by different ways. I would recommend cygwin. The following tools may cover your task: sort, grep, cut, awk, sed, join, paste.
These unix/linux command line tools may look scary to a windows person but there are reasons for people who love them. The following are my reasons for loving them:
They allow your skill to accumulate - your knowledge to a partially tool can be helpful in different future tasks.
They allow your efforts to accumulate - the command line (or scripts) you used to finish the task can be repeated as many times as needed with different data, without human interaction.
They usually outperform the same tool you can write. If you don't believe, try to beat sort with your version for terabyte files.

How can I save large amounts of data in C#?

I'm writing a program in C# that will save lots of data points and then later make a graph. What is the best way to save these points?
Can I just use a really long array or should I use a text file or excel file or something like that?
Additional information: It probably wont be more than a couple thousand. And it would be good if I could access it from a windows mobile app. Basically a user will be able to save times that something happens at, and then the app will use the data to find a cross correlation.
If it's millions or even thousands of records, I would probably look at using a database. You can get SQL Server 2008 Express for free, or use MySQL, or something like that.
If you go that route, LINQ to SQL makes database access a piece of cake in .NET. Entity Framework is also available, but LINQ to SQL probably has a quicker time-to-implement.
If you use a text file or excel file, etc. You'll still need to load them back into memory to plot the graph.
So if you're collecting data over a long period of time, or you want to plot the graph some time in the future, write them to a plain text file. When you're ready to plot the graph, load the file up and plot the graph.
If the data collection is within a short period of time, don't bother writing to a file - it'll just add steps to the process for nothing.
A really easy way of doing this would be to serialize your object list into a BinaryWriter or XMLWriter, which automatically format your data into a readable and writable format so that, when your program needs to load the data, all you have to do is deserialize it (1 line of code).
Alternatively, if you have very many records, I suggest trying to use a database. It's quite easy to interface C# with SQL Server (there's a free version called Express Edition) or MySQL, and storing and retrieving huge amounts of data is not a pain. This would be the most efficient way to accomplish your task.
Depending on how much data you have and whether you want to accomplish something like this with 1 line of code (serialization) or interface with a seperate product (the database approach), you can choose either one of the above. Of course, if you wanted to, you could just manually write the contents of your data to a text file or CSV file, as you suggested, but, from personal experience, I recommend the methods I explained above.
It probably wont be more than a couple thousand. And it would be good if I could access it from a windows mobile app. Basically a user will be able to save times that something happens at, and then the app will use the data to find a cross correlation.
Is there any need for interoperability with other processes? If so, time to swat-up on file formats.
However, from the sound of it, you're asking on a matter of "style", with no real requirement to open the file anywhere but your own app. I'd suggest using a BinaryWriter for the task.
If debugging is an issue, a human-readable format might be preferable, but would be considerably larger than the binary equivalent.
Probably the quickest way to do it would be using binary serialization.

Storing settings: XML vs. SQLite?

I am currently writing an IRC client and I've been trying to figure out a good way to store the server settings. Basically a big list of networks and their servers as most IRC clients have.
I had decided on using SQLite but then I wanted to make the list freely available online in XML format (and perhaps definitive), for other IRC apps to use. So now I may just store the settings locally in the same format.
I have very little experience with either ADO.NET or XML so I'm not sure how they would compare in a situation like this.
Is one easier to work with programmatically? Is one faster? Does it matter?
It's a vaguer question than you realize. "Settings" can encompass an awful lot of things.
There's a good .NET infrastructure for handling application settings in configuration files. These, generally, are exposed to your program as properties of a global Settings object; the classes in the System.Configuration namespace take care of reading and persisting them, and there are tools built into Visual Studio to auto-generate the code for dealing with them. One of the data types that this infrastructure supports is StringCollection, so you could use that to store a list of servers.
But for a large list of servers, this wouldn't be my first choice, for a couple of reasons. I'd expect that the elements in your list are actually tuples (e.g. host name, port, description), not simple strings, in which case you'll end up having to format and parse the data to get it into a StringCollection, and that is generally a sign that you should be doing something else. Also, application settings are read-only (under Vista, at least), and while you can give a setting user scope to make it persistable, that leads you down a path that you probably want to understand before committing to.
So, another thing I'd consider: Is your list of servers simply a list, or do you have an internal object model representing it? In the latter case, I might consider using XML serialization to store and retrieve the objects. (The only thing I'd keep in the application configuration file would be the path to the serialized object file.) I'd do this because serializing and deserializing simple objects into XML is really easy; you don't have to be concerned with designing and testing a proper serialization format because the tools do it for you.
The primary reason I look at using a database is if my program performs a bunch of operations whose results need to be atomic and durable, or if for some reason I don't want all of my data in memory at once. If every time X happens, I want a permanent record of it, that's leading me in the direction of using a database. You don't want to use XML serialization for something like that, generally, because you can't realistically serialize just one object if you're saving all of your objects to a single physical file. (Though it's certainly not crazy to simply serialize your whole object model to save one change. In fact, that's exactly what my company's product does, and it points to another circumstance in which I wouldn't use a database: if the data's schema is changing frequently.)
I would personally use XML for settings - .NET is already built to do this and as such has many built-in facilities for storing your settings in XML configuration files.
If you want to use a custom schema (be it XML or DB) for storing settings then I would say that either XML or SQLite will work just as well since you ought to be using a decent API around the data store.
Every tool has its own right
There is plenty of hype arround XML, I know. But you should see, that XML is basically an exchange format -- not a storage format (unless you use a native XML-Database that gives you more options -- but also might add some headaches).
When your configuration is rather small (say less than 10.000 records), you might use XML and be fine. You will load the whole thing into your memory and access the entries there. Done.
But when your configuration is so big, that you dont want to load it completely, than you rethink your decission and stay with SQLite which gives you the option to dynamically load those parts of the configuration you need.
You could also provide a little tool to create a XML file from the DB-content -- creation of XML from a DB is a rather simple task.
Looks like you have two separate applications here: a web server and a desktop client (because that is traditionally where these things run), each with its own storage needs.
On the server side: go with a relational data store, not Xml. Basically at some point you need to keep user data separate from other user data on the server. XML is not a good store for that.
On the client: it doesn't really matter. Xml will probably be easier for you to manipulate. And don't think that because you are using one technology in one setting, you have to use it in the other.

Categories