I have been using following way to save dictionaries into the database:
Convert the dictionary to XML.
Pass this XML to a SP(Stored Procedure).
In the SP use:
Select
Key1,
Value1
into #TempTable
FROM OPENXML(#handle, '//ValueSet/Values', 1)
WITH
(
key1 VARCHAR(MAX),
value1 varchar(100)
)
Done.
Is there a way to save dictionaries to a database without converting it to XML?
It depends whether...
You want the data to be stored: The fastest way (both implementation and performance) to do that is by binary serialization (Protocol buffers for example). However the data is not readable with a select and every application who needs to read the data must use the same serialization (if it exists in the same technology/language). From my point of view, it breaks the purpose of storing in a SQL database.
You want the data to be readable by humans: XML is an option while not so fast and a little bit difficult to read and still it is not query-able. However, it is quite fast to implement. You can also dump the result to a file and it's still readable. Moreover, you can share the data with other applications as XML is a widespread format.
You want the data to be query-able. Depending on the way you go, it could be not so easy to implement. You would need two tables (one for keys and one for values). Then you could write either your own custom mapping code to map columns to properties or you could use frameworks for mapping objects to tables like Entity framework or NHibernate.
While Entity or NHibernate may appear a bit huge swiss knife for a small problem, it's always interesting to built some expertise in it, as the inner concepts are re-usable and it can really speed up development once you got a working setup.
Serialize the Dictionary, and store the binary data.
Then De-Serialize your data back into Dictionary.
Tutorial1 Tutorial2
Loop through the dictionary using a foreach statement.
Related
I currently follow a pattern where I store objects which are serialized and deserialized to a particular column.
This pattern was fine before, however, now due to the frequency of transactions the cost of serializing the object to a JSON string and then later retrieving the string and deserializing back to an object is too expensive.
Is it possible to store an object directly to a column to avoid this cost? I am using Entity Framework and I would like to work the data stored in this column as type Object.
Please advise.
JSON serialization is not fast. It's faster and less verbose than XML, but a lot slower than binary serialization. I would look at third party binary serializers, namely ZeroFormatter, or Wire/Hyperion. For my own stuff I use Wire as a "fast enough" and simple to implement option.
As far as table structure I would recommend storing serialized data in a separate 1..0-1 associated table. So if I had an Order table that I wanted to serialize some extra order-related structure (coming from 3rd party delivery system for example) I'd create another table called OrderDeliveryInfo with a PK of OrderID to join to the Order table to house the Binary[] column for the serialized data. The reason for this is to avoid the cost of retrieving and transmitting the binary blob every time I query Order records unless I explicitly request the delivery info.
At some point in my code, I'm creating a dictionary of type Dictionary<string, string> and I'm wondering what's the best way to store this in a database in terms of converting it to a string and then back to a dictionary.
Thanks.
There are a number of options here.
You can go the normalization route and use a separate table with a key/value pair of columns.
Some databases provide you with a data type that is similar to what you need. PostgreSQL has an hstore type where you can save any key-value pairs, and MS SQL has an XML data type that can be used as well with some simple massaging of your data before insertion.
Without this type of database-specific assistance, you can just use a TEXT or BLOB column and serialize your dictionary using a DB-friendly format such as JSON, XML or language-specific serialization formats.
The tradeoffs are the following:
A separate table with key/value columns makes for expensive querying and is a PITA in general, but you get the most query flexibility and is portable across databases.
If you use a database-powered dictionary type, you get support in queries (i.e "select rows where an attribute stored in the dictionary matches a certain condition"). Without that, you are left with selecting everything and filtering in your program, but
You lose database portability unless you code a middle layer that abstracts this away, and you lose ease of data manipulation in your code (because things "work" as if there was a column in your database with this data).
NoSQL databases that are "document oriented" are meant exactly for this type of storage. Depending on what you are doing, you might want to look at some options. MongoDB is a popular choice.
The proper choice depends on the querying patterns for the data and other non-functional issues such as database support, etc. If you expand on the functionality you need to implement, I can expand on my answer.
If you really want to store the full dictionary as a single string, then you could serialize your dictionary to JSON (or XML) and store the result to the database.
You have a few options here. You could serialize the object into XML, or JSON as #M4N mentioned. You could also create a table with at least two columns: one for key and one for value.
It really depends on what your domain models look like and how you need to manage the data. If the dictionary values or keys change (IE rename, correction, etc), and needs to be reflected across many objects that are dependent on the data, then creating a sort of lookup table for that maps directly to the dictionary might be best. Otherwise, serializing the data would be one of the best performing options.
I have a field in the database that is XML because it represents a class that is used in C#/VB.Net. Problem is after the initial manipulation most, but not all, of the manipulation is done in SQL Server. This means that the XML field is converted on the fly.
As we are getting more fields and more records, this operation is getting slow. I think the slow down is the converting all of those fields to other data types.
So to speed it up I was thinking of a couple of ways:
Have a set of tables that represent the different pieces of the XML data. I would make these tables read only using a trigger on Insert/Update that would reject any changes. My 'main' table with the XML in it when it updates the XML would turn off the triggers, update the tables with the new values then turn the triggers back on.
The only real reason we use the XML is because it's really easy to convert it to the class in C#/VB.Net. But I'm getting the point where I may end up writing a routine that will take all the bits and pieces and convert it to a class and also a function to go the other way (class -> tables).
Can anybody give any ideas on a better way to do this? I'm not tied to the idea of using the XML structure. My concern is if we have separate tables to speed up SQL processing and somebody changes the value of a field in that table we have to make sure the XML is updated. Or don't allow the person to update it.
TIA - Jeff.
What is the purpose of the objects you are saving? If anything other than persistence of state, you are not doing yourself any favors and you are not properly separating concerns. If they are persistence of state, then at minimum, make columns out of the properties and fields (can include private as long as you leave an internal method to set the values when you reconstitute).
Disregarding the wisdom of what you're doing, you might look into creating an XML index. This should help you get started: http://msdn.microsoft.com/en-us/library/ms345121%28v=sql.90%29.aspx
The basic idea is that the right index can 'pre-shred' your XML and automatically build the sor of tables you are thinking of doing 'manually'. A downside is that this can really explode your storage requirements if you are storing lots of XML.
I have a situation where I need to store some data that just won't ...really fit into a database table. It is a little too abstract, and I've not enough knowledge to piecemeal it in such a way that it could be broken into tables and columns. The object in question is a System.Linq.Expressions.Expression<T>.
I have discovered a means of serializing such to xml using MetaLinq. and it works pretty well, albeit the xml it generates is excessively obese, I somewhat expected this much from something as complicated as an Expression. A modest expression turns out to around 19 kb.
So my thought was to use gzip compression on the file. This works well, it saves it to about 2 kb.
So then, my actual question is this : is it bad practice or 'dangerous' practice to basically use a table column to reference a filename for deserialization for an object? Like I would have a table for expressions, and it would have a filename, when that expression was called it would perform the gzip decompression, deserialize it, and return the object.
This seems like the ideal solution but it requires a lot of File I/O and a lot of various compression/zipping/serialization. I'm wondering if I could get the opinion of more experienced database admins out there. I am using Fluent nHibernate as my ORM mapper.
MetaLinq on codeplex
Not an experienced DBA, but I would store the serialized data in a BLOB field in the database. Database backups do no good if the files your data is depending on go away or vice versa. I think it would simplify things to just keep it all together. And the blob works fine since the data you are storing does not need to be queried.
Depends on the size of the data.
Sql has an XML data type for table columns now. So you could deserialize the object and then insert the whole object in the column again depending on size.
But if you must use the file system I would store a path and the file name in the column.
In your programs app.config keep the root of the drive like \\MyDrive or d:\
That way if information moves, just change the app config as long as the folder/file structure stays the same.
Edit:
Along with NerdFury suggestion you could you a binary serializer if you do not need to "see" the data in the database. XML serialization at least makes it readable
What would be the best database/technique to use if I'd like to create a database that can "add", "remove" and "edit" tables and columns?
I'd like it to be scaleable and fast.
Should I use one table and four columns for this (Id, Table, Column, Type, Value) - Is there any good articles about this. Or is there any other solutions?
Maybe three tables: One that holds the tables, one that holds the columns and one for the values?
Maybe someone already has created a db for this purpose?
My requirements is that I'm using .NET (I guess the database don't have to be on windows, but I would prefer that)
Since (in comments on the question) you are aware of the pitfalls of the "inner platform effect", it is also true that this is a very common requirement - in particular to store custom user-defined columns. And indeed, most teams have needed this. Having tried various approaches, the one which I have found most successful is to keep the extra data in-line with the record - in particular, this makes it simple to obtain the data without requiring extra steps like a second complex query on an external table, and it means that all the values share things like timestamp/rowversion for concurrency.
In particular, I've found a CustomValues column (for example text or binary; typically json / xml, but could be more exotic) a very effective way to work, acting as a property-bag for the additional data. And you don't have to parse it (or indeed, SELECT it) until you know you need the extra data.
All you then need is a way to tie named keys to expected types, but you need that metadata anyway.
I will, however, stress the importance of making the data portable; don't (for example) store any specific platform-bespoke serialization (for example, BinaryFormatter for .NET) - things like xml / json are fine.
Finally, your RDBMS may also work with this column; for example, SQL Server has the xml data type that allows you to run specific queries and other operations on xml data. You must make your own decision whether that is a help or a hindrance ;p
If you also need to add tables, I wonder if you are truly using the RDBMS as an RDBMS; at that point I would consider switching from an RDBMS to a document-database such as CouchDB or Raven DB