Storing a Dictionary<string, string> in the database

Storing a Dictionary<string, string> in the database - c#

At some point in my code, I'm creating a dictionary of type Dictionary<string, string> and I'm wondering what's the best way to store this in a database in terms of converting it to a string and then back to a dictionary.
Thanks.

There are a number of options here.
You can go the normalization route and use a separate table with a key/value pair of columns.
Some databases provide you with a data type that is similar to what you need. PostgreSQL has an hstore type where you can save any key-value pairs, and MS SQL has an XML data type that can be used as well with some simple massaging of your data before insertion.
Without this type of database-specific assistance, you can just use a TEXT or BLOB column and serialize your dictionary using a DB-friendly format such as JSON, XML or language-specific serialization formats.
The tradeoffs are the following:
A separate table with key/value columns makes for expensive querying and is a PITA in general, but you get the most query flexibility and is portable across databases.
If you use a database-powered dictionary type, you get support in queries (i.e "select rows where an attribute stored in the dictionary matches a certain condition"). Without that, you are left with selecting everything and filtering in your program, but
You lose database portability unless you code a middle layer that abstracts this away, and you lose ease of data manipulation in your code (because things "work" as if there was a column in your database with this data).
NoSQL databases that are "document oriented" are meant exactly for this type of storage. Depending on what you are doing, you might want to look at some options. MongoDB is a popular choice.
The proper choice depends on the querying patterns for the data and other non-functional issues such as database support, etc. If you expand on the functionality you need to implement, I can expand on my answer.

If you really want to store the full dictionary as a single string, then you could serialize your dictionary to JSON (or XML) and store the result to the database.

You have a few options here. You could serialize the object into XML, or JSON as #M4N mentioned. You could also create a table with at least two columns: one for key and one for value.
It really depends on what your domain models look like and how you need to manage the data. If the dictionary values or keys change (IE rename, correction, etc), and needs to be reflected across many objects that are dependent on the data, then creating a sort of lookup table for that maps directly to the dictionary might be best. Otherwise, serializing the data would be one of the best performing options.

Related

Entity Framework Store Object in Column

I currently follow a pattern where I store objects which are serialized and deserialized to a particular column.
This pattern was fine before, however, now due to the frequency of transactions the cost of serializing the object to a JSON string and then later retrieving the string and deserializing back to an object is too expensive.
Is it possible to store an object directly to a column to avoid this cost? I am using Entity Framework and I would like to work the data stored in this column as type Object.
Please advise.

JSON serialization is not fast. It's faster and less verbose than XML, but a lot slower than binary serialization. I would look at third party binary serializers, namely ZeroFormatter, or Wire/Hyperion. For my own stuff I use Wire as a "fast enough" and simple to implement option.
As far as table structure I would recommend storing serialized data in a separate 1..0-1 associated table. So if I had an Order table that I wanted to serialize some extra order-related structure (coming from 3rd party delivery system for example) I'd create another table called OrderDeliveryInfo with a PK of OrderID to join to the Order table to house the Binary[] column for the serialized data. The reason for this is to avoid the cost of retrieving and transmitting the binary blob every time I query Order records unless I explicitly request the delivery info.

Is there a way to make a folder-like hierarchy in redis cache?

So I'm using redis cache in my c# webapi and being able to implement a similar hierarchy would make my life much easier (something like this:
a-> key1
b-> c ->key2
key3
d ->...
)
My other option is to make a tree like approach with keys where a would give me 2 other keys one for key and another for b and so one (but would be a mess)

If you use redis commander to view your cache, you can use keys separated with colons,e.g, set1:subset:subset:key. Its not really a hierarchy but it displays like folders in the commander view.

Redis supports multiple datatypes. For your case you can use a Hashes since a hash can have another nested hash in it.
Since Redis doesn't support nested data structure, you can store it this way by storing the inner hash reference in outer hash which will have difficulty while retrieving the data back. Else, you can create the hierarchical object structure as a JSON (Or, if you already have one) and store that serialized object in Redis.
See Storing nested objects in Redis

What you are trying to do here is somewhat outside of what redis wants you to do. You can fake it with nesting keys inside keys (perhaps via a hash), but it will be really hard to work with.
However! This might be a good fit for a redis "module" (requires redis 4.*). In particular, I wonder whether ReJSON might be a good fit. This is designed for JSON usage, but frankly: JSON is a hierarchical nested key/value data type - exactly what you want. Just overlook the JSON part :)
In particular, ReJSON allows you to query, access, and manipulate arbitrary nodes via a syntax that will be familiar if you've ever used XPath.
How to access ReJSON will depend on what client library you are using. If you're using SE.Redis, you will probably want to use the db.Execute(command, args) method (since I don't have ReJSON bindings natively exposed). If you're using a "cluster" topology, make sure that the key is passed as a RedisKey (rather than as a string), so that it knows how to route it.

How to save a dictionary into a database with C#?

I have been using following way to save dictionaries into the database:
Convert the dictionary to XML.
Pass this XML to a SP(Stored Procedure).
In the SP use:
Select
Key1,
Value1
into #TempTable
FROM OPENXML(#handle, '//ValueSet/Values', 1)
WITH
(
key1 VARCHAR(MAX),
value1 varchar(100)
)
Done.
Is there a way to save dictionaries to a database without converting it to XML?

It depends whether...
You want the data to be stored: The fastest way (both implementation and performance) to do that is by binary serialization (Protocol buffers for example). However the data is not readable with a select and every application who needs to read the data must use the same serialization (if it exists in the same technology/language). From my point of view, it breaks the purpose of storing in a SQL database.
You want the data to be readable by humans: XML is an option while not so fast and a little bit difficult to read and still it is not query-able. However, it is quite fast to implement. You can also dump the result to a file and it's still readable. Moreover, you can share the data with other applications as XML is a widespread format.
You want the data to be query-able. Depending on the way you go, it could be not so easy to implement. You would need two tables (one for keys and one for values). Then you could write either your own custom mapping code to map columns to properties or you could use frameworks for mapping objects to tables like Entity framework or NHibernate.
While Entity or NHibernate may appear a bit huge swiss knife for a small problem, it's always interesting to built some expertise in it, as the inner concepts are re-usable and it can really speed up development once you got a working setup.

Serialize the Dictionary, and store the binary data.
Then De-Serialize your data back into Dictionary.
Tutorial1 Tutorial2

Loop through the dictionary using a foreach statement.

DB design when data is unknown about an entity?

I'm wondering if the following DB schema would have repercussions later. Let's say I'm writing a place entity. I'm not certain what properties of place will be stored in the DB. I'm thinking of making two tables: one to hold the required (or common) info, and one to hold additional info.
Table 1 - Place
PK PlaceId
Name
Lat
Lng
etc... (all the common fields)
Table 2 - PlaceData
PK DataId
PK FieldName
PK FK PlaceId
FieldData
Usage Scenario
I want certain visitors to have the capability of entering custom fields about a place. For example, a restaurant is a place that may have the following fields: HasParking, HasDriveThru, RequiresReservation, etc... but a car dealer is also a place, and those fields wouldn't make sense for a car dealer.
I want to support any type of place, from a single table (well, 2nd table has custom fields), because I don't know the number of types of places that will eventually be added to my site.
Overall goal
On my asp.net MVC (C#/Razor) site, where I display a place, it will show the attributes, as a unordered list populated by: SELECT * FROM PlaceData WHERE PlaceId = #0.
This way, I wouldn't need to show empty field names on the view (or do a string.IsNullOrWhitespace() check for each and every field. Which I would be forced to do if every attribute was a column on the table.
I'm assuming this scenario is quite common, but are there better ways to do it? Particularly from a performance perspective? What are the major drawbacks of this schema?

Your idea is referred to as an Entity-Attribute-Value table and is generally bad news in a RDBMS. RDBMSes are geared toward highly structured data.
The overall options are:
Model the db further in an RDBMS, which is most likely if someone is holding back specs from you.
Stick with the RDBMS, using XML columns for the data whose structure is variable. This makes the most sense if a relatively small portion of your data storage schema is semi- or un-structured. Speaking from a MS SQL Server perspective, this data can be indexed and you can perform checks that your data complies with an XML schema definition.
Move to a non-relational DB such as MongoDB, Cassandra, CouchDB, etc. This is what a lot of social sites and I suspect blog sites run with. Also, it is within reason to use a combination of RDBMS and non-relational stores if that's what your needs call for.
EAV gets to be a mess because you're creating a database within a database and lose all of the benefits a RDBMS can provide (foreign keys, data type enforcement, etc.) and the SQL code needed to reconstruct your objects goes from lasagna to fettuccine to spaghetti in the blink of an eye.
Given the information that's been added to the question, it would seem a good fit to create a PlaceDetails column of type XML in the Place table. You could also split that column into another table with a 1:1 relationship if performance requirements dictate it.
The upside to doing it that way is that you can retrieve the data using very simple SQL code, even using the xml data type's methods for searching the data. But that approach also allows you to do the more complex presentation-oriented data parsing in C#, which is better suited to that purpose than T-SQL is.

If you want your application to be able to create its own custom fields, this is a fine model. The Mantis Bugtracker uses this as well to allow Admins to add custom fields to their tickets.
If in any case, it's going to be the programmer that is going to create the field, I must agree with pst that this is more a premature optimization.

At any given time you can add new columns to the database (always watching for the third normalization rule) so you should go with what you want and only create a second table if needed or if such columns breaks any of the normal forms.

Customizeable database

What would be the best database/technique to use if I'd like to create a database that can "add", "remove" and "edit" tables and columns?
I'd like it to be scaleable and fast.
Should I use one table and four columns for this (Id, Table, Column, Type, Value) - Is there any good articles about this. Or is there any other solutions?
Maybe three tables: One that holds the tables, one that holds the columns and one for the values?
Maybe someone already has created a db for this purpose?
My requirements is that I'm using .NET (I guess the database don't have to be on windows, but I would prefer that)

Since (in comments on the question) you are aware of the pitfalls of the "inner platform effect", it is also true that this is a very common requirement - in particular to store custom user-defined columns. And indeed, most teams have needed this. Having tried various approaches, the one which I have found most successful is to keep the extra data in-line with the record - in particular, this makes it simple to obtain the data without requiring extra steps like a second complex query on an external table, and it means that all the values share things like timestamp/rowversion for concurrency.
In particular, I've found a CustomValues column (for example text or binary; typically json / xml, but could be more exotic) a very effective way to work, acting as a property-bag for the additional data. And you don't have to parse it (or indeed, SELECT it) until you know you need the extra data.
All you then need is a way to tie named keys to expected types, but you need that metadata anyway.
I will, however, stress the importance of making the data portable; don't (for example) store any specific platform-bespoke serialization (for example, BinaryFormatter for .NET) - things like xml / json are fine.
Finally, your RDBMS may also work with this column; for example, SQL Server has the xml data type that allows you to run specific queries and other operations on xml data. You must make your own decision whether that is a help or a hindrance ;p
If you also need to add tables, I wonder if you are truly using the RDBMS as an RDBMS; at that point I would consider switching from an RDBMS to a document-database such as CouchDB or Raven DB

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.