I currently follow a pattern where I store objects which are serialized and deserialized to a particular column.
This pattern was fine before, however, now due to the frequency of transactions the cost of serializing the object to a JSON string and then later retrieving the string and deserializing back to an object is too expensive.
Is it possible to store an object directly to a column to avoid this cost? I am using Entity Framework and I would like to work the data stored in this column as type Object.
Please advise.
JSON serialization is not fast. It's faster and less verbose than XML, but a lot slower than binary serialization. I would look at third party binary serializers, namely ZeroFormatter, or Wire/Hyperion. For my own stuff I use Wire as a "fast enough" and simple to implement option.
As far as table structure I would recommend storing serialized data in a separate 1..0-1 associated table. So if I had an Order table that I wanted to serialize some extra order-related structure (coming from 3rd party delivery system for example) I'd create another table called OrderDeliveryInfo with a PK of OrderID to join to the Order table to house the Binary[] column for the serialized data. The reason for this is to avoid the cost of retrieving and transmitting the binary blob every time I query Order records unless I explicitly request the delivery info.
Related
Yes, I know, serialized data produced by boost is intended only for the library internal use and not to be read by third parties. However I find myself in a position that I have to mimic binary serialized data originated from .NET (std::vector of tiny PODs) which would be deserialized later by boost (C++ native). CLI/C++ interop with native boost is not possible since the assembly have to be CLR pure.
Is it feasible? just to write the right sequence of bytes? Is there any binary format spec? I didnt find any.
EDIT001: Some background: I have a table in a database, hundreds of millions of rows. each row consists of two IDs - entity ID, parent entity ID and additional column for entity data (all entity data in the form of JSON, but it doesnt matter, I cant change it). Now, in the native C++ I have to select entities by parent ID to get all entities it has, it would yield (sometimes) 5M rows, as one can guess it will take ages to query, receive, iterate, parse and load into vector of C++ structs. So I've tested what if I have my own table with parent ID as PK and a column with all entities belonging to that ID binary serialized. The result (aside data transfer over network, etc) I can parse (actually, boost can) it in ~400ms which is not blazing fast but good enough for me. Now, how do I get my table populated with binary data? Obviously DBA team cant help here, they know nothing about boost binary format, so I resorted to CLR user defined function which MUST be implemented as "pure" CLR. this UDF is supposed to be called from stored procedure which populates table with individual entities and in the end will run over these and create binary bulk. But how I can mimic the boost binary format if I cant call boost (CLI/C++) in my assembly???
I have been using following way to save dictionaries into the database:
Convert the dictionary to XML.
Pass this XML to a SP(Stored Procedure).
In the SP use:
Select
Key1,
Value1
into #TempTable
FROM OPENXML(#handle, '//ValueSet/Values', 1)
WITH
(
key1 VARCHAR(MAX),
value1 varchar(100)
)
Done.
Is there a way to save dictionaries to a database without converting it to XML?
It depends whether...
You want the data to be stored: The fastest way (both implementation and performance) to do that is by binary serialization (Protocol buffers for example). However the data is not readable with a select and every application who needs to read the data must use the same serialization (if it exists in the same technology/language). From my point of view, it breaks the purpose of storing in a SQL database.
You want the data to be readable by humans: XML is an option while not so fast and a little bit difficult to read and still it is not query-able. However, it is quite fast to implement. You can also dump the result to a file and it's still readable. Moreover, you can share the data with other applications as XML is a widespread format.
You want the data to be query-able. Depending on the way you go, it could be not so easy to implement. You would need two tables (one for keys and one for values). Then you could write either your own custom mapping code to map columns to properties or you could use frameworks for mapping objects to tables like Entity framework or NHibernate.
While Entity or NHibernate may appear a bit huge swiss knife for a small problem, it's always interesting to built some expertise in it, as the inner concepts are re-usable and it can really speed up development once you got a working setup.
Serialize the Dictionary, and store the binary data.
Then De-Serialize your data back into Dictionary.
Tutorial1 Tutorial2
Loop through the dictionary using a foreach statement.
At some point in my code, I'm creating a dictionary of type Dictionary<string, string> and I'm wondering what's the best way to store this in a database in terms of converting it to a string and then back to a dictionary.
Thanks.
There are a number of options here.
You can go the normalization route and use a separate table with a key/value pair of columns.
Some databases provide you with a data type that is similar to what you need. PostgreSQL has an hstore type where you can save any key-value pairs, and MS SQL has an XML data type that can be used as well with some simple massaging of your data before insertion.
Without this type of database-specific assistance, you can just use a TEXT or BLOB column and serialize your dictionary using a DB-friendly format such as JSON, XML or language-specific serialization formats.
The tradeoffs are the following:
A separate table with key/value columns makes for expensive querying and is a PITA in general, but you get the most query flexibility and is portable across databases.
If you use a database-powered dictionary type, you get support in queries (i.e "select rows where an attribute stored in the dictionary matches a certain condition"). Without that, you are left with selecting everything and filtering in your program, but
You lose database portability unless you code a middle layer that abstracts this away, and you lose ease of data manipulation in your code (because things "work" as if there was a column in your database with this data).
NoSQL databases that are "document oriented" are meant exactly for this type of storage. Depending on what you are doing, you might want to look at some options. MongoDB is a popular choice.
The proper choice depends on the querying patterns for the data and other non-functional issues such as database support, etc. If you expand on the functionality you need to implement, I can expand on my answer.
If you really want to store the full dictionary as a single string, then you could serialize your dictionary to JSON (or XML) and store the result to the database.
You have a few options here. You could serialize the object into XML, or JSON as #M4N mentioned. You could also create a table with at least two columns: one for key and one for value.
It really depends on what your domain models look like and how you need to manage the data. If the dictionary values or keys change (IE rename, correction, etc), and needs to be reflected across many objects that are dependent on the data, then creating a sort of lookup table for that maps directly to the dictionary might be best. Otherwise, serializing the data would be one of the best performing options.
I have an open SqlConnection (extended by Dapper), a table name in the connected database, and a JSON string that I trust to be of the same schema. What's the simplest way to insert the JSON object's field values without deserializing to a static type?
I realize the standard option is to create a class representing the record to deserialize into. However, there are several reasons this is less than ideal. I'm syncing a number of tables in exactly the same way. The schema already exists in two places (the source and the target), so it seems poor form to repeat the schema in the middleware as well. Also, since I'm just going straight into the database, it seems excessive to require a recompile any time someone adds an additional column or table.
Is there a more dynamic solution?
probably deserializing into a dynamic is your best bet. if you want to pull out the values directly from the string, you're going to have to (at least partially) parse it anyways, and at that point you might as well just deserialize it
See this answer for an example using JSON.net: Deserialize json object into dynamic object using Json.net
Deserialize into a dictionary, then construct Dapper DynamicParameters.
How to create arguments for a Dapper query dynamically
I am trying to store a C# class filled with properties and fields. How can I store that class into SQL Server 2008.
Can someone help??
There a number of different ways and it depends on how you want to take that object back out again. Without further information, we can't help much.
You can use Entity Framework to map the object to a database table. Each property then would correspond to a table column.
You can serialize the object and store in a single column in a database table - serialize as xml, json, or binary blob.
the proper way it to represent the object as a schema: one or more related tables representing the object and its object graph: properties, subtypes, and referenced types. This lets you query the stowed object as SQL data and use it for other purposes (e.g., reporting).
Alternatively, you can serialize the object instance. You can do it declaratively via the [Serializable] attribute. You can roll your own by implementing the ISerializable interface and serialize it as binary, JSON, some other representation of your choice.
You can serialize to/from XML by using XML serialization attributes or by implementing IXmlSerializable.
Or you can ignore the built-in support for this sort of stuff and serialize it your own way.
Once you've serialized it to a Stream, you can stow that in column of Appropriate Type (varbinary(max) or varchar(max)) depending on how its been serialized.
A third option would be to create a CLR user-defined type and install the assembly in SQL Server. I'm not sure I'd suggest that as a "best practice"