Convert Deserialization method to Async - c#

I am trying to convert this method that deserializes an object into a string with Async/Await.
public static T DeserializeObject<T>(string xml)
{
using (StringReader reader = new StringReader(xml))
{
using (XmlReader xmlReader = XmlReader.Create(reader))
{
DataContractSerializer serializer = new DataContractSerializer(typeof(T));
T theObject = (T)serializer.ReadObject(xmlReader);
return theObject;
}
}
}

Most serialization APIs do not have async implementations, which means the only thing you can really do is wrap a sync method. For example:
public static Task<T> DeserializeObjectAsync<T>(string xml)
{
using (StringReader reader = new StringReader(xml))
{
using (XmlReader xmlReader = XmlReader.Create(reader))
{
DataContractSerializer serializer =
new DataContractSerializer(typeof(T));
T theObject = (T)serializer.ReadObject(xmlReader);
return Task.FromResult(theObject);
}
}
}
This isn't actually async - it just meets the required API. If you have the option, using ValueTask<T> is preferable in scenarios where the result may often be synchronous/
Either way, you should then be able to do something like:
var obj = await DeserializeObject<YourType>(someXml);
Debug.WriteLine(obj.Name); // etc
without needing to know whether the actual implementation was synchronous or asynchronous.

A little sample, pretty primitive way:
public delegate T Async<T>(string xml);
public void Start<T>()
{
string xml = "<Person/>";
Async<T> asyncDeserialization = DeserializeObject<T>;
asyncDeserialization.BeginInvoke(xml, Callback<T>, asyncDeserialization);
}
private void Callback<T>(IAsyncResult ar)
{
Async<T> dlg = (Async<T>)ar.AsyncState;
T item = dlg.EndInvoke(ar);
}
public T DeserializeObject<T>(string xml)
{
using (StringReader reader = new StringReader(xml))
{
using (XmlReader xmlReader = XmlReader.Create(reader))
{
DataContractSerializer serializer = new DataContractSerializer(typeof(T));
T theObject = (T)serializer.ReadObject(xmlReader);
return theObject;
}
}
}
you define a delegate and using it to Begin/End invoke using callbacks.
using the next versions of C# you can use the async keyword to get your code run asynchronously.

As you are working with a string as data source doing things async would only introduce more overhead and give you nothing for it.
But if you where reading from a stream you could copy from the source stream to a MemoryStream(buffering all data), then deserialize from the MemoryStream, that would increase the memory usage but would lower the amount of time you will block the thread.

You can return
Task.FromResult(theObject)

Related

Invalid OpenXml after converting from XElement

I'm using this code to convert from an XElement to OpenXmlElement
internal static OpenXmlElement ToOpenXml(this XElement xel)
{
using (var sw = new StreamWriter(new MemoryStream()))
{
sw.Write(xel.ToString());
sw.Flush();
sw.BaseStream.Seek(0, SeekOrigin.Begin);
var re = OpenXmlReader.Create(sw.BaseStream);
re.Read();
var oxe = re.LoadCurrentElement();
re.Close();
return oxe;
}
}
Before the conversion I have an XElement
<w:ind w:firstLine="0" w:left="0" w:right="0"/>
After the conversion it looks like this
<w:ind w:firstLine="0" w:end="0" w:start="0"/>
This element then fails OpenXml validation using the following
var v = new OpenXmlValidator();
var errs = v.Validate(doc);
With the errors being reported:
Description="The 'http://schemas.openxmlformats.org/wordprocessingml/2006/main:start' attribute is not declared."
Description="The 'http://schemas.openxmlformats.org/wordprocessingml/2006/main:end' attribute is not declared."
Do I need to do other things to add these attributes to the schema or do I need to find a new way to convert from XElement to OpenXml?
I'm using the nuget package DocumentFormat.OpenXml ver 2.9.1 (the latest).
EDIT: Looking at the OpenXml standard, it seems that both left/start and right/end should be recognised which would point to the OpenXmlValidator not being quite correct. Presumably I can just ignore those validation errors then?
Many thx
The short answer is that you can indeed ignore those specific validation errors. The OpenXmlValidator is not up-to-date in this case.
I would additionally offer a more elegant implementation of your ToOpenXml method (note the using declarations, which were added in C# 8.0).
internal static OpenXmlElement ToOpenXmlElement(this XElement element)
{
// Write XElement to MemoryStream.
using var stream = new MemoryStream();
element.Save(stream);
stream.Seek(0, SeekOrigin.Begin);
// Read OpenXmlElement from MemoryStream.
using OpenXmlReader reader = OpenXmlReader.Create(stream);
reader.Read();
return reader.LoadCurrentElement();
}
If you don't use C# 8.0 or using declarations, here's the corresponding code with using statements.
internal static OpenXmlElement ToOpenXmlElement(this XElement element)
{
using (var stream = new MemoryStream())
{
// Write XElement to MemoryStream.
element.Save(stream);
stream.Seek(0, SeekOrigin.Begin);
// Read OpenXmlElement from MemoryStream.
using OpenXmlReader reader = OpenXmlReader.Create(stream);
{
reader.Read();
return reader.LoadCurrentElement();
}
}
}
Here's the corresponding unit test, which also demonstrates that you'd have to pass a w:document to have the w:ind element's attributes changed by the Indentation instance created in the process.
public class OpenXmlReaderTests
{
private const string NamespaceUriW = "http://schemas.openxmlformats.org/wordprocessingml/2006/main";
private static readonly string XmlnsW = $"xmlns:w=\"{NamespaceUriW}\"";
private static readonly string IndText =
$#"<w:ind {XmlnsW} w:firstLine=""10"" w:left=""20"" w:right=""30""/>";
private static readonly string DocumentText =
$#"<w:document {XmlnsW}><w:body><w:p><w:pPr>{IndText}</w:pPr></w:p></w:body></w:document>";
[Fact]
public void ConvertingDocumentChangesIndProperties()
{
XElement element = XElement.Parse(DocumentText);
var document = (Document) element.ToOpenXmlElement();
Indentation ind = document.Descendants<Indentation>().First();
Assert.Null(ind.Left);
Assert.Null(ind.Right);
Assert.Equal("10", ind.FirstLine);
Assert.Equal("20", ind.Start);
Assert.Equal("30", ind.End);
}
[Fact]
public void ConvertingIndDoesNotChangeIndProperties()
{
XElement element = XElement.Parse(IndText);
var ind = (OpenXmlUnknownElement) element.ToOpenXmlElement();
Assert.Equal("10", ind.GetAttribute("firstLine", NamespaceUriW).Value);
Assert.Equal("20", ind.GetAttribute("left", NamespaceUriW).Value);
Assert.Equal("30", ind.GetAttribute("right", NamespaceUriW).Value);
}
}

C# - Call/write a deserialization method in another class

I'm trying to de-serialize a class in order to save user specific values (like a dirpath) in an XML-file.
Here's a snippet of my class:
[Serializable]
public class CustomSettings
{
public string initPath;
public string InitPath => initPath;
}
Both the serialization and deserialization work perfectly fine if I do this inside my MainWindow class:
CustomSettings cs = new CustomSettings();
XmlSerializer mySerializer = new XmlSerializer(typeof(CustomSettings));
StreamWriter myWriter = new StreamWriter(#"U:\Alex\prefs.xml");
mySerializer.Serialize(myWriter, cs);
myWriter.Close();
And for the deserialization:
CustomSettings cs = new CustomSettings();
XmlSerializer _mySerializer = new XmlSerializer(typeof(CustomSettings));
FileStream myFstream = new FileStream(#"U:\Alex\prefs.xml", FileMode.Open);
cs = (CustomSettings)_mySerializer.Deserialize(myFstream);
myFstream.Close();
Now since I need to do the de-serialization a few times, I figured I write two methods which will do the above work, but due to readability I want them to be inside another class.
In MainWindow I call this method inside another class, which again, works as intended:
public void WriteSettingsToFile(CustomSettings cs)
{
XmlSerializer mySerializer = new XmlSerializer(typeof(CustomSettings));
StreamWriter myWriter = new StreamWriter(#"U:\Alex\prefs.xml");
mySerializer.Serialize(myWriter, cs);
myWriter.Close();
}
The de-serialization function I wrote does not work, though. It loads the file successfully, returning the correct path value but as soon as it returns to the calling class, the string is always null. I rewrote the method several times, trying void, non-void returning a string, static etc. but I'm stuck.
Any help and ideas are appreciated!
Edit:
Here's one of the tries with the deserialization method:
public void GetInitPath(CustomSettings cs)
{
XmlSerializer _mySerializer = new XmlSerializer(typeof(CustomSettings));
FileStream myFstream = new FileStream(#"U:\Alex\prefs.xml", FileMode.Open);
cs = (CustomSettings)_mySerializer.Deserialize(myFstream);
myFstream.Close();
}
Edit 2:
With the help of Matthew Watson I was able to solve my problem.
public void GetInitPath(ref CustomSettings cs)
{
XmlSerializer serializer = new XmlSerializer(typeof(CustomSettings));
using (var myFstream = new FileStream(#"U:\Alex\prefs.xml", FileMode.Open))
{
cs = (CustomSettings)serializer.Deserialize(myFstream);
}
}
This is your method that doesn't work:
public void GetInitPath(CustomSettings cs)
{
XmlSerializer _mySerializer = new XmlSerializer(typeof(CustomSettings));
FileStream myFstream = new FileStream(#"U:\Alex\prefs.xml", FileMode.Open);
cs = (CustomSettings)_mySerializer.Deserialize(myFstream);
myFstream.Close();
}
The problem with this method is that it DOESN'T RETURN THE RESULT.
It overwrites the reference to cs that is passed to GetInitPath() with a reference to the object returned from .Deserialise(), but that does NOT change the original object passed to GetInitPath().
The solution is either to declare the parameter as a ref parameter, or (much better) change the method to return the result like so:
public CustomSettings GetInitPath()
{
XmlSerializer _mySerializer = new XmlSerializer(typeof(CustomSettings));
FileStream myFstream = new FileStream(#"U:\Alex\prefs.xml", FileMode.Open);
var cs = (CustomSettings)_mySerializer.Deserialize(myFstream);
myFstream.Close();
return cs;
}
Another observation: It's better to use using when possible to wrap resources that are disposable. If you do that, your method would look like this:
public CustomSettings GetInitPath()
{
var serializer = new XmlSerializer(typeof(CustomSettings));
using (var myFstream = new FileStream(#"U:\Alex\prefs.xml", FileMode.Open))
{
return (CustomSettings) serializer.Deserialize(myFstream);
}
}
This will close myFstream even if an exception occurs in Deserialize().

How to control the encoding while serializing and deserializing?

I'm using serialization to a string as follows.
public static string Stringify(this Process self)
{
XmlSerializer serializer = new XmlSerializer(typeof(Process));
using (StringWriter writer = new StringWriter())
{
serializer.Serialize(writer, self,);
return writer.ToString();
}
}
Then, I deserialize using this code. Please note that it's not an actual stringification from above that's used. In our business logic, it makes more sense to serialize a path, hence reading in from said path and creating an object based on the read data.
public static Process Processify(this string self)
{
XmlSerializer serializer = new XmlSerializer(typeof(Process));
using (XmlReader reader = XmlReader.Create(self))
return serializer.Deserialize(reader) as Process;
}
}
This works as supposed to except for a small issue with encoding. The string XML that's produced, contains the addition encoding="utf-16" as an attribute on the base tag (the one that's about XML version, not the actual data).
When I read in, I get an exception because of mismatching encodings. As far I could see, there's no way to specify the encoding for serialization nor deserialization in any of the objects I'm using.
How can I do that?
For now, I'm using a very brute work-around by simply cutting of the excessive junk like so. It's Q&D and I want to remove it.
public static string Stringify(this Process self)
{
XmlSerializer serializer = new XmlSerializer(typeof(Process));
using (StringWriter writer = new StringWriter())
{
serializer.Serialize(writer, self,);
return writer.ToString().Replace(" encoding=\"utf-16\"", "");
}
}

XMLSerializer to XElement

I have been working with XML in database LINQ and find that it is very difficult to work with the serializer.
The database LINQ required a field that store XElement.
I have a complex object with many customized structure class, so I would like to use the XmlSerializer to serialize the object.
However, the serializer can only serialize to file ("C:\xxx\xxx.xml") or a memory stream.
However to convert or serialize it to be a XElement so that I can store in the database using LINQ?
And How to do the reverse? i.e. Deserialize an XElement...
Try to use this
using (var stream = new MemoryStream())
{
serializer.Serialize(stream, value);
stream.Position = 0;
using (XmlReader reader = XmlReader.Create(stream))
{
XElement element = XElement.Load(reader);
}
}
deserialize :
XmlSerializer xs = new XmlSerializer(typeof(XElement));
using (MemoryStream ms = new MemoryStream())
{
xs.Serialize(ms, xml);
ms.Position = 0;
xs = new XmlSerializer(typeof(YourType));
object obj = xs.Deserialize(ms);
}
To make what John Saunders was describing more explicit, deserialization is very straightforward:
public static object DeserializeFromXElement(XElement element, Type t)
{
using (XmlReader reader = element.CreateReader())
{
XmlSerializer serializer = new XmlSerializer(t);
return serializer.Deserialize(reader);
}
}
Serialization is a little messier because calling CreateWriter() from an XElement or XDocument creates child elements. (In addition, the XmlWriter created from an XElement has ConformanceLevel.Fragment, which causes XmlSerialize to fail unless you use the workaround here.) As a result, I use an XDocument, since this requires a single element, and gets us around the XmlWriter issue:
public static XElement SerializeToXElement(object o)
{
var doc = new XDocument();
using (XmlWriter writer = doc.CreateWriter())
{
XmlSerializer serializer = new XmlSerializer(o.GetType());
serializer.Serialize(writer, o);
}
return doc.Root;
}
First of all, see Serialize Method to see that the serializer can handle alot more than just memory streams or files.
Second, try using XElement.CreateWriter and then passing the resulting XmlWriter to the serializer.
The SQL has XML data type may be this can help you look at msdn

streaming XML serialization in .net

I'm trying to serialize a very large IEnumerable<MyObject> using an XmlSerializer without keeping all the objects in memory.
The IEnumerable<MyObject> is actually lazy..
I'm looking for a streaming solution that will:
Take an object from the IEnumerable<MyObject>
Serialize it to the underlying stream using the standard serialization (I don't want to handcraft the XML here!)
Discard the in memory data and move to the next
I'm trying with this code:
using (var writer = new StreamWriter(filePath))
{
var xmlSerializer = new XmlSerializer(typeof(MyObject));
foreach (var myObject in myObjectsIEnumerable)
{
xmlSerializer.Serialize(writer, myObject);
}
}
but I'm getting multiple XML headers and I cannot specify a root tag <MyObjects> so my XML is invalid.
Any idea?
Thanks
The XmlWriter class is a fast streaming API for XML generation. It is rather low-level, MSDN has an article on instantiating a validating XmlWriter using XmlWriter.Create().
Edit: link fixed. Here is sample code from the article:
async Task TestWriter(Stream stream)
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Async = true;
using (XmlWriter writer = XmlWriter.Create(stream, settings)) {
await writer.WriteStartElementAsync("pf", "root", "http://ns");
await writer.WriteStartElementAsync(null, "sub", null);
await writer.WriteAttributeStringAsync(null, "att", null, "val");
await writer.WriteStringAsync("text");
await writer.WriteEndElementAsync();
await writer.WriteCommentAsync("cValue");
await writer.WriteCDataAsync("cdata value");
await writer.WriteEndElementAsync();
await writer.FlushAsync();
}
}
Here's what I use:
using System;
using System.Collections.Generic;
using System.Xml;
using System.Xml.Serialization;
using System.Text;
using System.IO;
namespace Utils
{
public class XMLSerializer
{
public static Byte[] StringToUTF8ByteArray(String xmlString)
{
return new UTF8Encoding().GetBytes(xmlString);
}
public static String SerializeToXML<T>(T objectToSerialize)
{
StringBuilder sb = new StringBuilder();
XmlWriterSettings settings =
new XmlWriterSettings {Encoding = Encoding.UTF8, Indent = true};
using (XmlWriter xmlWriter = XmlWriter.Create(sb, settings))
{
if (xmlWriter != null)
{
new XmlSerializer(typeof(T)).Serialize(xmlWriter, objectToSerialize);
}
}
return sb.ToString();
}
public static void DeserializeFromXML<T>(string xmlString, out T deserializedObject) where T : class
{
XmlSerializer xs = new XmlSerializer(typeof (T));
using (MemoryStream memoryStream = new MemoryStream(StringToUTF8ByteArray(xmlString)))
{
deserializedObject = xs.Deserialize(memoryStream) as T;
}
}
}
}
Then just call:
string xml = Utils.SerializeToXML(myObjectsIEnumerable);
I haven't tried it with, for example, an IEnumerable that fetches objects one at a time remotely, or any other weird use cases, but it works perfectly for List<T> and other collections that are in memory.
EDIT: Based on your comments in response to this, you could use XmlDocument.LoadXml to load the resulting XML string into an XmlDocument, save the first one to a file, and use that as your master XML file. For each item in the IEnumerable, use LoadXml again to create a new in-memory XmlDocument, grab the nodes you want, append them to the master document, and save it again, getting rid of the new one.
After you're finished, there may be a way to wrap all of the nodes in your root tag. You could also use XSL and XslCompiledTransform to write another XML file with the objects properly wrapped in the root tag.
You can do this by implementing the IXmlSerializable interface on the large class. The implementation of the WriteXml method can write the start tag, then simply loop over the IEnumerable<MyObject> and serialize each MyObject to the same XmlWriter, one at a time.
In this implementation, there won't be any in-memory data to get rid of (past what the garbage collector will collect).

Categories