XMLSerializer - issue with UTF-8 vs UTF-16 Code - c#

I am trying to serialize a simple object (5 string properties) into XML to save to a DB Image field. Then I need to DeSerialize it back into a string later in the program.
However, I am getting some errors - caused by the XML being saved thinking it is in UTF-16 - however, when I load it from the DB back into a string - it thinks it is a UTF 8 String.
The error I get is
InnerException {"There is no Unicode byte order mark. Cannot switch to Unicode."} System.Exception {System.Xml.XmlException}
-- Message "There is an error in XML document (0, 0)." string
Is this happening because of the two different ways I save and load the string to/from the DB? On the save I am using a StringBuilder - but on the load from DB I am using just a String.
Thoughts?
Serialize and Save to DB
// Now Save the OBject XML to the Query Tables
var serializer = new XmlSerializer(ExportConfig.GetType());
StringBuilder StringResult = new StringBuilder();
using (var writer = XmlWriter.Create(StringResult))
{
serializer.Serialize(writer, ExportConfig);
}
//MessageBox.Show("XML : " + StringResult);
// Now Save to the Query
try
{
string UpdateSQL = "Update ZQryRpt "
+ " Set ExportConfig = " + TAGlobal.QuotedStr(StringResult.ToString())
+ " where QryId = " + TAGlobal.QuotedStr(((DataRowView)bindingSource_zQryRpt.Current).Row["QryID"].ToString())
;
ExecNonSelectSQL(UpdateSQL, uniConnection_Config);
}
catch (Exception Error)
{
MessageBox.Show("Error Setting ExportConfig: " + Error.Message);
}
Load from DB And Deserialize
byte[] binaryData = (byte[])((DataRowView)bindingSource_zQryRpt.Current).Row["ExportConfig"];
string XMLStored = System.Text.Encoding.UTF8.GetString(binaryData, 0, binaryData.Length);
if (XMLStored.Length > 0)
{
IIDExportObject ExportConfig = new IIDExportObject();
var serializer = new XmlSerializer(ExportConfig.GetType());
//StringBuilder StringResult = new StringBuilder(XMLStored);
// Load the XML from the Query into the StringBuilder
// Now we need to build a Stream from the String to use in the XMLReader
byte[] byteArray = Encoding.UTF8.GetBytes(XMLStored);
MemoryStream stream = new MemoryStream(byteArray);
using (var reader = XmlReader.Create(stream))
{
ExportConfig = (IIDExportObject)serializer.Deserialize(reader);
}
}

John - thank you very much for the comment! It allowed me to complete the code and find a solution.
As you noted - using a stream reader was the solution - but I could not read the first line because there was only one 'line' in my string. However, I could use the line
using (StreamReader sr = new StreamReader(stream, false))
Which allows me to read the stream and ignore the "Byte Order Mark Detection" set to false.
string XMLStored = MainFormRef.GetExportConfigForCurrentQuery();
if (XMLStored.Length > 0)
{
IIDExportObject ExportConfig = new IIDExportObject();
try
{
var serializer = new XmlSerializer(ExportConfig.GetType());
// Now we need to build a Stream from the String to use in the XMLReader
byte[] byteArray = Encoding.UTF8.GetBytes(XMLStored);
MemoryStream stream = new MemoryStream(byteArray);
// Now we need to use a StreamReader to get around UTF8 vs UTF16 issues
// A little cumbersome - but it works
using (StreamReader sr = new StreamReader(stream, false))
{
using (var reader = XmlReader.Create(sr))
{
ExportConfig = (IIDExportObject)serializer.Deserialize(reader);
}
}
}
catch
{
}
I am not sure this is the best solution - but it works. I will be curious to see if anyone else has a better way of dealing with this.

Thanks to G Bradley, I took his answer and generalized it a bit to make it a bit easier to call.
public static string SerializeToXmlString<T>(T objectToSerialize)
{
XmlSerializer serializer = new XmlSerializer(typeof(T));
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = false;
settings.Encoding = Encoding.UTF8;
StringBuilder builder = new StringBuilder();
using (XmlWriter writer = XmlWriter.Create(builder, settings))
{
serializer.Serialize(writer, objectToSerialize);
}
return builder.ToString();
}
public static T DeserializeFromXmlString<T>(string xmlString)
{
if (string.IsNullOrWhiteSpace(xmlString))
return default;
var serializer = new XmlSerializer(typeof(T));
byte[] byteArray = Encoding.UTF8.GetBytes(xmlString);
MemoryStream stream = new MemoryStream(byteArray);
using (StreamReader sr = new StreamReader(stream, false))
{
using (var reader = XmlReader.Create(sr))
{
return (T)serializer.Deserialize(reader);
}
}
}

Related

Writing to Filestream and copying to MemoryStream

I want to overwrite or create an xml file on disk, and return the xml from the function. I figured I could do this by copying from FileStream to MemoryStream. But I end up appending a new xml document to the same file, instead of creating a new file each time.
What am I doing wrong? If I remove the copying, everything works fine.
public static string CreateAndSave(IEnumerable<OrderPage> orderPages, string filePath)
{
if (orderPages == null || !orderPages.Any())
{
return string.Empty;
}
var xmlBuilder = new StringBuilder();
var writerSettings = new XmlWriterSettings
{
Indent = true,
Encoding = Encoding.GetEncoding("ISO-8859-1"),
CheckCharacters = false,
ConformanceLevel = ConformanceLevel.Document
};
using (var fs = new FileStream(filePath, FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
try
{
XmlWriter xmlWriter = XmlWriter.Create(fs, writerSettings);
xmlWriter.WriteStartElement("PRINT_JOB");
WriteXmlAttribute(xmlWriter, "TYPE", "Order Confirmations");
foreach (var page in orderPages)
{
xmlWriter.WriteStartElement("PAGE");
WriteXmlAttribute(xmlWriter, "FORM_TYPE", page.OrderType);
var outBound = page.Orders.SingleOrDefault(x => x.FlightInfo.Direction == FlightDirection.Outbound);
var homeBound = page.Orders.SingleOrDefault(x => x.FlightInfo.Direction == FlightDirection.Homebound);
WriteXmlOrder(xmlWriter, outBound, page.ContailDetails, page.UserId, page.PrintType, FlightDirection.Outbound);
WriteXmlOrder(xmlWriter, homeBound, page.ContailDetails, page.UserId, page.PrintType, FlightDirection.Homebound);
xmlWriter.WriteEndElement();
}
xmlWriter.WriteFullEndElement();
MemoryStream destination = new MemoryStream();
fs.CopyTo(destination);
Log.Progress("Xml string length: {0}", destination.Length);
xmlBuilder.Append(Encoding.UTF8.GetString(destination.ToArray()));
destination.Flush();
destination.Close();
xmlWriter.Flush();
xmlWriter.Close();
}
catch (Exception ex)
{
Log.Warning(ex, "Unhandled exception occured during create of xml. {0}", ex.Message);
throw;
}
fs.Flush();
fs.Close();
}
return xmlBuilder.ToString();
}
Cheers
Jens
FileMode.OpenOrCreate is causing the file contents to be overwritten without shortening, leaving any 'trailing' data from previous runs. If FileMode.Create is used the file will be truncated first. However, to read back the contents you just wrote you will need to use Seek to reset the file pointer.
Also, flush the XmlWriter before copying from the underlying stream.
See also the question Simultaneous Read Write a file in C Sharp (3817477).
The following test program seems to do what you want (less your own logging and Order details).
using System;
using System.IO;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Threading.Tasks;
namespace ReadWriteTest
{
class Program
{
static void Main(string[] args)
{
string filePath = Path.Combine(
Environment.GetFolderPath(Environment.SpecialFolder.Personal),
"Test.xml");
string result = CreateAndSave(new string[] { "Hello", "World", "!" }, filePath);
Console.WriteLine("============== FIRST PASS ==============");
Console.WriteLine(result);
result = CreateAndSave(new string[] { "Hello", "World", "AGAIN", "!" }, filePath);
Console.WriteLine("============== SECOND PASS ==============");
Console.WriteLine(result);
Console.ReadLine();
}
public static string CreateAndSave(IEnumerable<string> orderPages, string filePath)
{
if (orderPages == null || !orderPages.Any())
{
return string.Empty;
}
var xmlBuilder = new StringBuilder();
var writerSettings = new XmlWriterSettings
{
Indent = true,
Encoding = Encoding.GetEncoding("ISO-8859-1"),
CheckCharacters = false,
ConformanceLevel = ConformanceLevel.Document
};
using (var fs = new FileStream(filePath, FileMode.Create, FileAccess.ReadWrite))
{
try
{
XmlWriter xmlWriter = XmlWriter.Create(fs, writerSettings);
xmlWriter.WriteStartElement("PRINT_JOB");
foreach (var page in orderPages)
{
xmlWriter.WriteElementString("PAGE", page);
}
xmlWriter.WriteFullEndElement();
xmlWriter.Flush(); // Flush from xmlWriter to fs
xmlWriter.Close();
fs.Seek(0, SeekOrigin.Begin); // Go back to read from the begining
MemoryStream destination = new MemoryStream();
fs.CopyTo(destination);
xmlBuilder.Append(Encoding.UTF8.GetString(destination.ToArray()));
destination.Flush();
destination.Close();
}
catch (Exception ex)
{
throw;
}
fs.Flush();
fs.Close();
}
return xmlBuilder.ToString();
}
}
}
For the optimizers out there, the StringBuilder was unnecessary because the string is formed whole and the MemoryStream can be avoided by just wrapping fs in a StreamReader. This would make the code as follows.
public static string CreateAndSave(IEnumerable<string> orderPages, string filePath)
{
if (orderPages == null || !orderPages.Any())
{
return string.Empty;
}
string result;
var writerSettings = new XmlWriterSettings
{
Indent = true,
Encoding = Encoding.GetEncoding("ISO-8859-1"),
CheckCharacters = false,
ConformanceLevel = ConformanceLevel.Document
};
using (var fs = new FileStream(filePath, FileMode.Create, FileAccess.ReadWrite))
{
try
{
XmlWriter xmlWriter = XmlWriter.Create(fs, writerSettings);
xmlWriter.WriteStartElement("PRINT_JOB");
foreach (var page in orderPages)
{
xmlWriter.WriteElementString("PAGE", page);
}
xmlWriter.WriteFullEndElement();
xmlWriter.Close(); // Flush from xmlWriter to fs
fs.Seek(0, SeekOrigin.Begin); // Go back to read from the begining
var reader = new StreamReader(fs, writerSettings.Encoding);
result = reader.ReadToEnd();
// reader.Close(); // This would just flush/close fs early(which would be OK)
}
catch (Exception ex)
{
throw;
}
}
return result;
}
I know I'm late, but there seems to be a simpler solution. You want your function to generate xml, write it to a file and return the generated xml. Apparently allocating a string cannot be avoided (because you want it to be returned), same for writing to a file. But reading from a file (as in your and SensorSmith's solutions) can easily be avoided by simply "swapping" the operations - generate xml string and write it to a file. Like this:
var output = new StringBuilder();
var writerSettings = new XmlWriterSettings { /* your settings ... */ };
using (var xmlWriter = XmlWriter.Create(output, writerSettings))
{
// Your xml generation code using the writer
// ...
// You don't need to flush the writer, it will be done automatically
}
// Here the output variable contains the xml, let's take it...
var xml = output.ToString();
// write it to a file...
File.WriteAllText(filePath, xml);
// and we are done :-)
return xml;
IMPORTANT UPDATE: It turns out that the XmlWriter.Create(StringBuider, XmlWriterSettings) overload ignores the Encoding from the settings and always uses "utf-16", so don't use this method if you need other encoding.

Using DataContractJsonSerializer to create a Non XML Json file

I want to use the DataContractJsonSerializer to serialize to file in JsonFormat.
The problem is that the WriteObjectmethod only has 3 options XmlWriter, XmlDictionaryWriter and Stream.
To get what I want I used the following code:
var js = new DataContractJsonSerializer(typeof(T), _knownTypes);
using (var ms = new MemoryStream())
{
js.WriteObject(ms, item);
ms.Position = 0;
using (var sr = new StreamReader(ms))
{
using (var writer = new StreamWriter(path, false))
{
string jsonData = sr.ReadToEnd();
writer.Write(jsonData);
}
}
}
Is this the only way or have I missed something?
Assuming you're just trying to write the text to a file, it's not clear why you're writing it to a MemoryStream first. You can just use:
var js = new DataContractJsonSerializer(typeof(T), _knownTypes);
using (var stream = File.Create(path))
{
js.WriteObject(stream, item);
}
That's rather simpler, and should do what you want...
I am actually quite terrified to claim to know something that Jon Skeet doesn't, but I have used code similar to the following which produces the Json text file and maintains proper indentation:
var js = new DataContractJsonSerializer(typeof(T), _knownTypes);
using (var stream = File.Create(path))
{
using (var writer = JsonReaderWriterFactory.CreateJsonWriter(stream, Encoding.UTF8, true, true, "\t"))
{
js.WriteObject(writer, item);
writer.Flush();
}
}
(as suggested here.)

C# - From NetworkStream to XElement managing different encoding

I've an application that search XML over the network (using TcpClient), these XML have various encoding (one site in UTF8, other in Windows-1252). I would like encode all of these XML in UTF-8 (always) to be sure I'm clean.
How can I do the conversion from the NetworkStream to an XElement encoding correctly all data?
I've this :
NetworkStream _clientStream = /* ... */;
MemoryStream _responseBytes = new MemoryStream();
// serverEncoding -> Xml Encoding I get from server
// _UTF8Encoder -> Local encoder (always UTF8)
try
{
_clientStream.CopyTo(_responseBytes);
if (serverEncoding != _UTF8Encoder)
{
MemoryStream encodedStream = new MemoryStream();
string line = null;
using (StreamReader reader = new StreamReader(_responseBytes))
{
using (StreamWriter writer = new StreamWriter(encodedStream))
{
while ((line = reader.ReadLine()) != null)
{
writer.WriteLine(
Encoding.Convert(serverEncoding, _UTF8Encoder, serverEncoding.GetBytes(line))
);
}
}
}
_responseBytes = encodedStream;
}
_responseBytes.Position = 0;
using (XmlReader reader = XmlReader.Create(_responseBytes))
{
xmlResult = XElement.Load(reader, LoadOptions.PreserveWhitespace);
}
}
catch (Exception ex)
{ }
Have you a better solution (and by ignoring all '\0' ?).
Edit
This works :
byte[] b = _clientStream.ReadToEnd();
var text = _UTF8Encoder.GetString(b, 0, b.Length);
xmlResult = XElement.Parse(text, LoadOptions.PreserveWhitespace);
But this not :
using (var reader = new StreamReader(_clientStream, false))
xmlResult = XElement.Load(reader, LoadOptions.PreserveWhitespace);
I don't understand why ...
You can simply create a StreamReader around the NetworkStream, passing the encoding of the stream, then pass it to XElement.Load:
XElement elem
using(var reader = new StreamReader = new StreamReader(_clientStream, serverEncoding))
elem = XElement.Load(reader);
There is no point in manually transcoding it to a different encoding.

Xml Serialization / Deserialization Problem

After writing and reading an xml string to and from a stream, it ceases to be deserializable. The new string is clipped.
string XmlContent = getContentFromMyDataBase();
XmlSerializer xs = new XmlSerializer(typeof(MyObj));
MemoryStream ms = new MemoryStream();
StreamWriter sw = new StreamWriter(ms);
char[] ca = XmlContent.ToCharArray(); // still working up to this point.
ms.Position = 0;
sw.Write(ca);
StreamReader sr = new StreamReader(ms);
ms.Position = 0;
string XmlContentAgain = sr.ReadToEnd();
Console.WriteLine(XmlContentAgain); // (outputstring is too short.)
MyObj theObj = (MyObj)xs.Deserialize(ms); // Can't deserialize.
Any suggestions as to how to fix this or what is causing the problem? My only guess is that there is some form of encoding issue, but I wouldn't know how to go about finding/fixing it.
Additionally, myObj has a generic dictionary member, which typically isn't serializable, so I have stolen code from Paul Welter in order to serialize it.
Try flushing and disposing or even better simplify your code using a StringReader:
string xmlContent = getContentFromMyDataBase();
var xs = new XmlSerializer(typeof(MyObj));
using (var reader = new StringReader(xmlContent))
{
var theObj = (MyObj)xs.Deserialize(reader);
}
Note: The getContentFromMyDataBase method also suggests that you are storing XML in your database that you are deserializing back to an object. Don't.
You need to Flush or Close (closing implicitly flushes) the StreamWriter, or you cannot be sure it is done writing to the underlying stream. This is because it is doing some internal buffering.
Try this:
using(StreamWriter sw = new StreamWriter(ms))
{
char[] ca = XmlContent.ToCharArray(); // still working up to this point.
ms.Position = 0;
sw.Write(ca);
}
StreamReader sr = new StreamReader(ms);
ms.Position = 0;
string XmlContentAgain = sr.ReadToEnd();

Persist a DataContract as XML in a database

I'm working on a kind of "store and forward" application for WCF services. I want to save the message in a database as a raw XML blob, as XElement. I'm having a bit of trouble converting the datacontract into the XElement type I need for the database call. Any ideas?
this returns it as a string, which you can put into the db into an xml column. Here is a good generic method you can use to serialize datacontracts.
public static string Serialize<T>(T obj)
{
StringBuilder sb = new StringBuilder();
DataContractSerializer ser = new DataContractSerializer(typeof(T));
ser.WriteObject(XmlWriter.Create(sb), obj);
return sb.ToString();
}
btw, are you using linq to sql? The reason i ask is because of the XElement part of your question. if thats the case, you can modify this in the .dbml designer to use a string as the CLR type, and not the default XElement.
The most voted on answer (Jason W. posted) did not work for me. I dont know why that answer got the most votes. But after searching around I found this
http://billrob.com/archive/2010/02/09/datacontractserializer-converting-objects-to-xml-string.aspx
Which worked for my project. I just had a few classes and put the datacontract and datamemeber attributes on classes and properties and then wanted to get an XML string which I could write to the database.
Code from the link above incase it goes 404:
Serializes:
var serializer = new DataContractSerializer(tempData.GetType());
using (var backing = new System.IO.StringWriter())
using (var writer = new System.Xml.XmlTextWriter(backing))
{
serializer.WriteObject(writer, tempData);
data.XmlData = backing.ToString();
}
Deserializes:
var serializer = new DataContractSerializer(typeof(T));
using (var backing = new System.IO.StringReader(data.XmlData))
using (var reader = new System.Xml.XmlTextReader(backing))
{
return serializer.ReadObject(reader) as T;
}
If your database is SQL Server 2005 or above, you can use the XML data type:
private readonly DataContractToSerialize _testContract =
new DataContractToSerialize
{
ID = 1,
Name = "One",
Children =
{
new ChildClassToSerialize {ChildMember = "ChildOne"},
new ChildClassToSerialize {ChildMember = "ChildTwo"}
}
};
public void SerializeDataContract()
{
using (var outputStream = new MemoryStream())
{
using (var writer = XmlWriter.Create(outputStream))
{
var serializer =
new DataContractSerializer(_testContract.GetType());
if (writer != null)
{
serializer.WriteObject(writer, _testContract);
}
}
outputStream.Position = 0;
using (
var conn =
new SqlConnection(Settings.Default.ConnectionString))
{
conn.Open();
const string INSERT_COMMAND =
#"INSERT INTO XmlStore (Data) VALUES (#Data)";
using (var cmd = new SqlCommand(INSERT_COMMAND, conn))
{
using (var reader = XmlReader.Create(outputStream))
{
var xml = new SqlXml(reader);
cmd.Parameters.Clear();
cmd.Parameters.AddWithValue("#Data", xml);
cmd.ExecuteNonQuery();
}
}
}
}
}
I'm not sure about the most efficient way to get it to an XElement, but to get it to a string just run:
DataContractSerializer serializer = new DataContractSerializer(typeof(Foo));
using (MemoryStream memStream = new MemoryStream())
{
serializer.WriteObject(memStream, fooInstance);
byte[] blob = memStream.ToArray();
}
I tried to use Jason w'Serialize function that uses StringBuilder , but it returns empty string for LingToSQL Designer generated table class
with [DataContract()] attribute
However if I serialze to byte array as suggested by AgileJon
and then use UTF7Encoding to convert to string , it creates readable XML string.
static string DataContractSerializeUsingByteArray<T>(T obj)
{
string sRet = "";
DataContractSerializer serializer = new DataContractSerializer(typeof(T));
using (MemoryStream memStream = new MemoryStream())
{
serializer.WriteObject(memStream, obj);
byte[] blob = memStream.ToArray();
var encoding= new System.Text.UTF7Encoding();
sRet = encoding.GetString(blob);
}
return sRet;
}
Not sure why stringBuilder solution not working.

Categories