Serializing class structure to XML seems to add a NewLine character - c#

The code below serializes XML into a string, then writes it to an XML file (yes quite a bit going on with respect to UTF8 and removal of the Namespace):
var bidsXml = string.Empty;
var emptyNamespaces = new XmlSerializerNamespaces(new[] { XmlQualifiedName.Empty });
var settings = new XmlWriterSettings();
settings.Indent = true;
settings.OmitXmlDeclaration = true;
activity = $"Serialize Class INFO to XML to string";
using (MemoryStream stream = new MemoryStream())
using (StreamWriter writer = new StreamWriter(stream, Encoding.UTF8))
{
XmlSerializer xml = new XmlSerializer(info.GetType());
xml.Serialize(writer, info, emptyNamespaces);
bidsXml = Encoding.UTF8.GetString(stream.ToArray());
}
var lastChar = bidsXml.Substring(bidsXml.Length);
var fileName = $"CostOffer_Testing_{DateTime.Now:yyyy.MM.dd_HH.mm.ss}.xml";
var path = $"c:\\temp\\pjm\\{fileName}";
File.WriteAllText(path, bidsXml);
Problem is, serialization to XML seems to introduce a CR/LF (NewLine):
It's easier to see in the XML file:
A workaround is to strip out the "last" character:
bidsXml = bidsXml.Substring(0,bidsXml.Length - 1);
But better is to understand the root cause and resolve without a workaround - any idea why this a NewLine characters is being appended to the XML string?
** EDIT **
I was able to attempt a load into the consumer application (prior to this attempt I used an API to import the XML), and I received a more telling message:
The file you are loading is a binary file, the contents can not be displayed here.
So i suspect an unprintable characters is somehow getting embedded into the file/XML. When I open the file in Notepad++, I see the following (UFF-8-Byte Order Mark) - at least I have something to go on:

So it seems the consumer of my XML does not want BOM (Byte Order Mark) within the stream.
Visiting this site UTF-8 BOM adventures in C#
I've updated my code to use new UTF8Encoding(false)) rather than Encoding.UTF8:
var utf8NoBOM = new UTF8Encoding(false);
var bidsXml = string.Empty;
var emptyNamespaces = new XmlSerializerNamespaces(new[] { XmlQualifiedName.Empty });
var settings = new XmlWriterSettings();
settings.Indent = true;
settings.OmitXmlDeclaration = true;
activity = $"Serialize Class INFO to XML to string";
using (MemoryStream stream = new MemoryStream())
using (StreamWriter writer = new StreamWriter(stream, utf8NoBOM))
{
XmlSerializer xml = new XmlSerializer(info.GetType());
xml.Serialize(writer, info, emptyNamespaces);
bidsXml = utf8NoBOM.GetString(stream.ToArray());
}
var fileName = $"CostOffer_Testing_{DateTime.Now:yyyy.MM.dd_HH.mm.ss}.xml";
var path = $"c:\\temp\\pjm\\{fileName}";
File.WriteAllText(path, bidsXml, utf8NoBOM);

Related

How not to erase data from a file during serialization?

Good evening,
I realize an application currently using the C # language and I had to resort to serialization using XmlSerializer.
I had to be able to save a list in an xml file. It was then necessary that I could recover the data of this file to be able to recover the list. I managed to do all this and here is my code:
To save the list:
Stream stream = File.OpenWrite(chemin);
XmlSerializer xmlSer = new XmlSerializer(typeof(List<Utilisateur>));
xmlSer.Serialize(stream,listeUtilisateurs);
stream.Close();
To retrieve the list:
Stream stream = File.OpenRead(chemin);
XmlSerializer xmlSer = new XmlSerializer(typeof(List<Utilisateur>));
List<Utilisateur> listrecuperee = (List<Utilisateur>)xmlSer.Deserialize(stream);
listeUtilisateurs = listrecuperee;
stream.Close();
However, the problem is that every time I save the list, the data that was before it go away, I want to keep it, I actually want it to write after the file. Would you have a solution please? Cordially.
Yes, we can do what you want.
Code to serialize to the same file:
var xmlSer = new XmlSerializer(typeof(List<Utilisateur>));
var xmlWriterSettings = new XmlWriterSettings
{
OmitXmlDeclaration = true,
Indent = true // need only for pretty print
};
bool append = File.Exists("test.txt");
using (var streamWriter = new StreamWriter("test.txt", append))
using (var xmlWriter = XmlWriter.Create(streamWriter, xmlWriterSettings))
{
xmlSer.Serialize(xmlWriter, listeUtilisateurs);
streamWriter.WriteLine(); // need only for pretty print
}
Code for reading a set of data:
int orderNumber = 2; // for example, read the second data collection in order
var xmlReaderSettings = new XmlReaderSettings
{
ConformanceLevel = ConformanceLevel.Fragment
};
using (var reader = XmlReader.Create("test.txt", xmlReaderSettings))
{
int count = 0;
while (reader.ReadToFollowing("ArrayOfUtilisateur"))
{
count++;
if (count == orderNumber)
break;
}
var list = (List<Utilisateur>)xmlSer.Deserialize(reader);
}
Of course, you need to add the necessary checks when reading.
This way of using a single file is pretty ugly, though it works. Therefore, consider other options for saving your data.

Save XML file without formatting

I have a XML file that needs to be saved without formatting, without identation and line breaks. I'm doing it this way:
using (var writer = System.IO.File.CreateText("E:\\nfse.xml"))
{
var doc = new XmlDocument { PreserveWhitespace = false };
doc.Load("E:\\notafinal.xml");
writer.WriteLine(doc.InnerXml);
writer.Flush();
}
But that way I need to create the file, and then I need to change it 3 times, so in the end there are a total of 4 files, the initial one and the result of the 3 changes.
When I save the file, I do it this way:
MemoryStream stream = stringToStream(soapEnvelope);
webRequest.ContentLength = stream.Length;
Stream requestStream = webRequest.GetRequestStream();
stream.WriteTo(requestStream);
document.LoadXml(soapEnvelope);
document.PreserveWhitespace = false;
document.Save(#"E:\\notafinal.xml");
How can I do this without having to create a new document?
If what you want is to eliminate extra space by not formatting the XML file, you could use XmlWriterSettings and XmlWriter, like this:
public void SaveXmlDocToFile(XmlDocument xmlDoc,
string outputFileName,
bool formatXmlFile = false)
{
var settings = new XmlWriterSettings();
if (formatXmlFile)
{
settings.Indent = true;
}
else
{
settings.Indent = false;
settings.NewLineChars = String.Empty;
}
using (var writer = XmlWriter.Create(outputFileName, settings))
xmlDoc.Save(writer);
}
Passing formatXmlFile = false in the parameters will save the XML file without formatting it.

XMLSerializer - issue with UTF-8 vs UTF-16 Code

I am trying to serialize a simple object (5 string properties) into XML to save to a DB Image field. Then I need to DeSerialize it back into a string later in the program.
However, I am getting some errors - caused by the XML being saved thinking it is in UTF-16 - however, when I load it from the DB back into a string - it thinks it is a UTF 8 String.
The error I get is
InnerException {"There is no Unicode byte order mark. Cannot switch to Unicode."} System.Exception {System.Xml.XmlException}
-- Message "There is an error in XML document (0, 0)." string
Is this happening because of the two different ways I save and load the string to/from the DB? On the save I am using a StringBuilder - but on the load from DB I am using just a String.
Thoughts?
Serialize and Save to DB
// Now Save the OBject XML to the Query Tables
var serializer = new XmlSerializer(ExportConfig.GetType());
StringBuilder StringResult = new StringBuilder();
using (var writer = XmlWriter.Create(StringResult))
{
serializer.Serialize(writer, ExportConfig);
}
//MessageBox.Show("XML : " + StringResult);
// Now Save to the Query
try
{
string UpdateSQL = "Update ZQryRpt "
+ " Set ExportConfig = " + TAGlobal.QuotedStr(StringResult.ToString())
+ " where QryId = " + TAGlobal.QuotedStr(((DataRowView)bindingSource_zQryRpt.Current).Row["QryID"].ToString())
;
ExecNonSelectSQL(UpdateSQL, uniConnection_Config);
}
catch (Exception Error)
{
MessageBox.Show("Error Setting ExportConfig: " + Error.Message);
}
Load from DB And Deserialize
byte[] binaryData = (byte[])((DataRowView)bindingSource_zQryRpt.Current).Row["ExportConfig"];
string XMLStored = System.Text.Encoding.UTF8.GetString(binaryData, 0, binaryData.Length);
if (XMLStored.Length > 0)
{
IIDExportObject ExportConfig = new IIDExportObject();
var serializer = new XmlSerializer(ExportConfig.GetType());
//StringBuilder StringResult = new StringBuilder(XMLStored);
// Load the XML from the Query into the StringBuilder
// Now we need to build a Stream from the String to use in the XMLReader
byte[] byteArray = Encoding.UTF8.GetBytes(XMLStored);
MemoryStream stream = new MemoryStream(byteArray);
using (var reader = XmlReader.Create(stream))
{
ExportConfig = (IIDExportObject)serializer.Deserialize(reader);
}
}
John - thank you very much for the comment! It allowed me to complete the code and find a solution.
As you noted - using a stream reader was the solution - but I could not read the first line because there was only one 'line' in my string. However, I could use the line
using (StreamReader sr = new StreamReader(stream, false))
Which allows me to read the stream and ignore the "Byte Order Mark Detection" set to false.
string XMLStored = MainFormRef.GetExportConfigForCurrentQuery();
if (XMLStored.Length > 0)
{
IIDExportObject ExportConfig = new IIDExportObject();
try
{
var serializer = new XmlSerializer(ExportConfig.GetType());
// Now we need to build a Stream from the String to use in the XMLReader
byte[] byteArray = Encoding.UTF8.GetBytes(XMLStored);
MemoryStream stream = new MemoryStream(byteArray);
// Now we need to use a StreamReader to get around UTF8 vs UTF16 issues
// A little cumbersome - but it works
using (StreamReader sr = new StreamReader(stream, false))
{
using (var reader = XmlReader.Create(sr))
{
ExportConfig = (IIDExportObject)serializer.Deserialize(reader);
}
}
}
catch
{
}
I am not sure this is the best solution - but it works. I will be curious to see if anyone else has a better way of dealing with this.
Thanks to G Bradley, I took his answer and generalized it a bit to make it a bit easier to call.
public static string SerializeToXmlString<T>(T objectToSerialize)
{
XmlSerializer serializer = new XmlSerializer(typeof(T));
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = false;
settings.Encoding = Encoding.UTF8;
StringBuilder builder = new StringBuilder();
using (XmlWriter writer = XmlWriter.Create(builder, settings))
{
serializer.Serialize(writer, objectToSerialize);
}
return builder.ToString();
}
public static T DeserializeFromXmlString<T>(string xmlString)
{
if (string.IsNullOrWhiteSpace(xmlString))
return default;
var serializer = new XmlSerializer(typeof(T));
byte[] byteArray = Encoding.UTF8.GetBytes(xmlString);
MemoryStream stream = new MemoryStream(byteArray);
using (StreamReader sr = new StreamReader(stream, false))
{
using (var reader = XmlReader.Create(sr))
{
return (T)serializer.Deserialize(reader);
}
}
}

YamDocument to text representation end with 3 dots

When I do :
var root = new YamlMappingNode();
var doc = new YamlDocument(root);
root.Add("one", "two");
var stream = new YamlStream(doc);
var buffer = new StringBuilder();
using (var writer = new StringWriter(buffer))
{
stream.Save(writer, false);
var t = buffer.ToString();
}
I get :
one: two
...
Why is there 3 dots at the end of the file ?
So YamlStream is for streaming multiple yaml documents down a single stream, therefore it codifies markers to indicate both end-of-file (---) and end-of-stream (...). If you're only serializing a single document, you probably don't want this.
Instead, use Serializer to write a node to a StreamWriter (backed-off by a (File)Stream):
var serializer = new Serializer(); //YamlDotNet.Serialization.Serializer
using (var fs = File.OpenWrite("some/path.yaml"))
using (var sw = new StreamWriter(fs))
{
serializer.Serialize(sw, doc.RootNode);
}

How do I save a JSON file with four spaces indentation using JSON.NET?

I need to read a JSON configuration file, modify a value and then save the modified JSON back to the file again. The JSON is as simple as it gets:
{
"test": "init",
"revision": 0
}
To load the data and modify the value I do this:
var config = JObject.Parse(File.ReadAllText("config.json"));
config["revision"] = 1;
So far so good; now, to write the JSON back to the file. First I tried this:
File.WriteAllText("config.json", config.ToString(Formatting.Indented));
Which writes the file correctly, but the indentation is only two spaces.
{
"test": "init",
"revision": 1
}
From the documentation, it looks like there's no way to pass any other options in using this method, so I tried modifying this example which would allow me to directly set the Indentation and IndentChar properties of the JsonTextWriter to specify the amount of indentation:
using (FileStream fs = File.Open("config.json", FileMode.OpenOrCreate))
{
using (StreamWriter sw = new StreamWriter(fs))
{
using (JsonTextWriter jw = new JsonTextWriter(sw))
{
jw.Formatting = Formatting.Indented;
jw.IndentChar = ' ';
jw.Indentation = 4;
jw.WriteRaw(config.ToString());
}
}
}
But that doesn't seem to have any effect: the file is still written with two space indentation. What am I doing wrong?
The problem is that you are using config.ToString(), so the object is already serialised into a string and formatted when you write it using the JsonTextWriter.
Use a serialiser to serialise the object to the writer instead:
JsonSerializer serializer = new JsonSerializer();
serializer.Serialize(jw, config);
I ran into the same issue and found out that WriteRaw does not effect the indentation settings, however you can solve the issue using WriteTo on the JObject
using (FileStream fs = File.Open("config.json", FileMode.OpenOrCreate))
{
using (StreamWriter sw = new StreamWriter(fs))
{
using (JsonTextWriter jw = new JsonTextWriter(sw))
{
jw.Formatting = Formatting.Indented;
jw.IndentChar = ' ';
jw.Indentation = 4;
config.WriteTo(jw);
}
}
}
Maybe try to feed a tab character to the IndentChar?
...
jw.IndentChar = '\t';
...
Accordinging to the documentation, it should use the tab character to indent the JSON instead of the space character.
http://james.newtonking.com/json/help/index.html?topic=html/T_Newtonsoft_Json_Formatting.htm
I summarize the complete code, based on #Guffa's answer.
Changed from FileMode.OpenOrCreate to FileMode.Create. Otherwise the file will not be shortened when the new contents are smaller.
var config = JObject.Parse(File.ReadAllText("config.json"));
config["revision"] = 1;
using (FileStream fs = File.Open("config.json", FileMode.Create))
{
using (StreamWriter sw = new StreamWriter(fs))
{
using (JsonTextWriter jw = new JsonTextWriter(sw))
{
jw.Formatting = Formatting.Indented;
jw.IndentChar = ' ';
jw.Indentation = 4;
JsonSerializer serializer = new JsonSerializer();
serializer.Serialize(jw, config);
}
}
}
If you don`t have a class to serialize/deserialize your JSON object, you may use JObject from Newtonsoft.Json.Linq
var buildText = JObject.Parse(buildInfo.ToString());
File.WriteAllText(infoPath, buildText.ToString());
As a result you will have fully formatted .json file
Here you have an extension method that makes json easy to read for human.
It removes quotes in property names, converts enums to strings and indent json with 4 spaces.
public static string ToPrettyJson(this object obj)
{
var json = JsonConvert.SerializeObject(obj, Formatting.Indented, new StringEnumConverter());
json = Regex.Replace(json, #"^([\s]+)""([^""]+)"": ", "$1$2: ", RegexOptions.Multiline); // no quotes in props
json = Regex.Replace(json, #"^[ ]+", m => new String(' ', m.Value.Length * 2), RegexOptions.Multiline); // more indent spaces
return json;
}

Categories