XmllSerializer.Serialize and returning raw xml - c#

I am using an XmlSerializer to serialize an object to xml. After the object gets serialized, I end up with something like...
<?xml version="1.0" encoding="utf-8"?>
<xml xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
</xml>
I need to have it return simply...
<xml></xml>
Is there a way to serialize xml without the extra information? I realize strictly proper xml requires these additional elements, but I need the simpler form as I am appending these xml strings together to form a larger xml blob.
UPDATE
I was able to get the following code to work...
public static string Serialize(object o)
{
XmlWriterSettings xws = new XmlWriterSettings();
xws.OmitXmlDeclaration = true;
xws.Indent = true;
XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
ns.Add(String.Empty, String.Empty);
StringBuilder sb = new StringBuilder();
XmlWriter xmlw = XmlWriter.Create(sb, xws);
XmlSerializer serializer = new XmlSerializer(o.GetType());
serializer.Serialize(xmlw, o, ns);
xmlw.Flush();
return sb.ToString();
}

In C#, you can do something like this:
XmlSerializer serializer = new XmlSerializer(typeof(object));
StringWriter stringWriter = new StringWriter();
using (XmlWriter writer = XmlWriter.Create(stringWriter, new XmlWriterSettings() { OmitXmlDeclaration = true }))
{
serializer.Serialize(writer, this, new XmlSerializerNamespaces() { "",""});
}
string xmlText = stringWriter.ToString();
Explanation:
OmitXmlDeclaration = true makes it remove the declaration.
new XmlSerializerNamespaces() { "",""} removes the namespaces.

Related

How to serialize an UTF 8 xml string

I had this code to create an xml. It used to work as expected, but in the first xml row, the encoding was set as utf-16.
The code was:
public static string SerializeObject<T>(this T toSerialize)
{
XmlSerializer xmlSerializer = new XmlSerializer(toSerialize.GetType());
using (StringWriter textWriter = new StringWriter())
{
xmlSerializer.Serialize(textWriter, toSerialize);
return textWriter.ToString();
}
}
Since I need to encode it in UTF-8, I edit it as follows:
public static string SerializeObject<T>(this T toSerialize)
{
XmlSerializer xmlSerializer = new XmlSerializer(toSerialize.GetType());
XmlWriterSettings settings = new XmlWriterSettings()
{
Encoding = new UTF8Encoding(),
Indent = false,
OmitXmlDeclaration = false
};
using (StringWriter textWriter = new StringWriter())
{
using (XmlWriter xmlWriter = XmlWriter.Create(textWriter, settings))
{
xmlSerializer.Serialize(xmlWriter, toSerialize);
}
return textWriter.ToString();
}
}
The code seems good, but the generated xml has still the row
<?xml version="1.0" encoding="utf-16"?>
and not
<?xml version="1.0" encoding="utf-8"?>
Why? How can I create an utf-8 xml?

XML Deserialization encoding issue

I already searched a lot and unable to find a solution and unable to determine the correct approach
I am serializing an object to xml string and deserializing it back to an object using c#. XML string after serialization adds a leading ?. When I dezerialize it back to the object I am getting an error There is an error in XML document (1, 1)
?<?xml version="1.0" encoding="utf-16"?>
Serialization code:
string xmlString = null;
MemoryStream memoryStream = new MemoryStream();
XmlSerializer xs = new XmlSerializer(typeof(T));
XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
ns.Add("abc", "http://example.com/abc/");
XmlTextWriter xmlTextWriter = new XmlTextWriter(memoryStream,Encoding.Unicode);
xs.Serialize(xmlTextWriter, obj, ns);
memoryStream = (MemoryStream)xmlTextWriter.BaseStream;
xmlString = ConvertByteArrayToString(memoryStream.ToArray());
ConvertByteArrayToString:
UnicodeEncoding encoding = new UnicodeEncoding();
string constructedString = encoding.GetString(characters);
Deserialization Code:
XmlSerializer ser = new XmlSerializer(typeof(T));
StringReader stringReader = new StringReader(xml);
XmlTextReader xmlReader = new XmlTextReader(stringReader);
object obj = ser.Deserialize(xmlReader);
xmlReader.Close();
stringReader.Close();
return (T)obj;
I would like to know what I am doing wrong with encoding and I need a solution that works for most cases. Thanks
Use following function for serialization and Deserialization
public static string Serialize<T>(T dataToSerialize)
{
try
{
var stringwriter = new System.IO.StringWriter();
var serializer = new XmlSerializer(typeof(T));
serializer.Serialize(stringwriter, dataToSerialize);
return stringwriter.ToString();
}
catch
{
throw;
}
}
public static T Deserialize<T>(string xmlText)
{
try
{
var stringReader = new System.IO.StringReader(xmlText);
var serializer = new XmlSerializer(typeof(T));
return (T)serializer.Deserialize(stringReader);
}
catch
{
throw;
}
}
Your serialized XML contains a Unicode byte-order mark in the beginning, and this is where the deserializer fails.
To remove the BOM you need to create a different version of encoding suppressing BOM instead of using default Encoding.Unicode:
new XmlTextWriter(memoryStream, new UnicodeEncoding(false, false))
Here the second false prevents BOM being prepended to the string.

remove encoding from xmlserializer

I am using the following code to create an xml document -
XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
ns.Add("", "");
new XmlSerializer(typeof(docket)).Serialize(Console.Out, i, ns);
this works great in creating the xml file with no namespace attributes. i would like to also have no encoding attribute in the root element, but I cannot find a way to do it. Does anyone have any idea if this can be done?
Thanks
Old answer removed and update with new solution:
Assuming that it's ok to remove the xml declaration completly, because it makes not much sense without the encoding attribute:
XmlSerializerNamespaces ns = new XmlSerializerNamespaces(); ns.Add("", "");
using (XmlWriter writer = XmlWriter.Create(Console.Out, new XmlWriterSettings { OmitXmlDeclaration = true}))
{
new XmlSerializer(typeof (SomeType)).Serialize(writer, new SomeType(), ns);
}
To remove encoding from XML header pass TextWriter with null encoding to XmlSerializer:
MemoryStream ms = new MemoryStream();
XmlTextWriter w = new XmlTextWriter(ms, null);
s.Serialize(w, vs);
Explanation
XmlTextWriter uses encoding from TextWriter passed in constructor:
// XmlTextWriter constructor
public XmlTextWriter(TextWriter w) : this()
{
this.textWriter = w;
this.encoding = w.Encoding;
..
It uses this encoding when generating XML:
// Snippet from XmlTextWriter.StartDocument
if (this.encoding != null)
{
builder.Append(" encoding=");
...
string withEncoding;
using (System.IO.MemoryStream memory = new System.IO.MemoryStream()) {
using (System.IO.StreamWriter writer = new System.IO.StreamWriter(memory)) {
serializer.Serialize(writer, obj, null);
using (System.IO.StreamReader reader = new System.IO.StreamReader(memory)) {
memory.Position = 0;
withEncoding= reader.ReadToEnd();
}
}
}
string withOutEncoding= withEncoding.Replace("<?xml version=\"1.0\" encoding=\"utf-8\"?>", "");
Credit to this blog for helping me with my code
http://blog.dotnetclr.com/archive/2008/01/29/removing-declaration-and-namespaces-from-xml-serialization.aspx
here's my solution, same idea, but in VB.NET and a little clearer in my opinion.
Dim sw As StreamWriter = New, StreamWriter(req.GetRequestStream,System.Text.Encoding.ASCII)
Dim xSerializer As XmlSerializer = New XmlSerializer(GetType(T))
Dim nmsp As XmlSerializerNamespaces = New XmlSerializerNamespaces()
nmsp.Add("", "")
Dim xWriterSettings As XmlWriterSettings = New XmlWriterSettings()
xWriterSettings.OmitXmlDeclaration = True
Dim xmlWriter As XmlWriter = xmlWriter.Create(sw, xWriterSettings)
xSerializer.Serialize(xmlWriter, someObjectT, nmsp)

Json.NET - convert JSON to XML and remove XML version, encoding?

http://james.newtonking.com/projects/json/help/
How come when I use "DeserializeXmlNode" and my JSON gets converted to an XML document
then convert my XML document into a string like this
string strXML = "";
StringWriter writer = new StringWriter();
xmlDoc.Save(writer);
strXML = writer.ToString();
It includes
<?xml version="1.0" encoding="utf-16"?>
I did not add this, how do I remove it?
an XML without that line is not a valid XML file!
that line is called the XML Declaration
as an example, check out the OData XML from Netflix on Catalog Titles, can you see that first line?
http://odata.netflix.com/Catalog/Titles
Use XmlWriter with StringBuilder instead of StringWriter
var strXML = "";
var writer = new StringBuilder();
var settings = new System.Xml.XmlWriterSettings() { OmitXmlDeclaration = true};
var xmlWriter = System.Xml.XmlWriter.Create(strXML, settings);
xmlDoc.Save(xmlWriter);
strXML = writer.ToString();

Force XDocument to write to String with UTF-8 encoding

I want to be able to write XML to a String with the declaration and with UTF-8 encoding. This seems mighty tricky to accomplish.
I have read around a bit and tried some of the popular answers for this but the they all have issues. My current code correctly outputs as UTF-8 but does not maintain the original formatting of the XDocument (i.e. indents / whitespace)!
Can anyone offer some advice please?
XDocument xml = new XDocument(new XDeclaration("1.0", "utf-8", "yes"), xelementXML);
MemoryStream ms = new MemoryStream();
using (XmlWriter xw = new XmlTextWriter(ms, Encoding.UTF8))
{
xml.Save(xw);
xw.Flush();
StreamReader sr = new StreamReader(ms);
ms.Seek(0, SeekOrigin.Begin);
String xmlString = sr.ReadToEnd();
}
The XML requires the formatting to be identical to the way .ToString() would format it i.e.
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<root>
<node>blah</node>
</root>
What I'm currently seeing is
<?xml version="1.0" encoding="utf-8" standalone="yes"?><root><node>blah</node></root>
Update
I have managed to get this to work by adding XmlTextWriter settings... It seems VERY clunky though!
MemoryStream ms = new MemoryStream();
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.UTF8;
settings.ConformanceLevel = ConformanceLevel.Document;
settings.Indent = true;
using (XmlWriter xw = XmlTextWriter.Create(ms, settings))
{
xml.Save(xw);
xw.Flush();
StreamReader sr = new StreamReader(ms);
ms.Seek(0, SeekOrigin.Begin);
String blah = sr.ReadToEnd();
}
Try this:
using System;
using System.IO;
using System.Text;
using System.Xml.Linq;
class Test
{
static void Main()
{
XDocument doc = XDocument.Load("test.xml",
LoadOptions.PreserveWhitespace);
doc.Declaration = new XDeclaration("1.0", "utf-8", null);
StringWriter writer = new Utf8StringWriter();
doc.Save(writer, SaveOptions.None);
Console.WriteLine(writer);
}
private class Utf8StringWriter : StringWriter
{
public override Encoding Encoding { get { return Encoding.UTF8; } }
}
}
Of course, you haven't shown us how you're building the document, which makes it hard to test... I've just tried with a hand-constructed XDocument and that contains the relevant whitespace too.
Try XmlWriterSettings:
XmlWriterSettings xws = new XmlWriterSettings();
xws.OmitXmlDeclaration = false;
xws.Indent = true;
And pass it on like
using (XmlWriter xw = XmlWriter.Create(sb, xws))
See also https://stackoverflow.com/a/3288376/1430535
return xdoc.Declaration.ToString() + Environment.NewLine + xdoc.ToString();

Categories