Why xmlreader can't read something that xmlwriter wrote? - c#

I have the simplest code in the world,
using (XmlWriter writer = XmlWriter.Create(stringWriter))
{
writer.WriteStartDocument();
writer.WriteStartElement("Board");
writer.WriteAttributeString("Rows", mRows.ToString());
writer.WriteAttributeString("Columns", mColumns.ToString());
writer.WriteEndElement();
writer.WriteEndDocument();
}
TextWriter writer1 = new StreamWriter(path);
writer1.Write(stringWriter.toString());
writer1.Close();
Then i write it to a txt file that looks like this:
<?xml version="1.0" encoding="utf-16"?>
<Board Rows="30" Columns="50">
</Board>
Then i do the following :
FileStream str = new FileStream(s.FileName, FileMode.Open);
using(XmlReader reader = XmlReader.Create(stream))
{
reader.Read();
}
And it throws an exception :
"There is no Unicode byte order mark. Cannot switch to Unicode."
I googled the exception and found several workarounds, but i don't understand why i need a work around, i just want to read the xml i wrote.
Can some one please explain what exactly the problem is ?
Should i write something differently in the xml ?
What is the simplest solution to this ?

You're probably not writing to a unicode file which File.WriteAllText or a vanilla FileStream does not do.
Instead use File.OpenWrite or FileStream combined with the StreamWriter(Stream steam, Encoding encoding) constructor to specify unicode.
Sample:
var path = #"C:\Dev\sample.xml";
string xml;
var mRows = 30;
var mColumns = 50;
var options = new XmlWriterSettings { Indent = true };
using (var stringWriter = new StringWriter())
{
using (var writer = XmlWriter.Create(stringWriter, options))
{
writer.WriteStartDocument();
writer.WriteStartElement("Board");
writer.WriteAttributeString("Rows", mRows.ToString());
writer.WriteAttributeString("Columns", mColumns.ToString());
writer.WriteEndElement();
writer.WriteEndDocument();
}
xml = stringWriter.ToString();
}
if(File.Exists(path))
File.Delete(path);
using(var stream = File.OpenWrite(path))
using(var writer = new StreamWriter(stream, Encoding.Unicode))
{
writer.Write(xml);
}
Console.Write(xml);
using(var stream = File.OpenRead(path))
using(var reader = XmlReader.Create(stream))
{
reader.Read();
}
File.Delete(path);

Related

Writing to Filestream and copying to MemoryStream

I want to overwrite or create an xml file on disk, and return the xml from the function. I figured I could do this by copying from FileStream to MemoryStream. But I end up appending a new xml document to the same file, instead of creating a new file each time.
What am I doing wrong? If I remove the copying, everything works fine.
public static string CreateAndSave(IEnumerable<OrderPage> orderPages, string filePath)
{
if (orderPages == null || !orderPages.Any())
{
return string.Empty;
}
var xmlBuilder = new StringBuilder();
var writerSettings = new XmlWriterSettings
{
Indent = true,
Encoding = Encoding.GetEncoding("ISO-8859-1"),
CheckCharacters = false,
ConformanceLevel = ConformanceLevel.Document
};
using (var fs = new FileStream(filePath, FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
try
{
XmlWriter xmlWriter = XmlWriter.Create(fs, writerSettings);
xmlWriter.WriteStartElement("PRINT_JOB");
WriteXmlAttribute(xmlWriter, "TYPE", "Order Confirmations");
foreach (var page in orderPages)
{
xmlWriter.WriteStartElement("PAGE");
WriteXmlAttribute(xmlWriter, "FORM_TYPE", page.OrderType);
var outBound = page.Orders.SingleOrDefault(x => x.FlightInfo.Direction == FlightDirection.Outbound);
var homeBound = page.Orders.SingleOrDefault(x => x.FlightInfo.Direction == FlightDirection.Homebound);
WriteXmlOrder(xmlWriter, outBound, page.ContailDetails, page.UserId, page.PrintType, FlightDirection.Outbound);
WriteXmlOrder(xmlWriter, homeBound, page.ContailDetails, page.UserId, page.PrintType, FlightDirection.Homebound);
xmlWriter.WriteEndElement();
}
xmlWriter.WriteFullEndElement();
MemoryStream destination = new MemoryStream();
fs.CopyTo(destination);
Log.Progress("Xml string length: {0}", destination.Length);
xmlBuilder.Append(Encoding.UTF8.GetString(destination.ToArray()));
destination.Flush();
destination.Close();
xmlWriter.Flush();
xmlWriter.Close();
}
catch (Exception ex)
{
Log.Warning(ex, "Unhandled exception occured during create of xml. {0}", ex.Message);
throw;
}
fs.Flush();
fs.Close();
}
return xmlBuilder.ToString();
}
Cheers
Jens
FileMode.OpenOrCreate is causing the file contents to be overwritten without shortening, leaving any 'trailing' data from previous runs. If FileMode.Create is used the file will be truncated first. However, to read back the contents you just wrote you will need to use Seek to reset the file pointer.
Also, flush the XmlWriter before copying from the underlying stream.
See also the question Simultaneous Read Write a file in C Sharp (3817477).
The following test program seems to do what you want (less your own logging and Order details).
using System;
using System.IO;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Threading.Tasks;
namespace ReadWriteTest
{
class Program
{
static void Main(string[] args)
{
string filePath = Path.Combine(
Environment.GetFolderPath(Environment.SpecialFolder.Personal),
"Test.xml");
string result = CreateAndSave(new string[] { "Hello", "World", "!" }, filePath);
Console.WriteLine("============== FIRST PASS ==============");
Console.WriteLine(result);
result = CreateAndSave(new string[] { "Hello", "World", "AGAIN", "!" }, filePath);
Console.WriteLine("============== SECOND PASS ==============");
Console.WriteLine(result);
Console.ReadLine();
}
public static string CreateAndSave(IEnumerable<string> orderPages, string filePath)
{
if (orderPages == null || !orderPages.Any())
{
return string.Empty;
}
var xmlBuilder = new StringBuilder();
var writerSettings = new XmlWriterSettings
{
Indent = true,
Encoding = Encoding.GetEncoding("ISO-8859-1"),
CheckCharacters = false,
ConformanceLevel = ConformanceLevel.Document
};
using (var fs = new FileStream(filePath, FileMode.Create, FileAccess.ReadWrite))
{
try
{
XmlWriter xmlWriter = XmlWriter.Create(fs, writerSettings);
xmlWriter.WriteStartElement("PRINT_JOB");
foreach (var page in orderPages)
{
xmlWriter.WriteElementString("PAGE", page);
}
xmlWriter.WriteFullEndElement();
xmlWriter.Flush(); // Flush from xmlWriter to fs
xmlWriter.Close();
fs.Seek(0, SeekOrigin.Begin); // Go back to read from the begining
MemoryStream destination = new MemoryStream();
fs.CopyTo(destination);
xmlBuilder.Append(Encoding.UTF8.GetString(destination.ToArray()));
destination.Flush();
destination.Close();
}
catch (Exception ex)
{
throw;
}
fs.Flush();
fs.Close();
}
return xmlBuilder.ToString();
}
}
}
For the optimizers out there, the StringBuilder was unnecessary because the string is formed whole and the MemoryStream can be avoided by just wrapping fs in a StreamReader. This would make the code as follows.
public static string CreateAndSave(IEnumerable<string> orderPages, string filePath)
{
if (orderPages == null || !orderPages.Any())
{
return string.Empty;
}
string result;
var writerSettings = new XmlWriterSettings
{
Indent = true,
Encoding = Encoding.GetEncoding("ISO-8859-1"),
CheckCharacters = false,
ConformanceLevel = ConformanceLevel.Document
};
using (var fs = new FileStream(filePath, FileMode.Create, FileAccess.ReadWrite))
{
try
{
XmlWriter xmlWriter = XmlWriter.Create(fs, writerSettings);
xmlWriter.WriteStartElement("PRINT_JOB");
foreach (var page in orderPages)
{
xmlWriter.WriteElementString("PAGE", page);
}
xmlWriter.WriteFullEndElement();
xmlWriter.Close(); // Flush from xmlWriter to fs
fs.Seek(0, SeekOrigin.Begin); // Go back to read from the begining
var reader = new StreamReader(fs, writerSettings.Encoding);
result = reader.ReadToEnd();
// reader.Close(); // This would just flush/close fs early(which would be OK)
}
catch (Exception ex)
{
throw;
}
}
return result;
}
I know I'm late, but there seems to be a simpler solution. You want your function to generate xml, write it to a file and return the generated xml. Apparently allocating a string cannot be avoided (because you want it to be returned), same for writing to a file. But reading from a file (as in your and SensorSmith's solutions) can easily be avoided by simply "swapping" the operations - generate xml string and write it to a file. Like this:
var output = new StringBuilder();
var writerSettings = new XmlWriterSettings { /* your settings ... */ };
using (var xmlWriter = XmlWriter.Create(output, writerSettings))
{
// Your xml generation code using the writer
// ...
// You don't need to flush the writer, it will be done automatically
}
// Here the output variable contains the xml, let's take it...
var xml = output.ToString();
// write it to a file...
File.WriteAllText(filePath, xml);
// and we are done :-)
return xml;
IMPORTANT UPDATE: It turns out that the XmlWriter.Create(StringBuider, XmlWriterSettings) overload ignores the Encoding from the settings and always uses "utf-16", so don't use this method if you need other encoding.

XSLT transformation for Word

I'm writing a web service in .NET C# that takes in an object, converts it to xml, applies an XSLT template, runs the transformation, and returns an MS work file.
Here is the code for the function:
public static HttpResponseMessage Transform(object data)
{
var response = new HttpResponseMessage(HttpStatusCode.OK);
StringWriter stringWriter = new StringWriter();
XmlWriter xmlWriter = XmlWriter.Create(stringWriter);
var applicationDirectory = AppDomain.CurrentDomain.BaseDirectory;
var xsltPath = applicationDirectory + #"\Reporting\Files\Template.xslt";
var templatePath = applicationDirectory + #"\Reporting\Files\Template.docx";
var xmlObject = new System.Xml.Serialization.XmlSerializer(data.GetType());
MemoryStream stream;
using (stream = new MemoryStream())
{
var sw = new StreamWriter(stream);
xmlObject.Serialize(stream, data);
stream.Position = 0;
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load(xsltPath);
using (XmlReader xmlReader = XmlReader.Create(stream))
{
transform.Transform(xmlReader, xmlWriter);
XmlDocument newWordContent = new XmlDocument();
newWordContent.LoadXml(stringWriter.ToString());
var outputPath = applicationDirectory + #"\Reporting\Temp\temp.docx";
System.IO.File.Copy(templatePath, outputPath, true);
using (WordprocessingDocument output = WordprocessingDocument.Open(outputPath, true))
{
Body updatedBodyContent = new Body(newWordContent.DocumentElement.InnerXml);
output.MainDocumentPart.Document.Body = updatedBodyContent;
output.MainDocumentPart.Document.Save();
}
response.Content = new StreamContent(new FileStream(outputPath, FileMode.Open, FileAccess.Read));
response.Content.Headers.ContentDisposition = new System.Net.Http.Headers.ContentDispositionHeaderValue("attachment");
response.Content.Headers.ContentDisposition.FileName = outputPath;
}
}
return response;
}
When I make a request, it gives me a word file without the data.
I put a breakpoint at using (XmlReader xmlReader = XmlReader.Create(stream)).
After running that line, xmlReader has a value of {None}.
I'm also trying to avoid creating an XML file for efficiency(Hence MemoryStream).
Any idea why this isn't working? And is there a better way of accomplishing this?
Thanks,
Gerson
What happens if you change this:
Body updatedBodyContent = new Body(newWordContent.DocumentElement.InnerXml);
to this:
Body updatedBodyContent = new Body(newWordContent.InnerXml);
or this:
Body updatedBodyContent = new Body(newWordContent.DocumentElement.OuterXml);
The way you have it there would cause the outer element of the transformed XML to be omitted, and I doubt that's what you want (though can't say for sure because you've shown us neither the input XML or the XSLT.

Error reading XML file

Person person = GetPerson();
XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
ns.Add(string.Empty, string.Empty);
XmlSerializer serializer = new XmlSerializer(typeof(Person));
string personText = string.Empty;
using (MemoryStream memoryStream = new MemoryStream())
{
using (XmlWriter xmlWriter = XmlWriter.Create(memoryStream, new XmlWriterSettings() { Encoding = Encoding.UTF8 }))
{
serializer.Serialize(xmlWriter, person, ns);
xmlWriter.Flush();
personText = Encoding.UTF8.GetString(memoryStream.ToArray());
}
}
string path = #"D:\person.xml";
// Write method 1:
File.WriteAllText(path, personText);
// Write method 2:
using (StreamWriter streamWriter = new StreamWriter(path, false , Encoding.UTF8))
{
streamWriter.Write(personText);
}
// Read the xml
using (FileStream fileStream = new FileStream(path, FileMode.Open))
{
return XDocument.Load(XmlReader.Create(fileStream));
}
When I read the xml after writing using method 2, I get this Data at the root level is invalid. Line 1, position 1. But it works fine using method 1.
What is causing this? Any pointers appreciated.
The problem is that both the StreamWriter and the XmlWriter are adding a byte-order-mark.
Options:
String the BOM from personText to start with
Pass new UTF8Encoding(false) instead of Encoding.UTF8 for the StreamWriter
Pass new UTF8Encoding(false) instead of Encoding.UTF8 for the XmlWriter
Avoid converting to text and back again in the first place: you've got the binary data in the MemoryStream, why not just dump that to disk?

What is the proper way to encrypt an XmlTextWriter and serialize it to a file?

I have an XmlTextWriter that gets written to file using an XmlSerializer that looks like the following:
using (XmlTextWriter writer = new XmlTextWriter(path, null))
{
writer.Formatting = Formatting.Indented;
writer.Indentation = 3;
MyFileObj.ourSerializer.Serialize(writer, xmlFile, ourXmlNamespaces);
}
where "ourSerializer" is just a reference to an System.Xml.Serialization.XmlSerializer object. However, I have an instance where this XML must be encrypted to disk so that the end user cannot read its contents, and I am unsure of the proper way to go about it using the existing code since there are many places where this code is called and does not need to be encrypted. Can anyone shed some insight into this for me?
An alternative way would be to use a CryptoStream, like this:
using (var fs = new FileStream(path, System.IO.FileMode.Create))
{
using (var cs = new CryptoStream(fs, _Provider.CreateEncryptor(), CryptoStreamMode.Write))
{
using (var writer = XmlWriter.Create(cs))
{
writer.Formatting = Formatting.Indented;
writer.Indentation = 3;
MyFileObj.ourSerializer.Serialize(writer, xmlFile, ourXmlNamespaces);
}
}
}
Where _Provider is an AesCryptoServiceProvider properly initialized.
Here is how I ended up solving the issue:
MemoryStream ms = new MemoryStream();
XmlSerializer ourSerializer.Serialize(ms, xmlFile, ourXmlNamespaces);
ms.Position = 0;
//Encrypt the memorystream
using (TextReader reader = new StreamReader(ms, Encoding.ASCII))
using (StreamWriter writer = new StreamWriter(path))
{
string towrite = Encrypt(reader.ReadToEnd());
writer.Write(towrite);
}
Basically serialized the XML to a MemoryStream, read the text back out into a TextReader, encrypted the TextReader contents and then saved the resulting encrypted string to a file.

Force XDocument to write to String with UTF-8 encoding

I want to be able to write XML to a String with the declaration and with UTF-8 encoding. This seems mighty tricky to accomplish.
I have read around a bit and tried some of the popular answers for this but the they all have issues. My current code correctly outputs as UTF-8 but does not maintain the original formatting of the XDocument (i.e. indents / whitespace)!
Can anyone offer some advice please?
XDocument xml = new XDocument(new XDeclaration("1.0", "utf-8", "yes"), xelementXML);
MemoryStream ms = new MemoryStream();
using (XmlWriter xw = new XmlTextWriter(ms, Encoding.UTF8))
{
xml.Save(xw);
xw.Flush();
StreamReader sr = new StreamReader(ms);
ms.Seek(0, SeekOrigin.Begin);
String xmlString = sr.ReadToEnd();
}
The XML requires the formatting to be identical to the way .ToString() would format it i.e.
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<root>
<node>blah</node>
</root>
What I'm currently seeing is
<?xml version="1.0" encoding="utf-8" standalone="yes"?><root><node>blah</node></root>
Update
I have managed to get this to work by adding XmlTextWriter settings... It seems VERY clunky though!
MemoryStream ms = new MemoryStream();
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.UTF8;
settings.ConformanceLevel = ConformanceLevel.Document;
settings.Indent = true;
using (XmlWriter xw = XmlTextWriter.Create(ms, settings))
{
xml.Save(xw);
xw.Flush();
StreamReader sr = new StreamReader(ms);
ms.Seek(0, SeekOrigin.Begin);
String blah = sr.ReadToEnd();
}
Try this:
using System;
using System.IO;
using System.Text;
using System.Xml.Linq;
class Test
{
static void Main()
{
XDocument doc = XDocument.Load("test.xml",
LoadOptions.PreserveWhitespace);
doc.Declaration = new XDeclaration("1.0", "utf-8", null);
StringWriter writer = new Utf8StringWriter();
doc.Save(writer, SaveOptions.None);
Console.WriteLine(writer);
}
private class Utf8StringWriter : StringWriter
{
public override Encoding Encoding { get { return Encoding.UTF8; } }
}
}
Of course, you haven't shown us how you're building the document, which makes it hard to test... I've just tried with a hand-constructed XDocument and that contains the relevant whitespace too.
Try XmlWriterSettings:
XmlWriterSettings xws = new XmlWriterSettings();
xws.OmitXmlDeclaration = false;
xws.Indent = true;
And pass it on like
using (XmlWriter xw = XmlWriter.Create(sb, xws))
See also https://stackoverflow.com/a/3288376/1430535
return xdoc.Declaration.ToString() + Environment.NewLine + xdoc.ToString();

Categories