How to reduce size of xml file programatically in c# - c#

I have one xml file, it has 110KB
I've uploaded it here
In Notepad++ I'm using XML Tools plugin and pretty print (Ctrl+Alt+Shift+B) for code arrangement, like in the picture below
Also I have another plugin for Notepad++, "TextFX", I'm selecting all the text (Ctrl+A) and using Unwrap Text, like in the picture below
After these actions I'm saving my xml file and it has 100KB (uploaded it here ).
How can I do this action programatically in c# ?
Thanks in advance!

Are you asking about how to remove all space characters in the xml document? Please load it to the XmlDocument and read from OuterXml. You will get xml document in one line
var d = new Data();
var s = new XmlSerializer(d.GetType());
var sb = new StringBuilder();
var strStream = new StringWriter(sb);
s.Serialize(strStream, d);
Trace.WriteLine(sb.ToString());// formatted document
var xd = new XmlDocument();
xd.LoadXml(sb.ToString());
Trace.WriteLine(xd.OuterXml); // document without any surplus space character or linebreaks
Data is my custom class, please find it below. It does not contain any XML serialization control attributes. You can use any class instead of it.
public class Data
{
public string BIC;
public string Addressee;
public string AccountHolder;
public string Name;
public string CityHeading;
public string NationalCode;
public bool MainBIC;
public string TypeOfChange;
public DateTime validFrom;
public DateTime validTill;
public int ParticipationType;
public string Title { get; set; }
}
First trace produces well-formatted XML
<?xml version="1.0" encoding="utf-16"?>
<Data xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<MainBIC>false</MainBIC>
<validFrom>0001-01-01T00:00:00</validFrom>
<validTill>0001-01-01T00:00:00</validTill>
<ParticipationType>0</ParticipationType>
</Data>
and second trace output is single line:
<?xml version="1.0" encoding="utf-16"?><Data xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><MainBIC>false</MainBIC><validFrom>0001-01-01T00:00:00</validFrom><validTill>0001-01-01T00:00:00</validTill><ParticipationType>0</ParticipationType></Data>

Related

XmlSerializer removes data when serializing

I have a very annoying issue serializing a large XML object. I have defined a class FetchYearlyInvoiceStatusRequest:
[Serializable]
public class FetchYearlyInvoiceStatusRequest
{
[XmlArray("INVOICES")]
[XmlArrayItem("INVOICE", typeof(InvoiceRequest))]
public List<InvoiceRequest> Invoices;
}
[Serializable]
public class InvoiceRequest
{
[XmlElement("CONTRACTACCOUNT")]
public string ContractAccount;
[XmlElement("INVOICENUMBER")]
public string InvoiceNumber;
}
I have an instance of this class with around 700 items in the list. When I use this code to serialize this to XML:
XmlSerializer serializerRequest = new XmlSerializer(typeof(FetchYearlyInvoiceStatusRequest));
string invoices;
using (var sw = new StringWriter())
{
serializerRequest.Serialize(sw, odsrequest);
invoices = sw.ToString();
}
The invoices string now contains a long list of invoices, but the serializer (or the memorystream?) just cut off the middle half. So really in the middle of the output string, the literal text is:
<INVOICE>
<CONTRACTACCOUNT>3006698362</CONTRACTACCOUNT>
<INVOICENUMBER>40523461958</INVOICENUMBER>
</INVOICE>
<INVOICE>
<CONTRACTACCOUNT>3006362096</CONTRACTACCOUNT>
<INVOICENUMBER>40028149026</INVOICENUMBE... <CONTRACTACCOUNT>3006362096</CONTRACTACCOUNT>
<INVOICENUMBER>55002448279</INVOICENUMBER>
</INVOICE>
<INVOICE>
<CONTRACTACCOUNT>3006362096</CONTRACTACCOUNT>
<INVOICENUMBER>42514938204</INVOICENUMBER>
</INVOICE>
This is simply corrupt. When I write to a file using almost the same code (but using a StreamWriter isntead of StringWriter):
XmlSerializer serializerRequest = new XmlSerializer(typeof(FetchYearlyInvoiceStatusRequest));
string invoices;
using (var sw = new StreamWriter("c:\\temp\\output.xml"))
{
serializerRequest.Serialize(sw, odsrequest);
}
The output in the file is perfect (snippet at the same location as above where it cuts off):
<INVOICE>
<CONTRACTACCOUNT>3006362096</CONTRACTACCOUNT>
<INVOICENUMBER>40028149026</INVOICENUMBER>
</INVOICE>
<INVOICE>
<CONTRACTACCOUNT>3007728722</CONTRACTACCOUNT>
<INVOICENUMBER>40027928855</INVOICENUMBER>
</INVOICE>
Please someone tell me what happens. I've googled for "stringwriter limitations", "xmlserializer limitations and more. Nothing :(
Any help appreciated!

How to store and retrieve objects from Xml file

I have this class:
public class MyMenu
{
public string Name { get; set; }
public string Type { get; set; }
}
This class I want to use it in a dynamic menu, and I do not want to store his data in a database.
I want to store its data in Xml file.
Till now I have this for saving data:
string path = Server.MapPath("~/Content/Files");
XmlSerializer serial = new XmlSerializer(model.GetType());
System.IO.StreamWriter writer = new System.IO.StreamWriter(path + "\\ribbonmenu");
serial.Serialize(writer, model);
writer.Close();
And this to get the data:
string path = Server.MapPath("~/Content/Files");
XmlSerializer serial = new XmlSerializer(typeof(RibbonMenu));
System.IO.StreamReader reader = new System.IO.StreamReader(path + "\\ribbonmenu");
RibbonMenu menu =(RibbonMenu) serial.Deserialize(reader);
reader.Close();
What I have is working for one object to save and get.
I need to save multiple objects, and get the collection of objects, something like:
IEnumerable<MyMenu> model=(IEnumerable<MyMenu>) serial.Deserialize(reader);
Can someone give me a solution? Thanks.
Edit: The content of the generated Xml with my code is:
<?xml version="1.0" encoding="utf-8"?>
<MyMenu xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Id>0</Id>
<Menu>Home</Menu>
<Type>Button</Type>
</MyMenu>
When serializing you should initialize a collection like this:
var model = new List<MyMenu>()
{
new MyMenu() { Name = "Menu1", Type = "Ribbon" },
new MyMenu() { Name = "Menu2", Type = "Ribbon" },
};
This way, when you serialize you'd get something like this:
<?xml version="1.0" encoding="utf-8"?>
<ArrayOfMyMenu xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<MyMenu>
<Name>Menu1</Name>
<Type>Ribbon</Type>
</MyMenu>
<MyMenu>
<Name>Menu2</Name>
<Type>Ribbon</Type>
</MyMenu>
</ArrayOfMyMenu>
And you can get the object back by using List as the type of serializer:
XmlSerializer serial = new XmlSerializer(typeof(List<MyMenu>));
System.IO.StreamReader reader = new System.IO.StreamReader("ribbonmenu.xml");
var menu = (List<MyMenu>)serial.Deserialize(reader);
reader.Close();
Hope this helps.

Putting c# inside xml

I am using WinForms. I have an XML document that looks like this:
<?xml version="1.0" encoding="utf-8" ?>
<MarcusXMLFile xmlns:Responses="http://www.rewardstrike.com/XMLFile1.xml">
<response>
<greatmood>
<yes>
<replytocommand>
<answer>Yes.</answer>
<answer>Yes, sir.</answer>
<answer>Settings.Default.User</answer>
</replytocommand>
</yes>
</greatmood>
</response>
</MarcusXMLFile>
To read this xml document, I use:
private void Responses()
{
string query = String.Format("http://www.rewardstrike.com/XMLFile1.xml");
XmlDocument Responses = new XmlDocument();
Responses.Load(query);
XmlNode channel = Responses.SelectSingleNode("MarcusXMLFile");
if (QEvent == "yesreplytocommand")
{
XmlNodeList yesreplytocommand = Responses.SelectNodes("MarcusXMLFile/response/greatmood/yes/replytocommand/answer");
foreach (XmlNode ans in yesreplytocommand
.Cast<XmlNode>()
.OrderBy(elem => Guid.NewGuid()))
{
response = ans.InnerText;
}
}
}
and then to display:
QEvent = "yesreplytocommand";
Responses();
Console.WriteLine(response);
My problem is when it gets Settings.Default.User and displays it, I want it to display it as c# code so that it actually gets the value from the application. Right now it is actually displaying "Settings.Default.User". How do I do this?
First, you'll need a way to recognize which of your entries are literals and which are expressions. You could do it by adding an attribute to the XML node:
<?xml version="1.0" encoding="utf-8" ?>
<MarcusXMLFile xmlns:Responses="http://www.rewardstrike.com/XMLFile1.xml">
<response>
<greatmood>
<yes>
<replytocommand>
<answer>Yes.</answer>
<answer>Yes, sir.</answer>
<answer expression="true">DefaultSettings.User</answer>
</replytocommand>
</yes>
</greatmood>
</response>
</MarcusXMLFile>
Based on that you can modify your parsing code to either directly use the value from XML or evaluate it instead:
foreach (XmlNode ans in yesreplytocommand
.Cast<XmlNode>()
.OrderBy(elem => Guid.NewGuid()))
{
var attribute = ans.Attributes["expression"];
if (attribute != null && attribute.Value == "true")
{
Console.WriteLine(Evaluate(ans.InnerText));
}
else
{
Console.WriteLine(ans.InnerText);
}
}
There's still the problem of evaluating that expression. There's no easy built-in way to do that from C#. But you could use Dynamic Expresso. This is how Evaluate method could look like:
public string Evaluate(string expression)
{
var interpreter = new Interpreter();
interpreter.SetVariable("DefaultSettings", Settings.Default);
return interpreter.Eval<string>(expression);
}
As you can see, you'll still have to define the expression variables yourself. For the above to work, you will have to use DefaultSettings.User in your XML instead of Settings.Default.User. I already made that change in my sample XML at the beginning of the answer.
You should take a look at XML Serialization.
Really basic on how it works is that it can convert a struct or a class like this:
struct Foo
{
int bar = 0;
Vector2 obj = new Vector2(10, 50);
}
into this:
<?xml version="1.0" ?>
<Foo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"">
<bar>0</bar>
<obj>
<X>10</X>
<Y>50</Y>
</ojb>
</Foo>
And the other way around.
The methods used to load and save code looks like this:
public static void Save(string filepath, Foo foobject)
{
XmlSerializer serializer = new XmlSerializer(typeof(Foo));
using (Stream stream = File.OpenWrite(filepath))
{
serializer.Serialize(stream, foobject);
}
}
public static Foo Load(string filepath)
{
Foo myFoo;
XmlSerializer serializer = new XmlSerializer(typeof(Foo));
using (Stream stream = File.OpenRead(filepath))
{
myFoo = (Foo)serializer.Deserialize(stream);
}
}
It converts xml code to c# code, and other way around.
It cannot convert methods, but it can convert most properties and classes.

XmlDocument in WCF Service method not successfully saving to file using the class' save method

Hello and thanks in advance,
I am attempting to take the input from text boxes in a silverlight application and on an event fired by a button click, convert them to an xml string, pass the string and a specified file name to a WCF service call and in that call save the xml to the specifed file(via a string parameter). The code which captures the text into an xml string seems to be successfully working(based on what I see in the variables when debugging) and looks like this:
private void ServerInfoNext_Click(object sender, RoutedEventArgs e)
{
//new RegisterServerGroupObject instance
RegisterServerGroupObject groupInfo= new RegisterServerGroupObject(groupNameTB.Text,1,parentServerNameTB.Text,LeaderNameCB.SelectedItem.ToString());
var serializer = new XmlSerializer(typeof(RegisterServerGroupObject));
XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
ns.Add("","");
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.UTF8;
settings.Indent = true;
settings.CloseOutput = true;
StringBuilder sb = new StringBuilder();
using (XmlWriter writer = XmlWriter.Create(sb,settings))
{
serializer.Serialize(writer, groupInfo);
writer.Close();
}
//sb now contains the xml string with the information from the serialized class
string contentsString = sb.ToString();
//create instance of XmlWrite service
XMLWriteServiceClient xmlClient = new XMLWriteServiceClient();
xmlClient.WriteXmlToServerCompleted += new EventHandler<System.ComponentModel.AsyncCompletedEventArgs>(xmlClient_WriteXmlToServerCompleted);
xmlClient.WriteXmlToServerAsync("ServerGroups.xml", contentsString);
}
at this point when the variable contents string is passed to the service method, I can see that it has valid xml, as well as within the service method itself, which looks like this:
public class XMLWriteService : IXMLWriteService
{
public void WriteXmlToServer(string filename,string xmlString)
{
XmlDocument xDoc = new XmlDocument();
xDoc.LoadXml(xmlString.ToString());
try
{
xDoc.Save(filename);
}
catch (FileNotFoundException e)
{
Console.WriteLine(e.InnerException.ToString());
}
}
}
The try/catch block is not indicating that the file("ServerGroups.xml") is not found, and I currently have that xml file in the ClientBin of the server side portion of the project. (the .Web side). However, after the method terminates there is no new xml written to the file. Can someone please tell me what I am doing wrong? I don't know why the XmlDocument class instance is not saving its contents to the file. Thanks in advance!
You aren't passing a path, so it's just going to save the file to the current directory of the WCF service process, whatever that happens to be. Either find out what that is, or do a search on your whole server drive for that file name to see where it's saving it. Better yet, call Path.Combine to append a path to the begining of the file name before you save to it. For instance:
xDoc.Save(Path.Combine("C:\\ClientBin", filename));
To answer your question in the comment below, if you want to append the incoming XML data to the data that is already stored in the XML file on the server, that's a bit more involved. It all depends what the format of the XML is. Since you are using serialization, which by default will only allow one object per XML document (because it puts the object name as the root document element, of which there can only be one), then you would have to have a different XML format. For instance, on the server side, you would need to have some kind of root element on the document under which you could keep appending the incoming RegisterServerGroupObject objects. For instance, if your XML file on the server looked like this:
<?xml version="1.0" encoding="utf-8" ?>
<ListOfRegisterServerGroupObject>
</ListOfRegisterServerGroupObject>
Then, you could append the data by inserting new elements within that root element, like this:
<?xml version="1.0" encoding="utf-8" ?>
<ListOfRegisterServerGroupObject>
<RegisterServerGroupObject>
...
</RegisterServerGroupObject>
<RegisterServerGroupObject>
...
</RegisterServerGroupObject>
...
</ListOfRegisterServerGroupObject>
To do this, you would need to first load the XML document, then get the root element, then append the incoming XML as a child element. For instance:
public void WriteXmlToServer(string filename, string xmlString)
{
string filePath = Path.Combine("C:\\ClientBin", filename);
XmlDocument storage = New XmlDocument();
storage.Load(filePath);
XmlDocument incoming = New XmlDocument();
incoming.LoadXml(xmlString);
storage.DocumentElement.AppendChild(incoming.DocumentElement);
storage.Save(filePath);
}
You may need to 'map' the physical path to the output file within the service
string path = HostingEnvironment.MapPath("~/MyPath/MyFile.xml");

How to parse XML Content without additional newlines and tabulators in C#?

I'm using the following xml file to read the contents of a file that I need to write to:
<properties>
<url>http://www.leagueoflegends.com/service-status</url>
<content>host=beta.lol.riotgames.com
xmpp_server_url=chat.na.lol.riotgames.com
lobbyLandingURL=http://www.leagueoflegends.com/pvpnet_landing
ladderURL=http://www.leagueoflegends.com/ladders
storyPageURL=http://www.leagueoflegends.com/story
lq_uri=https://lq.na.lol.riotgames.com/login-queue/rest/queue</content>
</properties>
I have intentionally made content element to have only the newlines and nothing else (instead of proper formatting).
However when I read the content element to a string it adds several newlines and tabulators on the beginning of the lines. The result I get by writing to a text file is the following text:
<imaginary newline here>
host=beta.lol.riotgames.com
xmpp_server_url=chat.na.lol.riotgames.com
lobbyLandingURL=http://www.leagueoflegends.com/pvpnet_landing
ladderURL=http://www.leagueoflegends.com/ladders
storyPageURL=http://www.leagueoflegends.com/story
lq_uri=https://lq.na.lol.riotgames.com/login-queue/rest/queue
<imaginary newline here><additional tab here>
The issue here is that I need that text to simply start from the first line, have no tabs, no spaces, just the newlines.
The C# code behind for reading the content tag is:
XDocument root = XDocument.Parse(File.ReadAllText(file), LoadOptions.PreserveWhitespace);
PropertyFile PF = new PropertyFile();
PF.Content = (from Content in root.Descendants("content")
select (string)Content.Value).Single();
file is the path to the file, as I am able to read everything else.
Any ideas would be much appreciated.
Thanks in advance
EDIT:
PropertyFile class code ahead (content is just a string holding data for one file of a filetype I'm reading):
class PropertyFile
{
public Uri URI { get; set; }
public string Name { get; set; }
public string Path { get; set; }
public string Content { get; set; }
}
Desired output of the file is:
host=beta.lol.riotgames.com
xmpp_server_url=chat.na.lol.riotgames.com
lobbyLandingURL=http://www.leagueoflegends.com/pvpnet_landing
ladderURL=http://www.leagueoflegends.com/ladders
storyPageURL=http://www.leagueoflegends.com/story
lq_uri=https://lq.na.lol.riotgames.com/login-queue/rest/queue
By default the serialization of a XDocument/XElement formats (i.e. indents) the xml fragment.
Try using:
root.Save("Root.xml", SaveOptions.DisableFormatting);
However any DOM operations won't be involved in insignificant spaces/tabs. What you write in an element, is what you get.
Try setting the LoadOptions to None:
XDocument root = XDocument.Parse(File.ReadAllText(file), LoadOptions.None);
According to MSDN, that will ignore all insignificant whitespace.
I tried it (with just a string for your XML - I didn't bother loading it from a file) and got the following:
host=beta.lol.riotgames.com
xmpp_server_url=chat.na.lol.riotgames.com
lobbyLandingURL=http://www.leagueoflegends.com/pvpnet_landing
ladderURL=http://www.leagueoflegends.com/ladders
storyPageURL=http://www.leagueoflegends.com/story
lq_uri=https://lq.na.lol.riotgames.com/login-queue/rest/queue
Is that what you're looking for?
Edited To Add Code I Used
string xml = #"<properties>
<url>http://www.leagueoflegends.com/service-status</url>
<content>host=beta.lol.riotgames.com
xmpp_server_url=chat.na.lol.riotgames.com
lobbyLandingURL=http://www.leagueoflegends.com/pvpnet_landing
ladderURL=http://www.leagueoflegends.com/ladders
storyPageURL=http://www.leagueoflegends.com/story
lq_uri=https://lq.na.lol.riotgames.com/login-queue/rest/queue
";
XDocument root = XDocument.Parse(xml, LoadOptions.None);
var content = (from Content in root.Descendants("content")
select (string)Content.Value).Single();
What do you get if you try ...?
using System.IO;
XDocument root = XDocument.Parse(File.ReadAllText(file),
LoadOptions.PreserveWhitespace);
string content = (from Content in root.Descendants("content")
select (string)Content.Value).Single();
File.WriteAllText("SomeTempFile.txt", content);
I suspect that this text file will be formatted as you expect. This would indicate a problem with PropertyFile.

Categories