I am converting an object into xml string and then into an escaped string.
public class Program
{
public static void Main(string[] args)
{
BankDetails details = new BankDetails();
var xmlstring = ToXmlString(details);
var escaped = SecurityElement.Escape(xmlstring);
}
private static string ToXmlString<T>(T input)
{
XmlSerializer xsSubmit = new XmlSerializer(typeof(T));
XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
var xml = "";
ns.Add("", "");
using (var sww = new StringWriter())
{
using (XmlWriter writer = XmlWriter.Create(sww, new XmlWriterSettings()
{
OmitXmlDeclaration = true
}))
{
xsSubmit.Serialize(writer, input, ns);
xml = sww.ToString();
}
}
return xml;
}
}
public class BankDetails
{
public string MemberName = "B & A Auto";
}
How can I avoid getting & in xmlstring variable.
<BankDetails><MemberName>B & A Auto</MemberName></BankDetails>
I am looking for output something like this:
xmlstring = //<BankDetails><MemberName>B & A Auto</MemberName></BankDetails>
//and then
escaped = //<BankDetails><MemberName>B & A Auto</MemberName></BankDetails>
Working Fiddle
You can use Unicode equivalent character ie decimal or hex, & or & instead.
"B & A Auto" => "B & A Auto";
You can parse your string, convert amps to their unicode equivalence and then escape those.
No, you can not. The & is a special character in XML and used for escaping other characters.
Escaped character in XML
' = '
< = <
> = >
& = &
" = "
Related
I am building an XML string from 3 parts:
the before bit which is hard-coded
the middle bit which is created from object serialization
the after bit which is also hardcoded.
No matter what I do, I cannot remove the escape characters from it though, and this is causing failed requests when I use that XML string in the request object that it is intended for.
Code (and my attempts at this) below:
string preceedingXml = "<soapenv:Envelope xmlns:soapenv=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:enl=\"http://company.com\"><soapenv:Header/><soapenv:Body><enl:Combine>";
string afterXml = "</enl:Combine></soapenv:Body></soapenv:Envelope>";
try
{
using (var stringwriter = new System.IO.StringWriter())
{
var serializer = new XmlSerializer(this.GetType());
serializer.Serialize(stringwriter, this);
var requestXML = XDocument.Parse(stringwriter.ToString())
.Descendants("in")
.First();
string unescaped = Regex.Unescape(requestXML.ToString().Replace("\r\n", ""));
string returnValue = preceedingXml +unescaped + afterXml;
string rd = returnValue.Replace("\\'", "'");
string rd3 = returnValue.Replace(#"\", string.Empty);
string rd4 = RemoveSpecialChars(returnValue);
string rd5 = returnValue.Trim('"');
return returnValue;
}
}
The result is always the same no matter what:
<soapenv:Envelope
xmlns:soapenv=\"http://schemas.xmlsoap.org/soap/envelope/\"
xmlns:enl=\"http://company.com/product\">
<soapenv:Header/>
<soapenv:Body>
<enl:Combine>
<in
xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"
xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\">
<returnAllRecords>0</returnAllRecords>
<projectName>Project</projectName>
<lhs>
<recordType>ParsedData</recordType>
<recordId>1000000</recordId>
<data>
<fields name=\"Full-Name\">
<values>John Smith</values>
etc.
I am passing XML data to a server from a text Box, now issue is XML is giving issues with symbols like & < |. So i want to replace these symbols with their equivalent codes.
if i use string.replace function it will replace the characters recently replaced as well.
.Replace("&", "&")
.Replace("<", "<")
.Replace("|", "|")
.Replace("!", "!")
.Replace("#", "#")
As it go through complete string again and again.
So &<# will become "&<"
I also tried Dictionary method:
var replacements = new Dictionary<string, string>
{
{"&", "&"},
{"<", "<"},
{"|", "|"},
{"!", "!"},
{"#", "#"}
}
var output = replacements.Aggregate(input, (current, replacement) => current.Replace(replacement.Key, replacement.Value));
return output;
But same issue here as well. I also tried string builder method, but same repeating replacement issue. Any Advise?
You shouldn't be trying to escape characters manually. There are libraries and methods that are already built to do this such the SecurityElement.Escape(). It specifically escapes invalid XML characters into a known safe format that can be unescaped later.
I strongly advise using proper XML handling to build XML:
var id = 3;
var message = "&'<crazyMessage&&";
var xmlDoc = new XmlDocument();
using(var writer = xmlDoc.CreateNavigator().AppendChild())
{
writer.WriteStartElement("ROOT");
writer.WriteElementString("ID", id.ToString());
writer.WriteStartElement("INPUT");
writer.WriteElementString("ENGMSG", message);
writer.WriteEndElement(); // INPUT
writer.WriteEndElement(); // ROOT
}
var xmlString = xmlDoc.InnerXml;
Console.WriteLine(xmlString);
Ideone example
If you are using .NET 3.5 or higher, you can use Linq2Xml to build the XML, which is a bit cleaner:
var id = 3;
var message = "&'<crazyMessage&&";
var xml = new XElement("ROOT",
new XElement("ID", id),
new XElement("INPUT",
new XElement("ENGMSG", message)
)
);
var xmlString = xml.ToString();
Console.WriteLine(xmlString);
public static string Transform(string input, Dictionary<string, string> replacements)
{
string finalString = string.Empty;
for (int i = 0; i < input.Length; i++)
{
if (replacements.ContainsKey(input[i].ToString()))
{
finalString = finalString + replacements[input[i].ToString()];
}
else
{
finalString = finalString + input[i].ToString();
}
}
return finalString;
}
I have been banging my head on this one for a bit. It seems like it must be a simple solution but I have searched the internet and tried quite a few things.
I have a complex object which includes a string list that needs to be serialized into xml and then deserialized.
The serialization code has long since been part of the application and works in countless other scenarios but the issue here appears to be that one of the elements in the string list is a mere new line character (i.e. "\n").
It is my understanding, based on my research, it is serializing as expected (see below) but after deserialization the element contains an empty string (i.e. "") instead of "\n".
Here is the code...
public DoStuff(ItemTypeObj item)
{
string myItem = XmlSerialize<ItemType>(item);
ItemTypeObj myNewItemTypeObj = XmlDeserialize<CustomItem>(myItem)
}
public static string XmlSerialize<T>(T objectToSerialize)
{
string ret = string.Empty;
XmlSerializer s = new XmlSerializer(typeof(T));
using (MemoryStream ms = new MemoryStream())
{
s.Serialize(ms, objectToSerialize);
ms.Position = 0;
using (StreamReader sr = new StreamReader(ms))
{
sRet = sr.ReadToEnd();
}
}
return ret;
}
public static T XmlDeserialize<T>(string serializedObject)
{
T retVal = default(T);
byte[] ba = ASCIIEncoding.UTF8.GetBytes(serializedObject);
using (MemoryStream ms = new MemoryStream(ba))
{
XmlSerializer s = new XmlSerializer(typeof(T));
retVal = (T)s.Deserialize(ms);
}
return retVal;
}
To give you an idea of the data sent in, ItemTypeObj is the object which includes a string List. The string list can be variable length but sample data could look like this...
[0] = "Zero element text \n"
[1] = "[element1]"
[2] = "\n"
[3] = "[element3]"
[4] = "\n"
[5] = "[element5]"
When serialized it will look like this (which seems correct to me):
<Text>
<string>Zero element text
</string>
<string>[element1]</string>
<string>
</string>
<string>[element3]</string>
<string>
</string>
<string>[element5]</string>
<Text>
From what I've read the newlines are represented as expected in the xml above. The issue is after it is deserialized the string list is this:
[0] = "Zero element text \n"
[1] = "[element1]"
[2] = ""
[3] = "[element3]"
[4] = ""
[5] = "[element5]"
Only the newline characters in the elements that also have text (e.g. [0]) will still exist. The other two are replaced with empty string. If I add text to those elements the new line will be retained.
Looking at the byte array in the deserialization, the array element at the location in the serialized string where the "\n" was turns into a 10 (aka LF, new line). Then that does not successfully get turned into "\n" in the Deserialize. Perhaps that is too much to ask.
Any insight would be most appreciated. Thanks.
You'll need to use the XmlReader and XmlWriter classes or the DataContractSerializer.
See: How to keep XmlSerializer from killing NewLines in Strings?
public static string XmlSerialize<T>(T objectToSerialize)
{
XmlSerializer s = new XmlSerializer(typeof(T));
var settings = new XmlWriterSettings
{
NewLineHandling = NewLineHandling.Entitize
};
using(var stream = new StringWriter())
using(var writer = XmlWriter.Create(stream, settings))
{
s.Serialize(writer, objectToSerialize);
return stream.ToString();
}
}
public static T XmlDeserialize<T>(string serializedObject)
{
XmlSerializer s = new XmlSerializer(typeof(T));
using(var stream = new StringReader(serializedObject))
using(var reader = XmlReader.Create(stream))
{
return (T)s.Deserialize(reader);
}
}
Usage:
public class Foo
{
public string Bar { get; set; }
}
var foo = new Foo { Bar = "\n" };
var result = XmlSerialize(foo);
Console.WriteLine(result);
var newFoo = XmlDeserialize<Foo>(result);
Console.WriteLine(newFoo.Bar);
Debug.Assert(newFoo.Bar == "\n");
I am looking on Internet how keep the carriage return from XML data but I could not find the answer, so I'm here :)
The objective is to write in a file the content of a XML data. So, if the value of the node contains some "\r\n" data, the soft need to write them in file in order to create new line, but it doesn't write, even with space:preserve.
Here is my test class:
XElement xRootNode = new XElement("DOCS");
XElement xData = null;
//XNamespace ns = XNamespace.Xml;
//XAttribute spacePreserve = new XAttribute(ns+"space", "preserve");
//xRootNode.Add(spacePreserve);
xData = new XElement("DOC");
xData.Add(new XAttribute("FILENAME", "c:\\titi\\prout.txt"));
xData.Add(new XAttribute("MODE", "APPEND"));
xData.Add("Hi my name is Baptiste\r\nI'm a lazy boy");
xRootNode.Add(xData);
bool result = Tools.writeToFile(xRootNode.ToString());
And here is my process class:
try
{
XElement xRootNode = XElement.Parse(xmlInputFiles);
String filename = xRootNode.Element(xNodeDoc).Attribute(xAttributeFilename).Value.ToString();
Boolean mode = false;
try
{
mode = xRootNode.Element(xNodeDoc).Attribute(xWriteMode).Value.ToString().ToUpper().Equals(xWriteModeAppend);
}
catch (Exception e)
{
mode = false;
}
String value = xRootNode.Element(xNodeDoc).Value;
StreamWriter destFile = new StreamWriter(filename, mode, System.Text.Encoding.Unicode);
destFile.Write(value);
destFile.Close();
return true;
}
catch (Exception e)
{
return false;
}
Does anybody have an idea?
If you want to preserve cr lf in element or attribute content when saving a XDocument or XElement you can do that by using certain XmlWriterSettings, namely NewLineHandling to Entitize:
string fileName = "XmlOuputTest1.xml";
string attValue = "Line1.\r\nLine2.";
string elementValue = "Line1.\r\nLine2.\r\nLine3.";
XmlWriterSettings xws = new XmlWriterSettings();
xws.NewLineHandling = NewLineHandling.Entitize;
XDocument doc = new XDocument(new XElement("root",
new XAttribute("test", attValue),
elementValue));
using (XmlWriter xw = XmlWriter.Create(fileName, xws))
{
doc.Save(xw);
}
doc = XDocument.Load(fileName);
Console.WriteLine("att value: {0}; element value: {1}.",
attValue == doc.Root.Attribute("test").Value,
elementValue == doc.Root.Value);
In that example the value are preserved in the round trip of saving and loading as the output of the sample is "att value: True; element value: True."
Heres a useful link I found for parsing an Xml string with carraige returns, line feeds in it.
howto-correctly-parse-using-xelementparse-for-strings-that-contain-newline-character-in
It may help those who are parsing an Xml string.
For those who can't be bothered to click it says use an XmlTextReader instead
XmlTextReader xtr = new XmlTextReader(new StringReader(xml));
XElement items = XElement.Load(xtr);
foreach (string desc in items.Elements("Item").Select(i => (string)i.Attribute("Description")))
{
Console.WriteLine("|{0}|", desc);
}
My object template, which is deserialized from a hand made XML file contains mixed types and the text can contain line jumps. When I look at the text I can see line jumps are \r\n, but in my deserialized template object, line jumps are \n. How can I keep line jumps as \r\n?
XmlReaderSettings settings = new XmlReaderSettings();
settings.CloseInput = true;
//settings.ValidationEventHandler += ValidationEventHandler;
settings.ValidationType = ValidationType.Schema;
settings.Schemas.Add(schema);
StringReader r = new StringReader(syntaxEdit.Text);
Schema.template rawTemplate = null;
using (XmlReader validatingReader = XmlReader.Create(r, settings))
{
try
{
XmlSerializer serializer = new XmlSerializer(typeof(Schema.template));
rawTemplate = serializer.Deserialize(validatingReader) as Schema.template;
}
catch (Exception ex)
{
rawTemplate = null;
string floro = ex.Message + (null != ex.InnerException ? ":\n" + ex.InnerException.Message : "");
MessageBox.Show(floro);
}
}
It seems that this is required behavior by the XML specification and is a "feature" in Microsoft's implementation of the XmlReader (see this answer).
Probably the easiest thing for you to do would be to replace \n with \r\n in your result.
That's the behavior mandated by the XML specification: every \r\n, \r or \n MUST be interpreted as a single \n character. If you want to maintain the \r in your output, you have to change it to a character reference (
) as shown below.
public class StackOverflow_7374609
{
[XmlRoot(ElementName = "MyType", Namespace = "")]
public class MyType
{
[XmlText]
public string Value;
}
static void PrintChars(string str)
{
string toEscape = "\r\n\t\b";
string escapeChar = "rntb";
foreach (char c in str)
{
if (' ' <= c && c <= '~')
{
Console.WriteLine(c);
}
else
{
int escapeIndex = toEscape.IndexOf(c);
if (escapeIndex >= 0)
{
Console.WriteLine("\\{0}", escapeChar[escapeIndex]);
}
else
{
Console.WriteLine("\\u{0:X4}", (int)c);
}
}
}
Console.WriteLine();
}
public static void Test()
{
string serialized = "<MyType>Hello\r\nworld</MyType>";
MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes(serialized));
XmlSerializer xs = new XmlSerializer(typeof(MyType));
MyType obj = (MyType)xs.Deserialize(ms);
Console.WriteLine("Without the replacement");
PrintChars(obj.Value);
serialized = serialized.Replace("\r", "
");
ms = new MemoryStream(Encoding.UTF8.GetBytes(serialized));
obj = (MyType)xs.Deserialize(ms);
Console.WriteLine("With the replacement");
PrintChars(obj.Value);
}
}