replacing substring inside attributes of XmlDocument - c#

I'm using C# with .net 3.5 and have a few cases where I want to replace some substrings in the XML attributes of an XmlDocument with something else.
One case is to replace the single quote character with ' and the other is to clean up some files that contain valid XML but the attributes' values are no longer appropriate (say replace anything attribute which starts with "myMachine" with "newMachine").
Is there a simple way to do this, or do I need to go through each attribute of every node (recursively)?

One way to approach it is to select a list of the correct elements using Linq to XML, and then iterate over that list. Here's an example one-liner:
XDocument doc = XDocument.Load(path);
doc.XPathSelectElements("//element[#attribute-name = 'myMachine']").ToList().ForEach(x => x.SetAttributeValue("attribute-name", "newMachine"));
You could also do a more traditional iteration.

I suggest taking a look at LINQ to XML. There's a collection of code snippets that can help you get started here - LINQ To XML Tutorials with Examples
LINQ to XML should allow you to do what you're looking to do, and you'll probably find it easy once you've played with it a bit.

Related

Loading XML Document - Name cannot begin with the zero character

I am trying to load something which claims to be an XML document into any type of .net XML object: XElement, XmlDocument, or XmlTextReader. All of them throw an exception :
Name cannot begin with the '0' character, hexadecimal value 0x30
The error related to a bit of 'XML'
<chart_value
color="ff4400"
alpha="100"
size="12"
position="cursor"
decimal_char="."
0=""
/>
I believe the problem is the author should not have named an attribute as 0.
If I could change this I would, but I do not have control of this feed. I suppose those who use it are using more permissive tools. Is there anyway I can load this as XML without throwing an error?
There is no XML declaration either, nor namespace or contract definition. I was thinking I might have to turn it into a string and do a replace, but this is not very elegant. Was wondering if there was any other options.
As many have said, this is not XML.
Having said that, it's almost XML and WANTS to be XML, so I don't think you should use a regex to screw around inside of it (here's why).
Wherever you're getting the stream, dump into into a string, change 0= to something like zero= and try parsing it.
Don't forget to reverse the operation if you have to return-to-sender.
If you're reading from a file, you can do something like this:
var txt = File.ReadAllText(#"\path\to\wannabe.xml");
var clean = txt.Replace("0=", "zero=");
var doc = new XmlDocument();
doc.LoadXml(clean);
This is not guaranteed to remove all potential XML problems -- but it should remove the one you have.
Just replace the Numeric value with '_'
Example: "0=" replace to "_0="
I hope that will fix the problem, thanks.
It might claim to be an XML document, but the claim is clearly false, so you should reject the document.
The only good way to deal with bad XML is to find out what bit of software is producing it, and either fix it or throw it away. All the benefits of XML go out of the window if people start tolerating stuff that's nearly XML but not quite.
The 0="" obviously uses an invalid attribute name 0. You'd probably have to do a find/replace to try and fix the XML if you cannot fix it at the source that created it. You might be able to use RegEx to try to do more efficient manipulation of the XML string.

Querying XML Without Worrying about Namespaces

I have XML with and without a prefix on elements, but no namespaces defined for any of them. When I try to load this, it gives me an error on XDocument.Load (at least, I think that's where it happens) that certain prefixes are not defined. Is there a way to tell the framework to ignore any namespace prefixes? I'm using LINQ to XML, but could use something else if available.
I can't necessarily pre-define them because I'm going to be working with a variety of documents that may or may not have a prefix defined and no definitive xmlns declaration.
Aren't prefixes supposed to represent an abbreviation for a namespace? I believe you need to clean up those prefixes that have no namespace associated with them in the first place before processing it, since it isn't valid XML. A quick regex to replace all prefixes of the form </prefix: with </: and <prefix: with < should do it.
To do this, first replace the following regex matches
</.*?: with </
and <.*?: with < (do not change the ordering).
An approach to what you want to do may be using XmlDocument:
XmlDocument d = new XmlDocument();
using (var textReader = new XmlTextReader(#"test.xml"))
{
textReader.Namespaces = false;
d.Load(textReader);
}
You will lose the power of querying the data using the syntax of LINQ to XML.
You can actually use LINQ to XML and ignore the namespace by setting for each prefix in the file the folowing line
nameSpaceManager.AddNamespace("prefixName", "urn:ignore");
where nameSpaceManager is of type XmlNamespaceManager.
But from your question i sense that this is not a reasonable solution.

LINQ to XML:Is XNode query possible

I want to use LINQ to XML in Silverlight 3 since there is no XPath support.
I have kind of got the hang of it. But the project I'm working on will not guarantee that all the XML tags I will be querying for will appear in the result XML file.
Due to this I will not be able to query the overall file as XDocument becase the absence of the tag in one document will jumble up the enumeration.
Is there anyway to typecast an XNode to XDocument? I am asking this as I am not able to query the XNode.
Even with LINQ-to-XML you should be querying by name, so I'm not sure why the absence of any particular tag should "jumble up the enumeration" - simply; you might have some nulls, i.e.
var customer = node.Element("Foo");
// now test for null ;p
You can't cast an arbitrary XNode to an XDocument, but if you are sure it is an element, casting to XElement should provide what you need.
Note also that when value nodes may be missing, you might find it easiest to use the conversion operators:
var auditDate = (DateTime?)e.Element("AuditDate");
if <AuditDate> doesn't exist, this will return an empty Nullable<DateTime> - same approach works for most common value-types, or for strings just convert to string.

Best way to replace XML Text

I have a web service which returns the following XML:
<Validacion>
<Es_Valido>NK7+22XrSgJout+ZeCq5IA==</Es_Valido>
</Validacion>
<Estatus>
<Estatus>dqrQ7VtQmNFXmXmWlZTL7A==</Estatus>
</Estatus>
<Generales>
<Nombre>V4wb2/tq9tEHW80tFkS3knO8i4yTpJzh7Jqi9MxpVVE=</Nombre>
<Apellido>jXyRpjDQvsnzZz+wsq6b42amyTg2np0wckLmQjQx1rCJc8d3dDg6toSdSX200eGi</Apellido>
<Ident_Clie>IYbofEiD+wOCJ+ujYTUxgsWJTnGfVU+jcQyhzgQralM=</Ident_Clie> <Fec_Creacion>hMI2YyE5h2JVp8CupWfjLy24W7LstxgmlGoDYjPev0r8TUf2Tav9MBmA2Xd9Pe8c</Fec_Creacion>
<Nom_Asoc>CF/KXngDNY+nT99n1ITBJJDb08/wdou3e9znoVaCU3dlTQi/6EmceDHUbvAAvxsKH9MUeLtbCIzqpJq74e QfpA==</Nom_Asoc>
<Fec_Defuncion />
</Generales>
The text inside the tags in encrypted, I need to decrypt the text, I've come up with a regular expressions solution but I don't think it's very optimal, is there a better way to do this? thanks!
I wouldn't use a regular expression. Load the XML with something like LINQ to XML, find every element which just has a text child, and replace the contents of that child with the decrypted form.
Do you know which elements will be encrypted? That would make it even easier. Basically you'll want something along the lines of:
// It's possible that modifying elements while executing Descendants()
// would be okay, but I'm not sure
List<XElement> elements = doc.Descendants().ToList();
foreach (XElement element in elements)
{
if (ShouldDecrypt(element)) // Whatever this would do
{
element.Value = Decrypt(element.Value);
}
}
(I'm assuming you know how to do the actual decryption part, of course.)
Never ever use regular expressions to parse XML. XmlReader and XmlDocument, both found inside System.Xml, provide a way better way to parse XML.
Do you know the type of encryption used? Look here to get the basics on the Cryptology capabilities in .NET

Put prefix on all elements of xml document

I'm using C# and i need to create a XML document. Ok, i did, but, in each element i need to put a tc prefix.
The only way that i know, is using xmlDoc.CreateElement("tc", "node1", "file.xsd"), but it is very massive because i have lots of tags and my program its already writted.
Is this the only way?
This might work for you:
XmlReader - I need to edit an element and produce a new one
If you're lucky enough to be using C# 3.5, take a look at LINQ to XML.
Here's a document on How to: Create a Document with Namespaces (C#) (LINQ to XML) from MSDN for the LINQ to XML API.
And if you've never seen LINQ to XML before, take a look at this 5 minute overview

Categories