String.Split not working to get values - c#

I have an XML which is passes as a string variable to me. I want to get the value of specific tags from that XML. Following is the XML I have and what I'm trying to achieve:
<code>
string xmlData = #"
<HEADER>
<TYPE>AAA</TYPE>
<SUBTYPE>ANNUAL</SUBTYPE>
<TYPEID>12345</TYPEID>
<SUBTYPEID>56789</SUBTYPEID>
<ACTIVITY>C</ACTION>
</HEADER>";
var typeId = data.Split("<TYPEID>")[0]; //Requirement
var activity = data.Split("<ACTIVITY>")[0]; //Requirement
</code>
I know string.Split(); doesn't work here as it requires a single character only. Other alternate is to use regex which seems a bit threatening to me. Although I have tried to work with it but doesn't getting the desired result. Can someone help with the regex code?

You should have used XML Parsing to get the values but since you are trying split to split a string from a string and not char you can choose
string typeId = xmlData.Split(new string[] { "<TYPEID>" }, StringSplitOptions.None)[1];
string typeIdVal = typeId.Split(new string[] { "</TYPEID>" }, StringSplitOptions.None)[0];
and it looks very neat and clean with XML Parsing
XmlDocument xmlDoc= new XmlDocument();
xmlDoc.Load("yourXMLFile.xml");
XmlNodeList XTypeID = xmlDoc.GetElementsByTagName("TYPEID");
string TypeID = XTypeID[0].InnerText;
You can also choose SubString like
string typeidsubstr = xmlData.Substring(xmlData.IndexOf("<TYPEID>") + 8, xmlData.IndexOf("</TYPEID>") - (xmlData.IndexOf("<TYPEID>") + 8));
I used +8 because the length of <TYPEID> is 8 you can also choose it string.length to evaluate the result.

You can use XML Linq objects to parse these.
NB: There is a typo in the ACTIVITY element, the closing tag should be /ACTIVITY, not /ACTION! (I've corrected below)
string xmlData = #"<HEADER>
<TYPE>AAA</TYPE>
<SUBTYPE>ANNUAL</SUBTYPE>
<TYPEID>12345</TYPEID>
<SUBTYPEID>56789</SUBTYPEID>
<ACTIVITY>C</ACTIVITY>
</HEADER>";
var doc = XDocument.Parse(xmlData);
var typeId = doc.Root.Elements("TYPEID").First().Value;
var activity = doc.Root.Elements("ACTIVITY").First().Value;

Related

Remove single quote characters from specific field inside JSON data

I have this json data string below, which I need to do some cleaning before I can Deserialize into an object in C#. Here's my json string:
{'data':[
{'ID':'01','Name':'Name 1','Description':'abc','Skills':[{'Type':'abc','Technical':'abc','Description':'abc'}],'Status':false,'Inactive':0},
{'ID':'02','Name':'Name 2','Description':'abc','Skills':[{'Type':'abc','Technical':'abc','Description':'abc'}],'Status':false,'Inactive':0},
{'ID':'03','Name':'Name 3','Description':'abc','Skills':[{'Type':'abc','Technical':'abc','Description':'abc'}],'Status':false,'Inactive':1}]}
What I'm trying to do is REMOVE single quote (') character from the following field in the above data:
'Skills':[{'Type':'abc','Technical':'abc','Description':'abc'}]
So what I need to achieve is to have "Skills" field to look like this:
'Skills':[{Type:abc,Technical:abc,Description:abc}]
I designed this Regex patter:
(?<='Skills':\[\{)(.*?)(?=\}\],)
It matches the string below, but I don't know how to exclude single quotes.
'Type':'abc','Technical':'abc','Description':'abc'
Can someone please help?
It's better to modify the source to get pretty formatted JSON, it's not standard JSON format.
if you don't have access to modify the source output, you can use this :
string content = Console.ReadLine();
var matchResult = new Regex("(?<='Skills':).*?}]").Matches(content);
foreach(Match match in matchResult)
{
string matchValueWithoutSingleQuote = match.Value.Replace("'", string.Empty);
content = content.Replace(match.Value, matchValueWithoutSingleQuote);
}
Console.WriteLine(content);
Console.ReadLine();
the output is :
{'data':[
{'ID':'01','Name':'Name 1','Description':'abc','Skills':[{Type:abc,Technical:abc,Description:abc}],'Status':false,'Inactive':0},
{'ID':'02','Name':'Name 2','Description':'abc','Skills':[{Type:abc,Technical:abc,Description:abc}],'Status':false,'Inactive':0},
{'ID':'03','Name':'Name 3','Description':'abc','Skills':[{Type:abc,Technical:abc,Description:abc}],'Status':false,'Inactive':1}]}
Linq version :
string content = Console.ReadLine();
var matchResult = new Regex("(?<='Skills':).*?}]").Matches(content);
var jsonWithNormalizedSkillField = matchResult.Cast<Match>().Select(s => content.Replace(s.Value, s.Value.Replace("'", string.Empty))).FirstOrDefault();
Console.WriteLine(jsonWithNormalizedSkillField);
Console.ReadLine();

How to keep xml from converting /r/n into

I have here a small code:
string attributeValue = "Hello" + Environment.NewLine + " Hello 2";
XElement element = new XElement("test");
XElement subElement = new XElement("subTest");
XAttribute attribute = new XAttribute("key", "Hello");
XAttribute attribute2 = new XAttribute("key2", attributeValue);
subElement.Add(attribute);
subElement.Add(attribute2);
element.Add(subElement);
Console.Write(element.ToString());
Console.ReadLine();
I have an issue, basically the /r/n or the new line is converted in
in attribute, but I dont want to have it, I want to keep it /r/n as when I use this XML with the Microsoft Word documents template, the new lines are not implemented, although it is multilined text, in word document I only get the spaces. But no new lines :/
Anyone has any idea?
Although i've set the allow multi line int he property of the field in the template.
Actually the behaviour you get with
is the same than the one of Environment.NewLine. You can do a simple test to confirm this (add two TextBoxes to your Form with the Multiline property set to True: textBox1 and textBox2):
textBox1.Text = element.ToString(); //
string text = element.ToString().Replace("
", Environment.NewLine);
textBox2.Text = text; ///r/n
On the other hand, if you want to avoid the
part anyway (for example: because of wanting to output the given string to an external program not working on .NET), you can just rely on the aforementioned Replace after dealing with XElement and new lines.

Regex - remove text while replacing text with c#

I am attempting to learn regex by using it to edit some scripts I have.
My scripts contain like so
<person name="John">Will be out of town</person><person name="Julie">Will be in town.</person>
I need to replace the name values in the script - the addition to the name is always the same, but I might have names that I don't want to update.
Quick example of what I have:
string[] names = new string[1];
names[0] = "John-Example";
names[1] = "Paul-Example";
string ToFix = "<person name=\"John\">Will be out of town</person><person name=\"Julie\">Will be in town.</person>"
for (int i=0; i<names.Length; i++)
{
string Name = names[i];
ToFix = Regex.Replace(ToFix, "(<.*name=\")(" + Name.Replace("-Example", "") + ".*)(\".*>)", "$1" + Name + "$3", RegexOptions.IgnoreCase);
}
This works for the most part, but I have two problems with it. Sometime it removes too much, if I have multiple persons in the string, it will remove everything between the first person and the last person, as so:
Hello <person name="John">This is John</person><person name="Paul">This is Paul</person>
becomes
Hello <person name="John-Example">This is Paul</person>
Also, I would like to remove any extra text behind the name value and before the closing carrat, so that:
<person name="John" hello>
Should be corrected to:
<person name="John-Example">
I have read several articles on regex and feel that I am just missing something small here. How and why would I go about fixing this?
EDIT: I don't think these scripts that I am working with classify as XML - the entire script may or may not have <> tags. Back to my original goal with this question, can someone explain the behavior of the regex? And how would I remove extra text after the name value before the closing tag?
Your regex is too greedy. Try .*? rather than just .*
Also, please don't use regex to parse XML.
Here's an example of how to do what I think you want, using XDocument:
var xdoc = XDocument.Parse(ToFix);
foreach (var person in xdoc.Elements("person"))
{
var name = person.Attribute("name");
if (person.LastAttribute != name)
{
person.RemoveAttributes();
person.SetAttributeValue(name.Name, name.Value + "-Example");
}
}
var output = xdoc.ToString();

Write XML in a string c#

I was wondering how I could , when I get the xml write it in a string ?
Because when I do this :
{
XDocument xdoc = XDocument.Load("MYURL");
string textresult = xdoc.Root.ToString();
Label_RequestResult.Text = textresult;
}
my Label_RequestResult.text will be equal to the value of the node of the XML.
I would like to actually return the whole xml structure .
Is this posible ?
Thanks for helping.
In my case string textresult = xdoc.ToString(); did the job.
I got the whole structure, even with spaces and line breaks.
I think it should something like this: have you tried already?
string textresult = xdoc.Root.Value();

The error in getting the exact value from the XML node when the '\' value is in string and that string in passed used as xml instead of file

I am having XML in a String as below
String s = #<user>abc.int\abhi</user>
but when i write the following code
XmlDocument doc = new XmlDocument();
doc.InnerXml = s;
XmlElement root = doc.DocumentElement;
String User = root.SelectSingleNode("user");
The User has the value abc.int\\abhi instead of abc.int\abhi the '\' character appears twice in the string.
Thank you in advance.
Do you check that value in VS watch window? If so, it is normal to display \, because watch window shows string as if it was written in code, not the real string.
In code, if you want to enter \ into a string, you have to write string s = "\\"; And this will create actual string with \ in it.
try outputting your string to console or messagebox, and you should see, that it is correct.

Categories