Linq Read XML with <br> tag - c#

I have a xml file, structure is like following:
<template><body>public DiffSectionType Type<template:br/>{<template:br/><template:tab/>get<template:br/><template:tab/>{<template:br/><template:tab/><template:tab/>return _Type;<template:br/><template:tab/>}<template:br/>}</body></template>
I would like to be more readable, like:
public DiffSectionType Type
{
get
{
return _Type;
}
}
<template:br/> => new line
<template:tab/> => tab
I can read body string, but not able to put it in correct format,
I have tried
var document = XDocument.Load("template.xml");
var body = from element in document.Elements("template").Elements("body")
select element;
foreach(var v in body)
{
Console.WriteLine(v.Value);
}

You could use Regex to solve this so something like this:
string str = #"<template><body>public DiffSectionType Type<template:br/>{<template:br/><template:tab/>get<template:br/><template:tab/>{<template:br/><template:tab/><template:tab/>return _Type;<template:br/><template:tab/>}<template:br/>}</body></template>";
str = Regex.Replace(str, "<template:br\x2F>", Environment.NewLine);
str = Regex.Replace(str, "<template:tab\x2F>", "\t");
str = Regex.Replace(str, "(<\x2Ftemplate>)|(<template>)", "");
str = Regex.Replace(str, "(<\x2Fbody>)|(<body>)", "");

Related

C# convert a string to unicode encoding

I need to convert a string i.e. "hi" to & #104;& #105; is there a simple way of doing this? Here is a website that does what I need. http://unicode-table.com/en/tools/encoder/
Try this:
var s = "hi";
var ss = String.Join("", s.Select(c => "&#" + (int)c + ";"));
Try this:
string myString = "Hi there!";
string encodedString = myString.Aggregate("", (current, c) => current + string.Format("&#{0};", Convert.ToInt32(c)));
Based on the answer to this question:
static string EncodeNonAsciiCharacters(string value)
{
StringBuilder sb = new StringBuilder();
foreach (char c in value)
{
string encodedValue = "&#" + ((int)c).ToString("d4"); // <------- changed
sb.Append(encodedValue);
}
return sb.ToString();
}

Remove words from string c#

I am working on a ASP.NET 4.0 web application, the main goal for it to do is go to the URL in the MyURL variable then read it from top to bottom, search for all lines that start with "description" and only keep those while removing all HTML tags. What I want to do next is remove the "description" text from the results afterwords so I have just my device names left. How would I do this?
protected void parseButton_Click(object sender, EventArgs e)
{
MyURL = deviceCombo.Text;
WebRequest objRequest = HttpWebRequest.Create(MyURL);
objRequest.Credentials = CredentialCache.DefaultCredentials;
using (StreamReader objReader = new StreamReader(objRequest.GetResponse().GetResponseStream()))
{
originalText.Text = objReader.ReadToEnd();
}
//Read all lines of file
String[] crString = { "<BR> " };
String[] aLines = originalText.Text.Split(crString, StringSplitOptions.RemoveEmptyEntries);
String noHtml = String.Empty;
for (int x = 0; x < aLines.Length; x++)
{
if (aLines[x].Contains(filterCombo.SelectedValue))
{
noHtml += (RemoveHTML(aLines[x]) + "\r\n");
}
}
//Print results to textbox
resultsBox.Text = String.Join(Environment.NewLine, noHtml);
}
public static string RemoveHTML(string text)
{
text = text.Replace(" ", " ").Replace("<br>", "\n");
var oRegEx = new System.Text.RegularExpressions.Regex("<[^>]+>");
return oRegEx.Replace(text, string.Empty);
}
Ok so I figured out how to remove the words through one of my existing functions:
public static string RemoveHTML(string text)
{
text = text.Replace(" ", " ").Replace("<br>", "\n").Replace("description", "").Replace("INFRA:CORE:", "")
.Replace("RESERVED", "")
.Replace(":", "")
.Replace(";", "")
.Replace("-0/3/0", "");
var oRegEx = new System.Text.RegularExpressions.Regex("<[^>]+>");
return oRegEx.Replace(text, string.Empty);
}
public static void Main(String[] args)
{
string str = "He is driving a red car.";
Console.WriteLine(str.Replace("red", "").Replace(" ", " "));
}
Output:
He is driving a car.
Note: In the second Replace its a double space.
Link : https://i.stack.imgur.com/rbluf.png
Try this.It will remove all occurrence of the word which you want to remove.
Try something like this, using LINQ:
List<string> lines = new List<string>{
"Hello world",
"Description: foo",
"Garbage:baz",
"description purple"};
//now add all your lines from your html doc.
if (aLines[x].Contains(filterCombo.SelectedValue))
{
lines.Add(RemoveHTML(aLines[x]) + "\r\n");
}
var myDescriptions = lines.Where(x=>x.ToLower().BeginsWith("description"))
.Select(x=> x.ToLower().Replace("description",string.Empty)
.Trim());
// you now have "foo" and "purple", and anything else.
You may have to adjust for colons, etc.
void Main()
{
string test = "<html>wowzers description: none <div>description:a1fj391</div></html>";
IEnumerable<string> results = getDescriptions(test);
foreach (string result in results)
{
Console.WriteLine(result);
}
//result: none
// a1fj391
}
static Regex MyRegex = new Regex(
"description:\\s*(?<value>[\\d\\w]+)",
RegexOptions.Compiled);
IEnumerable<string> getDescriptions(string html)
{
foreach(Match match in MyRegex.Matches(html))
{
yield return match.Groups["value"].Value;
}
}
Adapted From Code Project
string value = "ABC - UPDATED";
int index = value.IndexOf(" - UPDATED");
if (index != -1)
{
value = value.Remove(index);
}
It will print ABC without - UPDATED

Saving an XML that has invalid characters

there are code snippets that strip the invalid characters inside a string before we save it as an XML ... but I have one more problem: Let's say my user wants to have a column name like "[MyColumnOne] ...so now I do not want to strip these "[","] well because these are the ones that user has defined and wants to see them so if I use some codes that are stripping the invalid characters they are also removing "[" and "[" but in this case I still need them to be saved... what can I do?
Never mind, I changed my RegEx format to use XML 1.1 instead of XML 1.0 and now it is working good :
string pattern = String.Empty;
//pattern = #"#x((10?|[2-F])FFF[EF]|FDD[0-9A-F]|7F|8[0-46-9A-F]9[0-9A-F])"; //XML 1.0
pattern = #"#x((10?|[2-F])FFF[EF]|FDD[0-9A-F]|[19][0-9A-F]|7F|8[0-46-9A-F]|0?[1-8BCEF])"; // XML 1.1
Regex regex = new Regex(pattern, RegexOptions.IgnoreCase);
if (regex.IsMatch(sString))
{
sString = regex.Replace(sString, String.Empty);
File.WriteAllText(sString, sString, Encoding.UTF8);
}
return sString;
This worked for me, and it was fast.
private object NormalizeString(object p) {
object result = p;
if (p is string || p is long) {
string s = string.Format("{0}", p);
string resultString = s.Trim();
if (string.IsNullOrWhiteSpace(resultString)) return "";
Regex rxInvalidChars = new Regex("[\r\n\t]+", RegexOptions.IgnoreCase);
if (rxInvalidChars.IsMatch(resultString)) {
resultString = rxInvalidChars.Replace(resultString, " ");
}
//string pattern = String.Empty;
//pattern = #"";
////pattern = #"#x((10?|[2-F])FFF[EF]|FDD[0-9A-F]|7F|8[0-46-9A-F]9[0-9A-F])"; //XML 1.0
////pattern = #"#x((10?|[2-F])FFF[EF]|FDD[0-9A-F]|[19][0-9A-F]|7F|8[0-46-9A-F]|0?[1-8BCEF])"; // XML 1.1
//Regex rxInvalidXMLChars = new Regex(pattern, RegexOptions.IgnoreCase);
//if (rxInvalidXMLChars.IsMatch(resultString)) {
// resultString = rxInvalidXMLChars.Replace(resultString, "");
//}
result = string.Join("", resultString.Where(c => c >= ' '));
}
return result;
}

How do I use the Aggregate function to take a list of strings and output a single string separated by a space?

Here is the source code for this test:
var tags = new List<string> {"Portland", "Code","StackExcahnge" };
const string separator = " ";
tagString = tags.Aggregate(t => , separator);
Console.WriteLine(tagString);
// Expecting to see "Portland Code StackExchange"
Console.ReadKey();
Update
Here is the solution I am now using:
var tagString = string.Join(separator, tags.ToArray());
Turns out string.Join does what I need.
For that you can just use string.Join.
string result = tags.Aggregate((acc, s) => acc + separator + s);
or simply
string result = string.Join(separator, tags);
String.Join Method may be?
This is what I use
public static string Join(this IEnumerable<string> strings, string seperator)
{
return string.Join(seperator, strings.ToArray());
}
And then it looks like this
tagString = tags.Join(" ")

String.Replace does not seem to replace brackets with empty string

The following bit of C# code does not seem to do anything:
String str = "{3}";
str.Replace("{", String.Empty);
str.Replace("}", String.Empty);
Console.WriteLine(str);
This ends up spitting out: {3}. I have no idea why this is. I do this sort of thing in Java all the time. Is there some nuance of .NET string handling that eludes me?
The String class is immutable; str.Replace will not alter str, it will return a new string with the result. Try this one instead:
String str = "{3}";
str = str.Replace("{", String.Empty);
str = str.Replace("}", String.Empty);
Console.WriteLine(str);
String is immutable; you can't change an instance of a string. Your two Replace() calls do nothing to the original string; they return a modified string. You want this instead:
String str = "{3}";
str = str.Replace("{", String.Empty);
str = str.Replace("}", String.Empty);
Console.WriteLine(str);
It works this way in Java as well.
Replace actually does not modify the string instance on which you call it. It just returns a modified copy instead.
Try this one:
String str = "{3}";
str = str.Replace("{", String.Empty);
str = str.Replace("}", String.Empty);
Console.WriteLine(str);
Str.Replace returns a new string. So, you need to use it as follows:
String str = "{3}";
str = str.Replace("{", String.Empty);
str = str.Replace("}", String.Empty);
The Replace function returns the modified string, so you have to assign it back to your str variable.
String str = "{3}";
str = str.Replace("{", String.Empty);
str = str.Replace("}", String.Empty);
Console.WriteLine(str);
You'll have to do:
String str = "{3}";
str = str.Replace("{", String.Empty);
str = str.Replace("}", String.Empty);
Console.WriteLine(str);
Look at the String.Replace reference:
Return Value Type: System.String
A String equivalent to this instance but
with all instances of oldValue
replaced with newValue.
I believe that str.Replace returns a value which you must assign to your variable. So you will need to do something like:
String str = "{3}";
str = str.Replace("{", String.Empty);
str = str.Replace("}", String.Empty);
Console.WriteLine(str);
The Replace method returns a string with the replacement. What I think you're looking for is this:
str = str.Replace("{", string.Empty);
str = str.Replace("}", string.Empty);
Console.WriteLine(str);
Besides all of the suggestions so far - you could also accomplish this without changing the value of the original string by using the replace functions inline in the output...
String str = "{3}";
Console.WriteLine(str.Replace("{", String.Empty).Replace("}", String.Empty));

Categories