Android indexoutofboundsexception - c#

helo everyone i working Regex. i have indexoutofboundsexception
this is a my source
public String ExtractYoutubeURLFromHTML(String html) throws UnsupportedEncodingException
{
Pattern pattern = Pattern.compile(this.extractionExpression);
Matcher matcher = pattern.matcher(html);
List<String> matches = new ArrayList<String>();
while(matcher.find()){
matches.add(matcher.group());
}
String vid0 = matches.get(0).toString();
vid0 = vid0.replace ("\\\\u0026", "&");
vid0 = vid0.replace ("\\\\\\", "");
vid0=URLDecoder.decode(vid0,"UTF-8");
return vid0;
}
i debuged and i have indexoutofboundsexception invalid index exception
error is in this part of code
List<String> matches = new ArrayList<String>();
while(matcher.find()){
matches.add(matcher.group());
}
String vid0 = matches.get(0).toString();
also i have woring code in C# and i want to rewrite C# code in java Code
this is a working code in C#
public string ExtractYoutubeURLFromHTML(string html)
{
Regex rx = new Regex (this.extractionExpression);
var video = rx.Matches (html);
var vid0 = video [0].ToString ();
vid0 = vid0.Replace ("\\\\u0026", "&");
vid0 = vid0.Replace ("\\\\\\", "");
vid0 = System.Net.WebUtility.UrlDecode (vid0);
return vid0;
}

If matches is empty then matches.get(0).toString(); will give an ArrayIndexOutOfBoundsException.
Better check like
if(matches.size()>0)
String vid0 = matches.get(0).toString();

Replace this one :
String vid0 = matches.get(0).toString();
by
if(matches.size() > 0){
String vid0 = matches.get(0).toString();
} else {
return "" // or you can also return null;
}

Related

How can I pick out a number from a string in C#

I have this string:
http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&sid=&q=1007040&a=2
How can I pick out the number between "q=" and "&amp" as an integer?
So in this case I want to get the number: 1007040
What you're actually doing is parsing a URI - so you can use the .Net library to do this properly as follows:
var str = "http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&sid=&q=1007040&a=2";
var uri = new Uri(str);
var query = uri.Query;
var dict = System.Web.HttpUtility.ParseQueryString(query);
Console.WriteLine(dict["amp;q"]); // Outputs 1007040
If you want the numeric string as an integer then you'd need to parse it:
int number = int.Parse(dict["amp;q"]);
Consider using regular expressions
String str = "http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&sid=&q=1007040&a=2";
Match match = Regex.Match(str, #"q=\d+&amp");
if (match.Success)
{
string resultStr = match.Value.Replace("q=", String.Empty).Replace("&amp", String.Empty);
int.TryParse(resultStr, out int result); // result = 1007040
}
Seems like you want a query parameter for a uri that's html encoded. You could do:
Uri uri = new Uri(HttpUtility.HtmlDecode("http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&sid=&q=1007040&a=2"));
string q = HttpUtility.ParseQueryString(uri.Query).Get("q");
int qint = int.Parse(q);
A regex approach using groups:
public int GetInt(string str)
{
var match = Regex.Match(str,#"q=(\d*)&amp");
return int.Parse(match.Groups[1].Value);
}
Absolutely no error checking in that!

C# remove empty url parameters regex

I am trying to remove empty url type parameters from a string using C#. My code sample is here.
public static string test ()
{
string parameters = "one=aa&two=&three=aaa&four=";
string pattern = "&[a-zA-Z][a-zA-Z]*=&";
string replacement = "";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(parameters, replacement);
return parameters;
}
public static void Main(string[] args)
{
Console.WriteLine(test());
}
I tried the code in rextester
output: one=aa&two=&three=aaa&four=
expected output: one=aa&three=aaa
You absolutely do not need to roll your own Regex for this, try using HttpUtility.ParseQueryString():
public static string RemoveEmptyUrlParameters(string input)
{
var results = HttpUtility.ParseQueryString(input);
Dictionary<string, string> nonEmpty = new Dictionary<string, string>();
foreach(var k in results.AllKeys)
{
if(!string.IsNullOrWhiteSpace(results[k]))
{
nonEmpty.Add(k, results[k]);
}
}
return string.Join("&", nonEmpty.Select(kvp => $"{kvp.Key}={kvp.Value}"));
}
Fiddle here
Regex:
(?:^|&)[a-zA-Z]+=(?=&|$)
This matches start of string or an ampersand ((?:^|&)) followed by at least one (english) letter ([a-zA-Z]+), an equal sign (=) and then nothing, made sure by the positive look-ahead ((?=&|$)) which matches end of string or a new parameter (started by &).
Code:
public static string test ()
{
string parameters = "one=aa&two=&three=aaa&four=";
string pattern = "(?:^|&)[a-zA-Z]+=(?=&|$)";
string replacement = "";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(parameters, replacement);
return result;
}
public static void Main(string[] args)
{
Console.WriteLine(test());
}
Note that this also returns the correct variable (as pointed out by Joel Anderson)
See it live here at ideone.
The results of the Regex replace is not returned by the function. The function returns the variable "parameters", which is never updated or changed.
string parameters = "one=aa&two=&three=aaa&four=";
...
string result = rgx.Replace(parameters, replacement);
return parameters;
....
Perhaps you meant
return results;

Can't get Regex group value

The string I'm trying to parse from looks something like this:
{"F4q6i9xe":{"Hhgi79M1":"cTZ3W2JG","j0Uszek2":"0"},"a3vSYuq2":{"Kn51uR4Y":"c6QUklQzv+kRLnmvS0zsheqDsRCXbWFVpYhq2rsElKyFQnla01E6P3qQ26b+xWSrscFJCi2qsh6WjJKSW5FN9EwxRAWc3rfyToaEhRngI2WPu3W/b1/hkkS2tEEk7LEpS2ItxLYKjEQGneO5E9rcGzbtfSOikdIjhxpD5m9HsKayo2ZSc2EYd/cN9yfYrNfLOCx+xeGqGcPmmImAzOZM0Q1IXIDQZ5r70vUS1aUOMOFnN1fVxmQ8ISofEKNEpHFfwWW9QUl9eVDgpQ2HES13s20kvVH2FOlE5uahIJBnyTLWYzViAWYyX13VK2PgrQcZ0oLuV84bSbjHGZf+JAjvImuyYhkKDhtTWAGkQ8LZ6r07duo71Gbz3glOUZZyz+FFiH5KrwafBpzGhRlI8El6lLZASN9Z5iZNTKs+sOti1fzsuOBzUXEjFURJGa93GMqPC+OyTw6YxKvgapz7go1XL7EAf50UXpMHd9xAJzsuIb6/lB/6v+XjD70wc74D2OosxtR6DIbQj3gbaBUBABQsJHQZLNnHWqikh+HLCASVbkqw6YPm5NecGaOQxonDkh3ZVQSF15WCEsgNdoWG94OuLF8DSw1t1Kt8sMFkvBJgB7LJS3pAw/Tcg/rvR5qwZ/n31rtOzaJgCdVc6XeIK6Rttj4KvNtgtTwkJIjb97FgIS0CXrR0UCp3BsuLwQg3CRTnjQqsjGgJeinLXX2lq6vNOkxRzgXlWnap96kheYj7sIE/IxbJJWEIInMbH9HU/w0bS0MtIgofpIQAZ/m6XKhu9LpPZBgsBf9KX5chUPeuRfs3MfHFzPYPlCCd2dkE8BHKOTmfS3okqhkzIZvFN3Ni+7enktFdgfpQ0toQbjhpn/TszP34pBaIy77me+GvePNO5bAFECLWSpGvsmW16rLCP0H+xIS/lNf9hK98jQqGU7eqpQdai7WFie2yN75Up7MrgBMp5w9Z5C7qpG2iYGiqynpqTCEnfQ3IZumbL+YvGTiuI2c320MGjKzOdO45MIU5fxUNZQSIfCSyIq5G/XGIeXCG3KETyKDZygXUTgEWbJWNTADE0AhXvP7HMtsuvstyuHvlTZGcfKS4oLnDPFiV1ndIV7+W74Ytv9bAdDIVl36xTzA2PV6waqXBfSPUCTx3jVrAeXcHGFjZtxbk3pFmuqPqgVcxeX/aDbK0NHkR6phQcFEREBjfdCOLAAbCkWiRF0JAA=="}}
And this is my regular expression:
Hhgi79M1":"(?<encodeKeyID>.*?)",.*Kn51uR4Y":"(?<encodedBody>.*?)"}
And this is the code I'm using in my C# application:
string responsePattern = "Hhgi79M1\":\"(?<encodeKeyID>.*?)\",.*Kn51uR4Y\":\"(?<encodedBody>.*?)\"}";
if (Regex.IsMatch(body, responsePattern))
{
var match = Regex.Match(body, responsePattern);
string encodeKeyID = match.Groups["encodeKeyID"].Value;
string encodedBody = match.Groups["encodedBody"].Value;
Now it works, but it doesn't get the value of "encodedBody". I tested my expression with the data on https://regex101.com/ and it seems to work fine on there. However, when getting the value in my program it's just an empty string.
I sense the issue is that your body string is not escaped properly, as your pattern works fine in the following code:
string body = "{\"F4q6i9xe\":{\"Hhgi79M1\":\"cTZ3W2JG\",\"j0Uszek2\":\"0\"},\"a3vSYuq2\":{\"Kn51uR4Y\":\"c6QUklQzv+kRLnmvS0zsheqDsRCXbWFVpYhq2rsElKyFQnla01E6P3qQ26b+xWSrscFJCi2qsh6WjJKSW5FN9EwxRAWc3rfyToaEhRngI2WPu3W/b1/hkkS2tEEk7LEpS2ItxLYKjEQGneO5E9rcGzbtfSOikdIjhxpD5m9HsKayo2ZSc2EYd/cN9yfYrNfLOCx+xeGqGcPmmImAzOZM0Q1IXIDQZ5r70vUS1aUOMOFnN1fVxmQ8ISofEKNEpHFfwWW9QUl9eVDgpQ2HES13s20kvVH2FOlE5uahIJBnyTLWYzViAWYyX13VK2PgrQcZ0oLuV84bSbjHGZf+JAjvImuyYhkKDhtTWAGkQ8LZ6r07duo71Gbz3glOUZZyz+FFiH5KrwafBpzGhRlI8El6lLZASN9Z5iZNTKs+sOti1fzsuOBzUXEjFURJGa93GMqPC+OyTw6YxKvgapz7go1XL7EAf50UXpMHd9xAJzsuIb6/lB/6v+XjD70wc74D2OosxtR6DIbQj3gbaBUBABQsJHQZLNnHWqikh+HLCASVbkqw6YPm5NecGaOQxonDkh3ZVQSF15WCEsgNdoWG94OuLF8DSw1t1Kt8sMFkvBJgB7LJS3pAw/Tcg/rvR5qwZ/n31rtOzaJgCdVc6XeIK6Rttj4KvNtgtTwkJIjb97FgIS0CXrR0UCp3BsuLwQg3CRTnjQqsjGgJeinLXX2lq6vNOkxRzgXlWnap96kheYj7sIE/IxbJJWEIInMbH9HU/w0bS0MtIgofpIQAZ/m6XKhu9LpPZBgsBf9KX5chUPeuRfs3MfHFzPYPlCCd2dkE8BHKOTmfS3okqhkzIZvFN3Ni+7enktFdgfpQ0toQbjhpn/TszP34pBaIy77me+GvePNO5bAFECLWSpGvsmW16rLCP0H+xIS/lNf9hK98jQqGU7eqpQdai7WFie2yN75Up7MrgBMp5w9Z5C7qpG2iYGiqynpqTCEnfQ3IZumbL+YvGTiuI2c320MGjKzOdO45MIU5fxUNZQSIfCSyIq5G/XGIeXCG3KETyKDZygXUTgEWbJWNTADE0AhXvP7HMtsuvstyuHvlTZGcfKS4oLnDPFiV1ndIV7+W74Ytv9bAdDIVl36xTzA2PV6waqXBfSPUCTx3jVrAeXcHGFjZtxbk3pFmuqPqgVcxeX/aDbK0NHkR6phQcFEREBjfdCOLAAbCkWiRF0JAA==\"}}\n" +
"";
string responsePattern = "Hhgi79M1\":\"(?<encodeKeyID>.*?)\",.*Kn51uR4Y\":\"(?<encodedBody>.*?)\"}";
if (Regex.IsMatch(body, responsePattern))
{
var match = Regex.Match(body, responsePattern);
string encodeKeyID = match.Groups["encodeKeyID"].Value;
string encodedBody = match.Groups["encodedBody"].Value;
string msg = String.Format("encodeKeyID: {0}\nencodedBody: {1}", encodeKeyID, encodedBody);
//show in message box
MessageBox.Show(msg, "Pattern Match Result");
}
Output:
Try it this way.
string sTarget = #"
{""F4q6i9xe"":{""Hhgi79M1"":""cTZ3W2JG"",""j0Uszek2"":""0""},""a3vSYuq2"":{""Kn51uR4Y"":""c6QUklQzv+kRLnmvS0zsheqDsRCXbWFVpYhq2rsElKyFQnla01E6P3qQ26b+xWSrscFJCi2qsh6WjJKSW5FN9EwxRAWc3rfyToaEhRngI2WPu3W\/b1\/hkkS2tEEk7LEpS2ItxLYKjEQGneO5E9rcGzbtfSOikdIjhxpD5m9HsKayo2ZSc2EYd\/cN9yfYrNfLOCx+xeGqGcPmmImAzOZM0Q1IXIDQZ5r70vUS1aUOMOFnN1fVxmQ8ISofEKNEpHFfwWW9QUl9eVDgpQ2HES13s20kvVH2FOlE5uahIJBnyTLWYzViAWYyX13VK2PgrQcZ0oLuV84bSbjHGZf+JAjvImuyYhkKDhtTWAGkQ8LZ6r07duo71Gbz3glOUZZyz+FFiH5KrwafBpzGhRlI8El6lLZASN9Z5iZNTKs+sOti1fzsuOBzUXEjFURJGa93GMqPC+OyTw6YxKvgapz7go1XL7EAf50UXpMHd9xAJzsuIb6\/lB\/6v+XjD70wc74D2OosxtR6DIbQj3gbaBUBABQsJHQZLNnHWqikh+HLCASVbkqw6YPm5NecGaOQxonDkh3ZVQSF15WCEsgNdoWG94OuLF8DSw1t1Kt8sMFkvBJgB7LJS3pAw\/Tcg\/rvR5qwZ\/n31rtOzaJgCdVc6XeIK6Rttj4KvNtgtTwkJIjb97FgIS0CXrR0UCp3BsuLwQg3CRTnjQqsjGgJeinLXX2lq6vNOkxRzgXlWnap96kheYj7sIE\/IxbJJWEIInMbH9HU\/w0bS0MtIgofpIQAZ\/m6XKhu9LpPZBgsBf9KX5chUPeuRfs3MfHFzPYPlCCd2dkE8BHKOTmfS3okqhkzIZvFN3Ni+7enktFdgfpQ0toQbjhpn\/TszP34pBaIy77me+GvePNO5bAFECLWSpGvsmW16rLCP0H+xIS\/lNf9hK98jQqGU7eqpQdai7WFie2yN75Up7MrgBMp5w9Z5C7qpG2iYGiqynpqTCEnfQ3IZumbL+YvGTiuI2c320MGjKzOdO45MIU5fxUNZQSIfCSyIq5G\/XGIeXCG3KETyKDZygXUTgEWbJWNTADE0AhXvP7HMtsuvstyuHvlTZGcfKS4oLnDPFiV1ndIV7+W74Ytv9bAdDIVl36xTzA2PV6waqXBfSPUCTx3jVrAeXcHGFjZtxbk3pFmuqPqgVcxeX\/aDbK0NHkR6phQcFEREBjfdCOLAAbCkWiRF0JAA==""}}
";
Regex responseRx = new Regex(#"Hhgi79M1"":""(?<encodeKeyID>.*?)"",.*Kn51uR4Y"":""(?<encodedBody>.*?)""}");
Match responseMatch = responseRx.Match(sTarget);
if (responseMatch.Success)
{
Console.WriteLine("ID = {0}", responseMatch.Groups["encodeKeyID"].Value);
Console.WriteLine("Body = {0}", responseMatch.Groups["encodedBody"].Value);
}
Output
ID = cTZ3W2JG
Body = c6QUklQzv+kRLnmvS0zsheqDsRCXbWFVpYhq2rsElKyFQnla01E6P3qQ26b+xWSrscFJCi2qs
h6WjJKSW5FN9EwxRAWc3rfyToaEhRngI2WPu3W\/b1\/hkkS2tEEk7LEpS2ItxLYKjEQGneO5E9rcGzb
tfSOikdIjhxpD5m9HsKayo2ZSc2EYd\/cN9yfYrNfLOCx+xeGqGcPmmImAzOZM0Q1IXIDQZ5r70vUS1a
UOMOFnN1fVxmQ8ISofEKNEpHFfwWW9QUl9eVDgpQ2HES13s20kvVH2FOlE5uahIJBnyTLWYzViAWYyX1
3VK2PgrQcZ0oLuV84bSbjHGZf+JAjvImuyYhkKDhtTWAGkQ8LZ6r07duo71Gbz3glOUZZyz+FFiH5Krw
afBpzGhRlI8El6lLZASN9Z5iZNTKs+sOti1fzsuOBzUXEjFURJGa93GMqPC+OyTw6YxKvgapz7go1XL7
EAf50UXpMHd9xAJzsuIb6\/lB\/6v+XjD70wc74D2OosxtR6DIbQj3gbaBUBABQsJHQZLNnHWqikh+HL
CASVbkqw6YPm5NecGaOQxonDkh3ZVQSF15WCEsgNdoWG94OuLF8DSw1t1Kt8sMFkvBJgB7LJS3pAw\/T
cg\/rvR5qwZ\/n31rtOzaJgCdVc6XeIK6Rttj4KvNtgtTwkJIjb97FgIS0CXrR0UCp3BsuLwQg3CRTnj
QqsjGgJeinLXX2lq6vNOkxRzgXlWnap96kheYj7sIE\/IxbJJWEIInMbH9HU\/w0bS0MtIgofpIQAZ\/
m6XKhu9LpPZBgsBf9KX5chUPeuRfs3MfHFzPYPlCCd2dkE8BHKOTmfS3okqhkzIZvFN3Ni+7enktFdgf
pQ0toQbjhpn\/TszP34pBaIy77me+GvePNO5bAFECLWSpGvsmW16rLCP0H+xIS\/lNf9hK98jQqGU7eq
pQdai7WFie2yN75Up7MrgBMp5w9Z5C7qpG2iYGiqynpqTCEnfQ3IZumbL+YvGTiuI2c320MGjKzOdO45
MIU5fxUNZQSIfCSyIq5G\/XGIeXCG3KETyKDZygXUTgEWbJWNTADE0AhXvP7HMtsuvstyuHvlTZGcfKS
4oLnDPFiV1ndIV7+W74Ytv9bAdDIVl36xTzA2PV6waqXBfSPUCTx3jVrAeXcHGFjZtxbk3pFmuqPqgVc
xeX\/aDbK0NHkR6phQcFEREBjfdCOLAAbCkWiRF0JAA==

Working with Regex to get 2 strings out of a source code [duplicate]

I am using webrequest to download a source from a page and then I need to use Regex to grab the string and store it in a string:
U_nQgAjU_tdUnfcA7lT5opoTLyLdslWDTpiNzcdkLoHlobS_HbujMw..
also need:
bpvsid=nvnN2JFJqJc.&dcz=1
Both out of:
<td style="cursor:pointer;" class="" onclick="NewWindow('U_nQgAjU_tdUnfcA7lT5opoTLyLdslWDTpiNzcdkLoHlobS_HbujMw..', 'bpvsid=nvnN2JFJqJc.&dcz=1', 'bpvstage_edit', '1200', '800')" onmouseout="HideHover();"><img src="gfx/info.gif" alt="" tipwidth="450" ajaxtip="openajax.php?target=modules/bpv/bpvstage_hover_info.php&rid=&oid=&bpvsid=&bpvname=" /></td>
It keep giving me errors like not enough )'s?
Thanks in advance.
Current code, probably wrong in every way. Really new to this:
Regex rx = new Regex("(?<=class=\"\" onclick=\"NewWindow(').*(?=')");
longId = (rx.Match(textBox2.Text).Value);
textBox1.Text = longId;
var match = Regex.Match(s, #"onclick=""NewWindow\('([^']*)',\s*'([^']*)',.*");
if (match.Success)
{
string longId = match.Groups[1].Value;
string other = match.Groups[2].Value;
}
That will give you two groups with values:
U_nQgAjU_tdUnfcA7lT5opoTLyLdslWDTpiNzcdkLoHlobS_HbujMw..
bpvsid=nvnN2JFJqJc.&dcz=1
The regex NewWindow\('([^']*)', '([^']*) will match what you require. The two strings required will be in Groups[1] and Groups[2].
var match = Regex.Match(textBox2.Text, "NewWindow\('([^']*)', '([^']*)");
var id1 = match.Groups[1].Value;
var id2 = match.Groups[2].Value;
Note that you could also use simply string functions instead of a regex:
var s = "<td style=\"cursor:pointer;\" class=\"\" onclick=\"NewWindow('U_nQgAjU_tdUnfcA7lT5opoTLyLdslWDTpiNzcdkLoHlobS_HbujMw..', 'bpvsid=nvnN2JFJqJc.&dcz=1', 'bpvstage_edit', '1200', '800')\" onmouseout=\"HideHover();\"><img src=\"gfx/info.gif\" alt=\"\" tipwidth=\"450\" ajaxtip=\"openajax.php?target=modules/bpv/bpvstage_hover_info.php&rid=&oid=&bpvsid=&bpvname=\" /></td>";
var tmp = s.Substring(s.IndexOf("NewWindow('")).Split('\'');
var value1 = tmp[1]; // U_nQgAjU_tdUnfcA7lT5opoTLyLdslWDTpiNzcdkLoHlobS_HbujMw..
var value2 = tmp[3]; // bpvsid=nvnN2JFJqJc.&dcz=1
I would use HtmlAgilityPack to parse HTML, then this non-regex approach works:
string html = // get your html ...
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html); // doc.Load can also consume a response-stream directly
var result = Enumerable.Empty<string>();
var firstTD = doc.DocumentNode.SelectNodes("//td").FirstOrDefault();
if (firstTD != null)
{
if (firstTD.Attributes.Contains("onclick"))
{
string onclick = firstTD.Attributes["onclick"].Value;
int newWindowIndex = onclick.IndexOf("newWindow(", StringComparison.OrdinalIgnoreCase);
if (newWindowIndex >= 0)
{
string functionBody = onclick.Substring(newWindowIndex + "newWindow(".Length);
string[] tokens = functionBody.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
result = tokens.Take(2).Select(s => s.Trim(' ', '\''));
}
}
}

Remove HTML from string

I am trying to clear the HTML coding from my RSS feed. I can not work out how to set the below to take out the HTML encoding.
var rssFeed = XElement.Parse(e.Result);
var currentFeed = this.DataContext as app.ViewModels.FeedViewModel;
var items = from item in rssFeed.Descendants("item")
select new ATP_Tennis_App.ViewModels.FeedItemViewModel()
{
Title = item.Element("title").Value,
DatePublished = DateTime.Parse(item.Element("pubDate").Value),
Url = item.Element("link").Value,
Description = item.Element("description").Value
};
foreach (var item in items)
currentFeed.Items.Add(item);
Just use the following code:
var withHtml = "<p>hello <b>there</b></p>";
var withoutHtml = Regex.Replace(withHtml, "<.+?>", string.Empty);
This will clean the html leaving only the text, so "hello there"
So, you can just copy and use this function:
string RemoveHtmlTags(string html) {
return Regex.Replace(html, "<.+?>", string.Empty);
}
Your code will look something like this:
var rssFeed = XElement.Parse(e.Result);
var currentFeed = this.DataContext as app.ViewModels.FeedViewModel;
var items = from item in rssFeed.Descendants("item")
select new ATP_Tennis_App.ViewModels.FeedItemViewModel()
{
Title = RemoveHtmlTags(item.Element("title").Value),
DatePublished = DateTime.Parse(item.Element("pubDate").Value),
Url = item.Element("link").Value,
Description = RemoveHtml(item.Element("description").Value)
};
You can use this code sample, it works fine on my side
public static string RemoveHTMLTags(string value)
{
string step1 = Regex.Replace(value, "<[^>]*>", " ");
string step2 = HttpUtility.HtmlDecode(step1);
return step2;
}
I hope, this code helps you.
Use the following class utility:
HttpUtility.HtmlDecode(string);
Please don't refer this answer no more.

Categories