Edit Hyperlink in a PDF Using iTextSharp [duplicate] - c#

Supposed I have the following string:
string str = "<tag>text</tag>";
And I would like to change 'tag' to 'newTag' so the result would be:
"<newTag>text</newTag>"
What is the best way to do it?
I tried to search for <[/]*tag> but then I don't know how to keep the optional [/] in my result...

Why use regex when you can do:
string newstr = str.Replace("tag", "newtag");
or
string newstr = str.Replace("<tag>","<newtag>").Replace("</tag>","</newtag>");
Edited to #RaYell's comment

To make it optional, simply add a "?" AFTER THE "/", LIKE THIS:
<[/?]*tag>

string str = "<tag>text</tag>";
string newValue = new XElement("newTag", XElement.Parse(str).Value).ToString();

Your most basic regex could read something like:
// find '<', find an optional '/', take all chars until the next '>' and call it
// tagname, then take '>'.
<(/?)(?<tagname>[^>]*)>
If you need to match every tag.
Or use positive lookahead like:
<(/?)(?=(tag|othertag))(?<tagname>[^>]*)>
if you only want tag and othertag tags.
Then iterate through all the matches:
string str = "<tag>hoi</tag><tag>second</tag><sometag>otherone</sometag>";
Regex matchTag = new Regex("<(/?)(?<tagname>[^>]*)>");
foreach (Match m in matchTag.Matches(str))
{
string tagname = m.Groups["tagname"].Value;
str = str.Replace(m.Value, m.Value.Replace(tagname, "new" + tagname));
}

var input = "<tag>text</tag>";
var result = Regex.Replace(input, "(</?).*?(>)", "$1newtag$2");

Related

Regex Ignore first and last terminator

I have string in text that have uses | as a delimiter.
Example:
|2P|1|U|F8|
I want the result to be 2P|1|U|F8. How can I do that?
The regex is very easy, but why not just use Trim():
var str = "|2P|1|U|F8|";
str = str.Trim(new[] {'|'});
or just without new[] {...}:
str = str.Trim('|');
Output:
In case there are leading/trailing whitespaces, you can use chained Trims:
var str = "\r\n |2P|1|U|F8| \r\n";
str = str.Trim().Trim('|');
Output will be the same.
You can use String.Substring:
string str = "|2P|1|U|F8|";
string newStr = str.Substring(1, str.Length - 2);
Just remove the starting and the ending delimiter.
#"^\||\|$"
Use the below regex and then replace the match with an empty string.
Regex rgx = new Regex(#"^\||\|$");
string result = rgx.Replace(input, "");
Use mulitline modifier m when you're dealing with multiple lines.
Regex rgx = new Regex(#"(?m)^\||\|$");
Since | is a special char in regex, you need to escape this in-order to match a literal | symbol.
string input = "|2P|1|U|F8|";
foreach (string item in input.Split("|".ToCharArray(), StringSplitOptions.RemoveEmptyEntries))
{
Console.WriteLine(item);
}
Result is:
2P
1
U
F8
^\||\|$
You can try this.Replace by empty string.Use verbatim mode.See demo.
https://regex101.com/r/oF9hR9/14
For completionists-sake, you can also use Mid
Strings.Mid("|2P|1|U|F8|", 2, s.Length - 2)
This will cut out the part from the second character to the previous to last one and produce the correct output.
I'm assuming that at some point you will want to parse the string to extract its '|' separated components, so here goes another alternative that goes in that direction:
string.Join("|", theString.Split(new[] {'|'}, StringSplitOptions.RemoveEmptyEntries))

C# extract string using regular expression

I have a html string which i'm parsing which looks like below. I need to get the value of #Footer.
strHTML = "<html><html>\r\n\r\n<head>\r\n<meta http-equiv=Content-Type
content=\"text/html; charset=windows-1252\">\r\n
<meta name=Generator content=\"Microsoft Word 14></head></head><body>
<p>#Footer=CONFIDENTIAL<p></body></html>"
I have tried the below code, how do i get the value?
Regex m = new Regex("#Footer", RegexOptions.Compiled);
foreach (Match VariableMatch in m.Matches(strHTML.ToString()))
{
Console.WriteLine(VariableMatch);
}
You need to capture the value after the =. This will work, as long as the value cannot contain any < characters:
Regex m = new Regex("#Footer=([^<]+)", RegexOptions.Compiled);
foreach (Match VariableMatch in m.Matches(strHTML.ToString()))
{
Console.WriteLine(VariableMatch.Groups[1].Value);
}
You can do this with regex, but it's not necessary. One simple way to do this would be:
var match = strHTML.Split(new string[] { "#Footer=" }, StringSplitOptions.None).Last();
match = match.Substring(0, match.IndexOf("<"));
This assumes that your html string only has one #Footer.
Your regex will match the string "#Footer". The value of the match will be "#Footer".
Your regex should look like this instead :
Regex regex = new Regex("#Footer=[\w]+");
string value = match.Value.Split('=')[1];
Use a matching group.
Regex.Matches(strHTML, #"#Footer=(?<VAL>([^<\n\r]+))").Groups["VAL"].Value;
If that's all your string, we can use string methods to solve it without touching regex stuff:
var result = strHTML.Split(new string[]{"#Footer=", "<p>"}, StringSplitOptions.RemoveEmptyEntries)[1]

How to remove the exact occurence of characters from a string?

For Example, I have a string like :
string str = "santhosh,phani,ravi,phani123,praveen,sathish,prakash";
I want to delete the charaters ,phani from str.
Now, I am using str = str.Replace(",phani", string.Empty);
then my output is : str="santhosh,ravi123,praveen,sathish,prakash";
But I want a output like : str="santhosh,ravi,phani123,praveen,sathish,prakash";
string str = "santhosh,phani,ravi,phani123,praveen,sathish,prakash";
var words = str.Split(',');
str = String.Join(",", words.Where(word => word != "phani"));
the better choice is to use a Split and Join method.
Easy in Linq :
String str = "santhosh,phani,ravi,phani123,praveen,sathish,prakash";
String token = "phani";
String result = String.Join(",", str.Split(',').Where(s => s != token));
(edit : I take time for testing and i'm not first ^^)
String.join(",", str.split(',').ToList().remove("phani"));
Removes any given name from the list.
How about
str = str.Replace(",phani,", ",");
This, however, does not work if "phani" is the last item in the string. To get around this, you could do this:
string source = "...";
source += ","; // Explicitly add a comma to the end
source = source.Replace(",phani,", ",").TrimEnd(',');
This adds a comma, replaces "phani" and removes the trailing comma.
A third solution would be this:
str = String.Join(",", str.Split(',').ToList().Remove("phani").ToArray());
Try to use with comma instead of;
string str = "santhosh,ravi,phani,phani123,praveen,sathish,prakash";
str = str.Replace(",phani,", ",");
Console.WriteLine(str);
Output will be;
santhosh,ravi,phani123,praveen,sathish,prakash
Here is a DEMO.
As Davin mentioned in comment, this won't work if phani is last item in the string. Silvermind's answer looks like the right answer.
string str = "santhosh,phani,ravi,phani123,praveen,sathish,prakash";
string pattern = #"\b,phani,\b";
string replace = ",";
Console.WriteLine(Regex.Replace(str, pattern, replace));
Output:
santhosh,ravi,phani123,praveen,sathish,prakash
You may use the regular expression, but you have to take care of cases when your string starts or ends with the substring:
var pattern = #",?\bphani\b,?";
var regex = new Regex(pattern);
var result = regex.Replace(input, ",").Trim(',');
Shorter notation could look like this:
var result = Regex.Replace(input, #",?\bphani\b,?", ",").Trim(',');
Explanation of the regular expression: ,?\bphani\b,? matches the word phani, but only if preceded and followed by word-delimiter characters (because of the word boundary metacharacter \b), and it can be (but doesn't have to be) preceded and followed by the comma thanks to ,? which means none or more comma(s).
At the end we need to remove possible commas from the beginning and end of the string, that's why there's Trim(',') on the result.

Replace any string between quotes

Problem:
Cannot find a consistent way to replace a random string between quotes with a specific string I want. Any help would be greatly appreciated.
Example:
String str1 = "test=\"-1\"";
should become
String str2 = "test=\"31\"";
but also work for
String str3 = "test=\"foobar\"";
basically I want to turn this
String str4 = "test=\"antyhingCanGoHere\"";
into this
String str4 = "test=\"31\"";
Have tried:
Case insensitive Regex without using RegexOptions enumeration
How do you do case-insensitive string replacement using regular expressions?
Replace any character in between AnyText: and <usernameredacted#example.com> with an empty string using Regex?
Replace string in between occurrences
Replace a String between two Strings
Current code:
Regex RemoveName = new Regex("(?VARIABLE=\").*(?=\")", RegexOptions.IgnoreCase);
String convertSeccons = RemoveName.Replace(ruleFixed, "31");
Returns error:
System.ArgumentException was caught
Message=parsing "(?VARIABLE=").*(?=")" - Unrecognized grouping construct.
Source=System
StackTrace:
at System.Text.RegularExpressions.RegexParser.ScanGroupOpen()
at System.Text.RegularExpressions.RegexParser.ScanRegex()
at System.Text.RegularExpressions.RegexParser.Parse(String re, RegexOptions op)
at System.Text.RegularExpressions.Regex..ctor(String pattern, RegexOptions options, Boolean useCache)
at System.Text.RegularExpressions.Regex..ctor(String pattern, RegexOptions options)
at application.application.insertGroupID(String rule) in C:\Users\winserv8\Documents\Visual Studio 2010\Projects\application\application\MainFormLauncher.cs:line 298
at application.application.xmlqueryDB(String xmlSaveLocation, TextWriter tw, String ruleName) in C:\Users\winserv8\Documents\Visual Studio 2010\Projects\application\application\MainFormLauncher.cs:line 250
InnerException:
found answer
string s = Regex.Replace(ruleFixed, "VARIABLE=\"(.*)\"", "VARIABLE=\"31\"");
ruleFixed = s;
I found this code sample at Replace any character in between AnyText: and with an empty string using Regex? which is one of the links i previously posted and just had skipped over this syntax because i thought it wouldnt handle what i needed.
var str1 = "test=\"foobar\"";
var str2 = str1.Substring(0, str1.IndexOf("\"") + 1) + "31\"";
If needed add check for IndexOf != -1
I don't know if I understood you correct, but if you want to replace all chars inside string, why aren't you using simple regular expresission
String str = "test=\"-\"1\"";
Regex regExpr = new Regex("\".*\"", RegexOptions.IgnoreCase);
String result = regExpr.Replace(str , "\"31\"");
Console.WriteLine(result);
prints:
test="31"
Note: You can take advantage of plain old XAttribute
String ruleFixed = "test=\"-\"1\"";
var splited = ruleFixed.Split('=');
var attribute = new XAttribute(splited[0], splited[1]);
attribute.Value = "31";
Console.WriteLine(attribute);//prints test="31"
var parts = given.Split('=');
return string.Format("{0}=\"{1}\"", parts[0], replacement);
In the case that your string has other things in it besides just the key/value pair of key="value", then you need to make the value-match part not match quote marks, or it will match all the way from the first value to the last quote mark in the string.
If that is true, then try this:
Regex.Replace(ruleFixed, "(?<=VARIABLE\s*=\s*\")[^\"]*(?=\")", "31");
This uses negative look-behind to match the VARIABLE=" part (with optional white space around it so VARIABLE = " would work as well, and negative look-ahead to match the ending ", without including the look-ahead/behind in the final match, enabling you to just replace the value you want.
If not, then your solution will work, but is not optimal because you have to repeat the value and the quote marks in the replace text.
Assuming that the string within the quotes does not contain quotes itself, you can use this general pattern in order to find a position between a prefix and a suffix:
(?<=prefix)find(?=suffix)
In your case
(?<=\w+=").*?(?=")
Here we are using the prefix \w+=" where \w+ denotes word characters (the variable) and =" are the equal sign and the quote.
We want to find anything .*? until we encounter the next quote.
The suffix is simply the quote ".
string result = Regex.Replace(input, "(?<=\\w+=\").*?(?=\")", replacement);
Try this:
[^"\r\n]*(?:""[\r\n]*)*
var pattern = "\"(.*)?\"";
var regex = new Regex(pattern, RegexOptions.IgnoreCase);
var replacement = regex.Replace("test=\"hereissomething\"", "\"31\"");
string s = Regex.Replace(ruleFixed, "VARIABLE=\"(.*)\"", "VARIABLE=\"31\"");
ruleFixed = s;
I found this code sample at Replace any character in between AnyText: and <usernameredacted#example.com> with an empty string using Regex? which is one of the links i previously posted and just had skipped over this syntax because i thought it wouldnt handle what i needed.
String str1 = "test=\"-1\"";
string[] parts = str1.Split(new[] {'"'}, 3);
string str2 = parts.Length == 3 ? string.Join(#"\", parts.First(), "31", parts.Last()) : str1;
String str1 = "test=\"-1\"";
string res = Regex.Replace(str1, "(^+\").+(\"+)", "$1" + "31" + "$2");
Im pretty bad at RegEx but you could make a simple ExtensionMethod using string functions to do this.
public static class StringExtensions
{
public static string ReplaceBetweenQuotes(this string str, string replacement)
{
if (str.Count(c => c.Equals('"')) == 2)
{
int start = str.IndexOf('"') + 1;
str = str.Replace(str.Substring(start, str.LastIndexOf('"') - start), replacement);
}
return str;
}
}
Usage:
String str3 = "test=\"foobar\"";
str3 = str3.ReplaceBetweenQuotes("31");
returns: "test=\"31\""

How do I replace part of a string in C#?

Supposed I have the following string:
string str = "<tag>text</tag>";
And I would like to change 'tag' to 'newTag' so the result would be:
"<newTag>text</newTag>"
What is the best way to do it?
I tried to search for <[/]*tag> but then I don't know how to keep the optional [/] in my result...
Why use regex when you can do:
string newstr = str.Replace("tag", "newtag");
or
string newstr = str.Replace("<tag>","<newtag>").Replace("</tag>","</newtag>");
Edited to #RaYell's comment
To make it optional, simply add a "?" AFTER THE "/", LIKE THIS:
<[/?]*tag>
string str = "<tag>text</tag>";
string newValue = new XElement("newTag", XElement.Parse(str).Value).ToString();
Your most basic regex could read something like:
// find '<', find an optional '/', take all chars until the next '>' and call it
// tagname, then take '>'.
<(/?)(?<tagname>[^>]*)>
If you need to match every tag.
Or use positive lookahead like:
<(/?)(?=(tag|othertag))(?<tagname>[^>]*)>
if you only want tag and othertag tags.
Then iterate through all the matches:
string str = "<tag>hoi</tag><tag>second</tag><sometag>otherone</sometag>";
Regex matchTag = new Regex("<(/?)(?<tagname>[^>]*)>");
foreach (Match m in matchTag.Matches(str))
{
string tagname = m.Groups["tagname"].Value;
str = str.Replace(m.Value, m.Value.Replace(tagname, "new" + tagname));
}
var input = "<tag>text</tag>";
var result = Regex.Replace(input, "(</?).*?(>)", "$1newtag$2");

Categories