{%if Lang="english", Site="testsite"}
content
{%endif}
I need to get the groups Lang, Site and the content
This is what I am using to get the content part
".*}(.*){%.*"
You can capture groups by using ?<groupname>.
This is a very crude regex to get the groups you want:
\{\%if\s.*(Lang=\"(?<lang>[^\"]*))\".*(Site=\"(?<site>[^\"]*))\"\}(?<content>[^\{]*)\{\%endif\}
when you use a regex from c# you can get the groups by using:
var _regex = new RegEx(...);
var _language = _regex.Groups["lang"].Value;
I'm not sure how flexible you need your regex to be, but try something like:
{%if Lang="(.*)",\s.*Site="(.*?)"}(\r\n)*?(.*?)(\r\n)*?{%endif}
You can then get the matches from there.
Related
I have an interesting problem for which I want to find a best solution I have tried my best with regex . What I want is to find all the col_x values from this string using C# using regular expression or any other method.
[col_5] is a central heating boiler manufacturer produce boilers under [col_6]
brand name . Your selected [col_7] model name is a [col_6] [col_15] boiler.
[col_6] [col_15] boiler [col_7] model [col_10] came in production untill
[col_11]. [col_6] model product index number is [col_1] given by SEDBUK
'Seasonal Efficiency of a Domestic Boiler in the UK'. [col_6] model have
qualifier [col_8] and GCN [col_9] 'Boiler Gas Council No'. [col_7] model
source of heat for a boiler combustion is a [col_12].
The output expected is an array
var data =["col_5","col_10","etc..."]
Edit
my attempt :
string text = "[col_1]cc[col_2]asdfsd[col_3]";
var matches = Regex.Matches(text, #"[[^#]*]");
var uniques = matches.Cast<Match>().Select(match => match.Value).ToList().Distinct();
foreach(string m in uniques)
{
Console.WriteLine(m);
}
but no success.
Try something like this:
string[] result = Regex.Matches(input, #"\[(col_\d+)\]").
Cast<Match>().
Select(x => x.Groups[1].Value).
ToArray();
I think that's what you need:
string pattern = #"\[(col_\d+)\]";
MatchCollection matches = Regex.Matches(input, pattern);
string[] results = matches.Cast<Match>().Select(x => x.Groups[1].Value).ToArray();
Replace input with your input string.
I hope it helps
This is a little hacky but you could do this.
var myMessage =#"[col_5] is a central heating boiler..."; //etc.
var values = Enumerable.Range(1, 100)
.Select(x => "[col_" + x + "]")
.Where(x => myMessage.Contains(x))
.ToList();
Assuming there is a known max col_"x" in this case I assumed 100, it just tries them all by brute force returning only the ones that it finds inside the text.
If you know that there are only so many columns to hunt for, I would try this instead of Regex personally as I have had too many bad experiences burning hours on Regex.
I'm writing a function that will parse a file similar to an XML file from a legacy system.
....
<prod pid="5" cat='gov'>bla bla</prod>
.....
<prod cat='chi'>etc etc</prod>
....
.....
I currently have this code:
buf = Regex.Replace(entry, "<prod(?:.*?)>(.*?)</prod>", "<span class='prod'>$1</span>");
Which was working fine until it was decided that we also wanted to show the categories.
The problem is, categories are optional and I need to run the category abbreviation through a SQL query to retrieve the category's full name.
eg:
SELECT * FROM cats WHERE abbr='gov'
The final output should be:
<span class='prod'>bla bla</span><span class='cat'>Government</span>
Any idea on how I could do this?
Note1: The function is done already (except this part) and working fine.
Note2: Cannot use XML libraries, regex has to be used
Regex.Replace has an overload that takes a MatchEvaluator, which is basically a Func<Match, string>. So, you can dynamically generate a replacement string.
buf = Regex.Replace(entry, #"<prod(?<attr>.*?)>(?<text>.*?)</prod>", match => {
var attrText = match.Groups["attr"].Value;
var text = match.Groups["text"].Value;
// Now, parse your attributes
var attributes = Regex.Matches(#"(?<name>\w+)\s*=\s*(['""])(?<value>.*?)\1")
.Cast<Match>()
.ToDictionary(
m => m.Groups["name"].Value,
m => m.Groups["value"].Value);
string category;
if (attributes.TryGetValue("cat", out category))
{
// Your SQL here etc...
var label = GetLabelForCategory(category)
return String.Format("<span class='prod'>{0}</span><span class='cat'>{1}</span>", WebUtility.HtmlEncode(text), WebUtility.HtmlEncode(label));
}
// Generate the result string
return String.Format("<span class='prod'>{0}</span>", WebUtility.HtmlEncode(text));
});
This should get you started.
What I need:
I have a string like this:
Bike’s: http://website.net/bikeurl Toys: http://website.net/rc-cars
Calendar: http://website.net/schedule
I want to match the word I specify and the first URL after it. So if i specify the word as "Bike" i should get:
Bike’s: http://website.net/bikeurl
Or if possible only the url of the Bike word:
http://website.net/bikeurl
Or if I specify Toys I should get:
Toys: http://website.net/rc-cars
or if possible
http://website.net/rc-cars
What I am using:
I am using this regex:
(Bike)(.*)((https?|ftp):/?/?)(?:(.*?)(?::(.*?)|)#)?([^:/\s]+)(:([^/]*))?(((?:/\w+)*)/)([-\w.]+[^#?\s]*)?(\?([^#]*))?(#(.*))?
Result:
It is matching:
Bike’s: http://website.net/bikeurl Toys: http://website.net/rc-cars
I only want:
Bike’s: http://website.net/bikeurl
I am not a regex expert, I tried using {n} {n,} but it either didn't match anything or matches the same
I am using .NET C# so I am testing here http://regexhero.net/tester/
Here is another approach:
Bike(.*?):\s\S*
and here is an example how to get the corresponding URL-candidate only:
var inputString = "Bike’s: http://website.net/bikeurl Toys: http://website.net/rc-cars Calendar: http://website.net/schedule";
var word = "Bike";
var url = new Regex( word + #"(.*?):\s(?<URL>\S*)" )
.Match( inputString )
.Result( "${URL}" );
If you really need to make sure it's an url look at this:
Validate urls with Regex
Regex to check a valid Url
Here's another solution. I would separate the Bike's, Toys and Calendar in a dictionary and put the url as a value then when needed call it.
Dictionary<string, string> myDic = new Dictionary<string, string>()
{
{ "Bike’s:", "http://website.net/bikeurl" },
{ "Toys:", "http://website.net/rc-cars" },
{ "Calendar:", "http://website.net/schedule" }
};
foreach (KeyValuePair<string, string> item in myDic)
{
if (item.Key.Equals("Bike's"))
{
//do something
}
}
Hope one of my ideas helps you.
If I understood your problem correctly. You need a generic regex that will select a url based on a word. Here is one that would select the url with bike in it:
(.(?<!\s))*\/\/((?!\s).)*bike((?!\s).)*
If you replace bike with any other word. It would select the respective URL's.
EDIT 1:
Based on your edit, here is one that would select based on the word preceding the URL:
(TOKEN((?!\s).)*\s+)((?!\s).)*
It would select the word + the URL eg.
(Bike((?!\s).)*\s+)((?!\s).)* would select Bike’s: http://website.net/bikeurl
(Toy((?!\s).)*\s+)((?!\s).)* would select Toys: http://website.net/rc-cars
(Calendar((?!\s).)*\s+)((?!\s).)* would select Calendar: http://website.net/schedule
If you want to make sure the string contains a URL, you can use this instead:
(TOKEN((?!\s).)*\s+)((?!\s).)*\/\/((?!\s).)*
It will make sure that the 2nd part of the string ie. the one that is supposed to contain a URL has a // in between.
I'm new to .NET and having a hard time trying to understand the Regex object.
What I'm trying to do is below. It's pseudo-code; I don't know the actual code that makes this work:
string pattern = ...; // has multiple groups using the Regex syntax <groupName>
if (new Regex(pattern).Apply(inputString).HasMatches)
{
var matches = new Regex(pattern).Apply(inputString).Matches;
return new DecomposedUrl()
{
Scheme = matches["scheme"].Value,
Address = matches["address"].Value,
Port = Int.Parse(matches["address"].Value),
Path = matches["path"].Value,
};
}
What do I need to change to make this code work?
There is no Apply method on Regex. Seems like you may be using some custom extension methods that aren't shown. You also haven't shown the pattern you're using. Other than that, groups can be retrieved from a Match, not a MatchCollection.
Regex simpleEmail = new Regex(#"^(?<user>[^#]*)#(?<domain>.*)$");
Match match = simpleEmail.Match("someone#tempuri.org");
String user = match.Groups["user"].Value;
String domain = match.Groups["domain"].Value;
A Regex instance on my machine doesn't have the Apply method. I'd usually do something more like this:
var match=Regex.Match(input,pattern);
if(match.Success)
{
return new DecomposedUrl()
{
Scheme = match.Groups["scheme"].Value,
Address = match.Groups["address"].Value,
Port = Int.Parse(match.Groups["address"].Value),
Path = match.Groups["path"].Value
};
}
I am using the Lucene.NET API directly in my ASP.NET/C# web application. When I search using a wildcard, like "fuc*", the highlighter doesn't highlight anything, but when I search for the whole word, like "fuchsia", it highlights fine. Does Lucene have the ability to highlight using the same logic it used to match with?
Various maybe-relevant code-snippets below:
var formatter = new Lucene.Net.Highlight.SimpleHTMLFormatter(
"<span class='srhilite'>",
"</span>");
var fragmenter = new Lucene.Net.Highlight.SimpleFragmenter(100);
var scorer = new Lucene.Net.Highlight.QueryScorer(query);
var highlighter = new Lucene.Net.Highlight.Highlighter(formatter, scorer);
highlighter.SetTextFragmenter(fragmenter);
and then on each hit...
string description = Server.HtmlEncode(doc.Get("Description"));
var stream = analyzer.TokenStream("Description",
new System.IO.StringReader(description));
string highlighted_text = highlighter.GetBestFragments(
stream, description, 1, "...");
And I'm using the QueryParser and the StandardAnalyzer.
you'll need to ensure you set the parser rewrite method to SCORING_BOOLEAN_QUERY_REWRITE.
This change seems to have become necessary since Lucene v2.9 came along.
Hope this helps,