I am making an application where I need to verify the syntax of each line which contains a command involving a keyword as the first word.
Also, if the syntax is correct I need to check the type of the variables used in the keywords.
Like if there's a print command:
print "string" < variable < "some another string" //some comments
print\s".*"((\s?<\s?".*")*\s?<\s?(?'string1'\w+))?(\s*//.*)?
So i made the following Regex:
\s*[<>]\s*((?'variant'\w+)(\[\d+\])*)
This is to access all words in variant group to extract the variables used and verify their type.
Like this my tool has many keywords and currently I am crudely writing regex for each keyword. And if there's a change tomorrow I would be replacing the respective change everytime everywhere in every keyword.
I am storing a Regex for each keyword in an XML file. However I was interested in making it extensible, where say the specification changes tomorrow so I need to change it only once and it would reflect in all the places something like I transform the print regex to:
print %string% (%<% %string%|%variable%)* %comments%
Now like this, I write a specification for each keyword and write the definition of string, variable, comments in another file which stores their regex. Then I write a parser which parses this string and create a regex string for me.
Is this possible?
Is there any better way of doing this or is there any way I can do this in XML?
Last time I asked a question like this, someone pointed me to http://www.antlr.org/. Enjoy. :-)
I got an idea and made my own replacer. I used %myname% kind of tags to define my regular expression, and i wrote the definition of %myname% tags seperately using regex. Then i scanned the string recursively and converted the occurance of %myname% tags to the specification they had. It did my work.Thanks any ways
Related
I have lots of code like below:
PlusEnvironment.EnumToBool(Row["block_friends"].ToString())
I need to convert them to something like this.
Row["block_friends"].ToString() == "1"
The value that gets passed to EnumToBool is always unique, meaning there is no guarantee that itll be passed by a row, it could be passed by a variable, or even a method that returns a string.
I've tried doing this with regex, but its sort of sketchy and doesn't work 100%.
PlusEnvironment\.EnumToBool\((.*)\)
I need to do this in Visual Studio's find and replace. I'm using VS 17.
If you had a few places where PlusEnvironment.EnumToBool() was called, I would have done the same thing that #IanMercer suggested: just replace PlusEnvironment.EnumToBool( with empty string and the fix all the syntax errors.
#IanMercer has also given you a link to super cool, advanced regex usage that will help you.
But if you are skeptical about using such a complex regex on hundreds of files, here is what I would have done:
Define my own PlusEnvironment class with EnumToBool functionality in my own namespace. And then just replace the using Plus; line with using <my own namespace>; in those hundreds of files. That way my changes will be limited to only the using... line, 1 line per file, and it will be simple find and replace, no regex needed.
(Note: I'm assuming that you don't want to use PlusEnvironment, or the complete library and hence you want to do this type of replacement.)
in Find and Replace Window:
Find:
PlusEnvironment\.EnumToBool\((.*))
Replace:
$1 == "1"
Make sure "Use Regular Expressions" is selected
Is it possible somehow to do a RegEx-replace with a calculation in the result? (in VS2010)
Such as:
Grid\.Row\=\"{[0-9]+}\"
to
Grid.Row="eval(int(\1) + 1)"
You can use a MatchEvaluator do achieve this, like
String s = Regex.Replace("1239", #"\d", m => (Int32.Parse(m.ToString()) + 1).ToString());
Output: 23410
Edit:
I just noticed... if you mean "using the VS2010 find-replace feature" and not "using C#", then the answer is "no", i am afraid.
You could always use capturing to retrieve any values you need for your calculation and then perform a RegEx Replace with a new RegEx that's constructed from you're equation and any values you captured.
If the equation doesn't use anything from the input text, one RegEx would be sufficient. You'd simply construct it by concatenating the static portions together with the computed value(s).
Unfortunately, C# and .NET do not provide an eval method or equivalent. However, it is possible to either use a library for expression parsing (a quick google gave me this .NET Math Expression Parser) or write your own (which is actually pretty easy, check out the Shunting-yard Algorithm and Postfix Notation). Simply capture the group then output the group value to the library/method you have written.
Edit: I see now you want this for the VS2010 program. This is unachievable unless you write your own VS extension. You could always write a program to search and replace your code and feed the code into it, then replace it the original code with its output.
So, Im trying to make a program to rename some files. For the most part, I want them to look like this,
[Testing]StupidName - 2[720p].mkv
But, I would like to be able to change the format, if so desired. If I use MatchEvaluators, you would have to recompile every time. Thats why I don't want to use the MatchEvaluator.
The problem I have is that I don't know how, or if its possible, to tell Replace that if a group was found, include this string. The only syntax for this I have ever seen was something like (?<group>:data), but I can't get this to work. Well if anyone has an idea, im all for it.
EDIT:
Current Capture Regexes =
^(\[(?<FanSub>[^\]\)\}]+)\])?[. _]*(?<SeriesTitle>[\w. ]*?)[. _]*\-[. _]*(?<EpisodeNumber>\d+)[. _]*(\-[. _]*(?<EpisodeName>[\w. ]*?)[. _]*)?([\[\(\{](?<MiscInfo>[^\]\)\}]*)[\]\)\}][. _]*)*[\w. ]*(?<Extension>\.[a-zA-Z]+)$
^(?<SeriesTitle>[\w. ]*?)[. _]*[Ss](?<SeasonNumber>\d+)[Ee](?<EpisodeNumber>\d+).*?(?<Extension>\.[a-zA-Z]+)$
^(?<SeriesTitle>[\w. ]*?)[. _]*(?<SeasonNumber>\d)(?<EpisodeNumber>\d{2}).*?(?<Extension>\.[a-zA-Z]+)$
Current Replace Regex = [${FanSub}]${SeriesTitle} - ${EpisodeNumber} [${MiscInfo}]${Extension}
Using Regex.Replace, the file TestFile 101.mkv, I get []TestFile - 1[].mkv. What I want to do is make it so that [] is only included if the group FanSub or MiscInfo was found.
I can solve this with a MatchEvaluator because I actually get to compile a function. But this would not be a easy solution for users of the program. The only other idea I have to solve this is to actually make my own Regex.Replace function that accepts special syntax.
It sounds like you want to be able to specify an arbitrary format dynamically rather than hard-code it into your code.
Perhaps one solution is to break your filename parts into specific groups then pass in a replacement pattern that takes advantage of those group names. This would give you the ability to pass in different replacement patterns which return the desired filename structure using the Regex.Replace method.
Since you didn't explain the categories of your filename I came up with some random groups to demonstrate. Here's a quick example:
string input = "Testing StupidName Number2 720p.mkv";
string pattern = #"^(?<Category>\w+)\s+(?<Name>.+?)\s+Number(?<Number>\d+)\s+(?<Resolution>\d+p)(?<Extension>\.mkv)$";
string[] replacePatterns =
{
"[${Category}]${Name} - ${Number}[${Resolution}]${Extension}",
"${Category} - ${Name} - ${Number} - ${Resolution}${Extension}",
"(${Number}) - [${Resolution}] ${Name} [${Category}]${Extension}"
};
foreach (string replacePattern in replacePatterns)
{
Console.WriteLine(Regex.Replace(input, pattern, replacePattern));
}
As shown in the sample, named groups in the pattern, specified as (?<Name>pattern), are referred to in the replacement pattern by ${Name}.
With this approach you would need to know the group names beforehand and pass these in to rearrange the pattern as needed.
In the application I am currently working on, I have an option to create automatic backups of a certain file on the hard disk. What I would like to do is offer the user the possibility to configure the name of the file and its extension.
For example, the backup filename could be something like : "backup_month_year_username.bak". I had the idea to save the format in the form of a regular expression. For the example above, the regexp would look like :
"^backup_(?<Month>\d{2})_(?<Year>\d{2})_(?<Username>\w).(?<extension>bak)$"
I thought about using regex because I will also have to browse through the directory of backuped files to delete those older than a certain date. The main trouble I have now is how to create a filename using the regex. In a way I should replace the tags with the information. I could do that using regex.replace and another regex, but I feel it's a big weird doing that and it might be a better way.
Thanks
[Edit] Maybe I wasn't really clear in the first go, but the idea is of course that the user (in this case an admin that will know regex syntax) will have the possibility to modify the form of the filename, that's all the idea behind it[/Edit]
... and if the regex changes, it is next to impossible to reconstruct a string from a given regex.
Edit:
Create some predefined "place-holders": %u could be the user's name, %y could be the year, etc.:
backup_%m_%y_%u.bak
and then simple replace the %? with their actual values.
It sounds like you're trying to use the regular expression to create the file name from a pattern which the user should be able to specify.
Regular expressions can - AFAIK - not be used to create output, but only to validate input, so you'd have the user specify two things:
a file name production pattern like Bart suggested
a validation pattern in form of a regular expression that helps you split the file names into their parts
EDIT
By the way, your sample regex contains an error: The "." is use for "any character", also \w only matches one word character, so I guess you meant to write
"^backup_(?<Month>\d{2})_(?<Year>\d{2})_(?<Username>\w+)\.(?<extension>bak)$"
If the filename is always in this form, there is no reason for a regex, as it's easier to process with string.Split ...
With Bart's solution it is easy enough to split (using string.Split) the generated file name using underscore as the delimiter, to get back the information.
Ok, I think I have found a way to use only the regex. As I am using groups to get the information, I will use another regular expression to match the regular expression and replace the groups with the value:
Regex rgx = new Regex("\(\?\<Month\>.+?\)");
rgx.Replace("^backup_(?<Month>\d{2})_(?<Year>\d{2})_(?<Username>\w+)\.(?<extension>bak)$"
, DateTime.Now.Month.ToString());
Ok, it's really a hack, but at least it works and I have only one pattern defined by the user. It might not work if the regex is too complex, but I think I can deal with that problem.
What do you think?
I am wondering if it is possible to extract the index position in a given string where a Regex failed when trying to match it?
For example, if my regex was "abc" and I tried to match that with "abd" the match would fail at index 2.
Edit for clarification. The reason I need this is to allow me to simplify the parsing component of my application. The application is an Assmebly language teaching tool which allows students to write, compile, and execute assembly like programs.
Currently I have a tokenizer class which converts input strings into Tokens using regex's. This works very well. For example:
The tokenizer would produce the following tokens given the following input = "INP :x:":
Token.OPCODE, Token.WHITESPACE, Token.LABEL, Token.EOL
These tokens are then analysed to ensure they conform to a syntax for a given statement. Currently this is done using IF statements and is proving cumbersome. The upside of this approach is that I can provide detailed error messages. I.E
if(token[2] != Token.LABEL) { throw new SyntaxError("Expected label");}
I want to use a regular expression to define a syntax instead of the annoying IF statements. But in doing so I lose the ability to return detailed error reports. I therefore would at least like to inform the user of WHERE the error occurred.
I agree with Colin Younger, I don't think it is possible with the existing Regex class. However, I think it is doable if you are willing to sweat a little:
Get the Regex class source code
(e.g.
http://www.codeplex.com/NetMassDownloader
to download the .Net source).
Change the code to have a readonly
property with the failure index.
Make sure your code uses that Regex
rather than Microsoft's.
I guess such an index would only have meaning in some simple case, like in your example.
If you'll take a regex like "ab*c*z" (where by * I mean any character) and a string "abbbcbbcdd", what should be the index, you are talking about?
It will depend on the algorithm used for mathcing...
Could fail on "abbbc..." or on "abbbcbbc..."
I don't believe it's possible, but I am intrigued why you would want it.
In order to do that you would need either callbacks embedded in the regex (which AFAIK C# doesn't support) or preferably hooks into the regex engine. Even then, it's not clear what result you would want if backtracking was involved.
It is not possible to be able to tell where a regex fails. as a result you need to take a different approach. You need to compare strings. Use a regex to remove all the things that could vary and compare it with the string that you know it does not change.
I run into the same problem came up to your answer and had to work out my own solution. Here it is:
https://stackoverflow.com/a/11730035/637142
hope it helps