How to extract string from java properties - c#

Here is the java properties content
xxx_error_tx1 = This is xxxx. Johe say:
xxx_error_MapCode = xxx_error_tx1, test this function,Failed,\
Default, Current,\
App_Error_tx1
I need to extract string ID and string content, I can extract line1 content correctly, but the second line content extract only the first string xxx_error_tx1, test this function,Failed,\. The rest of string cannot extract.
The regex string is (?<ID>.+?)=(?<Translation>.+?)$, I know this regex have some problem, but I've tried to modify to correct pattern but maybe I am newbie, the result still cannot meet my request.
Any help would be appreciated.

Seems like you want something like this,
(?<ID>.+?)=(?<Translation>(?:(?!\S+\s*=)[\s\S])+)
DEMO
(?:(?!\S+\s*=)[\s\S])+ Matches one or more space or non-space characters which won't contain the string which was matched by this \S+\s*= pattern.

Try this, it correctly include the whole value when the value is splited on multiple lines but stop before line that follow.
(?<ID>.+?)=(?<Translation>(?:.*\\\s)*.*)
DEMO

Related

How to remove a pattern from a string using Regex

I want to find paths from a string and remove them, e.g.:
string1 = "'c:\a\b\c'!MyUDF(param1, param2,..) + 'c:\a\b\c'!MyUDF(param3, param4,..)..."`
I'd like a regex to find the pattern '[some path]'!MyUDF, and remove '[path]'.
Thanks.
Edit:
Example input:
string1 = "'c:\a\b\c'!MyUDF(param1, param2,..) + 'c:\a\b\c'!MyUDF(param3, param4,..)";
Expected output: "MyUDF(param1, param2,...) + MyUDF(param3, param4,...)"
where MyUDF is a function name, so it consists of only letters
input=Regex.Replace(input,"'[^']+'(?=!MyUDF)","");
In case if the path is followed by ! and some other word you can use
input=Regex.Replace(input,#"'[^']+'(?=!\w+)","");
Alright, if the ! is always in the string as you suggest, this Regex !(.*)?\( will get you what you want. Here is a Regex 101 to prove it.
To use it, you might do something like this:
var result = Regex.Replace(myString, #"!(.*)?\(");
The feature you want, if you are dealing with file paths, is in System.Path.
There are many methods there, but that is one of it's specific purposes.

"Evaluate" a c# string

I am reading a C# source file.
When I encounter a string, I want to get it's value.
For instance, in the following example:
public class MyClass
{
public MyClass()
{
string fileName = "C:\\Temp\\A Weird\"FileName";
}
}
I would like to retrieve
C:\Temp\A Weird"FileName
Is there an existing procedure to do that?
Coding a solution with all the possible cases should be quite tricky (#, escape sequences. ...).
I am convinced such procedure exists...
I would like to have the dual function too (to inject a string into a C# source file)
Thanks in advance.
Philippe
P.S:
I gave an example with a filename, but I look for a solution working for all kinds of strings.
I'm pretty sure you can use CodeDOM to read a C# code file and parse its elements. It generates a code tree, and then you can look for nodes representing strings.
http://www.codeproject.com/Articles/2502/C-CodeDOM-parser
Other CodeDom parsers:
http://www.codeproject.com/Articles/14383/An-Expression-Parser-for-CodeDom
NRefactory: https://github.com/icsharpcode/NRefactory and http://www.codeproject.com/Articles/408663/Using-NRefactory-for-analyzing-Csharp-code
There is a way of extracting these strings using a regular expression:
("(\\"|[^"])*")
This particular one works on your simple example and gives the filename (complete with leading and trailing quote characters); whether it would work on more complex ones I can't easily tell unfortunately.
For clarity, (\\"|[^"]) matches any character apart from ", except where it has a leading \ character.
Just use ".*" Regex to match all string values, then remove trailing inverted commas and unescape it.
this will allow \" and "" characters inside your string
so both "C:\\Temp\\A Weird\"FileName" and "Hello ""World""" will match

C# Regular expression problem

I have the following string:
http://www.powerwXXe.com/text1 123-456 text2 text3/
Can someone give me advice on how to get the value of text1, text2 and text3 and put them into a string. I have heard of regular expressions but have no idea how to use them.
Instead of going the RegEx route, if you know that the string will always be of a similar format, you can using string.Split, first on /, then on space and retrieve the results from the resulting string arrays.
string[] slashes = myString.Split('/');
string[] textVals = slashes[3].Split(' ');
// at this point:
// textVals[0] = "text1"
// textVals[1] = "123-456"
// textVals[2] = "text2"
// textVals[3] = "text3"
Here is a link on getting started with regular expressions in C#:Regular Expression Tutorial
I don't think it is appropriate to write out a tutorial here since the information is online, so please check out the link and let me know if you have a specific question.
Instead of using regex, you can use string.Fromat("http://myurl.com/{0}{1}{2}", value1, textbox2.Text, textbox3.Text) and format the url in whatever fashion. If you are looking to go the regex route, you can always check regexlib.
The use of regular expressions relies on patterns you see in your strings - you need to be able to generalize the pattern of strings you're looking for before you can use a regular expression.
For a problem of this scope, if you can pin down the pattern, you're probably better off using other string parsing methods, such as String.IndexOf and String.Split.
Regular expressions is a powerful tool, and certainly worth learning, but it might not be necessary here.
Based on the example you gave, it looks as though text1, text2 and text3 are separated by spaces? If so, and if you always know the positions they'll be in, you may want to skip regular expressions and just use .Split(' ') to split the string into an array of strings and then grab the pertinent items from there. Something like this:
string foo = "http://www.powerwXXe.com/text1 123-456 text2 text3/"
string[] fooParts = foo.Split(' ');
string text1 = fooParts[0].Replace("http://www.powerwXXe.com/", "");
string text2 = fooParts[2];
string text3 = fooParts[3].Replace("/", "");
You'd want to perform bounds checking on the string[] before trying to grab anything from it, but this would work. Regex is awesome for string parsing, but when it's simple stuff you need to do, sometimes it's overkill when simple methods from the string class will do.
It all depends on how much you know about about the string you are parsing. Where does the string come from and how much do you know about it's formating?
Based on your example string you could get away with something as simple as
string pattern = #"http://www.powerwXXe.com/(?<myGroup1>\S+)\s\S+\s(?<myGroup2>\S+)\s(?<myGroup3>\S+)/";
var reg = new System.Text.RegularExpressions.Regex(pattern);
string input = "http://www.powerwXXe.com/text1 123-456 text2 text3/";
System.Text.RegularExpressions.Match myMatch = reg.Match(input);
The caputerd strings would then be contained in myMatch.Groups["myGroup1"], ["myGroup2"], ["myGroup3"] respectivly.
This however assumes that your string always begins with http://www.powerwXXe.com/, that there will always be three groups to capture and that the groups are separated by a space (which is an illegal character in url's and would in almost all cases be converted to %20, which would have to be accounted for in the pattern).
So, how much do you know about your string? And, as some has already stated, do you really need regular expressions?

Extracting a string starting with x and ending with y

First of all, I did a search on this and was able to find how to use something like String.Split() to extract the string based on a condition. I wasn't able to find however, how to extract it based on an ending condition as well. For example, I have a file with links to images: http://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpg
You will notice that all the images start with http:// and end with .jpg. However, .jpg is succeeded by http:// without a space, making this a little more difficult.
So basically I'm trying to find a way (Regex?) to extract a string from a string that starts with http:// and ends with .jpg
Regex is the easiest way to do this. If you're not familiar with regular expressions, you might check out Regex Buddy. It's a relatively cheap little tool that I found extremely useful when I was learning. For your particular case, a possible expression is:
(http://.+?\.jpg)
It probably requires some more refinement, as there are boundary cases that could trip this up, but it would work if the file is a simple list.
You can also do free quick testing of expressions here.
Per your latest comment, if you have links to other non-images as well, then you need to make sure it doesn't start at the http:// for one link and read all the way to the .jpg for the next image. Since URLs are not allowed to have whitespace, you can do it like this:
(http://[^\s]+\.jpg)
This basically says, "match a string starting with http:// and ending with .jpg where there is at least one character between the two and none of those characters are whitespace".
Regex RegexObj = new Regex("http://.+?\\.jpg");
Match MatchResults = RegexObj.Match(subject);
while (MatchResults.Success) {
//Do something with it
MatchResults = MatchResults.NextMatch();
}
In your specific case, you could always split if by ".jpg". You will probably end up with one empty element at the end of the array, and have to append the .jpg at the end of each file if you need that. Apart from that I think it would work.
Tested the following code and it worked fine:
public void SplitTest()
{
string test = "http://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpg";
string[] items = test.Split(new string[] { ".jpg" }, StringSplitOptions.RemoveEmptyEntries);
}
It even get rid of the empty entry...
The following LINQ will separate by http: and make sure to only get values that end with jpg.
var images = from i in imageList.Split(new[] {"http:"},
StringSplitOptions.RemoveEmptyEntries)
where i.EndsWith(".jpg")
select "http:" + i;

C# Need to locate web addresses using REGEX is that possible?

C# Need to locate web addresses using REGEX is that possible?
Basically I need to parse a string prior to loading it into a WebBrowser
myString = "this is an example string http://www.google.com , and I need to make the link clickable";
webBrow.DocumentText = myString;
Basically what I want to happen is a replace of the web address so that it looks like a hyperlink, and do this with any address pulled in to the string. I would need to replace the web address so that web address would read like
<a href='web address'>web address</a>
This would allow me to have the links clickable..
Any Ideas?
new Regex(#"https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?").Match(myString)
It's possible depending on how strict or permissive you want your parsing to be.
As a first cut, you can try #"\bhttp://\S+" which will match any string starting with "http://" at a word boundary (non-word character, such as whitespace or punctuation).
To search using a regex and replace all occurrences with your custom text, you could use the Regex.Replace method.
You may want to read up on Regular Expression Language Elements to learn more.

Categories