C# Regular Expressions Replace - c#

I'm trying to rewrite some text that contains URLs such as:
"/route/id"
"/base/route/id"
"/route2/id"
"/base/route2/id"
of which there will be many in the text. The quotes are part of the text being matched.
All these URLs need to be of the format "/base/..." so I need to rewrite "/ to "/base/ unless the "/base is already there. Which is the bit I'm struggling with. I can replace the "/ but not when it's already followed by base.

No need for Regex:
var listOfThingsAllWithBase = listOfThings.Select(a => a.StartsWith("\"/base") ? a : "\"/base" + a.Substring(1, a.Length - 2));

My regex is a little rusty, but if I remember correctly, this should find all entries not starting with /base
^(?!/base)

Related

RegEx to extract partial string

So simple but I'm struggling, I do RegExp every 2 years or so , so I'm rusty
I have these two url strings
http://localhost:58876/Products/Product1
https://localhost:58876/Products/Product1
The result I want is
localhost:58876
Basically remove the http(s):// and everything after the first single / so I end up with the domain with or without the port number
P.S: I'm working with C#
This worked for me (tested int notepad++):
(\w+:\d+)
You can use the following regex to split the URL:
((http[s]?|ftp):/)?/?([^:/\s]+)(:([^/]))?((/\w+)/)([\w-.]+[^#?\s]+)(\?([^#]))?(#(.))?
The RegEx positions 3 and 5 are those you are looking for.
(^[^h]|\/\/)([\w\d\:\#\.]+:?[\d]?+)
then in c#:
string address = ...
char[] MyChar = {'/'};
string NewString = address.TrimStart(MyChar);
EDIT: also worked with localhost:58876/Products/Product1
!
Just match anything but a slash: /^https?:\/\/([^\/]+)\/.*$/
var url = 'http://localhost:58876/Products/Product1';
var match = url.match(/^https?:\/\/([^\/]+)\/.*$/);
if(match&&match.length>0)document.write(match[1]);
Even shorter: /\/\/([^\/]+)/. Note that there are (a lot) better ways to parse URLs. Depending on your platform, there’s PHP’s parse_url, NodeJS’s url module or libraries like uri.js that handle the many faces of valid URIs.

Regex in C# - remove quotes and escaped quotes from a value after another value

I am using HighCharts and am generating script from C# and there's an unfortunate thing where they use inline functions for formatters and events. Unfortunately, I can't output JSON like that from any serializer I know of. In other words, they want something like this:
"labels":{"formatter": function() { return Highcharts.numberFormat(this.value, 0); }}
And with my serializers available to me, I can only get here:
"labels":{"formatter":"function() { return Highcharts.numberFormat(this.value, 0); }"}
These are used for click events as well as formatters, and I absolutely need them.
So I'm thinking regex, but it's been years and years and also I was never a regex wizard.
What kind of Regex replace can I use on the final serialized string to replace any quoted value that starts with function() with the unquoted version of itself? Also, the function itself may have " in it, in which case the quoted string might have \" in it, which would need to also be replaced back down to ".
I'm assuming I can use a variant of the first answer here:
Finding quoted strings with escaped quotes in C# using a regular expression
but I can't seem to make it happen. Please help me for the love of god.
I've put more sweat into this, and I've come up with
serialized = Regex.Replace(serialized, #"""function\(\)[^""\\]*(?:\\.[^""\\]*)*""", "function()$1");
However, my end result is always:
formatter:function()$1
This tells me I'm matching the proper stuff, but my capture isn't working right. Now I feel like I'm probably being an idiot with some C# specific regex situation.
Update: Yes, I was being an idiot. I didn't have a capture around what I really wanted.
`enter code here` serialized = Regex.Replace(serialized, #"""function\(\)([^""\\]*(?:\\.[^""\\]*)*)""", "function()$1");
that gets my match, but in a case like this:
"formatter":"function() { alert(\"hi!\"); return Highcharts.numberFormat(this.value, 0); }"
it returns:
"formatter":function() { alert(\"hi!\"); return Highcharts.numberFormat(this.value, 0); }
and I need to get those nasty backslashes out of there. Now I think I'm truly stuck.
Regexp for match
"function\(\) (?<code>.*)"
Replace expression
function() ${code}
Try this : http://regexr.com?30jpf
What it does :
Finds double quotes JUST before a function declaration and immediately after it.
Regex :
(")(?=function()).+(?<=\})(")
Replace groups 1 & 3 with nothing :
3 capturing groups:
group 1: (")
group 2: ()
group 3: (")
string serialized = JsonSerializer.Serialize(chartDefinition);
serialized = Regex.Replace(serialized, #"""function\(\)([^""\\]*(?:\\.[^""\\]*)*)""", "function()$1").Replace("\\\"", "\"");

Converting C# regex to javascript

I am converting my C# program to JavaScript for a Google Chrome extension.
Here is the C# regex:
Did you mean: </span><a href=/search.[a-zA-Z0-9=&;_-]{1,}q=[a-zA-Z0-9+-]{1,}
How can I match the same thing in JavaScript? The same regex doesn't work.
Edit:
The input String is:
>Did you mean: </span><a href=/search?hl=en&safe=off&&sa=X&ei=hD9PTYKpKcKtgQei4pUP&ved=0CBIQBSgA&q=Linkin+Park-In+The+End&spell=1class=spell>Linkin Park-In
I need to match:
Did you mean: </span><a href=/search?hl=en&safe=off&&sa=X&ei=hD9PTYKpKcKtgQei4pUP&ved=0CBIQBSgA&q=Linkin+Park-In+The+End
Note: Quotes have been filtered out
Can you try this one in Javascript if it works what it should?
var rx = /Did you mean: <\/span><a href=["']?\/search.[a-zA-Z0-9=&;_-]+q=[a-zA-Z0-9+-]+/;
I may have escaped a few characters too many, but I've also changed the {1,} which are basically equivalent to + and I've added the quotation mark checking that may be present after href. Single or double quote.
If I execute this in Firebug on this stackoverflow page:
var rx = /Did you mean: <\/span><a href=["']?\/search.[a-zA-Z0-9=&;_-]+q=([a-zA-Z0-9+-]+)/;
rx.exec($(document.body).text());
It does find the whole text and since I've captured q variable as well it also displays Linkin+Park...

Extracting a string starting with x and ending with y

First of all, I did a search on this and was able to find how to use something like String.Split() to extract the string based on a condition. I wasn't able to find however, how to extract it based on an ending condition as well. For example, I have a file with links to images: http://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpg
You will notice that all the images start with http:// and end with .jpg. However, .jpg is succeeded by http:// without a space, making this a little more difficult.
So basically I'm trying to find a way (Regex?) to extract a string from a string that starts with http:// and ends with .jpg
Regex is the easiest way to do this. If you're not familiar with regular expressions, you might check out Regex Buddy. It's a relatively cheap little tool that I found extremely useful when I was learning. For your particular case, a possible expression is:
(http://.+?\.jpg)
It probably requires some more refinement, as there are boundary cases that could trip this up, but it would work if the file is a simple list.
You can also do free quick testing of expressions here.
Per your latest comment, if you have links to other non-images as well, then you need to make sure it doesn't start at the http:// for one link and read all the way to the .jpg for the next image. Since URLs are not allowed to have whitespace, you can do it like this:
(http://[^\s]+\.jpg)
This basically says, "match a string starting with http:// and ending with .jpg where there is at least one character between the two and none of those characters are whitespace".
Regex RegexObj = new Regex("http://.+?\\.jpg");
Match MatchResults = RegexObj.Match(subject);
while (MatchResults.Success) {
//Do something with it
MatchResults = MatchResults.NextMatch();
}
In your specific case, you could always split if by ".jpg". You will probably end up with one empty element at the end of the array, and have to append the .jpg at the end of each file if you need that. Apart from that I think it would work.
Tested the following code and it worked fine:
public void SplitTest()
{
string test = "http://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpg";
string[] items = test.Split(new string[] { ".jpg" }, StringSplitOptions.RemoveEmptyEntries);
}
It even get rid of the empty entry...
The following LINQ will separate by http: and make sure to only get values that end with jpg.
var images = from i in imageList.Split(new[] {"http:"},
StringSplitOptions.RemoveEmptyEntries)
where i.EndsWith(".jpg")
select "http:" + i;

Extract substring from string with Regex

Imagine that users are inserting strings in several computers.
On one computer, the pattern in the configuration will extract some characters of that string, lets say position 4 to 5.
On another computer, the extract pattern will return other characters, for instance, last 3 positions of the string.
These configurations (the Regex patterns) are different for each computer, and should be available for change by the administrator, without having to change the source code.
Some examples:
Original_String Return_Value
User1 - abcd78defg123 78
User2 - abcd78defg123 78g1
User3 - mm127788abcd 12
User4 - 123456pp12asd ppsd
Can it be done with Regex?
Thanks.
Why do you want to use regex for this? What is wrong with:
string foo = s.Substring(4,2);
string bar = s.Substring(s.Length-3,3);
(you can wrap those up to do a bit of bounds-checking on the length easily enough)
If you really want, you could wrap it up in a Func<string,string> to put somewhere - not sure I'd bother, though:
Func<string, string> get4and5 = s => s.Substring(4, 2);
Func<string,string> getLast3 = s => s.Substring(s.Length - 3, 3);
string value = "abcd78defg123";
string foo = getLast3(value);
string bar = get4and5(value);
If you really want to use regex:
^...(..)
And:
.*(...)$
To have a regex capture values for further use you typically use (), depending on the regex compiler it might be () or for microsoft MSVC I think it's []
Example
User4 - 123456pp12asd ppsd
is most interesting in that you have here 2 seperate capture areas. Is there some default rule on how to join them together, or would you then want to be able to specify how to make the result?
Perhaps something like
r/......(..)...(..)/\1\2/ for ppsd
r/......(..)...(..)/\2-\1/ for sd-pp
do you want to run a regex to get the captures and handle them yourself, or do you want to run more advanced manipulation commands?
I'm not sure what you are hoping to get by using RegEx. RegEx is used for pattern matching. If you want to extract based on position, just use substring.
It seems to me that Regex really isn't the solution here. To return a section of a string beginning at position pos (starting at 0) and of length length, you simply call the Substring function as such:
string section = str.Substring(pos, length)
Grouping. You could match on /^.{3}(.{2})/ and then look at group $1 for example.
The question is why? Normal string handling i.e. actual substring methods are going to be faster and clearer in intent.

Categories