I am converting my C# program to JavaScript for a Google Chrome extension.
Here is the C# regex:
Did you mean: </span><a href=/search.[a-zA-Z0-9=&;_-]{1,}q=[a-zA-Z0-9+-]{1,}
How can I match the same thing in JavaScript? The same regex doesn't work.
Edit:
The input String is:
>Did you mean: </span><a href=/search?hl=en&safe=off&&sa=X&ei=hD9PTYKpKcKtgQei4pUP&ved=0CBIQBSgA&q=Linkin+Park-In+The+End&spell=1class=spell>Linkin Park-In
I need to match:
Did you mean: </span><a href=/search?hl=en&safe=off&&sa=X&ei=hD9PTYKpKcKtgQei4pUP&ved=0CBIQBSgA&q=Linkin+Park-In+The+End
Note: Quotes have been filtered out
Can you try this one in Javascript if it works what it should?
var rx = /Did you mean: <\/span><a href=["']?\/search.[a-zA-Z0-9=&;_-]+q=[a-zA-Z0-9+-]+/;
I may have escaped a few characters too many, but I've also changed the {1,} which are basically equivalent to + and I've added the quotation mark checking that may be present after href. Single or double quote.
If I execute this in Firebug on this stackoverflow page:
var rx = /Did you mean: <\/span><a href=["']?\/search.[a-zA-Z0-9=&;_-]+q=([a-zA-Z0-9+-]+)/;
rx.exec($(document.body).text());
It does find the whole text and since I've captured q variable as well it also displays Linkin+Park...
Related
For the below string,I want to select only the inner script tag containing the url http://cdn.walkme.com/users and replace the selected tag with an empty string so can somebody help me with the regex pattern
<script><script type="text/javascript">(function() {var walkme = document.createElement('script'); walkme.type = 'text/javascript'; walkme.async = true; walkme.src='http://cdn.walkme.com/users/cb643dab0d6f4c7cbc9d436e7c06f719/walkme_cb643dab0d6f4c7cbc9d436e7c06f719.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(walkme, s); window._walkmeConfig = {smartLoad:true}; })();</script></script>
I have tried this < script(.+)http://cdn.walkme.com/users/.+?\/script>
I agree that it's not really possible to have comprehensive and generic regex to parse any (x)HTML which standard supports. That's is true just by nature of these things.
But you're perfectly fine to do lots of smaller cool tasks using Regex. Just like in your case, in order to strip particular script out of the page markup, you could just use the following regex to find an entry and then replace it with an empty string:
\<script\>\<script type="text/javascript"\>\(function\(\) \{var walkme =.*\</script\>
It does very a simple thing - takes everything in between
<script><script type="text/javascript">(function() {var walkme =
(you can include more text to be more specific) and
</script>
Just ensure special symbols (like /, ( or )) are escaped properly.
Edited
In order to select inner need to use what is called positive lookahead to find first closing tag right after opening one:
<script type="text/javascript">\(function\(\) {var walkme =.*(?=</script>)
I want to set redirection from
www.somesite.com/products/dynamicstring/randomtext1/randomtext2
to www.somesite.com/products/dynamicstring
Is it possible to do that through Regex ?
It means if my incming url is
www.somesite.com/products/myproducts/test1/test2 it should redirect to www.somesite.com/products/myproducts/
just briefing more about this :
#TomLord i am using HttpContext.Current.Response.RedirectPermanent(matchingDefinition.To) i have all the redirects "From" and "To" in a class object, in the form of REGEX expressions.Example in From "/product/*" and To "/products" , i am reading these object and trying to redirect them, but i am not able to redirect something like /products/dynamicstring/randomtext1/ to /products/dynamicstring where dynamic string is random string , i dont find any regular expression which can be use to do this. For example /products/samples/randomtext1 should redirect to /products/samples/
Redirection cannot be done with regex alone. Google a bit what is a regular expression in reality. The short answer is: it's string-like expression that describes search pattern. So it can't redirect, not even replace a substring with substring or do anything else then match and capture parts of the matched string.
That being said, regex can help us do what you wanna. I am gonna assume you can use Javascript, cause I can't put a solution in every language. I am also gonna assume you will try to go over the code not copy paste and press enter. If you only need that hire a programmer. If you use another language, principle should be the same:
obtain URL
define regex
use capture group to extract the part of your URL that you need
construct a new URL
redirect to it
While matching the URLs in general is a fair bit more complex, like:
^(?:https?://)?(?:[\w]+\.)(?:\.?[\w]{2,})+$
As long as you are sure you will only be getting URLs and in the format you wanna, we will do it far simpler.
Basically, let's say you have:
some text with 2 dots that ends in com
then a /products/dynamicstring/
then text
then /
then text
As a regex that is:
/\w*.\w*.com\/products\/dynamicstring\/\w*\/\w*/g
Curde matching is done, but we still need to add a capture group we will use to extract part of the string we need:
/(\w*.\w*.com\/products\/)dynamicstring\/\w*\/\w*/g
Oke, now let's leverage this regex to do rest of the work:
Define regex:
var regex = /\w*.\w*.com\/products\/dynamicstring\/\w*\/\w*/g;
Get current URL. If you already have URL use it.
var currUrl = window.location.href;
Extract capture group from string:
var match = regex.exec(currUrl);
Use that to get a new URL from old one:
var redirectUrl = match[1] + myproducts/
Finally, we redirect with:
window.location.replace(redirectUrl);
I wrote all this straight from my head so I recommend you go over each step, look how it works, read some documentation about functions used. You might find an error as well as learn a lot.
I am using regex for extracting url from string and it's working mostly;
var regex=new Regex("<a [^>]*href=(?:'(?<href>.*?)')|(?:\"(?<href>.*?)\")",RegexOptions.IgnoreCase);
following strings working fine:
"This is Test page <a href='test.aspx'>test page</a>"
"This is Test page <a href='test1.aspx'>test</a> another one <a href='test2.aspx'>test</a>"
"This is Tests\"s page <a href='test1.aspx'>test</a> another one <a href='test2.aspx'>test</a>"
"This is Test page"
"This is Test page\"s without problem"
But some time it's not returning good result. Following code return bad result (string contains 2 double quotes) -
var inputString="This string create \"problem\" for me";
var regex=new Regex("<a [^>]*href=(?:'(?<href>.*?)')|(?:\"(?<href>.*?)\")",RegexOptions.IgnoreCase);
var urls=regex.Matches(inputString).OfType<Match>().Select(m =>m.Groups["href"].Value);
foreach(var zzzzzzz in urls){
Console.WriteLine(zzzzzzz);
}
Demo with problem
Could anyone help me to solve this problem?
Maybe you can change your regex like this:<a .*?href=(?:['"](?<href>[^'"]*?)['"])
On Csharp:"<a .*?href=(?:['\"](?<href>[^'\"]*?)['\"])"
Solution:
You should use an HTML Parser to get rid of current and further headaches. A tested and working example can be found for example here.
Regex explanation:
As for your regex, it currently fails because of alternation that you did not enclose into a group. Thus, it can return strings that have no <a... href inside them. More, there are other issues that you can have with your current regex.
A "fixed" regex (meaning it will be capable of handling escaped entities and both double and single quotes) would look like:
(?i)<a\b[^<]*href=(?:(?:'(?<href>[^'\\]*(?:\\.[^'\\]*)*)')|(?:\"(?<href>[^'\\]*(?:\\.[^'\\]*)*))\")
But it is unlikely you can fully rely on regex when parsing HTML. Use the solution, not a workaround.
I have some data from a lookup like this: =winz\ach'dull.
How can I replace single quotes (') with ("").
This is my code =>
<input type="button" id="btnSelect" onclick="Select('<%#Eval("LoginName").ToString().Replace("'", "\'")%>');" value="Select"/>
I'm trying to create code like this:
Select('<%#Eval("LoginName").ToString().Replace("'", "\'")%>');
but it does not not work.
Please correct and help me. Thanks.
In pure javascript we could do :
var a="winz\ach'dull.";
alert(a.replace("'",'"'));
And that would replace your single quote.
Note: Your code is C# not javascript.
You can escape quotes with the "\" character and it works perfectly with HTML. So the answer to exactly what you wrote would be: (this is just to humour you in the future)
"Select('<%#Eval(\"LoginName\").ToString().Replace(\"'\", \"\'\")%>');"
But you have syntax errors in what you are writing and that Eval stuff is not javascript so I don't know why ToString and Replace are attached to it. I've changed it a little based on guessing what you're trying to do:
<input onclick="Select('<%#Eval("LoginName")%>').ToString().Replace(\"'\", \"'\");">
Note that if you're using C# or something on the server side it doesn't need to be escaped because by the time the HTML is parsed in the DOM, typically a browser the source no longer contains your server side code and only the output!
I am using HighCharts and am generating script from C# and there's an unfortunate thing where they use inline functions for formatters and events. Unfortunately, I can't output JSON like that from any serializer I know of. In other words, they want something like this:
"labels":{"formatter": function() { return Highcharts.numberFormat(this.value, 0); }}
And with my serializers available to me, I can only get here:
"labels":{"formatter":"function() { return Highcharts.numberFormat(this.value, 0); }"}
These are used for click events as well as formatters, and I absolutely need them.
So I'm thinking regex, but it's been years and years and also I was never a regex wizard.
What kind of Regex replace can I use on the final serialized string to replace any quoted value that starts with function() with the unquoted version of itself? Also, the function itself may have " in it, in which case the quoted string might have \" in it, which would need to also be replaced back down to ".
I'm assuming I can use a variant of the first answer here:
Finding quoted strings with escaped quotes in C# using a regular expression
but I can't seem to make it happen. Please help me for the love of god.
I've put more sweat into this, and I've come up with
serialized = Regex.Replace(serialized, #"""function\(\)[^""\\]*(?:\\.[^""\\]*)*""", "function()$1");
However, my end result is always:
formatter:function()$1
This tells me I'm matching the proper stuff, but my capture isn't working right. Now I feel like I'm probably being an idiot with some C# specific regex situation.
Update: Yes, I was being an idiot. I didn't have a capture around what I really wanted.
`enter code here` serialized = Regex.Replace(serialized, #"""function\(\)([^""\\]*(?:\\.[^""\\]*)*)""", "function()$1");
that gets my match, but in a case like this:
"formatter":"function() { alert(\"hi!\"); return Highcharts.numberFormat(this.value, 0); }"
it returns:
"formatter":function() { alert(\"hi!\"); return Highcharts.numberFormat(this.value, 0); }
and I need to get those nasty backslashes out of there. Now I think I'm truly stuck.
Regexp for match
"function\(\) (?<code>.*)"
Replace expression
function() ${code}
Try this : http://regexr.com?30jpf
What it does :
Finds double quotes JUST before a function declaration and immediately after it.
Regex :
(")(?=function()).+(?<=\})(")
Replace groups 1 & 3 with nothing :
3 capturing groups:
group 1: (")
group 2: ()
group 3: (")
string serialized = JsonSerializer.Serialize(chartDefinition);
serialized = Regex.Replace(serialized, #"""function\(\)([^""\\]*(?:\\.[^""\\]*)*)""", "function()$1").Replace("\\\"", "\"");