I'm searching for a c# function that takes a Url as a parameter and returns all the inbound url related to that url.
You can "download" webpages using the WebClient class:
String url = "http://www.google.com";
WebClient client = new WebClient();
String source = client.DownloadString(url);
Then you need to search all URLS. I'd love to write a RegEx for you, if you'd put effort in finding the answer which you didn't, apparently.
Writing one of those Regular Expressions is rather hard because there are so many different things you have to match:
Relative URL's
Absolute URL's
IP's
You have to consider the base tag
Only if they're in specific tags (a, img, link, script, and on and on)
Good luck with that
From your description you want to find "inbound"? url's to a URL. If that is the case you would need to connect to an API to retrieve that information. I don't think Google has one but I do know they exist.
Related
I have a link to resource
http://example.com/category/id/../test1.html
when I request this resource, on the server I see url without escaped id and double /../ dots
I try to catch these dots in global.asax Application_BeginRequest, in custom modules, in IIS logs, result is the same url is without id and /../
http://example.com/category/test1.html
At which level I can extract id?
As far as I know, if we use ../ in the url this means parent path, so the browser will auto generate the new url instead of the old one. This browser action, we couldn't modify it by using url rewrite or something else, since the url come to the server has alread been modified.
In my opinion, the only way to get the ID is you should modify the url format or encode the url.
I have a form that users enters there website. Problem is some users put their email address in which I do not want. I want a way to check if the url is well structured. e.g. no #, must have a root domain. www subdomains are optional. I am unable to find this anywhere.
I have tried this code
if (!Uri.TryCreate("http://" + websiteurl, UriKind.Absolute, out uri) || null == uri)
returning false on error but my problem is that it still validates without a root domain e.g. I can put in
http://websitename
and validates fine which I do not want. It does return false when I have put in
http://websitename#.
Is there a way I can overcome this problem? also I added
http:// in the passthrough value because the url never validates.
You can use:
Uri.IsWellFormedUriString(inputUrl, UriKind.RelativeOrAbsolute)
Depending on your performance needs, maybe issuing a quick HttpWebRequest for the website url they give and verifying that you get back a success response might be a good option.
You could try with a regular expression.
Uri.IsWellFormattedUriString won't solve the problem here, which includes the ability to distinguish a valid Url from an email address. Both are well formatted Uris.
Use a regular expression. Here's one from the MS forums using C#:
Url validation with Regular Expression
But you should really validate this before it gets sent to the server. If you use the Peter Blum validators, he's already done the work for you.
Peter Blum's Validators
Or if you want to put in your own JavaScript file, check out this StackOverflow thread.
Url Validation using jQuery
In my Project i don't want to show query string values to users. For that case i used URL Rewriting in asp.net. So my URL Looks like below.
http://localhost/test/default.aspx?id=1
to
http://localhost/test/general.aspx
The first URL will be rewrites to second URL, but it will still executes the default.aspx page with that query string value. This is working fine.
But my question is that, is there any way the user can find that original URL in browser?
The answer is no.
The browser can't tell what actual script ended up servicing the request - it only knows what it sent to the server (unless the server issued a redirect, but then the browser would make a new request to the redirect target).
Since URL rewriting takes an incoming request and routes it to a different resource, I believe the answer is yes. Somewhere in your web traffic you are requesting http://localhost/test/default.aspx?id=1 and it is being rewritten as the new request http://localhost/test/general.aspx.
While this may hide the original request from displaying in the browser, at some point it did send that original URL as an HTTP GET.
As suggested, use Firebug or Fiddler to sniff the traffic.
I figured answer for my question. We can easily found the rewritten urls. If we saw the view source of that page in browser then we can see that original url with querystring values.
I'm working on a real estate website. It would be ideal to have my client's featured properties have their own unique URL like:
www.realestatewebsite.com/featured/123-fake-st/
I'm constructing a CMS for my client so that they can add/delete featured properties in an admin backend, meaning that I need to write a program to automatically add the new URL for them based on the address they input in the database through the CMS.
I'm new to URL Rewrite. What would be the best way to go about this? I've considered using RewriterConfig in the web.config, but then I'm worried I would encounter problems writing a program that adds new rules to the web.config file. I thought about using a regex expression in the RewriterRule to find anything after /featured/ in the URL, but then if I'm just using the address in the LookFor then how would it know which property ID to use in the SendTo?
It would be ideal if I could just have a file put the address after "/featured/" into a string, look in the database for the address and retrieve the Property ID and then redirect the users that way.
As I said, I'm new to URL Rewriting and it would be great if someone could point me in the right direction.
Thanks!
-Aaron
There are different ways of doing this. Common to all solutions are the following:
Set up a algorithm to create the URIs and store them in the database (changing space to - is a simple way to achieve this.
Route the URI by making the address string into a parameter
Routing can be done a variety of ways.
If you have control of the server, or they have control of the server, you have the ability to set up IIS rewriting on the IIS instance on their server (good starter URI).
If this is hosted on an ISP, you may not have this option and have to use IIS rewriting and will have to use ASP.NET routing. Here is a good article to start with to undestand this. If you are using MVC, the routing is "built in".
I would suggest using URL Rewrite Module for IIS7, look here:
http://learn.iis.net/page.aspx/460/using-the-url-rewrite-module/
i want to check if a url is from youtube.com website or the mobile version of the site.
is there a robust way to do this?
checking the url contains "youtube.com" does not seem good to me.
whats the proper way to do it?
Use the Uri class to parse the URL and compare to the Host property.
Uri uri = new Uri(myURL);
return uri.Host.Equals("youtube.com", StringComparison.InvariantCultureIgnoreCase)
I don't know a foolproof way to make sure it's youtube.com coming in, but checking REFERER is really not all that solid: the page linking to you can fake its referer header any time it wants to:
http://www.stardrifter.org/refcontrol/
It'll be interesting to see how the security gurus answer this question.
-- pete