Using *.html extension in dynamic UR's for SEO - c#

My situation is. I have a project planned to be built on ASP.NET MVC 2. And one of the major requirements is SEO optimization. A customer wants to use static-like URLs that end up with .html extension for this project that make URLs more SEO friendly. E.g. "mysite.com/about.html " or "mysite.com/items/getitem/5.html" etc.
I wonder is there any benefit from SEO perspective to use .html extension in dynamic URLs? Are Google and other search engines rank work better with such URLs?

I would use sitemaps instead, this enables you to have dynamic content (and to use MVC) but still be crawled completely.
See: http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=156184
and http://www.codinghorror.com/blog/2008/10/the-importance-of-sitemaps.html

No, the base URL doesn't matter. If you're serving dynamic content in .aspx or .html, it's all the same. If you do serve ASP.NET content with .html because of requirements (as dumb as they may be), then I suggest finding an alternative extension (e.g. .htm) for all static content. You don't want your static HTML files getting processed unnecessarily.
As Femaref said, you can use sitemaps to help.
Also, make sure your URL doesn't change (including variables) if the content is the same. This shouldn't be a problem with MVC.
Edit: In your example:
mysite.com/items/getitem/5.html
I'm guessing what you originally wanted is:
mysite.com/items/getitem/5
No extension doesn't make a difference either. Since that's not a problem, I would also argue that an extension makes the URL less "clean" and also suggests that there is a file called 5.html in that path, which is obviously not true.

Search engines don't care at all at what your webpage extensions look like.

If anything all you're doing indicating a file type for the page served up.
Can the page be reached?
Is there content?
Are there links pointing to that page?
That's what a search engine is worried about. Anyone can create a custom solution using custom file extensions for a website and have it work just fine.

Related

Use razor/asp.net mvc3 to generate static html pages?

For one projet, I've to generate static .html pages, which are gonna to be published on a remote server.
I've to automate the creation of those files from a c# code, which takes data from a SQL Server database.
Data will be not often changed(every 4-5 month), and this website will be highly frequented.
Since I find the razor synthax of asp.net MVC3 very effective, I was wondering if it's possible to use asp.net MVC3/Razor to generate those .html pages?
So:
Is this a good idea?
If yes, what is the good way?
If you think to another good manner of doing it, which way?
Thank you for the help
Edit
Regarding answers, I need to make a precision: I don't want/need to use web caching, for a lot of reasons(load(millions of pages loaded every month), integration(we integrate our page in an optimized apache with, another part of a website), number of pages(caching will only help me if I've the same pages a lot of time, but I will have ~2500 pages, so with murphy's law, except if I put a very high cache timeout, I will have to generate them often). So I really search something to generate HTML pages.
Edit 2
I just got a new constraint :/ Those template must be localized. Meaning that I should have something equivalent to the following razor code: #MyLocalizationFile.My.MyValue
Edit 3
Currently, I'm thinking of doing a dynamic website, and call some http query on it, to store the generated HTML. BUT, is there a way to avoid the http? meaning simulate an http call, specifiy the output stream and the url called(with only GET call).
Our previous load numbers were really underestimated, actually they have a little more than one million visitor each days, ~ 14 million pages loads/day.
Yes it is. Even when you can cache the results, HTML pages will be always faster and use lower server resources
A good way is to transform your razor views into text and then save the text as a html file.
Another way can be using T4 templates, but I recommend Razor.
You can use the Razor Engine (NuGet-link, their website), This way you can create templates from a console application without using asp.net MVC.
I use it as follows:
public string ParseFile<T>(string fileName, T model) {
var file = File.OpenText(fileName);
var sb = new StringBuilder();
string line;
while ((line = file.ReadLine()) != null)
{
// RazorEngine does not recognize the #model line, remove it
if (!line.StartsWith("#model ", StringComparison.OrdinalIgnoreCase))
sb.AppendLine(line);
}
file.Close();
// Stuff to make sure we get unescaped-Html back:
var config = new FluentTemplateServiceConfiguration(
c => c.WithEncoding(RazorEngine.Encoding.Raw));
string result;
using (var service = new TemplateService(config))
{
return service.Parse<T>(sb.ToString(), model);
}
}
}
Rather than generating static HTML pages, I think it would be better to dynamically generate the pages each time, but using Caching to increase performance.
See this article on caching with ASP.NET MVC3 for more information:
http://www.asp.net/mvc/tutorials/older-versions/controllers-and-routing/improving-performance-with-output-caching-cs
I ended by creating a normal asp.net MVC website and then generate page by going on the page with a WebClient.
Like this I can have a preview of the website and I can enjoy the full power of Razor+MVC helpers.
Is there any performance reason you've run into that would merit the effort of pre-rendering the website? How many pages are we talking about? What kind of parameters do your controllers take? If vanilla caching does not satisfy your requirements, for me the best approach would be a disk-based caching provider...
http://www.juliencorioland.net/Archives/en-aspnet-mvc-custom-output-cache-provider
Look at T4 templates or a similar templating solution
I'm working on a similar solution. My website is running normally (ASP.NET + DB + CMS) on a staging environment and then I use wget to crawl it and generate static html pages. Those static html pages, including assets are then uploaded to a Amazon S3 Bucket. That way the website becomes fully static, with no dependencies.
I'm planning to have a daily task that crawls specific pages on the website to make it speedier, e.g. only crawl /news every day.
I know you've already found a solution, but maybe this answer might be helpful to others.

ASP.Net MVC : How to combine area AND localization

For an wide application, I need to use localization. I'm planning to use this method for the localization: http://geekswithblogs.net/shaunxu/archive/2010/05/06/localization-in-asp.net-mvc-ndash-3-days-investigation-1-day.aspx
But for this project I also need to use Area, which are defining their own files.
Is there a way that I can use to not redefine the language variable in URI (the {lang} in area?
I feel like I've to redeclare each time how the localization using the url works, and this seems to be bad to me.
What can I do to avoid this?
In terms of SEO using an url parameter for localization is the best. The drawback is that you will have to define it for all your urls. Another possibility is to use cookies or session to store the current language in which case you don't need to pass it to all urls. Here's a guide which illustrates this. Easier to develop but not good for SEO.

Is there a simple way in C#/ASP.NET to validate that user input is a URL to guard against XSS attacks?

We've got an interstitial page that warns people when they're leaving our site. The trouble is it takes querystring parameters and blindly generates a page, thus it's vulnerable to XSS attacks. I've been tasked with fixing it and I want to do it right.
You should call Server.HtmlEncode to properly escape your generated HTML.
Yes, try this:
if(Uri.IsWellFormedUriString(url, UriKind.Absolute) && url.StartsWith("http"))
Response.Write(string.Format("{0}",
HttpUtility.HtmlEncode(url)));
So things not to do;
Use regex
Use HtmlEncode without thought.
Things to do;
Treat all input as untrusted.
Encode input before it is output. However make sure you're using the right type of encoding. If you put user input in an attribute then you use HtmlAttributeEncode, if it's just HTML then you use HtmlEncode, if you put it into JavaScript then it's JavaScriptEncode. If your javascript puts it into a div then it's HtmlEncode, followed by JavaScriptEncode.
Consider using AntiXSS which provides more encoding mechanisms and uses a safe list approach which is inherently safer.
Whitelist the exit URLs so people cannot use this page as an open referrer. Do not have a parameter which has the URL, rather have a GUID which looks up the URL from a database, session table or whatever.
(Disclosure : I own AntiXSS)
The best way is to get rid of the page entirely and just accept that its a website and make it act like a website. Websites link to other resources, it's why the web has over 200million sites instead of about a dozen.
Failing that, your best bet is to start with HtmlEncoding as a quick fix, and then replacing it with a lookup of ids to bring one to different sites.
But really, those "ZOMG you are leaving!" pages are horrible. They're even worse than the sites that open new tabs for every so-called "external" link.

ASP.NET URL remapping &redirection - Best Practice needed

This is the scenario: I have a list of about 5000 URLs which have already been published to various customers. Now, all of these URLs' location has changed on my server side. The server is still the same though. This is a ASP.NET website with .NET3.5/C#.
My requirement is : Though the customers use the older source URL they should be redirected to the new URL without any perceived change or intermediate redirection message etc.
I am trying to make sense of the whole scenario:
Where would I put the actual mapping of Old URL to New URL -- in a database or some config. file or is there a better option?
How would I actual implement a redirect:
Should I write a method with Server.Transfer ot Response.Redirect?
And is there a best practice to it like - placing the actual re-routing in HTTPModules..or is it Application_BeginRequest?
I am looking to achieve with a best-practice compliant methodology and very low performance degradation, if any.
If your application already uses a database then I'd use that. Make the old URL the primary key and lookups should be very fast. I'd personally wrap the whole thing in .NET classes that abstracts it and allow you to create a Dictionary<string,string> of all the URLs which can be loaded into memory from the DB and cached. This will be even faster.
Definitely DON'T use Server.Transfer. Instead you should do a 301 Permanently Moved redirect. This will let search engines know to use the new URL. If you were using NET 4.0 you could use the HttpResponse.RedirectPermanent method. However, in earlier versions you have to set the headers yourself - but this is trivial.
Keep the data in a database, but load into ASP.NET cache to reduce access time.
You definitely want to use HTTPModules. It's the accepted practice, and having recently tried to do it inside Global.asax, I can tell you that unless you want to do only the simplest kind of stuff (i.e. "~/mypage.aspx/3" <-> "~/mypage.aspx?param1=3) it's much more complicated and buggy than it seems.
In fact, I regret even trying to roll my own URL rewriting solution. It's just not worth it if you want something you can depend on. Scott Guthrie has a very good blog post on the subject, and he recommends UrlRewriter.net or UrlRewriting.net as a couple of free, open-source URL rewriting solutions.
Good luck.

making user friendly urls in a cms

I am interested in the architecture of a CMS where i can pass a full URL instead of a query string.
I would like to make a site that could handle a request to any page... Say
'http://www.my-domain.com/directory/page.aspx'
and have the resulting response deliver a generic page/file.
I would like the request to be passed through an XML document where i could store page names and the corresponding file to render content...
My question specifically
Is this possible
Is it easy to do
Are there any Links people have on
hand they could share with me on the
how to's.
Any pro's or
cons you may have come across if you
have used this method.
Yes, it's possible, and reasonable easy. Most CMSes do it this way, but use a database instead of an XML file.
You should probably look into URL rewriting. The concept is to separate the URL structure from the actual filesystem representation.
For .NET: UrlRewriting.Net is a gem.
However, since there are hundreds of fantastic CMSes already out there like you describe, I'd suggest using one of them and saving yourself work. Provide more detailed requirements and I can suggest one.

Categories