Fastest way to code up hitting a URL

Fastest way to code up hitting a URL - c#

I need to login to a site, then hit a certain URL about a thousand times (with different params, of course).
The URL is something this:
http://www.foo.com/bar.asp?id=x ' where x is the ID
Of course if I simply hit the URL without being logged, it will fail.
I am not very familiar with this type of work, but I would imagine that whatever the method I choose, it would have to support cookies.
I was thinking that I could create a winform app with a browser control and somehow drive it, but that seems like a massive overkill.
Is there a better way?

If you are determined to do it in your code itself then i dont think any thing is stopping you from doing that.
HttpRequest and HttpResponse classes has pretty much everything you need to do that.
Moreover if you are concerned about cookies then you could always store received cookies in a database or file and send them with every subsequent request.
If you want to know the structure of the Http Request like a GET request then look here.
Also you can make your request look like a Request from browser by specifying the Proper Request Headers...(However it doesn't work every time)
And all this can be done even in a console app

You may want to look into WCAT if you are mainly interested in how your server performs under load.

Using Python or PHP, you can use the libcURL library, I believe they both have bindings for these languages. If not, just use the urllib2 module (for Python).

Related

Log All Requests for User Per Webpage

I want to build an "audit trail" for all requests incoming to the server, however it needs to be specific per user, per web page.
For instance I imagine something like this:
On initial view render I would store (cookie/ page variable/ something else) a unique Id saying the user browsed to /myapp.com/dashboard/1234. - maybe in the layout.cshtml.
Then the app fires off X number of GET/ POST requests to the server each having that same unique Id initially tied to the view rendered.
This allows me then to tie back all requests for a page and add up the server execution time.
I tried using path specific cookies but this won't work I realized since a user can have many tabs open with the same url. Also the user works in many areas of the app at once. They can have anywhere from 1 to 10+ tabs open. Each of these should have it's own unique Id and "audit trail" of all calls taking place on that page.
This is an existing app so modifying each of the GET/ POST to pass in the unique Id is out of scope. Just hoping I am missing something that might take care of this.
Thank you!

If I'm understanding you correctly, you have a single page load, and then additional requests made either for images and other resources or AJAX requests that you want tied to and tracked along with that initial page load.
The chief problem you're going to have here is that, based on the way HTTP works, each request is handled as its own thing and not considered as part of a greater whole. The web browser makes it all look seamless, but all the web server is doing is just responding to a bunch of (as far as it knows) unrelated requests for various different things. To track them all as one unit, you would either need to attach some unique id to the request itself (for a GET, that would be either as part of the URI path or query string) or lean on Session to introduce state between the requests. However, session state really only works in this scenario when all requests can be tied to a single initial request. Once the user starts working with multiple different pages at once, there's no reasonable to discern which request belongs to what, and you're back in the same boat.
In other words, your only real option is to send something along with the request, which would mean doing something like:
<link rel="stylesheet" type="text/css" href="/path/to/file.css?origin=#Request.RawUrl" />
Then, you could have an action filter that looks for origin in the query string of any request, and ties it to the logging for that particular page.
For what it's worth, it should be noted that by default, IIS will handle all requests for static resources directly, without involving ASP.NET. If you do want to track requests for static resources, you would have to pass them all through ASP.NET, which will be kind of a pain. If you only want to track AJAX requests, that's much simpler and shouldn't require anything special for the most part.
All that said, if the only purpose of this is to track page load time, there's far better and easier ways to do that. You can install Glimpse. You can use your browser's developer console. You can use something like Google Analytics. All of these are far preferable to the path you're going down here, for page load statistics.

Write an ActionFilter to do this. There are many examples of this
http://rion.io/2013/04/15/creating-advanced-audit-trails-using-actionfilters-in-asp-net-mvc/
http://blog.ploeh.dk/2014/06/13/passive-attributes/
I personally like Mark Seemann's example more since it clearly defines a nice separation of concerns for the attribute and the filter.

MVC Get Vs Post

While going through MVC concepts, i have read that it is not a good practice to have code inside 'GET' action which changes state of server objects( DB updates etc.,).
'Caching of return data' has been given as a reason for this.
Could someone please explain this?
Thanks in advance!

This is by HTTP standard. The GET verb is one that should be idempotent and safe.
9.1.1 Safe Methods
Implementors should be aware that the software represents the user in
their interactions over the Internet, and should be careful to allow
the user to be aware of any actions they might take which may have an
unexpected significance to themselves or others.
In particular, the convention has been established that the GET and
HEAD methods SHOULD NOT have the significance of taking an action
other than retrieval. These methods ought to be considered "safe".
This allows user agents to represent other methods, such as POST, PUT
and DELETE, in a special way, so that the user is made aware of the
fact that a possibly unsafe action is being requested.
Naturally, it is not possible to ensure that the server does not
generate side-effects as a result of performing a GET request; in
fact, some dynamic resources consider that a feature. The important
distinction here is that the user did not request the side-effects, so
therefore cannot be held accountable for them.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html

Browsers can cache GET requests, generally on static data, like images or scripts. But you can also allow browsers to cache GET requests to controller actions as well, using [OutputCache] or other similar ways, so if caching is turned on for a GET controller action, it's possible that clicking on a link leading to /Home/Index doesn't actually run the Index method on the server, but rather allows the browser to serve up the page from its own cache.
With this line of thinking, you can safely turn on caching on GET actions in which the data you're serving up doesn't change (or doesn't change often), with the knowledge that your server action won't fire every time.
POSTs won't be cached by the browser, so any POST is guaranteed to make it to the server.

Ignore caching for a moment. Another way of thinking about this is that search engines will store HTTP GET links during their indexing/crawling process, therefore they will show up in search results.
Suppose if your /Home/Index is implemented as GET but it lets say deletes a row in your Database, every time this link shows up on a search engine and somebody clicks it, you will have a delete row, and soon you have a lot deleted rows.

The HTTP spec states that GET and HEAD are expected to be idempotent, ie. they should not change server state.
One practical aspect of this, is that search robots will issue GET against any link to your site they know of. If such a GET changes user data it was not meant to change, you are in trouble.
Being idempotent has the added benefit that clients could be able to cache the result of a GET (use HTTP headers to control this).

ASP.NET URL remapping &redirection - Best Practice needed

This is the scenario: I have a list of about 5000 URLs which have already been published to various customers. Now, all of these URLs' location has changed on my server side. The server is still the same though. This is a ASP.NET website with .NET3.5/C#.
My requirement is : Though the customers use the older source URL they should be redirected to the new URL without any perceived change or intermediate redirection message etc.
I am trying to make sense of the whole scenario:
Where would I put the actual mapping of Old URL to New URL -- in a database or some config. file or is there a better option?
How would I actual implement a redirect:
Should I write a method with Server.Transfer ot Response.Redirect?
And is there a best practice to it like - placing the actual re-routing in HTTPModules..or is it Application_BeginRequest?
I am looking to achieve with a best-practice compliant methodology and very low performance degradation, if any.

If your application already uses a database then I'd use that. Make the old URL the primary key and lookups should be very fast. I'd personally wrap the whole thing in .NET classes that abstracts it and allow you to create a Dictionary<string,string> of all the URLs which can be loaded into memory from the DB and cached. This will be even faster.
Definitely DON'T use Server.Transfer. Instead you should do a 301 Permanently Moved redirect. This will let search engines know to use the new URL. If you were using NET 4.0 you could use the HttpResponse.RedirectPermanent method. However, in earlier versions you have to set the headers yourself - but this is trivial.

Keep the data in a database, but load into ASP.NET cache to reduce access time.

You definitely want to use HTTPModules. It's the accepted practice, and having recently tried to do it inside Global.asax, I can tell you that unless you want to do only the simplest kind of stuff (i.e. "~/mypage.aspx/3" <-> "~/mypage.aspx?param1=3) it's much more complicated and buggy than it seems.
In fact, I regret even trying to roll my own URL rewriting solution. It's just not worth it if you want something you can depend on. Scott Guthrie has a very good blog post on the subject, and he recommends UrlRewriter.net or UrlRewriting.net as a couple of free, open-source URL rewriting solutions.
Good luck.

Compressing parameters in the URL

The urls on my site can become very long, and its my understanding that urls are transmitted with the http requests. So the idea came to compress the string in the url.
From my searching on the internet, i found suggestions on using short urls and then link that one to the long url. Id prefer to not use this solution because I would have to do a extra database check to convert between long and short url.
That leaves in my head 3 options:
Hashing, I don't think this is a option. If you want a safe hashing algorithm, its going to be long.
Compressing the url string, basically having the server depress the string when when it gets the url parameters.
Changing the url so its not descriptive, this is bad because it would make development harder for me (This is a 1 man project).
Considering the vast amount possible amount of OS/browsers out there, I figured id as if anyone else has tried this or have some clever suggestions.
If it maters the url parameters can reach 100+ chars.
Example:
mysite.com/Reports/Ability.aspx?PlayerID=7737&GuildID=132&AbilityID=1140&EventID=1609&EncounterID=-1&ServerID=17&IsPlayer=True
EDIT:
Let me clarify atm this is NOT breaking the site. Its more about me learning to find a good solution ( Im well aware this is micro optimization, my site is very fast atm ) and making my site even faster ( To challenge myself, and become a better coder ).
There is also a cosmetic issue, I personal think that a URL longer then the address bar looks bad.

You have some conflicting requirements as you want to shorten/compress the url without making it less descriptive. By the very nature of shortening the URL, you will, to a certain extent, make it less descriptive.
As I understand it, your goal is to optimise by sending less over the request. You mention 100+ characters, instead of 1000+ which I assume means they don't get that big? In which case, I'd see this as an unnecessary micro-optimisation.
To add to previous suggestions of using POST, a simple thing would be to just shorten the keys instead of using full names if you don't want to do full url shortening e.g.:
mysite.com/Reports/Ability.aspx?pid=7737&GID=132&AID=1140&EID=1609&EnID=-1&SID=17&IsP=True
These are obviously less descriptive.
But like I said, are you having a real problem with having long URLs?

I'm not sure I understand what's your problem with long URLs? Generally I'd try to avoid them, but if that's necessary then you won't depend on the user remembering it anyway, so why go through all the compressing trouble? Even with a URL of 1000 chars (~2KB) the page request won't be slow.
I would, however, consider using POST instead of GET if possible, to prettify the URL, but that's of course depends on your implementation / environment.

It is recommended a few times here to use POST instead of GET. I would strongly recommend AGAINST picking your HTTP action by what the URL looks like. There is more to this choice than how it is displayed in the browser.
A quick overview:
http://www.w3.org/2001/tag/doc/whenToUseGet.html#checklist

A few options to add to the other answers:
Using a subclassed LinkButton for your navigation. This holds the extra data (PlayerId for example) inside its viewstate as properties. This won't be much help though if you're giving URLs to people via emails.
Use the MVC routing engine to produce slightly improved URLs - no keys for the querystring. e.g. mysite.com/Reports/Ability/7737/132/1140/1609/-1/17/True
Create your own URL shortener like tinyurl.com. Store the url in the database along with each of the querystring values to lookup.
Simply setup some friendly URLs for the most popular reports, for example mysite.com/Reports/JanuaryReport. You can also do this using the MVC routing engine.
The MVC routing engine is stand alone and can work without your site being an MVC site.

With my scheme, one could encode the params section of a URL as a base64 string which is ~50% shorter than a direct base64 representation. So for your case you get:
~50% shorter params section
a base 64 string which hides a lot of the detail
see http://blog.alivate.com.au/packed-url/

Most browsers can handle up 2048 characters in URL; if you don't feel like to use a long parameter list, you can always to pass parameters through POST requests.

There are theoretical problems with extended URLs. The exact limit varies across browser (roughly 2k in sort versions of IE) and server (4-8k in Apache, varying on version and configuration), and isn't officially specified in any RFC that I am aware of.
I would agree with synhershko, and replace the URL with form POST parameters instead if you are concerned that your URLs are growing too long.

I've encountered similar situations in the past, although my reasons for optimisation were for SEO. To me it depends on what you're doing with the page URL variables, are they being appended on all/most pages? If they are then to me there is almost always a much better way, although if you're far down the development path it's probably too late now.
I like being able to 'read' a URL, especially when I drop into an unknown site 2 or more layers deep in the navigation and there site is designed poorly, it's often the easiest and fastest way for an advanced user to find where they are on the site.
If you're interested in it from an SEO point of view, its normally best to have a hierarchy which only contains: / - _
Search engines will try and read URL's, see this video by Matt Cutts (can't remember how far into the video he mentions it but it's a good watch anyway...)

Any form of compression of the URL (hashing, compressing, non-descriptive) is going to:
make the urls harder to read, remember and type in correctly
have a performance impact as you will have to decrypt/decompress/convert the url before you can work with it.
Also, hashing is usually considered to be non-reversible - given a hashed value you shouldn't be able to work out what generated it, but you could use it to look up a value in a database, which gets you back to your first issue of short-long lookups.
You could easily just remove the redundant "ID" at the end of each parameter, and possibly strip out vowels or similar to "shorten" the url without losing too much from the semantics of the request.
But to be honest, the length of your URL is one of the least things to worry about in terms of performance - look at the size of any cookies you're sending back and forth between the browser and the server, and the page size you're sending back.

Keeping same session with different user-agents

Is there a way to use the same session on different user-agents. I have a flash app that is generating a new session id on posting data to myHandler.ashx ( same happens on aspx ). Am i missing a trick here?

Take a look at swfupload and their implementation in ASP.Net - they use a Global.asax hack in order to keep the same session.

I have no experience from c# or anything like that, but when doing remoting using amfphp you will sometimes need to supply the session_id variable in your call manually, as the server will for some reason consider you two different users even though it's all going through the same browser.
Often, the simplest way to do this is to supply your swf with the id in a flashvar when loading it. This will require you to print the session_id in the html source, which isn't ideal, but it's not that big of a deal since it can be sniffed very easily anyway.

It appears as though it is common to pass the session id through flash vars. I have not done this myself, but a quick Google search with these keys seems to find some promising hits: keep session data flash

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.