Simulate the page lifecycle to grab the html from the UI layer

Simulate the page lifecycle to grab the html from the UI layer - c#

I'm working with a rather large .net web application.
Users want to be able to export reports to PDF. Since the reports are based on aggregation of many layers of data, the best way to get an accurate snapshot is to actually take a snapshot of the UI. I can take the html of the UI and parse that to a PDF file.
Since the UI may take up to 30 seconds to load but the results never change, I wand to cache a pdf as soon as item gets saved in a background thread.
My main concern with this method is that if I go through the UI, I have to worry about timeouts. While background threads and the like can last as long as they want, aspx pages only last so long until they are terminated.
I have two ideas how to take care of this. The first idea is to create an aspx page that loads the UI, overrides render, and stores the rendered data to the database. A background thread would make a WebRequest to that page internally and then grab the results from the database. This obviously has to take security into consideration and also needs to worry about timeouts if the UI takes too long to generate.
The other idea is to create a page object and populate it manually in code, call the relevant methods by hand, and then grab the data from that. The problems with that method, aside from having no idea how to do it,is that I'm afraid I may forget to call a method or something may not work correctly because it's not actually associated with a real session or webserver.
What is the best way to simulate the UI of a page in a background thread?

I know of 3 possible solutions:
IHttpHandler
This question has the full answer. The general jiste is you capture the Response.Filter output by implementing your own readable stream and a custom IHttpHandler.
This doesn't let you capture a page's output remotely however, it only allows you to capture the HTML that would be sent to the client beforehand, and the page has to be called. So if you use a separate page for PDF generation something will have to call that.
WebClient
The only alternative I can see for doing that with ASP.NET is to use a blocking WebClient to request the page that is generating the HTML. Take that output and then turn it into a PDF. Before you do all this, you can obviously check your cache to see if it's in there already.
WebClient client = new WebClient();
string result = client.DownloadString("http://localhost/yoursite");
WatiN (or other browser automation packages)
One other possible solution is WatiN which gives you a lot of flexibility with capturing an browser's HTML. The setback with this is it needs to interact with the desktop. Here's their example:
using (IE ie = new IE("http://www.google.com"))
{
ie.TextField(Find.ByName("q")).TypeText("WatiN");
ie.Button(Find.ByName("btnG")).Click();
Assert.IsTrue(ie.ContainsText("WatiN"));
}

If the "the best way to get an accurate snapshot is to actually take a snapshot of the UI" is actually true, then you need to refactor your code.
Build a data provider that provides your aggregated data to both the UI and the PDF generator. Layer your system.
Then, when it's time to build the PDFs, you have only a single location to call, and no hacky UI interception/multiple-thread issues to deal with.

Related

ASP.Net MVC Long Running Process

I have a requirement to produce a report screen for different financial periods. As this is quite a large data set with a lot of rules the process could take a long time to run (well over an hour for some of the reports to return).
What is the best way of handling this scenario inside MVC?
I am concerned about:
screen locking
performance
usability
the request timing out

Those are indeed valid concerns.
As some of the commenters have already pointed out: if the reports do not depend on input from the user, then you might want to generate the reports beforehand, say, on a nightly basis.
On the other hand, if the reports do depend on input from the user, you can circumvent your concerns in a number of ways, but you should at least split the operation into multiple steps:
Have a request from the browser kick off the process of generating the report. You could start a new thread and tell it to generate the report, or you could put a "Create report" message on a queue and have a service consume messages and generate reports. Whatever you do, make sure this first request finishes quickly. It should return some kind of identifier identifying the task just started. At this time, you can inform the user that the system is processing the request.
Use Ajax to repeated poll the server for completion of the report using the given identifier. Preferably, the process generating the report should report its progress and this information should be provided to the user via the Ajax polling. If you want to get fancy, you could use SignalR to notify the browser of progress.
Once the report is ready, return a link to the user where he/she can access the report.
Depending on how you implement this, the user may be able to close the browser, have a sip of coffee and come back to a completed report.

In case your app is running on Windows Server with IIS your ASP.Net code can create a record in db table which will mean that report should be created.
Then you can use Windows Service or Console App which might be running on the same server and constantly checking if there any new fields in the table. This Service would create a report and during creation it should update table field to indicate progress.
Your ASP.net page might be displaying progress bar, getting progress indication from db using ajax requests or simply refreshing the page every several seconds.
If you are running on Windows Azure cloud you might use WebWorker instead of Windows Service
For screen locking on your page you may use jquery Block-UI library

ASP.Net Offload Page Processing

I have a current issue with one of our applications which is called by a 3rd Party.
The 3rd Party sends data (contained in a querystring) to a URL in our website and they must receive a OK response from the page within 5 seconds to verify that the call was received.
The page itself does a lot of processing of the data that is sent by the 3rd Party in the Page_Load function. It's possible that this is taking more than 5 seconds and therefore the page is not rendering until the processing is completed which in turn causes the 3rd Party to continue sending the data back to us multiple times as there system assumes we have not received it.
What I would like to know is what is the best way to offload the processing of the data so that I can render the page almost as soon as the 3rd Party calls the URL?
Note that there are no controls on the page, its purely a blank page with code behind it.

Am I right in assuming the 3rd party is just calling the page to send up the data, i.e they don't care about the result?
There are a couple of approaches that come to mind, a simple approach would be to dispatch the work as it comes in onto a thread and return "OK" immediately leaving the thread to carry on working.
The second approach would be to write the incoming query-string data to a file or database table and let an external process periodically pick it up and process batches of them.

Use JavaScript to retrieve the data after the page has loaded.

Update progress bar from codebehind

I have a really long submit()-type function on one of my web pages that runs entirely on the server, and I'd like to display a progress bar to the client to show the, well, progress.
I'd be ok with updating it at intervals of like 20% so long as I can show them something.
Is this even possible? Maybe some kind of control with runat="server"? I'm kind of lost for ideas here.

It's possible, but it's quite a bit harder to do in a web based environment than in, for example, a desktop based environment.
What you'll have to do is submit a request to the server, have the server start the async task and then send a response back to the client. The client will then need to periodically poll the server (likely/ideally using AJAX) for updates. The server will want to, within the long running task's body, set a Session value (or use some other method of storing state) that can be accessed by the client's polling method.
It's nasty, and messy, and inefficient, so you wouldn't want to do this if there are going to be lots of users executing this.
Here is an example implementation by Microsoft. Note that this example uses UpdatePanel objects, ASP timers, etc. which make the code quite a bit simpler to write (and it's still not all that pretty) but these components are fairly "heavy". Using explicity AJAX calls, creating web methods rather than doing full postbacks, etc. will improve the performance quite a bit. As I said though, even in the best of cases, it's a performance nightmare. Don't do this if you have a lot of users or if this is an operation performed very much. If it's just for occasional use by a small percentage of admin users then that may not be a concern, and it does add a lot from the user's perspective.

I would take a look at .net 4.5's async and await.
Using Asynchronous Methods in ASP.NET MVC 4 -- (MVC example I know sorry)
Then check out this example using a progress bar

Preventing Caching of Specific Images in Silverlight

It recently became apparent that my project is caching images. This is an issue because the user can upload a new image which does not get reflected until the browser is closed and reloaded (atleast when debugging in IE). I would like to not have to keep re-downloading images over and over again for things that have not changed, as that would very much increase the data we are sending out.
I have tried a couple solutions here and here2
The common factor seems to be that the variable that displays starts clean. But neither of those has worked for me.
I essentially am displaying images in two different ways.
1) I take a string and pass it into the source of an <Image />
2) I turn a string into a URI and turn that into a bitmap behind the scenes which then gets passed into the source of an <Image />
When the image gets updated server side the location of the user's image stays the same, only the data changes.
The coder doing server side stuff attempted a solution as well. He said he implemented some Cache preventing headers, the result was that the first time the image is requested after it has been updated it retrieves a new image and displays it. Any other places the image would be displayed do not get updated however.
I guess my ideal solution would be that once the user uploads the new image I implement something that notifies anyone that uses that particular URI to grab a new version.
Does anyone know how to selectively stop caching?

I would try append a timestamp to the Uri of the image you are requesting, this should help stop the browser (or any proxies) caching
e.g. http://www.example.com/myimage.jpg?ts=2011081107333454

First lets clear up the somewhat ambiguous term "caching".
We do all sorts of caching all the time. Whenever we take the result of an expensive operation and store that result for future use to avoid repeating the expensive operation we are in effect "caching". All frameworks including Silverlight will also be doing that sort of thing a lot.
However whenever the term "caching" is used in the context of a Web based application and refering to a resource fetched using HTTP the specific HTTP cache specification is what comes to mind. This is not unreasonable HTTP caches obviously play a major role and getting the response header settings on server right is important for correct operation.
An often missed aspect though of HTTP resource caching is that the reponsiblity to honor cache headers only lies with the HTTP stack itself, it does not lie with application using HTTP to even know anything about caching.
If then the application chooses to maintain its own "cache" of URIs to resources requested from the HTTP stack, it is not required to implement HTTP compliant caching algorithms. If such a "cache" is asked to provide a specific application object matching a specified Uri it is entirely free to do so without reference to HTTP.
If the HTTP caching were the only cache to worry about then assuming your "server coder" has actually got the cache headers set correctly then all should be well. However there may still be an application layer cache involved as well.
Ulitmately Robs suggestion makes sense in this case where you "version" the uri with a query string value. However its not about preventing caching, caching both at application and http level is a good thing, its about ensuring the resource referenced by the full Uri always the desired content.

How to gracefully check whether Gravatar, or third-party website, is working or not?

I just posted the question how-to-determine-why-the-browser-keeps-trying-to-load-a-page and discovered that my problem is with Gravatar.
I also noticed that StackOverflow is suffering from the same outage.
Does anyone know of a graceful way to determine if Gravatar, or any third party website for that matter, is up or not, before trying to retrieve avatar icons from them?
This would eliminate the long page load and the never ending busy cursor ... I shouldn't say never ending ... it just takes a long time to go away and is very confusing to the user as they sit there and wait ... for nothing.

You can have a different process that is periodically checking the status of the site. Set a rule about what is down for you, for instance you could say: "ping time > 1500 ms = down". Have this process to leave a note in a database table or config file. Then you check this value on each page rendering at almost no cost.
Depending on how critical is this external site, you can do the check more or less often.
This process could be an out of the web stack program, or a page only accessible through localhost that gets executed via Scheduled Tasks or an ASP.NET facility like mentioned in the comments.

For Gravatar you can cache all theses images instead of taking them from their server everytime. Of course, if user change their icon, it might not refresh as fast as it would be if it were direct access to the main server but at least you do not have to request gravar server everytime.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.