Is WebClient the best way to download http data? - c#

Problem: I need to download hundreds of images from different hosts. Each host has anywhere between 20-hundreds of images.
Solution: using a new WebClient every time a image needs to be downloaded through the WebClient's DownloadData method.
Or would be better to keep a pool of open socket connections and making the http request using lower level calls?
Is it expensive to open/close a tcp connection (I'm assuming that is what WebClient does), so that using a pools sounds more efficient?

I believe the underlying infrastructure which WebClient uses will already pool HTTP connections, so there's no need to do this. You may want to check using something like Wireshark of course, with some sample URLs.
Fundamentally, I'd take the same approach to this as with other programming tasks - write the code in the simplest way that works, and then check whether it performs well enough for your needs. If it does, you're done. If it doesn't, use appropriate tools (network analyzers etc) to work out why it's not performing well enough, and use more complicated code only if it fixes the problem.
My experience is that WebClient is fine if it doesn't what you need - but it doesn't give you quite as much fine-grained control as WebRequest. If you don't need that control, go with WebClient.

I use HttpWebRequest and HttpWebResponse to scrape anything I want. Unless, of course, there are services available for the requirement, but even though, sometimes, there are limitations (business limitations) and I often prefer to dig the html from pure http request. Sometimes just make feel more like developer, you know...

Related

Why use HttpClient over HttpWebRequest for synchronous requests

In my scenario, I have to send data from one web application to a webapi which is an effective data store. The requests are necessarily synchronous and I most definitely want an Exception thrown if something goes awry as it means a critical part of the application is unavailable.
This is a derivative of, though not duplication of an existing question; Why use HttpClient for Synchronous Connection.
Yet over and over, including in the article I see above, I see a consistent recommendation to use HttpClient, even in a synchronous scenario. The best reason I've seen is the accepted answer in the SO post above but it essentially boils down to;
Use this because "shiny".
Which I'm not liking as an acceptable answer for my scenario. I'd prefer to use the correct object for the task at hand and this seems to be the older HttpWebRequest. Even Ben Watson's excellent resource "Writing High-Performance .NET Code" states the following;
Another example is the System.Net.HttpWebRequest class, which will
throw an exception if it receives a non-200 response from a server.
This bizarre behavior is thankfully corrected in the
System.Net.Http.HttpClient class in .NET 4.5
But in my scenario, I actually do want that behavior. While there are a lot of good use cases for HttpClient, can anyone offer a good reason not to use HttpWebRequest in my scenario? Am I using the correct object or not? And more importantly, why?
HttpClient is designed to give more control over http protocol, where else doing same in HttpWebRequest or WebClient was not that straight forward. Apart from asynchronous, there are many benefits of HttpClient
Benefits of HttpClient
Biggest benefit of HttpClient is plugin architecture, that lets you change underlying behavior of HTTP protocol easily.
HttpClient is extensible, underlying HttpMessageHandler allows you to completely by pass underlying Microsoft's HttpClient implementation and you can plugin your own implementation. For example, in iOS and Android, instead of using .Net's HttpClient, we could use native Http stack.
It is easy to replace caching, cookies by customizing HttpMessageHandler
CancellationToken support is excellent when we want to cancel a long running Http request.
Not shiny, but important, Multi threaded, HttpClient is optimized to manage multiple requests with single instance. CPU time is utilized very efficiently without using too many locks (synchronous operations depend on locks, which is considerable overhead on CPU). Today we are living in world of micro services. In server with many clients to serve and mobile OS, CPU time is costly.
Drawbacks
Only drawback is async/await, you can't simply use async libraries easily in synchronous code without using a Task Runner or deadlocks. Though there are many libraries supporting how to synchronously use async code.
There is no great benefit of HttpClient on Desktop application with lots of CPU time as spare.
HttpClient's behavior is considered "cleaner" because a non-success response from the server doesn't necessarily mean something has gone awry. While it's not true of your situation, imagine a process that wants to check that a resource does not exist and expects that it typically does not. With HttpWebRequest, the normal execution flow throws an exception, which is kind of gross and can complicate things, whereas HttpClient does not.
For your specific scenario, the distinction is perhaps irrelevant. Other situations in your program might prefer the HttpClient behavior though, and it's nice to standardize on a single HTTP client instead of having to juggle two.
HttpClient is not a replacement of WebClient/HttpWebRequest. HttpWebRequest gives you more flexibility, but at the same time it makes your code bit more verbose. Where as HttpClient provides a simple interface. You can use HttpWebRequest over HttpClient if you really want the additional features.
As per non-200 response code exceptions are concerned, HttpClient provides a way to simulate that behavior. You have to invoke
response.EnsureSuccessStatusCode();
For more details please visit Usage of EnsureSuccessStatusCode and handling of HttpRequestException it throws

WCF REST Push Stream Service

Need some help figuring out what I am looking for. Basically, I need a service in which the Server dumps a bunch of XML into a stream (over a period of time) and every time the dump occurs N number of clients read the dump.
Example: Every time one of a 1000 stocks goes up by 5 cents, the service dumps some XML into a stream. The connecting applications grab the information from the stream.
I don't think the connection will ever close, as there needs to be something reading the stream for new data.
This needs to adhere to WCF REST standards, is there something out there that I'm looking for? In the end, it's just a non-stop stream of data.
Update: Looks like the service needs to be a multi-part/mixed content type.
An application I'm working on has a similar architecture, and I'm planning to use SignalR to push updates to clients, using long-polling techniques. I haven't implemented it yet, so I can't swear it will work for you, but their documentation seems promising: Update: I have implemented this now, and it works very well.
Pushing data from the server to the client (not just browser clients)
has always been a tough problem. SignalR makes it dead easy and
handles all the heavy lifting for you.
Scott Hansleman has a good blog on the subject and there is a useful article (involving WCF, REST, and SignalR) here: http://www.codeproject.com/Articles/324841/EventBroker
Instead of using WCF, have you look into ASP.NET MVC WebAPI?
For more information about using PushStreamContent in WebAPI, Henrik has a nice blog with example (under the heading 'Push Content').
Have you considered archived Atom feeds? They are 100% RESTful (hypermedia controls and all) and most importantly, they are very scalable.
Specifically, the archive documents never change, so you can set a cache expiry of 1 year or more. The subscription document is where all the newest events go and is constantly changing, but with the appropriate HTTP caching headers, you can make so you return 304 Not Modified if nothing has changed between each client request. Also, if you service has a natural time resolution, you can set the max-age to take advantage of that. For instance, if you data has a 20min resolution, you could include the following header in the subscription document response:
Cache-Control: max-age=1200
that way you can let you caches do most of the heaving lifting and the clients can poll the subscription document as often as they like, without bringing your service to it's knees.

handling multiple clients c#

I am working on a project where i need to connect with multiple clients and every client is streaming live screen capturing to server. Server show that.
What would be the best approach for that.
Thank You
You can use WCF in streaming mode for the video, but I doubt it is a good solution.
I think that going for pure sockets is better, to get the performance required. Showing a live video stream is also not really a limited operation (which is what WCF is built for), but rather something ongoing.
My suggestiion is to:
Use a pure TCP socket for the video stream for a start.
If that gives problems, you can switch to UDP. It is better to skip over any lost packages for live video, but with UDP you have to track package ordering etc. yourself.
If you need control operations, use a separate WCF service for that.

What is the best method to call an arbitrary JSON server from .NET (Specifically Windows Phone 7)

I have a server that I have no control over, it's JSON based and I've put together a simple proof of concept that calls the server using HTTPWebRequest etc and it works fine (if a little wordy since MS have removed all Synchronous I/O calls).
Is there a better way of doing this? I've been looking at WCF as an option but any stable and reasonably performant library should do the job. This is a new area for me so I'm a little unsure what the best practice is (or where to find it out)
Thanks in advance
Dave
Not sure whether it's the best method, but HttpWebRequest plus DataContractJsonSerializer are probably the best approach using classes from the Windows Phone library only -- plus HttpWebRequest's asynchronous methods ensure that your code won't block the UI thread when performing network requests.
Once you've written the http request code once, you can easily abstract it away for reuse. You've already done the hard part :-)

Pause/Resume Upload in C#

I'm looking for a way to pause or resume an upload process via C#'s WebClient.
pseudocode:
WebClient Client = new WebClient();
Client.UploadFileAsync(new Uri("http://mysite.com/receiver.php"), "POST", "C:\MyFile.jpg");
Maybe something like..
Client.Pause();
any idea?
WebClient doesn't have this kind of functionality - even the slightly-lower-level HttpWebRequest doesn't, as far as I'm aware. You'll need to use an HTTP library which gives you more control over exactly when things happen (which will no doubt involve more code as well, of course). The point of WebClient is to provide a very simple API to use in very simple situations.
As stated by Jon Skeet, this is not available in the Webclient not HttpWebRequest classes.
However, if you have control of the server, that receives the upload; perhaps you could upload small chunks of the file using WebClient, and have the server assemble the chunks when all has been received. Then it would be somewhat easier for you to make a Pause/resume functionality.
If you do not have control of the server, you will need to use an API that gives you mere control, and subsequently gives you more stuff to worry about. And even then, the server might give you a time-out if you pause for too long.
ok, with out giving you code examples I will tell you what you can do.
Write a WCF service for your upload, that service needs to use streaming.
things to remember:
client and server needs to identify
the file some how i suggest the use
of a Guid so the server knows what
file to append the extra data too.
Client needs to keep track of
position in the array so it knows
where to begin the streaming after it
resumes it. (you can even get the
server to tell the client how much
data it has but make sure the client
knows too).
Server needs to keep track of how
much data it has already downloaded
and how much still missing. files
should have a life time on the
server, you dont want half uploaded
and forgotten files stored on the
server forever.
please remember that, streaming does
not allow authentication since the
whole call is just one httprequest.
you can use ssl but remember that
will add a overhead.
you will need to create the service
contract at message level standard
method wont do.
I currently writing a Blog post about the very subject, It will be posted this week with code samples for how to get it working.
you can check it on My blog
I know this does not contain code samples but the blog will have some but all in all this is one way of doing stop and resume of file uploads to a server.
To do something like this you must write your own worker thread that does the actual http post stepwise.
Before sending a you have to check if the operation is paused and stop sending file content until it is resumed.
However depending on the server the connection can be closed if it isn't active for certain period of time and this can be just couple of seconds.

Categories