how to download a partial web page using .net

how to download a partial web page using .net - c#

We are downloading a full web page using System.Net.WebClient class. But we only want less than half of the page. So is there a way to download a portion of the page, say 1/3rd, half etc of a page using .net library so that We can save the network bandwidth and the space? If so, please throw your ideas, thanks.

You need to provide an "Accept-Ranges header" to your GET or POST request. That can be done by using the AddRange method of your HttpWebRequest:
HttpWebRequest myHttpWebRequest =
(HttpWebRequest)WebRequest.Create("http://www.foo.com");
myHttpWebRequest.AddRange(0,100);
That would yield the first 100 bytes. The server, however, needs to support this.

The sort answer is not unless the web app supports some way to tailor it's response to what you want it to return.
This could take the form of a
query string parameter
header field value
The simplest way to add this would be a query string parameter and when detected write out the necessary HTML to the response object. If you are unable to make changes to the web app then you won't be able to control how much of a page is returned to you.
You might want to read up on how HTTP works since the question and it's answer relies upon this. Specifically the Header Definition should be helpful.

Related

Send large String data from HTML/Javascript to code behind

Our application has some hidden fields like any other asp.net application. Here we are using hidden fields to store HTML of an image, which is considerably large. We use the value(large HTML string) of hidden field in our C# code for further processing.
We tend to create 4 instances of our application and we have doubts over load balancing of azure cloud service. We assign values to these hidden fields in midway of our application process through javascript. As this processing is done on client side there are no issues here. But since azure has multiple instances so if want to access these hidden fields on server side (i.e. in our C#), accessing these hidden fields directly would create any problem due to load balancing if the instance changes ?
Note: Our page does not postback while accessing these hidden fields
on the server side.
We are not clear when does the instance change, if our page does not postback then will the request go the same instance ? Is this guaranteed.
Also if the page does postback then does the response goes to the same instance of the calling request instance.
We need suggestions or the correct way of accessing these hidden fields on the server side. These hidden fields are very important to us; using the cache/session settings of azure will become very costly for us since the data is very large. It would be very helpful if the suggestion would be for cost free implementation. As we are already running on a tight budget.
//25Oct 2013
We have a large string of data, which is majorly made up of HTML obtained from Bing Map, we have taken the HTML of the Bing map using the Jquery Selector of our Bing Map div element, we want this HTML string to be sent to the code behind. We have this string on our javascript but when we do ajax call to the code behind it fails.
We even tried to send it in body via a POST method, but this fails again.
var string = formData; // so long text
var xhr = new XMLHttpRequest();
var body = "string=" + encodeURIComponent(string);
xhr.open("POST", "index.aspx/getString", true);
xhr.setRequestHeader("Content-Type", "application/x-www-form-urlencoded");
xhr.setRequestHeader("Content-Length", body.length);
xhr.setRequestHeader("Connection", "close");
xhr.send(body);
As Rick suggested, we had already tried blob but the problem is with sending the string from javascript to code behind. We are really stuck on the dead end here.

You indicated that caching would be costly for you. While it is true that using the Cache Service would incur some costs, have you considered co-located in-role caching as an alternative? Perhaps you have enough extra resources on your existing instances to support your needs. There is a link to some capacity planning in the link I've provided above if you choose to take this route.
--- 10/24/2013 ---
If I'm understanding your latest description correctly, you are generating some HTML on the client that you want to upload to your web app on Azure. In the web app, you're using that HTML to generate a PDF that I assume the client would later receive. If this is your scenario, then you could just upload the HTML and store it as a blob? This way any instance of the web app can reference it from blob storage rather than stuffing it into hidden fields.
In your web app, you can use HttpPostedFile to receive the file from the client and save it to a blob. Note: You may need to adjust the max. size allowed for the post since it defaults to 4MB and you indicated your data could be up to 5MB.
I'm trying to help you but your question is just not that clear. Even the title is misleading given the context of the discussion. If this doesn't help, then you may want to edit your question.

How can I redirect to an URL with custom headers using C#?

I have a bunch of parameters that I need to pass onto a second page via request headers. At first I tried via JS but I found out that that's impossible (please correct me if I'm wrong here).
So now I'm trying to do it in the code-behind (via C#). I want to write a bunch of custom request headers and call Response.Redirect or something similar to redirect user to the new page.
Is this possible? If so what methods do I have to use?
Edit: unfortunately using QS parameters is not an option here as it's out of my control.

Use a Server.Transfer("somepage.aspx?parameter1=value");
There is no client redirect then.
You can try setting the headers and do a Server.Transfer - I believe that will work to - up to you, but using the querystring is a bit more readable to me and doesn't show up in the clients browser.

you need to look at state in .net their are various ways to achive state.. in a stateless environment.
i would put it in the session object on page one.. read it on page 2...
create a session object on code behind page 1
read from session object on page 2.
or if you read the msdn state documenation on request paramters this will show you the options avliable.
JS dont worry about doing tricky stuff with it.. mostly trickey is wrong.

Inconsistent POSTing between Web Browser and HttpWebRequest

I’m working on Web Scraping using C# HttpWebRequest/HttpWebResponse. For the most part this process has gone smoothly. But after POSTing my way through several pages, I have gotten stuck with what seems to be an inconsistency between testing with the Web Browser and the HttpWebRequest/HttpWebResponse calls.
The problem occurs when I land on a page containing an input element that has a name similar to this: “RidiculouslyLongInputName.RidiculouslyLongInputName.RidiculouslyLongInputName.#RidiculouslyLong”
POSTing a value for this input element causes a 500 error when using HttpWebRequest but works fine when POSTing through the browser. If I remove this input value from the post data the the HttpWebRequest will not get the 500 error. But then I'm stuck with a data validate issue from the website.
Any idea on why HttpWebRequest is failing?

It's times like these when packet sniffers come in extremely useful for seeing exactly what kind of data is flowing through and what the difference is.
http://www.wireshark.org/
Is a great tool for things like this.
Filter down to only the domains you're interested in, then send off the packet with HttpWebRequest. Save the packet data somewhere. Repeat but do the request through the browser. Check the difference.
If it is indeed an issue with POST variables, it should be evident in the HTTP payload.

Not sure why you are running into the problem, but I would recommend grabbing a copy of Fiddler and taking a look at what the browser is sending in the POST request. It is possible there is something less than obvious going on.

You can also use Firebug extension with Firefox. With this extension installed and enabled, go through the entire scenario in Firefox. FIrebug will tell you the exact request/response sent by the browser. You can then duplicate that as much as possible using HttpWebRequest

First thanks for MEF response. That case was a personal mistake so I deleted the question.
I think best tool for your case is Fiddler but I guess there are other JavaScript attached to that button or something like that you are missing to mimic. WebRequest cannot do that for you and WebBrowser can do since it's working on DOM.
In order to use WebRequest correctly you highly need to reverse engineer every request by something like Fiddler. It's very hard to find what's exactly going on by looking at the page's source (and it's referenced Javascripts/CSS...).

How Design A Software to InputData In WebForm Automatically?

I Want Create A Software to Input Data in WebForms Automatically (like Robot) And Accept Input Data.
How I Can Create this Software in C# (Windows Application)?
what Technologies Must Be Used?
What OpenSource Project Exist for use?
Sample Code And etc...
Please Help Me

I hope you're doing something within the acceptable terms of use with the content you automatically post. Ie. you do not ask how to create yet another spam bot...
To grab the HTTP form you can use WebRequest. This returns the content of the page (including the form) as a response stream. You can then parse the response using HtmlAgility pack, for the forms you are interested. Once you know the forms and fields in the page, you can set values for the fields and post a response, again using a WebRequest but changing the method to POST and encoding the reponse fields as application/x-www-form-urlencoded content, see How to: Send Data Using the WebRequest Class.
This method is using almost the most basic building blocks, going lower level than this would mean using sockets and formating the HTTP request yourself. At this low level you'll have a great deal of freedom and flexibility on how to parse the form and send back the request, at the cost of actually having to understand how WebForms and HTTP work.

How to get an image file extension from the web when it has been stripped?

The link below is an image URL where the extension has been stripped. I assume this is being done with content negotiation tools. I know that it's a GIF having viewed the HTML meta data with Firebug. What I would like to know is a simple way working in C# on .NET, how would I get the file type of this URL?
http://ep.yimg.com/ca/I/yhst-20493720720238_2066_63220718
With most image URLs it's easy. One can use string functions to find the file type in the URL.
Ex. /imageEx.png

You're going to have to make an HTTP HEAD request, and then check the Content-Type on the response. I can't recall whether System.Net.HttpWebRequest supports HEAD requests, but that would be the place to start.
Alternatively, you could perform a full GET request, but that could have performance implications if all you need to know is Content-Type.

You would have to read in the image and look for 'magic numbers' which can tell you what the file type really is. Here is an incomplete example of what I am talking about:
http://www.garykessler.net/library/file_sigs.html
EDIT: OK, you don't have to do it this way in this context. I am not a web guy, so this is how I would have approached it :-)

See Content-Type. You might also want to read up on content type spoofing.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

how to download a partial web page using .net - c#

Related

Send large String data from HTML/Javascript to code behind

How can I redirect to an URL with custom headers using C#?

Inconsistent POSTing between Web Browser and HttpWebRequest

How Design A Software to InputData In WebForm Automatically?

How to get an image file extension from the web when it has been stripped?

Categories

Resources