I'm executing request through some free proxy servers, and I would like to know what headers each proxy server sets. Right now I'm visiting a page that prints out the result in the html body.
using(WebClient client = new WebClient())
{
WebProxy wp = new WebProxy("proxy url");
client.Proxy = wp;
string str = client
.DownloadString("http://www.pagethatprintsrequestheaders.com");
}
The WebClient doesn't show the modified headers, but the page prints the correct ones. Is there any way to find out what headers that are being set by the proxy without visiting a page that prints them like in my example? Do I have to create my own http listener?
When the proxy server sets its own headers, it is essentially performing its own web request. It can even hide or override some of the headers that you set using your WebProxy.
Consequently, only the target page (pagethatprintsrequestheaders.com) can reliably see the headers being set by the proxy. There is no guarantee that the proxy server will send back the headers that it had sent to the target, back to you.
To put it another way, it really depends on the proxy server implementation. if the proxy server you are using is based on Apache's ProxyPass, you'd probably see the headers being set! If it's a custom implementation, then you may not see it.
You can first try inspecting the client.ResponseHeaders property of the WebClient after your response comes back. If this does not contain headers matching what (pagethatprintsrequestheaders.com) reports, then it's indeed a custom or modified implementation.
You could then create your own proxy servers, but this is more involved. You would probably spin up an EC2 instance, install Squid/TinyProxy/YourCustomProxy on it and use that in your WebProxy call.
You may also want to modify your question and explain why you want to read the headers. There may be solutions to your overall goal that don't require reading headers at all but could be done in some other way.
It looks like your sending a request from your WebClient, through the proxy and its received by the host at www.pagethatprintsrequestheaders.com.
If the proxy is adding headers to the request, your webclient will never see them on it's request.
webclient proxys request
request with headers added
client -----------> proxy ----------------------> destination host
The webclient can only see the state of the request between it and the proxy. The proxy will create a new request to send to the destination host, and its that request to which the headers are added. It also that request that is received by the destination host (which is why when it echoes back the headers it can see those added by the proxy)
When the response comes back, the headers are set by the host. It's possible that the proxy will add some headers to the response, but even if it did, they are not likely to be the same headers it adds to a request.
response response
(forwarded by proxy) (headers set by host)
client <------------------- proxy <------------------------- destination host
Using a host that echo the headers back as part of the response payload is one option.
Another would be to use something between the proxy and the destination host to inspect the request there (e.g a packet sniffer or another proxy like Fiddler that lets you see the request headers).
If the proxy is outside of you network, getting between the proxy and the destination host will be difficult (unless the host is under your control).
Related
I have a solution with two ASP.NET Core MVC projects. One project (Client) is making a request to the other (Server) using HttpClient. When the action in Server receives the request, I want to get the URL of the thing that sent it. Every article I have read purports Request.Headers["Referer"] as the solution, but in my case Headers does not contain a "referer" key (or "referrer").
When receiving the request in Server, how should I find the URL of the Client that sent it?
That is how you you get the referring url for a request. But the referer isn't the thing that sent the request. The referer gets set in the headers by the browser when a person clicks on a link from one website to go to another website. When that request is made by the browser to the new website the request will typically have the Referer header which will contain the url of the prior website.
The receiving server can't get the url of the "client" making the request, remember a typical web browser client isn't at any url. All the receiving server can get is the IP address of the client typically.
Since you have control of the client software, if you wanted you could have the client put whatever info you want in the header of the request before it's sent to the server and the server could then get that info out of the header.
If you're using HttpClient, then it is up to the site making the request to add that header. It isn't added automatically in this case. So: change the code - or request that the code is changed - so as to add the header and value that you expect. If you are proxying through a request, you might get the value from the current request's Referer header, and add that.
Even in the general case of a browser making the request as part of a normal page cycle, you can't rely on it: the Referer header is often deliberately not sent; depending on the browser version, configuration, whether you're going between different domains, whether it is HTTPS or not, and rel markers on a <a href=... such as "noreferrer".
Issue:
Consider the following working code.
System.Net.WebProxy proxy = new System.Net.WebProxy(proxyServer[i]);
System.Net.HttpWebRequest objRequest = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(https_url);
objRequest.Method = "GET";
objRequest.Proxy = proxy;
First, notice that proxyServer is an array so each request may use a different proxy.
If I comment out the last line thereby removing the use of any proxies, I can monitor requests in Fiddler just fine, but once I reinstate the line and start using them, Fiddler stops logging outbound requests from my app.
Question:
Do I need to configure something in Fiddler to see the requests or is there a change in .Net I can make?
Notes:
.Net 4.0
requests are sometimes https, but i don't think this is directly relevant to issue
all requests are outbound (not localhost/127.0.0.1)
Fiddler is a proxy itself. By assigning a different proxy to your request.. you're essentially taking Fiddler out of the equation.
If you're looking to capture traffic and use your own proxy.. you can't use a proxy (by definition that makes no sense).. you want a network analyzer, such as WireShark. This captures the traffic instead of having the traffic routed through it (as a proxy does), allowing you to have it monitor traffic and route your requests through your custom proxy.
I'm trying to use YoutubeFisher library with ASP.NET. I make an HttpWebRequest to grab html content, process the contest to extract the video links and display links on the web page. I managed to make it work on localhost. I can retrieve video links and download the video on the locahost. But when I push it to the Server, it works only if I send the request from the same Server. If that page is accessed by a client browser, the client can see the links properly, but when link is clicked the client gets the HTTP Error 403, everytime the client clicks on the link even though the link is correct.
My analysis is that when the Server makes HttpWebRequest to grab HTML contet, it sends HTTP header as well. The HTML content (links to the video file) that is sent back from YouTube server, I think, will reponse to only the request that matches that HTTP header, that is sent from the Server. So, when client clicks on the link it sends request to YouTube server with different HTTP header.
So, I'm thinking of getting the HTTP header from the client, then modify the Server HTTP header to include HTTP header info of the client before making HttpWebRequest. I'm not quite sure if this will work. As far as I know, HTTP heaer cannot be modified.
Below is the code that makes HttpWebRequest from YouTubeFisher library,
public static YouTubeService Create(string youTubeVideoUrl)
{
YouTubeService service = new YouTubeService();
service.videoUrl = youTubeVideoUrl;
service.GetVideoPageHtmlSource();
service.GetVideoTitle();
service.GetDownloadUrl();
return service;
}
private void GetVideoPageHtmlSource()
{
HttpWebRequest req = HttpWebRequest.Create(videoUrl) as HttpWebRequest;
HttpWebResponse resp = req.GetResponse() as HttpWebResponse;
videoPageHtmlSource = new StreamReader(resp.GetResponseStream(), Encoding.UTF8).ReadToEnd();
resp.Close();
}
Client browses the page but the links are there but give HTTP 403:
Browse the page the from the Server itself, everything works as expected:
How do I make HttpWebRequest on the behalf of the client then? Is my analysis of this problem correct?
Thank you for your input.
Use an http monitor such as Charles, Fiddler or even Firebug to find out what additional headers are being sent from the brower in the success case. I suspect you'll need to duplicate one or more of accept, user-agent or referer.
In the past I've just assumed that youtube has those links encoded so that they only work for the original request IP. If that were the case it would be far more difficult. I have no clue if this is the case or not, try forwarding all the header elements you can before going down this route...
The only possibility that comes to mind is that you'd have to use a javascript request to download the page to the client's browser, then upload that to your server for processing, or do the processing in javascript.
Or you could have the client download the video stream via your server, so your server would pass through the data. This would obviously use a ton of your bandwidth.
I am hitting up a server with the following code and am encountering a ServerProtocolViolation error:
// Prepare the webpage
HttpWebRequest request = (HttpWebRequest) WebRequest.Create(url + queryString);
// execute the request
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Does anyone know how to work around this kind of error?
This error means that the webserver that you're sending the request to isn't conforming to the HTTP standard.
Other than fixing the server or rewriting HttpWebRequest to be more generous, there isn't much you can do.
What URL are you requesting, and what's the text of the exception?
EDIT: If you request the URL in Fiddler, you'll see that the server didn't return any headers. You should contact the owner of the server and complain.
As a workaround, if you run Fiddler while sending the request, Fiddler will fix the response and allow HttpWebResponse to parse it.
Just a quick note on another reason that I experienced:
If you configure your request to use a proxy server (i.e. through the HttpWebRequest.Proxy property), and you use a wrong proxy port, there might also be a chance to see that error.
In my case, I configured http://127.0.0.1/ as the proxy but had the actual proxy server running on http://127.0.0.1:808/ instead (i.e. port "808" instead of "80").
If this is the case for you, try using no proxy or of course, configure the correct proxy port.
I'm narrowing in on an underlying problem related to two prior questions.
Basically, I've got a URL that when I fetch it manually (paste it into browser) works just fine, but when I run through some code (using the HttpWebRequest) has a different result.
The URL (example):
http://208.106.250.207:8192/announce?info_hash=-%CA8%C1%C9rDb%ADL%ED%B4%2A%15i%80Z%B8%F%C&peer_id=01234567890123456789&port=6881&uploaded=0&downloaded=0&left=0&compact=0&no_peer_id=0&event=started
The code:
String uri = BuildURI(); //Returns the above URL
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(uri);
req.Proxy = new WebProxy();
WebResponse resp = req.GetResponse();
Stream stream = resp.GetResponseStream();
... Parse the result (which is an error message from the server claiming the url is incorrect) ...
So, how can I GET from a server given a URL? I'm obviously doing something wrong here, but can't tell what.
Either a fix for my code, or an alternative approach that actually works would be fine. I'm not wed at all to the HttpWebRequest method.
I recommend you use Fiddler to trace both the "paste in web browser" call and the HttpWebRequest call.
Once traced you will be able to see any differences between them, whether they are differences in the request url, in the form headers, etc, etc.
It may actually be worth pasting the raw requests from both (obtained from Fiddler) here, if you can't see anything obvious.
Well, the only they might differ is in the HTTP headers that get transmitted. In particular the User-Agent.
Also, why are you using a WebProxy? That is not really necessary and it most likely is not used by your browser.
The rest of your code is fine.. Just make sure you set up the HTTP headers correctly. Check this link out:
I would suggest that you get yourself a copy of WireShark and examine the communication that happens between your browser and the server that you are trying to access. Doing so will be rather trivial using WireShark and it will show you the exact HTTP message that is being sent from the browser.
Then take a look at the communication that goes on between your C# application and the server (again using WireShark) and then compare the two to find out what exactly is different.
If the communication is a pure HTTP GET method (i.e. there is no HTTP message body involved), and the URL is correct then the only two things I could think of are:
make sure that your are send the right protocol (i.e. HTTP/1.0 or HTTP/1.1 or whatever it is that you should be sending)
make sure that you are sending all required HTTP headers correctly, and obviously that you are not sending any HTTP headers that you shouldn't be sending.
There could be something wrong with the URL. Instead of using a string, it's usually better to use an instance of System.Uri:
String url = BuildURI(); //Returns the above URL
Uri uri = new Uri(url);
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(url);
req.Proxy = new WebProxy();
using (WebResponse resp = req.GetResponse()) {
using (Stream stream = resp.GetResponseStream()) {
// whatever
}
}
I think you need to see exactly what's flowing to your server in the HTTP request. Does sound likely that the headers are interestingly different.
You can introduce a some kind of debugging proxy between your request and the server (for example RAD has such a capability in the box).