I'm downloading a site for its content using a Webcrawler I wrote with Microsoft WebBrowser.
A part of the site's content is sent only after some kind of verification sent from the client side - my guess is that its cookies / session cookies.
When i'm trying to download the page from my crawler i see (with Fiddler's help) that the inner link for the ajax sends 'false' for one of the parameters and the data is not received.
When I try to perform the same action from any browser, Fiddler shows that the property is sent as '1'.
After a day of testing, any lead will be grateful - Is there a way to manipulate this property? plant cookies? any other idea?
Following khunj answer, I'm adding Headers from IE and from my WebBrowser:
In both headers i removed fields which have the same value
From IE:
GET /feed/prematch/1-1-234562-8527419630-1-2.dat HTTP/1.1
x-requested-with: XMLHttpRequest
Referer: http://www.mySite.com/ref=12345
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0)
Connection: Keep-Alive
Cookie: __utma=1.1088924975.1299439925.1299976891.1300010848.14;
__utmz=1.1299439925.1.1.utmcsr=(direct)|utmccn=
(direct)|__utmb=2.1.10.1300010848; __utmc=136771054; user_cookie=63814658;
user_hash=58b923a5a234ecb78b7cc8806a0371c5; user_time=1297166428; infobox_8=1;
user_login_id=12345; mySite=5e1c0u8g6qh41o2798ua2bfbi3
HTTP/1.1 200 OK
Date: Sun, 13 Mar 2011 10:07:38 GMT
Server: Apache
Last-Modified: Sun, 13 Mar 2011 10:07:25 GMT
ETag: "26a6d9-19df-49e5a5c9ed140"
Accept-Ranges: bytes
Content-Length: 6623
Cache-Control: max-age=0, no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Connection: close
Content-Type: text/plain
Content-Encoding: gzip
From WebBrowser:
GET /feed/prematch/1-1-234562-8527419630-false-2.dat HTTP/1.1
x-requested-with: XMLHttpRequest
Referer: http://www.mySite.com/ref=12345
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0)
Connection: Keep-Alive
Cookie: __utma=1.1782626598.1299416994.1299974912.1300011023.129;
__utmb=2.1.10.1300011023; __utmz=1.1299416994.1.1.utmcsr=
(direct)|utmccn=(direct)|__utmc=136771054; user_cookie=65192487;
user_hash=6425034050442671103fdd614e4a2932; user_time=1299416986;
user_full_time_zone=37;user_login_id=12345; mySite=q9qlqqm9bunm9siho32tdqdjo0
HTTP/1.1 404 Not Found
Date: Sun, 13 Mar 2011 10:10:33 GMT
Server: Apache
Content-Length: 313
Connection: close
Content-Type: text/html; charset=iso-8859-1
Thanks in advance,
Oz.
Well, the server is obviously treating your request from your crawler differently. Since you already have fiddler involved, what is different in your request headers when you make the request from IE versus using your crawler. The reason I say IE is because the webbrowser control uses the same engine as IE for doing its work.
The way I solved my problem is by using Fiddler as a proxy and defining a custom reply to the server that whenever the PathAndQuery property contains the site address, replace the 'false' to '1'.
Not the most elegant solution but fits my problem.
I learned the most from these 2 pages:
FiddlerScript CookBook
A site which teaches on the specific customRules.js file and the field i needed to edit
Thanks for the help,
Oz.
Related
I have a C# .NET application that is writing a zipped file back to the client for download. However, the browser does not receive the file or it rejects the file. The browser does not show any notifications. I have tried it both on Firefox and Chrome.
I have captured the request and response from the client and server using Fiddler:
Request:
POST http://localhost:62526/Reports/_Report_RewardLetters HTTP/1.1
Host: localhost:62526
Connection: keep-alive
Content-Length: 228
Accept: */*
Origin: http://localhost:62526
X-Requested-With: XMLHttpRequest
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
DNT: 1
Referer: http://localhost:62526/Reports
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.8
Response:
HTTP/1.1 200 OK
Cache-Control: private
Transfer-Encoding: chunked
Content-Type: application/octet-stream
Server: Microsoft-IIS/10.0
content-dispostion: filename=Letter.zip
X-AspNet-Version: 4.0.30319
X-SourceFiles: =?UTF-8?B?QzpcVXNlcnNcc2FpbmlfaFxTb3VyY2VcUmVwb3NcUmVzZWFyY2hPZmZpY2VEYXNoYm9hcmRcUmVzZWFyY2hPZmZpY2VEYXNoYm9hcmRcUmVwb3J0c1xfUmVwb3J0X1Jld2FyZExldHRlcnM=?=
X-Powered-By: ASP.NET
Date: Thu, 11 Feb 2016 23:13:22 GMT
....Truncated the file contents.....
My code:
Response.Clear();
Response.ClearHeaders();
Response.ClearContent();
Response.AddHeader("content-dispostion", "filename=MyFile.zip");
Response.ContentType = "application/octet-stream";
Response.Flush();
Response.WriteFile(myfile);
Response.Flush();
Response.End();
I have tried numerous combinations of Response.Flush(), Response.Clear(), HttpContext.ApplicationInstance.CompleteRequest(), Response.BinaryWrite(), Response.TransmitFile(), etc. but none seem to work. Additionally, I have in my code the necessary checks to determine the existence of the file.
From the fiddler captures, I think there is something wrong in the encoding or the server response of the file being sent to the client whereby the browser is rejecting the file without any notification.
Thanks for your help!
Just a thought: do you need the Response.Flush statements? Setting your headers, writing your file and then calling Response.End should be enough.
Also, set the ContentType to "application/zip, application/octet-stream"
I had made a mistake that was causing all the fuss. The view was using an Ajax form to make the request instead of a normal Html form.
I fixed the problem by changing my controller and view according to the information present here:
http://geekswithblogs.net/rgupta/archive/2014/06/23/downloading-file-using-ajax-and-jquery-after-submitting-form-data.aspx
Download Excel file via AJAX MVC
I have a WPF (could be any winform I guess) app that tries to login to a standard MVC 5 website using a HttpClient.
Normally I can login successfully with a call to PostAsync() where I provide the UserName and Password params in a HttpContent!
However, when I add the [ValidateAntiForgeryToken] to my controller's Login (POST) action, the PostAsync() call fails with Internal Server Error.
I have tried collecting the "__RequestVerificationToken" from a simple GET request and sending it with my POST request by adding it to the POST params, the Header of the request or the HttpHandler's CookieContainer (or any combination of the three) but still I get error 500 from the server.
I know it can be done with HttpWebRequests (apparently) but I don't know what I'm missing when using a HttpClient. I also don't know what exactly went wrong on the server side.. or how to check that since the code never reaches my controller method.
Did someone else try this by any chance?
EDIT 1:
I'm adding the raw data sent by the browser for both GET and POST:
GET http://localhost:57457/Account/Login HTTP/1.1
Accept: text/html, application/xhtml+xml, */*
Referer: http://localhost:57457/Account/Login
Accept-Language: en-US
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
DNT: 1
Host: localhost:57457
Cookie: NavigationTreeViewState=%5b%7b%27N0_1%27%3a%27T%27%2c%27N0%27%3a%27T%27%7d%2c%27N0_1_2%27%2c%7b%7d%5d; style=default; __RequestVerificationToken=Bak42Ga5sHJitYlmut6OgvmqXNmP7kKQRNaMSsLMAUh86iHGGmz5pnNfz_soKu46Wax9sG23arPOTnSh1bvaWyWqQ9NH4GJxFmendW8VFTg1
RESPONSE:
HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
Vary: Accept-Encoding
Server: Microsoft-IIS/8.0
X-AspNetMvc-Version: 5.2
X-Frame-Options: SAMEORIGIN
X-AspNet-Version: 4.0.30319
X-SourceFiles: =?UTF-8?B?RDpcRUJTLkNvZGVcUHJvamVjdHNcQ1ZSUE9TX1dlYlNpdGVcQ1ZSUE9TX1dlYlNpdGVcQWNjb3VudFxMb2dpbg==?=
X-Powered-By: ASP.NET
Date: Thu, 04 Dec 2014 10:00:00 GMT
Content-Length: 1734
[View page content]
POST http://localhost:57457/Account/Login HTTP/1.1
Accept: text/html, application/xhtml+xml, */*
Referer: http://localhost:57457/Account/Login
Accept-Language: en-US
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
Content-Length: 180
DNT: 1
Host: localhost:57457
Pragma: no-cache
Cookie: NavigationTreeViewState=%5b%7b%27N0_1%27%3a%27T%27%2c%27N0%27%3a%27T%27%7d%2c%27N0_1_2%27%2c%7b%7d%5d; style=default; __RequestVerificationToken=Bak42Ga5sHJitYlmut6OgvmqXNmP7kKQRNaMSsLMAUh86iHGGmz5pnNfz_soKu46Wax9sG23arPOTnSh1bvaWyWqQ9NH4GJxFmendW8VFTg1
__RequestVerificationToken=Bak42Ga5sHJitYlmut6OgvmqXNmP7kKQRNaMSsLMAUh86iHGGmz5pnNfz_soKu46Wax9sG23arPOTnSh1bvaWyWqQ9NH4GJxFmendW8VFTg1&UserName=test&Password=test
RESPONSE:
HTTP/1.1 400 Bad request (user/password for testing purposes only)
Cache-Control: private
Content-Type: text/html; charset=utf-8
Server: Microsoft-IIS/8.0
X-AspNetMvc-Version: 5.2
X-Frame-Options: SAMEORIGIN
X-AspNet-Version: 4.0.30319
X-SourceFiles: =?UTF-8?B?RDpcRUJTLkNvZGVcUHJvamVjdHNcQ1ZSUE9TX1dlYlNpdGVcQ1ZSUE9TX1dlYlNpdGVcQWNjb3VudFxMb2dpbg==?=
X-Powered-By: ASP.NET
Date: Thu, 04 Dec 2014 10:00:00 GMT
Content-Length: 4434
[View page content]
EDIT 2:
This is what my app sends for GET and POST:
GET http://localhost:57457/Account/Login HTTP/1.1
Host: localhost:57457
Connection: Keep-Alive
POST http://localhost:57457/Account/Login HTTP/1.1
Content-Type: application/x-www-form-urlencoded
Host: localhost:57457
Cookie: __RequestVerificationToken=df9nBSP_J1IiLrv84RwrkmvbYBrnH4iqv97wRvz6HMPLWBhgI4XzGeAFcschovHwD8mTtHU6xrmVxz1Ku96_BaoB79le_vLTcrgGemU4gjc1
Content-Length: 163
Expect: 100-continue
__RequestVerificationToken=df9nBSP_J1IiLrv84RwrkmvbYBrnH4iqv97wRvz6HMPLWBhgI4XzGeAFcschovHwD8mTtHU6xrmVxz1Ku96_BaoB79le_vLTcrgGemU4gjc1&UserName=test&Password=test
And finally this is the error:
[HttpAntiForgeryException (0x80004005): Validation of the provided anti-forgery token failed. The cookie "__RequestVerificationToken" and the form field "__RequestVerificationToken" were swapped.]
Thanks!
You d probably need to include aspnet session id cookie with your requests
EDIT:
OK ur right, it is not the session id, but you need two token to send back to your post action.
I think what you re doing wrong is using same value for both tokens, but they should be different, altho name of both tokens is __RequestVerificationToken.
Token grabbed from cookie should be send back as cookie and token grabbed from form field goes back as form field.
It's because you're missing the anti-forgery token from HtmlHelper.AntiForgeryToken() in your POST from your application.
You'll need to load a page from your WPF application with HtmlHelper.AntiForgeryToken() on the view. Then take the value of the hidden input element with the name __RequestVerificationToken and attach it to your login POST request to the server.
In request header Accept-encoding: gzip, deflate, is missing, but in response header Content-encoding: gzip is present. does it cause compression failed. if yes, how to avoid it??
Request URL: http://something.com/something.js
Request Method: GET
Status Code: 200 OK 200 OK
Request Headers
Accept: */*
Referer: somthing.comsomthing.aspx
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36
X-DevTools-Emulate-Network-Conditions-Client-Id: 2D3ED9B5-95BD-4984-9EEE-405C2889F11E
Response Headers
Accept-Ranges: bytes
Content-Encoding: gzip
Content-Length: 884
Content-Type: application/x-javascript
Date: Tue, 28 Oct 2014 11:09:13 GMT
ETag: "0ac99ce3e9fcf1:0"
Last-Modified: Mon, 14 Jul 2014 08:37:12 GMT
Server: Microsoft-IIS/8.0
Vary: Accept-Encoding
X-Powered-By: ASP.NET
From RFC 7231:
A request without an Accept-Encoding header field implies that the
user agent has no preferences regarding content-codings. Although
this allows the server to use any content-coding in a response, it
does not imply that the user agent will be able to correctly process
all encodings.
In short: if you specify no Accept-Encoding, it's legal (though ill-advised) for the server to send you compressed content. There doesn't appear to be a solid, reliable way to tell a web server that it should definitely not compress. You can try Accept-Encoding: *;q=0 or Accept-Encoding: identity, but support for this is not universal across web servers, and proxies can mess things up as well.
In the end you are probably better off with simply handling compressed content if it comes back as such -- there is no good reason for a client to not support compression and libraries for this are freely available.
I have a webservice that needs a Basic Authentication header. However, when I call it using
var header = "Authorization: Basic " +
CreateBasicHttpAuthenticationHeader(login, password);
webRequest.Headers.Add(header);
var webResponse = (HttpWebResponse)webRequest.GetResponse();
It returns a 303 - See Other:
POST https://myservice/rates HTTP/1.1
Authorization: Basic QXZ...NjY=
Content-Type: application/x-content
X-API-Version: 1.1
If-Unmodified-Since: Mon, 23 Sep 2013 08:32:27 GMT
User-Agent: UserAgent
Response:
HTTP/1.1 303 See Other
Date: Mon, 23 Sep 2013 08:30:57 GMT
X-Opaque-ID: q8nxxxxc
Location: https://myservice/rates
Content-Length: 0
.Net then automatically sends a GET request to the new location:
GET https://myservice/rates HTTP/1.1
Content-Type: application/x-content
X-API-Version: 1.1
If-Unmodified-Since: Mon, 23 Sep 2013 08:32:27 GMT
User-Agent: UserAgent
But does not send the Authorization header this time. Do you know a way to tell it to send all headers, on all calls? Should I tell it not to follow the content?
You have to deactivate the AllowRedirect property of your HttpRequest.
Then you have to build you own redirection system with basic auth headers.
This is not wonderful, but otherwise the .Net framework drops your header when redirecting.
I am trying to automate the daily retrieving of a web file using .NET.
The file is a PDF located at an address similar to:
http://www.example.com/?s=doc20101022
and these are the headers of HTTP request registered for debug using IE
HTTP/1.1 200 OK
Server: Apache/2.2.3 (CentOS)
Vary: User-Agent,Accept-Encoding
Expires: 0
Cache-Control: must-revalidate, post-check=0, pre-check=0
Pragma: public
Last-Modified: Mon, 22 Nov 2010 22:45:12 GMT
Cache-Control: private
Content-Disposition: attachment; filename="doc20101022.pdf"
Content-Transfer-Encoding: binary
Content-Type: application/force-download
Date: Tue, 23 Nov 2010 10:41:43 GMT
X-Varnish: 2155914052
Via: 1.1 varnish
Content-Length: 6596997
Proxy-Connection: Keep-Alive
Connection: Keep-Alive
Age: 2
Can you please suggest me a way to get it and save it locally using WebClient, WebBrowser or other VB.NET (Framework 4.0) components?
Use the DownloadFile or DownloadFileAsync method of WebClient:
WebClient wc = new WebClient();
wc.DownloadFileCompleted += new AsyncCompletedEventHandler(delegate(object source, AsyncCompletedEventArgs args) {
// Do something when the file has been downloaded successfully.
});
wc.DownloadFileAsync(new Uri("http://www.example.com/?s=doc20101022"), #"C:\Yourfile.pdf");
Edit: You tagged the question with c# and only mentioned .NET in the subject so I've provided you with a C# solution. If you need it in VB.NET it should be easy to port though.