I am using HTTPWebRequest to simulate POST request to login to a website. but when using firebug to track what the browser is doing during the login i find that it makes some GetRequests after that login.
So what i am looking for is how to make my POST Request automatically do that GET Requests ?
someone told me to use the JS Functions but i am totally clueless of this.
private static async Task<byte[]> LoginAsync(string username, string password)
{
var postData = new NameValueCollection();
var uri = new Uri(string.Format("http://{0}/", ServerName));
postData.Add("name", username);
postData.Add("password", password);
postData.Add("login", ParseLoginId(await GetPage("login.php")));
return await HttpHandler.UploadValuesTaskAsync(uri, postData);
}
MY HTTP HANDLER
private CookieContainer _mContainer = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = _mContainer;
}
return request;
}
public void ClearCookies()
{
_mContainer = new CookieContainer();
}
I am using the code above to send the POST Request but the problem is it does not totally simulates what the browser do... so it does not autamatically send the required GET requests after the login.
The preceding GET requests are likely Location header redirects.
Why not use the System.Net.WebClient class.
C# Console/Server access to web site
The link shows you how to extend the WebClient to persist cookie information.
Related
I am trying to login a website then walk around for catch some infromation and do some stuff in it.
Everything is well when the request doesnt need any cookies, but some page need cookie which is created on first request. So I should collect all of cookies in WebClient object.
I am using this code but its not enough for me. Because still I am missing all of my cookies for next request.
public class CookieAwareWebClient : WebClient
{
public CookieContainer CookieContainer { get; set; }
public CookieCollection ResponseCookies { get; set; }
//public CookieContainer ResponseCookieContainer { get; set; }
public CookieAwareWebClient()
: base()
{
CookieContainer = new CookieContainer();
ResponseCookies = new CookieCollection();
}
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest request = base.GetWebRequest(address);
HttpWebRequest webRequest = request as HttpWebRequest;
if (webRequest != null)
{
webRequest.CookieContainer = CookieContainer;
}
return request;
}
protected override WebResponse GetWebResponse(WebRequest request)
{
var response = (HttpWebResponse)base.GetWebResponse(request);
this.ResponseCookies = response.Cookies;
return response;
}
Here is my code for make request.
var loginLink ="https...."; // an Uri with username and password values as queryString
CookieAwareWebClient client = new CookieAwareWebClient();
var loginResult = client.DownloadString(loginLink);
I can see the result , yes I am in! login was success, but I am losting all of my cookies , and next request sending me to login page..
I should collect all of cookies in my webcilent cookieContainer. I should read Response Header for set "Set-Cookie" values to my container.
Think about this, I have 3 key in my cookies before request.
a = "123",
b = "asd",
c = "123"
and now I send a request to website and its returning me 2 cookies(one is new and one is old key with new value) back,(I can see in response Headers , in "Set-Cookie")
a = "123456",
d = "blabla"
so I need to change key of "a"s value, and I need to add "d" key to my Cookie, because I dont want to go back to login page on my next request again...
Maybe I need a library, maybe a better WebClient Class whcih can collect all cokkies to help me to discover all pages.
I hope someone can help me.
Best Regards!
Recently, I came across a python script to download files directly from Kaggle : https://ramhiser.com/2012/11/23/how-to-download-kaggle-data-with-python-and-requests-dot-py/
I am trying to do something similar using WebClients in C#. I've came the following response in StackOverFlow : C# download file from the web with login
Tried using it but I seem to be downloading only the login page instead of the actual file. Here's my main code :
CookieContainer cookieJar = new CookieContainer();
CookieAwareWebClient http = new CookieAwareWebClient(cookieJar);
string postData = "name=<username>&password=<password>&submit=submit";
string response = http.UploadString("https://www.kaggle.com/account/login", postData);
Console.Write(response);
http.DownloadFile("https://www.kaggle.com/c/titanic/download/train.csv", "train.CSV");
I've used the Webclient extension from the link above and modified slightly :
public class CookieAwareWebClient : WebClient
{
public CookieContainer CookieContainer { get; set; }
public Uri Uri { get; set; }
public CookieAwareWebClient()
: this(new CookieContainer())
{
}
public CookieAwareWebClient(CookieContainer cookies)
{
this.CookieContainer = cookies;
}
protected override WebRequest GetWebRequest(Uri address)
{
this.Uri = address;
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = this.CookieContainer;
}
HttpWebRequest httpRequest = (HttpWebRequest)request;
httpRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
return httpRequest;
}
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse r = base.GetWebResponse(request);
var response = r as HttpWebResponse;
if (response != null)
{
CookieCollection cookies = response.Cookies;
CookieContainer.Add(cookies);
}
return response;
}
}
Was wondering if anyone can point out where I went wrong?
Thanks.
We have created a forum post to help you accomplish what you wanted to do, Accessing Kaggle API through C#. Feel free to post here or on the forum if you have additional questions.
Try to go to https://www.kaggle.com/c/titanic/download/train.csv by your browser without logged in and your browser will open that page instead of downloading your file. You need to put direct link to the file instead of a web page.
Your code works perfectly, you just need to put a direct link to that file or make sure you have logged in before download the file.
I know it's not exactly what you were asking, but Kaggle now has an official API that you can use to download data. Should be a bit easier to use. :)
If I put the URL at the browser, my server responds properly (a XML).
Although, if this same URL pass through the WebClient.DownloadingString() method, something in the URL changes, and my server responds properly, but with an access denied message (XML, too), as if something had changed.
"Error message"
<?xml version="1.0" encoding="ISO-8859-1"?><said:service xmlns:said="http:xxx"><said:codigo_erro>8</said:codigo_erro><said:mensagem_erro>Unable</said:mensagem_erro></said:service>
The URL used on request is like this one:
http://...<parameter1>S<%2Fparameter1>%0D%0A++<parameter2>S<%2Fparameter2>%0D%0A++<parameter3>S<%2Fparameter3>%0D%0A<%2Fqueryservice>%0D%0A%09%09
I have already tried change de Encode to UT8, ISO, etc. No one of them worked.
You have to be sure that you're sending all the necessary data, cookies and request headers that the server is expecting.
I advise you to install Fiddler Web Debugger and monitor successful requests from web browser, after that try to recreate such requests in your application.
Maybe server is redirecting you to some error page because WebClient is not handling cookies. You can create your own version of WebClient and add cookie support. Create a class that inhertis from WebClient and override GetWebRequest method, there you have to add CookieContainer. Following is a simple implementation of WebClient that handles cookies:
public class MyWebClient : WebClient
{
public CookieContainer CookieContainer { get; private set; }
public MyWebClient()
{
this.CookieContainer = new CookieContainer();
}
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = this.CookieContainer;
(request as HttpWebRequest).AllowAutoRedirect = true;
}
return request;
}
}
When logging the login process using Firebug i see that it is like this
POST //The normal post request
GET //Automatically made after the login
GET //Automatically made after the login
GET //Automatically made after the login
When making a post request using my code below it did not make the automatic GET requests that the browsers is doing.
MY WebClient Handler
using System;
using System.Net;
namespace Test
{
class HttpHandler : WebClient
{
private CookieContainer _mContainer = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = _mContainer;
}
return request;
}
protected override WebResponse GetWebResponse(WebRequest request)
{
var response = base.GetWebResponse(request);
if (response is HttpWebResponse)
_mContainer.Add((response as HttpWebResponse).Cookies);
return response;
}
public void ClearCookies()
{
_mContainer = new CookieContainer();
}
}
}
Using Code
private static async Task<byte[]> LoginAsync(string username, string password)
{
var postData = new NameValueCollection();
var uri = new Uri(string.Format("http://{0}/", ServerName));
postData.Add("name", username);
postData.Add("password", password);
return await HttpHandler.UploadValuesTaskAsync(uri, postData);
}
When trying to track the connection of my application it is only doing the POST Request and not the rest of GET requests. [THAT ARE MADE AUTOMATICALLY IN THE BROWSER]
Try adding
request.AllowAutoRedirect = true;
right under the
var request = base.GetWebRequest(address);
It solved some similar problems for me, even though AllowAutoRedirect is supposed to be true by default.
That shouldn't be surprising, given that HttpWebRequest is not a browser. If you need to perform these redirects, then check the HttpWebResponse.StatusCode, and make another request if it's a redirect code in the 300's. Note from the link under 10.3 Redirection 3xx:
This class of status code indicates that further action needs to be taken by the user agent in order to fulfill the request. The action required MAY be carried out by the user agent without interaction with the user if and only if the method used in the second request is GET or HEAD. A client SHOULD detect infinite redirection loops, since such loops generate network traffic for each redirection.
I want to log out from page using webclient.
This is my code for login and site downloading.
public bool LogIn(string loginName, string password)
{
try
{
NameValueCollection postData = new NameValueCollection();
postData.Add("login", loginName);
postData.Add("password", password);
// Authenticate
_webClient.UploadValues("http://rapideo.pl/login.php", postData);
//string temp = _webClient.DownloadString("http://rapideo.pl/lista");
}
catch
{
return false;
}
_loggedIn = true;
_loginName = loginName;
return true;
}
class WebClientEx : WebClient
{
public CookieContainer CookieContainer { get; private set; }
public WebClientEx()
{
CookieContainer = new CookieContainer();
}
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = CookieContainer;
}
return request;
}
}
In order to logout I only need to open that page in browser:
http://rapideo.pl/wyloguj
I know how to download sourcecode of the page after login.
But how can I send http request to logout? I don't want do get response or sourcecode of that page. i just want to sent request.
As a sanity check, have you already tried doing a WebRequest.DownloadString("http://rapideo.pl/wyloguj") and then just discarding the returned data?
If that is not working, one thing to try would be to look at the request/response messages in a tool like Fiddler to see what exactly is going over the wire when you log out via the browser versus programmatically.
Also, as a general aside, it looks like the user's name and password are being sent in the clear as part of the login. Not sure if there is an HTTPS login endpoint available for that site but that would be something to look into.