WebClient.DownloadingString changing URL requested - c#

If I put the URL at the browser, my server responds properly (a XML).
Although, if this same URL pass through the WebClient.DownloadingString() method, something in the URL changes, and my server responds properly, but with an access denied message (XML, too), as if something had changed.
"Error message"
<?xml version="1.0" encoding="ISO-8859-1"?><said:service xmlns:said="http:xxx"><said:codigo_erro>8</said:codigo_erro><said:mensagem_erro>Unable</said:mensagem_erro></said:service>
The URL used on request is like this one:
http://...<parameter1>S<%2Fparameter1>%0D%0A++<parameter2>S<%2Fparameter2>%0D%0A++<parameter3>S<%2Fparameter3>%0D%0A<%2Fqueryservice>%0D%0A%09%09
I have already tried change de Encode to UT8, ISO, etc. No one of them worked.

You have to be sure that you're sending all the necessary data, cookies and request headers that the server is expecting.
I advise you to install Fiddler Web Debugger and monitor successful requests from web browser, after that try to recreate such requests in your application.
Maybe server is redirecting you to some error page because WebClient is not handling cookies. You can create your own version of WebClient and add cookie support. Create a class that inhertis from WebClient and override GetWebRequest method, there you have to add CookieContainer. Following is a simple implementation of WebClient that handles cookies:
public class MyWebClient : WebClient
{
public CookieContainer CookieContainer { get; private set; }
public MyWebClient()
{
this.CookieContainer = new CookieContainer();
}
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = this.CookieContainer;
(request as HttpWebRequest).AllowAutoRedirect = true;
}
return request;
}
}

Related

c# Script to login and download from Kaggle

Recently, I came across a python script to download files directly from Kaggle : https://ramhiser.com/2012/11/23/how-to-download-kaggle-data-with-python-and-requests-dot-py/
I am trying to do something similar using WebClients in C#. I've came the following response in StackOverFlow : C# download file from the web with login
Tried using it but I seem to be downloading only the login page instead of the actual file. Here's my main code :
CookieContainer cookieJar = new CookieContainer();
CookieAwareWebClient http = new CookieAwareWebClient(cookieJar);
string postData = "name=<username>&password=<password>&submit=submit";
string response = http.UploadString("https://www.kaggle.com/account/login", postData);
Console.Write(response);
http.DownloadFile("https://www.kaggle.com/c/titanic/download/train.csv", "train.CSV");
I've used the Webclient extension from the link above and modified slightly :
public class CookieAwareWebClient : WebClient
{
public CookieContainer CookieContainer { get; set; }
public Uri Uri { get; set; }
public CookieAwareWebClient()
: this(new CookieContainer())
{
}
public CookieAwareWebClient(CookieContainer cookies)
{
this.CookieContainer = cookies;
}
protected override WebRequest GetWebRequest(Uri address)
{
this.Uri = address;
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = this.CookieContainer;
}
HttpWebRequest httpRequest = (HttpWebRequest)request;
httpRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
return httpRequest;
}
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse r = base.GetWebResponse(request);
var response = r as HttpWebResponse;
if (response != null)
{
CookieCollection cookies = response.Cookies;
CookieContainer.Add(cookies);
}
return response;
}
}
Was wondering if anyone can point out where I went wrong?
Thanks.
We have created a forum post to help you accomplish what you wanted to do, Accessing Kaggle API through C#. Feel free to post here or on the forum if you have additional questions.
Try to go to https://www.kaggle.com/c/titanic/download/train.csv by your browser without logged in and your browser will open that page instead of downloading your file. You need to put direct link to the file instead of a web page.
Your code works perfectly, you just need to put a direct link to that file or make sure you have logged in before download the file.
I know it's not exactly what you were asking, but Kaggle now has an official API that you can use to download data. Should be a bit easier to use. :)

Cross request does not return files correctly

I want to send query to my backend server via a proxy script. But it does not return files correctly.
public class HttpWebRequestRunner : IWebRequestRunner
{
public HttpWebResponse Run(string backendUri)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(backendUri);
HttpWebResponse response = (HttpWebResponse) request.GetResponse();
return response;
}
}
My backend server is closed to internet so I send parameters my Asp.Net Mvc application. And it send request to backend server.
Backend server is returning file for this request: http://10.0.2.1/Employee/CV/1445
Inmy mvc controller I use this:
public class PersonController : Controller
{
public ActionResult GetCv(int id)
{
HttpWebResponse response = new HttpWebResponse();
HttpWebResponse webResponse = response.run("http://10.0.2.1/Employee/CV/1445");
context.HttpContext.Response.ContentType = wbResponse.ContentType;
webResponse.GetResponseStream().CopyTo(context.HttpContext.Response.OutputStream);
// write result...
}
}
Now
if I send request to backend from browser this url http://10.0.2.1/Employee/CV/1445 it returns 1445.pdf file
But If I send request via prox app like this http://localhost:22414/Person/GetCv/1445
this returns a file named file but not pdf extension.
File names are in header info. webResponse.Headers["Content-Disposition"]. So you have to use like this:
context.HttpContext.Response.Headers.Set(
"Content-Disposition",
webResponse.Headers.Get("Content-Disposition"));
You need to relay the Content-Disposition HTTP header as well.

WebClient Does not automatically redirect

When logging the login process using Firebug i see that it is like this
POST //The normal post request
GET //Automatically made after the login
GET //Automatically made after the login
GET //Automatically made after the login
When making a post request using my code below it did not make the automatic GET requests that the browsers is doing.
MY WebClient Handler
using System;
using System.Net;
namespace Test
{
class HttpHandler : WebClient
{
private CookieContainer _mContainer = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = _mContainer;
}
return request;
}
protected override WebResponse GetWebResponse(WebRequest request)
{
var response = base.GetWebResponse(request);
if (response is HttpWebResponse)
_mContainer.Add((response as HttpWebResponse).Cookies);
return response;
}
public void ClearCookies()
{
_mContainer = new CookieContainer();
}
}
}
Using Code
private static async Task<byte[]> LoginAsync(string username, string password)
{
var postData = new NameValueCollection();
var uri = new Uri(string.Format("http://{0}/", ServerName));
postData.Add("name", username);
postData.Add("password", password);
return await HttpHandler.UploadValuesTaskAsync(uri, postData);
}
When trying to track the connection of my application it is only doing the POST Request and not the rest of GET requests. [THAT ARE MADE AUTOMATICALLY IN THE BROWSER]
Try adding
request.AllowAutoRedirect = true;
right under the
var request = base.GetWebRequest(address);
It solved some similar problems for me, even though AllowAutoRedirect is supposed to be true by default.
That shouldn't be surprising, given that HttpWebRequest is not a browser. If you need to perform these redirects, then check the HttpWebResponse.StatusCode, and make another request if it's a redirect code in the 300's. Note from the link under 10.3 Redirection 3xx:
This class of status code indicates that further action needs to be taken by the user agent in order to fulfill the request. The action required MAY be carried out by the user agent without interaction with the user if and only if the method used in the second request is GET or HEAD. A client SHOULD detect infinite redirection loops, since such loops generate network traffic for each redirection.

HttpWebRequest Automatic sending required HttpRequests

I am using HTTPWebRequest to simulate POST request to login to a website. but when using firebug to track what the browser is doing during the login i find that it makes some GetRequests after that login.
So what i am looking for is how to make my POST Request automatically do that GET Requests ?
someone told me to use the JS Functions but i am totally clueless of this.
private static async Task<byte[]> LoginAsync(string username, string password)
{
var postData = new NameValueCollection();
var uri = new Uri(string.Format("http://{0}/", ServerName));
postData.Add("name", username);
postData.Add("password", password);
postData.Add("login", ParseLoginId(await GetPage("login.php")));
return await HttpHandler.UploadValuesTaskAsync(uri, postData);
}
MY HTTP HANDLER
private CookieContainer _mContainer = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = _mContainer;
}
return request;
}
public void ClearCookies()
{
_mContainer = new CookieContainer();
}
I am using the code above to send the POST Request but the problem is it does not totally simulates what the browser do... so it does not autamatically send the required GET requests after the login.
The preceding GET requests are likely Location header redirects.
Why not use the System.Net.WebClient class.
C# Console/Server access to web site
The link shows you how to extend the WebClient to persist cookie information.

How to Send a Web Request to Log Out?

I want to log out from page using webclient.
This is my code for login and site downloading.
public bool LogIn(string loginName, string password)
{
try
{
NameValueCollection postData = new NameValueCollection();
postData.Add("login", loginName);
postData.Add("password", password);
// Authenticate
_webClient.UploadValues("http://rapideo.pl/login.php", postData);
//string temp = _webClient.DownloadString("http://rapideo.pl/lista");
}
catch
{
return false;
}
_loggedIn = true;
_loginName = loginName;
return true;
}
class WebClientEx : WebClient
{
public CookieContainer CookieContainer { get; private set; }
public WebClientEx()
{
CookieContainer = new CookieContainer();
}
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = CookieContainer;
}
return request;
}
}
In order to logout I only need to open that page in browser:
http://rapideo.pl/wyloguj
I know how to download sourcecode of the page after login.
But how can I send http request to logout? I don't want do get response or sourcecode of that page. i just want to sent request.
As a sanity check, have you already tried doing a WebRequest.DownloadString("http://rapideo.pl/wyloguj") and then just discarding the returned data?
If that is not working, one thing to try would be to look at the request/response messages in a tool like Fiddler to see what exactly is going over the wire when you log out via the browser versus programmatically.
Also, as a general aside, it looks like the user's name and password are being sent in the clear as part of the login. Not sure if there is an HTTPS login endpoint available for that site but that would be something to look into.

Categories