HttpClient request returns status 410 Gone - c#

I'm practicing my skills with httpclient .NET 4.5 , but I ran into some trouble trying to get html content from a webpage which I can view no problem with my browser, but I will get a 410 Gone when I use httpclient
There is the first page that I can actually get with httpclient
https://selfservice.mypurdue.purdue.edu/prod/bwckschd.p_disp_detail_sched?term_in=201610&crn_in=20172
But links inside of this above url, like the url represented by "View Catalog Entry "
when I try to access with httpclient, I will get a 410 Gone.
I used Tuple to set up the series of requests
var requests = new List<Tuple<Httpmethod, string, FormUrlEncodedContent, string>>()
{
new Tuple<Httpmethod, string, FormUrlEncodedContent, string>(Httpmethod.GET, "https://selfservice.mypurdue.purdue.edu/prod/bwckschd.p_disp_detail_sched?term_in=201610&crn_in=20172", null, ""),
new Tuple<Httpmethod, string, FormUrlEncodedContent, string>(Httpmethod.GET, "https://selfservice.mypurdue.purdue.edu/prod/bwckctlg.p_display_courses?term_in=201610&one_subj=EPCS&sel_crse_strt=10100&sel_crse_end=10100&sel_subj=&sel_levl=&sel_schd=&sel_coll=&sel_divs=&sel_dept=&sel_attr=", null, ""),
new Tuple<Httpmethod, string, FormUrlEncodedContent, string>(Httpmethod.GET, "https://selfservice.mypurdue.purdue.edu/prod/bwckctlg.p_disp_listcrse?term_in=201610&subj_in=EPCS&crse_in=10100&schd_in=%", null, "")
};
Then use the foreach to loop through pass each element in requests to a function
HttpClientHandler handler = new HttpClientHandler()
{
CookieContainer = cookies,
AllowAutoRedirect = false,
AutomaticDecompression = DecompressionMethods.GZip
};
HttpClient client = new HttpClient(handler as HttpMessageHandler)
{
BaseAddress = new Uri(url),
Timeout = TimeSpan.FromMilliseconds(20000)
};
client.DefaultRequestHeaders.TryAddWithoutValidation("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
client.DefaultRequestHeaders.TryAddWithoutValidation("Accept-Encoding", "gzip, deflate, sdch");
client.DefaultRequestHeaders.TryAddWithoutValidation("Accept-Language", "en-US,en;q=0.8,zh-CN;q=0.6,zh;q=0.4");
client.DefaultRequestHeaders.TryAddWithoutValidation("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36");
client.DefaultRequestHeaders.Connection.Add("keep-alive");
if (referrer.Length > 0)
client.DefaultRequestHeaders.Referrer = new Uri(referrer);
System.Diagnostics.Debug.WriteLine("Navigating to '" + url + "...");
HttpResponseMessage result = null;
//List<Cookie> cook = GetAllCookies(cookies);
try
{
switch (method)
{
case Httpmethod.POST:
result = await client.PostAsync(url, post_content);
break;
case Httpmethod.GET:
result = await client.GetAsync(url);
break;
}
}
catch (HttpRequestException ex)
{
throw new ApplicationException(ex.Message);
}
I have no clue what is wrong with my method, since I have successfully handled requests that needs to be logged in with this method.
I intended to build some sort of web crawler using this.
Please help, thanks in advance.
I forgot to mention, the links inside of the above url could be accessed by clicking on them on a web browser, simply copy them to address bar will get 410 Gone as well.
EDIT: There is some JavaScript code I saw in page source, it's creating an XMLHttpRequest to get the content, so it might be that it only refresh part of the page instead of creating a new webpage. But the url is changed. How do I get httpclient do the clicking thing?

Related

C# websocket never retrieves data

I'm trying to imitate a browser from my console application reading from websockets
var tParam = "MoC4gE6";
var sessionId = GetSid(tParam); // simple GET, works the same way as in a browser
HttpGetSomeBytes(sessionId, tParam); // simple GET, return the same bytes as it is in a browser
var uri2 = new Uri($"ws://{_host}/driver.socket/?EIO=3&transport=websocket&sid={sessionId}");
var ws2 = new ClientWebSocket();
Random rand = new Random();
Byte[] randomBytes = new Byte[16];
rand.NextBytes(randomBytes);
ws2.Options.SetRequestHeader("Host", _host);
ws2.Options.SetRequestHeader("Accept-Encoding", "gzip, deflate");
ws2.Options.SetRequestHeader("User-Agent",
"Mozilla/5.0(Windows NT 10.0; Win64; x64) AppleWebKit/537.36(KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36");
await ws2.ConnectAsync(uri2, CancellationToken.None);
ArraySegment<Byte> buffer = new ArraySegment<byte>(new Byte[8192]);
var receiveTask = await ws2.ReceiveAsync(buffer, CancellationToken.None);
The problem is that call ReceiveAsync never returns back.
The server is not mine so I can't check what's going on there.
Update:
I'm able to see 2 responses:
If I send '2' I received '3' as in a browser
If I send '2probe' I received '3probe' as in a browser
Bu when I send '5' expecting to get back some useful data (as I do in a browser) the ReceiveAsync never returns.

C# Empty WebClient Downloadstring

I'm trying to download the html string of a website. The website has te following url:
https://www.gastrobern.ch/de/service/aus-weiterbildung/wirtekurs/234/?oid=1937&lang=de
First I tried to do a simple WebClient Request:
var wc = new WebClient();
string websitenstring = "";
websitenstring = wc.DownloadString("http://www.gastrosg.ch/default.asp?id=3020000&siteid=1&langid=de");
But, the websiteString was empty. Then, I read in some posts, that I have to send some additional headerinformations :
var wc = new WebClient();
string websitenstring = "";
wc.Headers[HttpRequestHeader.Accept] = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8";
wc.Headers[HttpRequestHeader.AcceptEncoding] = "gzip, deflate, br";
wc.Headers[HttpRequestHeader.AcceptLanguage] = "de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7";
wc.Headers[HttpRequestHeader.CacheControl] = "max-age=0";
wc.Headers[HttpRequestHeader.Host] = "www.gastrobern.ch";
wc.Headers[HttpRequestHeader.Upgrade] = "www.gastrobern.ch";
wc.Headers[HttpRequestHeader.UserAgent] = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36";
websitenstring = wc.DownloadString("https://www.gastrobern.ch/de/service/aus-weiterbildung/wirtekurs/234/?oid=1937&lang=de");
I tried this, but no answer. Then, I also tried to set some cookies:
wc.Headers.Add(HttpRequestHeader.Cookie,
"CFID=10609582;" +
"CFTOKEN=32721418;" +
"_ga=GA1.2.37" +
"_ga=GA1.2.379124242.1539000256;" +
"_gid=GA1.2.358798732.1539000256;" +
"_dc_gtm_UA-1237799-1=1;");
But this also didn't work. I also found out, that the Browser is somehow doing multiple requests, and my C-Sharp Application is just doing one and showing the first response headers.
But I don't know how I can make a following up request. I'm thankful for every answer.
Try HttpClient instead
Here is an Example On how to use it
public async static Task<string> GetString(string url)
{
HttpClient client = new HttpClient();
// Way around to avoid Deadlock
HttpResponseMessage message = await client.GetAsync(url).ConfigureAwait(false);
return await message.Content.ReadAsStringAsync().ConfigureAwait(false);
}
To call this Method
string dataFromServer = GetString("https://www.gastrobern.ch/de/service/aus-weiterbildung/wirtekurs/234/?oid=1937&lang=de").Result;
I checked Here dataFromServer has HTML content to that page

C# : REST url returns 404 while the same URL works in web browser

I have this REST api (GET) which works fine in browser and spits JSON. I even tried the same API in Excel-VBA using XMLHTTP and it works fine.
But when trying to use the same API in C#, I am getting errors.
First I was getting:
"The underlying connection was closed: An unexpected error occurred on
a send."
on the the line
HttpResponseMessage response = client.GetAsync(urlParameters).Result;
Then I set the security protocol to `Tls11' and that error vanished.
Now I am geeting 404 in the resposne. The URL is correct. I am able to run the same URL in web browser/VBA but not in C#.
Any suggestions or help?
Sorry, can't share the actual url as its work related.
private const string URL = "https://xxx-production-api.abc.com/api/listings/1790956";
private string urlParameters = "?apiToken=64842d73-9761-456b-86fa-a75a409273ce";
public string Download()
{
HttpClient client = new HttpClient();
client.BaseAddress = new Uri(URL);
client.DefaultRequestHeaders.Accept.Add(
new MediaTypeWithQualityHeaderValue("application/json"));
client.DefaultRequestHeaders.Add("Accept-Language", "en-GB,en-US;q=0.8,en;q=0.6,ru;q=0.4");
// List data response.
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls11;
HttpResponseMessage response = client.GetAsync(urlParameters).Result; // Blocking call!
if (response.IsSuccessStatusCode)
Console.WriteLine("Worked");
else
Console.WriteLine("{0} ({1})", (int)response.StatusCode, response.ReasonPhrase);
}
Answering my own question :
Adding the user agent to headers solved the issue for me: client.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36")
Copied and pasted the header info from Chrome dev tool and now the url is working in C# too.
Make sure your work firewall/proxy is not blocking the call.
I have run into a similar issues and this is the final code I use for my GET calls, see if it works for you. The main difference is that I don't use ".Result" I use "GetAwaiter().GetResult()".
using (var client = new HttpClient())
{
var method = string.Format("{0}{1}", ApiUrl, urlPart);
client.BaseAddress = new Uri(ApiUrl);
client.DefaultRequestHeaders.Accept.Clear();
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/xml"));
HttpResponseMessage response = client.GetAsync(method).GetAwaiter().GetResult();
if (response.IsSuccessStatusCode)
{
var data = response.Content.ReadAsStringAsync().GetAwaiter().GetResult();
return data;
}
return default(T);
}

Posting to a form results in an AggregateException

So I'm attempting to post a form using a HttpClient (and CloudFlareUtilities), and initialise it as shown below:
var handler = new ClearanceHandler {
MaxRetries = 4,
};
var client = new HttpClient(handler);
client.DefaultRequestHeaders.Connection.Clear();
client.DefaultRequestHeaders.ConnectionClose = false;
client.DefaultRequestHeaders.Add("User-Agent",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 " +
"(KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36");
client.Timeout = TimeSpan.FromSeconds(30);
Creating the client is successful, with no apparent issues occurring. However, I create a POST request like below:
var content = new FormUrlEncodedContent(new[] {
new KeyValuePair<string, string>("login", "username"),
new KeyValuePair<string, string>("password", "pw"),
new KeyValuePair<string, string>("cookie_check", "1"),
new KeyValuePair<string, string>("redirect", "myRedirect"),
new KeyValuePair<string, string>("register", "0"),
new KeyValuePair<string, string>("remember", "1")
});
and post the data:
var post = client.PostAsync("https://www.mc-market.org/login/login", content).Result.Content.ReadAsStreamAsync().Result;
The aforementioned line throws the error(s):
Inner Exception 1:
HttpRequestException: Error while copying content to a stream.
Inner Exception 2:
IOException: The read operation failed, see inner exception.
Inner Exception 3:
WinHttpException: The operation has been canceled
What I can interpret from this is that something went wrong, and something cancelled the task, yet I never cancel it myself.
Limiting errored line to
var post = client.PostAsync("https://www.mc-market.org/login/login", content);
gives the same error.
Is it possible I've created the Content incorrectly? Some indication of where I may have gone is appreciated.
I decided not to go with the HttpClient, for simplicity.
Instead I opted to go with PhantomJS & Selenium, which natively supports Javascript in-browser anyway (which I decided to enable for the Cloudflare bypass).
What I then did is used its SendKeys() method to type in the login fields for in the website https://www.mc-market.org/login/ (instead of https://www.mc-market.org/login/login), and press the button with Click(). An example is shown below;
driver.FindElementById("ctrl_pageLogin_login").SendKeys("NAME");
driver.FindElementById("ctrl_pageLogin_password").SendKeys("PASS");
driver.FindElementById("ctrl_pageLogin_remember").Click();
driver.FindElementByCssSelector("input.button.primary").Click();
The result was much simpler and it worked like a charm.

Can not set Headers of Content-Type windows phone app

i'm using an HttpClient to create a HTTP request and that client comes from the assembly Windows.Web.Http
All is good when posting the request without the Content-Type Header but the server does not return what I need because it needs that header, so after finding the correct headers needed to be sent I'm facing another problem... I'm not being able to set the Content-Type header
Here is my code (where is the try block is where the error is)
using (var wp = new Windows.Web.Http.HttpClient())
{
HttpRequestMessage mSent = new HttpRequestMessage(HttpMethod.Post, new Uri(url));
//mSent.Headers.Add("Host", "academicos.ubi.pt");
//mSent.Headers.Add("Connection", "keep-alive");
//mSent.Headers.Add("Content-Length", "18532");
//mSent.Headers.Add("Origin", "https://academicos.ubi.pt");
//mSent.Headers.Add("X-Requested-With", "XMLHttpRequest");
//mSent.Headers.Add("Cache-Control", "no-cache");
//mSent.Headers.Add("X-MicrosoftAjax", "Delta=True");
mSent.Headers.Add("User-Agent", " Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36");
try
{
mSent.Content.Headers.ContentType = new Windows.Web.Http.Headers.HttpMediaTypeHeaderValue("application/x-www-form-urlencoded");
}
catch (Exception ex) { ex.ToString(); }
//mSent.Headers.Add("Accept", "*/*");
//mSent.Headers.Add("Referer", "https://academicos.ubi.pt/online/horarios.aspx");
//mSent.Headers.Add("Accept-Encoding", "gzip,deflate");
//mSent.Headers.Add("Accept-Language", "pt-PT,pt;q=0.8,en-US;q=0.6,en;q=0.4");
mSent.Headers.Add("Cookie", "the cookie string is big, so I will not post it here");
mSent.Content = new HttpStringContent("the content is well defined, but I will not post it here it's huge"), Windows.Storage.Streams.UnicodeEncoding.Utf8);
HttpResponseMessage mReceived = await wp.SendRequestAsync(mSent, HttpCompletionOption.ResponseContentRead);
if (mReceived.IsSuccessStatusCode)
{
htmlPage = await mReceived.Content.ReadAsStringAsync();
}
}
The error that I'm recieving is Object reference not set to an instance of an object.
I tryied setting this header like I set the user agent it gives me another exception that says that to set the content type I need to set it under content headers...
Any Ideias? I tryied searching for answers for this problem, so far I came out empty handed
Pedro, you need to set the Content property to something, e.g.:
HttpRequestMessage mSent = new HttpRequestMessage(
HttpMethod.Post,
new Uri(url));
mSent.Content = new HttpStringContent(
"Name=Jonathan+Doe&Age=23",
UnicodeEncoding.Utf8,
"application/x-www-form-urlencoded");
There are other kinds of IHttpContent, such as HttpBufferContent, HttpFormUrlEncodedContent, HttpMultipartContent, HttpMultipartFormDataContent and HttpStreamContent.
You do not set the headers on the .Content member. You chould instead use the Multi-Part Content object and set the content similar to this:
HttpMultipartFormDataContent form = new HttpMultipartFormDataContent();
form.Add(new HttpStringContent(RequestBodyField.Text), "data");
HttpResponseMessage response = await httpClient.PostAsync(resourceAddress, form).AsTask(cts.Token);
Ref: http://code.msdn.microsoft.com/windowsapps/HttpClient-sample-55700664/sourcecode?fileId=98924&pathId=1116044733

Categories