httpwebrequest fails to load rss feed - c#

I am attempting to load a page I've received from an RSS feed and I receive the following WebException:
Cannot handle redirect from HTTP/HTTPS protocols to other dissimilar ones.
with an inner exception:
Invalid URI: The hostname could not be parsed.
Here's the code I'm using:
System.Net.HttpWebRequest req = (System.Net.HttpWebRequest)System.Net.HttpWebRequest.Create(url);
string source = String.Empty;
Uri responseURI;
try
{
req.UserAgent=#"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:31.0) Gecko/20100101 Firefox/31.0";
req.Headers.Add("Accept-Language", "en-us,en;q=0.5");
req.AllowAutoRedirect = true;
using (System.Net.WebResponse webResponse = req.GetResponse())
{
using (HttpWebResponse httpWebResponse = webResponse as HttpWebResponse)
{
responseURI = httpWebResponse.ResponseUri;
StreamReader reader;
if (httpWebResponse.ContentEncoding.ToLower().Contains("gzip"))
{
reader = new StreamReader(new GZipStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));
}
else if (httpWebResponse.ContentEncoding.ToLower().Contains("deflate"))
{
reader = new StreamReader(new DeflateStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));
}
else
{
reader = new StreamReader(httpWebResponse.GetResponseStream());
}
source = reader.ReadToEnd();
reader.Close();
}
}
}
catch (WebException we)
{
Console.WriteLine(url + "\n--\n" + we.Message);
return null;
}
I'm not sure if I'm doing something wrong or if there's something extra I need to be doing. Any help would be greatly appreciated! let me know if there's more information that you need.
############ UPDATE
So after following Jim Mischel's suggestions I've narrowed it down to a UriFormatException that claims Invalid URI: The hostname could not be parsed.
Here's the URL that's in the last "Location" Header: http:////www-nc.nytimes.com/
I guess I can see why it fails, but I'm not sure why it gives me trouble here but when I take the original url it processes it just fine in my browser. Is there something I'm missing/not doing that I should be in order to handle this strange URL?

Related

Restful API working in postman, C# not Working [Magento Integration]

I am integrating with Magento 2, using RESTful APIs. When I use postman, it works like charm, while in C# code, it returns "Unauthorized 401" exception.
However, It was working in C# code earlier, but suddenly it stopped working.
I have tried every way, I tried (WebRequest, HTTPClient & RESTsharp) the same exception returned.
Also, I am using Fiddler 4 to catch & match the requests, I used Fiddler to C# plugins to extract C# code, also I used the RESTsharp Code of Postman same exception returned.
The remote server returned an error: (401) Unauthorized.
//Calls request functions sequentially.
private string MakeRequests()
{
HttpWebResponse response;
if (Request_hatolna_co(out response))
{
//Success, possibly uses response.
string responseText = ReadResponse(response);
response.Close();
return responseText;
}
else
{
//Failure, cannot use response.
return "";
}
}
private static string ReadResponse(HttpWebResponse response)
{
using (Stream responseStream = response.GetResponseStream())
{
Stream streamToRead = responseStream;
if (response.ContentEncoding.ToLower().Contains("gzip"))
{
streamToRead = new GZipStream(streamToRead, CompressionMode.Decompress);
}
else if (response.ContentEncoding.ToLower().Contains("deflate"))
{
streamToRead = new DeflateStream(streamToRead, CompressionMode.Decompress);
}
using (StreamReader streamReader = new StreamReader(streamToRead, Encoding.UTF8))
{
return streamReader.ReadToEnd();
}
}
}
private bool Request_hatolna_co(out HttpWebResponse response)
{
response = null;
try
{
//Create a request to URL.
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://MAGENTO.co/index.php/rest//V1/orders/items?searchCriteria[filter_groups][0][filters][0][field]=item_id&searchCriteria[filter_groups][0][filters][0][value]=1");
//Set request headers.
request.KeepAlive = true;
request.Headers.Set(HttpRequestHeader.Authorization, "Bearer xxxxxxxxxxxxxxxxxxxxxx");
request.Headers.Add("Postman-Token", #"1181fa03-4dda-ae84-fd31-9d6fbd035614");
request.Headers.Set(HttpRequestHeader.CacheControl, "no-cache");
request.UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36";
request.ContentType = "application/json";
request.Accept = "*/*";
request.Headers.Set(HttpRequestHeader.AcceptEncoding, "gzip, deflate");
request.Headers.Set(HttpRequestHeader.AcceptLanguage, "en-US,en;q=0.9,ar;q=0.8,la;q=0.7");
request.Headers.Set(HttpRequestHeader.Cookie, #"store=default; private_content_version=f16533d4f181d42a1b3f386fa6d2cdf1");
//Get response to request.
response = (HttpWebResponse)request.GetResponse();
}
catch (WebException e)
{
//ProtocolError indicates a valid HTTP response, but with a non-200 status code (e.g. 304 Not Modified, 404 Not Found)
if (e.Status == WebExceptionStatus.ProtocolError) response = (HttpWebResponse)e.Response;
else return false;
}
catch (Exception)
{
if (response != null) response.Close();
return false;
}
return true;
}
Why Postman-Token has been set in C# code? Remove it and then try.
The problem was in the URL, where Magento's (Server's) Admin Changed it to [HTTPS] instead of [HTTP].
That concludes the difference between [Postman, Insomnia, or any other API app] & the C# code, that C# doesn't handle the [HTTP vs HTTPs], while the API app can handle it.

Spring WebSecurityConfigurerAdapter permitAll() does not allow REST POST requests from c# client?

I have this setup in my WebSecurityConfigurerAdapter to allow my client application to send POST request to the "/commands/" path on server:
#Override
protected void configure(HttpSecurity http) throws Exception {
http.authorizeRequests()
.antMatchers("/").permitAll()
.antMatchers("/commands/**").permitAll()
.antMatchers("/files/**").authenticated()
.and().
formLogin();
}
GET requests are fine,however the csrf seems be required for POST requests after this setup. I get following result if I don't login:
{
"timestamp": 1497904660159,
"status": 403,
"error": "Forbidden",
"message": "Could not verify the provided CSRF token because your session was not found.",
"path": "/commands/add"
}
If I login and attach the cookies from login request with C# client code, I will get following error:
{
"timestamp":1497897646380,
"status":403,
"error":"Forbidden",
"message":"Could not verify the provided CSRF token because your session was not found.",
"path":"/commands/add"
}
My C# code client for post looks like this:
public String SendJsonCommandByPost(String url, string data)
{
try
{
WebRequest req = HttpWebRequest.Create(url);
req.Proxy = null;
req.Method = "POST";
req.Timeout = TIMEOUT;
((HttpWebRequest)req).CookieContainer = myCookieContainer;
PrintCookies(myCookieContainer);
req.Headers.Add("X-CSRF-TOKEN", _csrftoken);
req.ContentType = "application/json";
((HttpWebRequest)req).UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2";
byte[] postdata = Encoding.UTF8.GetBytes(data);
req.ContentLength = postdata.Length;
Stream stream = req.GetRequestStream();
stream.Write(postdata, 0, postdata.Length);
stream.Flush();
stream.Close();
string source;
Console.WriteLine(req.Headers);
using (HttpWebResponse response = (HttpWebResponse)req.GetResponse())
{
using (StreamReader reader = new StreamReader(req.GetResponse().GetResponseStream()))
{
source = reader.ReadToEnd();
}
req.GetResponse().Close();
return source;
}
}
catch (Exception exp)
{
Console.WriteLine(exp);
if (exp is WebException)
{
var webexp = (WebException)exp;
Console.WriteLine(webexp.Response.Headers);
TextReader reader = new StreamReader(webexp.Response.GetResponseStream());
Console.WriteLine(reader.ReadToEnd());
}
return null;
}
}
May I know what could cause this kind of issue? Thank you!
add this line.
http.csrf().disable();
By default csrf is enabled so your post requests are getting blocked. Try this. It works for me

Maintaining session with web server

I have a WCF service that is consuming a website. Before anyone points out the obvious flaw and inherent instability in this approach, please forgive me, i have been forced to work on this.
The flow of the transaction is as follows :
Channel -> WCF Service ->(HTTP POST/GET) -> Website
Steps :
PostStrOnUrlRandom is called in a loop, passing it the string to be posted(strPost), the url(url) and CookieJar(CookieJarR) containig cookies from previous post.
HTTP GET on the url(not in below code)
I get the HTML page in response, parse it using XPATH, confirm it is correct page and save Cookies
POST strPost passed on this method on the url.
Get response HTML page, parse it using XPATH
Fetch next url and strPost from DB
Post the next strPost on the next url
I have to do this for 5 urls. Then i have to send response back to the Channel. The Channel then sends me another request, and i have to do another POST. I do this by using the same cookies I used for previous requests.
This is where the problem arises, I get back an Internal Server Error 500. If i do retires on this POST, using the same strPost, url and Cookies it works after 3-4 retires. I cannot understand why this is.
Code is as follows :
public HttpWebPostResponse PostStrOnUrlRandom(string url, string strPost, string Refer, CookieCollection cookieJarR)
{
log.Debug("Method Entery [PostStrOnUrl]");
StreamWriter myWriter = null;
string resultHtml = null;
HttpWebResponse objResponse = null;
HttpWebPostResponse hpr = null;
HttpWebRequest objRequest = (HttpWebRequest)WebRequest.Create(url);
objRequest.ProtocolVersion = HttpVersion.Version10;
objRequest.Timeout = 90000;
objRequest.CookieContainer = cookieJar;
objRequest.Method = "POST";
objRequest.ContentType = "application/x-www-form-urlencoded";
objRequest.AllowAutoRedirect = true;
objRequest.MaximumAutomaticRedirections = 100;
objRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0";
objRequest.Referer = Refer;
objRequest.Host = ConfigurationManager.AppSettings["Host"];
objRequest.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8";
objRequest.Headers.Add("Origin", ConfigurationManager.AppSettings["Origin"]);
objRequest.Headers.Add("Cache-Control", "max-age=0");
objRequest.Headers.Add("Accept-Encoding", "gzip, deflate, br");
objRequest.Headers.Add("Accept-Language", "en-US,en;q=0.8");
objRequest.KeepAlive = true;
log.Debug("Making HttpWebRequest to" + url);
Uri target = new Uri(url);
log.Debug("Target Url : " + url.ToString());
foreach (Cookie cookie in cookieJarR)
{
objRequest.CookieContainer.Add(cookie);
cookieJar.Add(cookie);
log.Debug(cookie.Name + "" + cookie.Value + "" + "" + cookie.Domain);
}
try
{
myWriter = new StreamWriter(objRequest.GetRequestStream());
myWriter.Write(strPost);
log.Debug("Posting on Url StrPost =" + strPost);
}
catch (Exception e)
{
log.Debug(e.Message);
}
finally
{
myWriter.Close();
}
log.Debug("Making Request.");
try
{
objResponse = (HttpWebResponse)objRequest.GetResponse();
using (StreamReader sr =
new StreamReader(objResponse.GetResponseStream()))
{
resultHtml = sr.ReadToEnd();
sr.Close();
}
hpr = new HttpWebPostResponse(objResponse.ResponseUri.ToString(), resultHtml, objResponse.StatusCode.ToString());
return hpr;
}
catch (Exception e)
{
log.Error(e.Message);
log.Error("Exception", e.InnerException);
return null;
}
}

using HtmlWeb causes HttpWebRequest to timeout

So I've got a situation where I'm using HtmlAgilityPack to load web pages in order to scrape the Document contents. I have a number of URLs that I need to load and a few of them require gzip encoding so I catch the exception thrown by HtmlWeb.load(), check that it's a gzip encoding issue, and then process the page load with HttpWebRequest. However this allows the first time through with HttpWebRequest to be successful, but any other attemp with HttpWebRequest will timeout.
Here's a cleaned up version of the code:
HtmlDocument doc = new HtmlDocument();
HtmlWeb web = new HtmlWeb();
try
{
doc = web.Load(uri);
Console.WriteLine("htmlweb and htmldocument success");
}
catch (ArgumentException ae)
{
Console.WriteLine("htmlweb and htmldocument not successful");
if (ae.Message.Contains("\'gzip\'"))
{
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(uri);
try
{
req.Headers[HttpRequestHeader.AcceptEncoding] = "gzip, deflate";
req.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
req.Method = "GET";
//req.UserAgent = "Mozilla/5.0 (Windows; U; MSIE 9.0; WIndows NT 9.0; en-US))";
string source;
req.KeepAlive = false;
//req.Timeout = 100000;
// On the second iteration we never get beyond this line
using (WebResponse webResponse = req.GetResponse())
{
using (HttpWebResponse httpWebResponse = webResponse as HttpWebResponse)
{
using (StreamReader reader = new StreamReader(httpWebResponse.GetResponseStream()))
{
source = reader.ReadToEnd();
}
}
}
req.Abort();
Console.WriteLine("httpwebresponse successfull");
}
catch (WebException we)
{
Console.WriteLine("httpwebresponse not successful");
}
}
}
Is there some cleanup that I'm needing to do? or is there something I'm forgetting?
Any help will be greatly appreciated.
I think that I will have to load via WebRequest first, instead of HtmlWeb. then inspect the response header for gzip, and decompress as needed each time.
System.Net.HttpWebRequest req = (System.Net.HttpWebRequest)System.Net.HttpWebRequest.Create(uri);
//req.Headers[HttpRequestHeader.AcceptEncoding] = "gzip, deflate";
//req.AutomaticDecompression = System.Net.DecompressionMethods.Deflate | System.Net.DecompressionMethods.GZip;
//req.Method = "GET";
string source = String.Empty;
try
{
using (System.Net.WebResponse webResponse = req.GetResponse())
{
using (HttpWebResponse httpWebResponse = webResponse as HttpWebResponse)
{
StreamReader reader;
if (httpWebResponse.ContentEncoding.ToLower().Contains("gzip"))
{
reader = new StreamReader(new GZipStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));
}
else if (httpWebResponse.ContentEncoding.ToLower().Contains("deflate"))
{
reader = new StreamReader(new DeflateStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));
}
else
{
reader = new StreamReader(httpWebResponse.GetResponseStream());
}
source = reader.ReadToEnd();
}
}
req.Abort();
}
catch(Exception ex){
//received a 404 Error - apparently one of my links is now dead...
}

Unable to use external proxy in c# HttpWebRequest

i am hitting this url:
http://www.google.co.uk/search?q=online stores uk&hl=en&cr=countryUK%7CcountryGB&as_qdr=all&tbs=ctr:countryUK
Basically i get the ppcUrls, it works perfect without any proxy.
But when i try to use a proxy which are available on the internet:
http://proxy-list.org/en/index.php?pp=3128&pt=any&pc=any&ps=any&submit=Filter+Proxy
The above link wont open in any way :|, i did check the ipz with the internet explorer and it opened , but here in HTTPWEBREQUEST , sometime i get 503 Server unavailable, or Too Many redirections
The link wont open with any ip .
Any suggestion ? Below is my getting HTML function:
public string getHtml(string url, string proxytmp)
{
string responseData = "";
try
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Accept = "*/*";
request.AllowAutoRedirect = true;
request.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)";
request.Timeout = 60000;
request.Method = "GET";
if (proxies.Count > 0)
{
try
{
int customIP = 0;
int port = 0;
string ip = string.Empty;
string[] splitter = proxytmp.Split(':');
if (splitter.Length > 0)
{
ip = splitter[0].ToString();
port = Convert.ToInt32(splitter[1].ToString());
}
WebProxy proxy = new WebProxy(ip, port);
request.Proxy = proxy;
}
catch (Exception exp)
{
}
}
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK)
{
Stream responseStream = response.GetResponseStream();
StreamReader myStreamReader = new StreamReader(responseStream);
responseData = myStreamReader.ReadToEnd();
}
response.Close();
}
catch (System.Exception e)
{
responseData = e.ToString();
}
return responseData;
}
UPDATE
The Url opens when i use the same proxy with Internet explorer so there must be a way.But i cannot figure it out.
Thank you
My guess is that the proxy blocks incoming connections of a certain nature and that is why you are running into various issues and these checks might be complex in nature or it might be as simple as setting the User-Agent to a valid browser.. I am not sure what other things a proxy can check, I would suggest you take a look at the request object created (something like Referrer, Port etc ) when you use your browser to make the request and make changes accordingly in your C# code..
Good luck, let me know how it works out for you.

Categories