When i try to get html page i get this error:
The underlying connection was closed: The connection was closed unexpectedly
I think the site I'm getting, is using some protection based on ip.
WebClient single_page_client = new WebClient();
single_page_client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705;)");
string cat_page_single = single_page_client.DownloadString(the_url);
How can i do it?
What about use proxy with Webclient?
EDIT
If i use this code, it works. Why?
HttpWebRequest webrequest = (HttpWebRequest)WebRequest.Create(current_url);
webrequest.KeepAlive = true;
webrequest.Method = "GET";
webrequest.ContentType = "text/html";
webrequest.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
//webrequest.Connection = "keep-alive";
webrequest.Host = "cat.sabresonicweb.com";
webrequest.Headers.Add("Accept-Language", "en-US,en;q=0.5");
webrequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101 Firefox/18.0";
HttpWebResponse webresponse = (HttpWebResponse)webrequest.GetResponse();
Console.Write(webresponse.StatusCode);
Stream receiveStream = webresponse.GetResponseStream();
Encoding enc = System.Text.Encoding.GetEncoding(1252);//1252
StreamReader loResponseStream = new StreamReader(receiveStream, enc);
string current_page = loResponseStream.ReadToEnd();
loResponseStream.Close();
webresponse.Close();
The first request does not utilize a header that indicates the length of the result. It closes the connection when it finishes.
The second request utilizes the length header, reads the designated number of bytes, then closes the connection gracefully. (under the client side control instead of server driven disconnection)
-or-
The url you sent caused an error on the server. Is there an error in the server log or event viewer?
Related
i try to ge the content of this url: https://www.eganba.com/index.php?p=Products&ctg_id=2000&sort_type=rel-desc&view=0&page=1
but as a result of the following code the response contains the content of this url, the home page: https://www.eganba.com
in addition, when i try to get the first url content via Postman application the response is correct.
do you have any idea?
WebRequest request = WebRequest.Create("https://www.eganba.com/index.php?p=Products&ctg_id=2000&sort_type=rel-desc&view=0&page=1");
request.Method = "GET";
request.Headers["X-Requested-With"] = "XMLHttpRequest";
WebResponse response = request.GetResponse();
Use WebClient method which inside System.Net. I think this code gives you what you need. It return the page's html
using (WebClient client = new WebClient())
{
client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
client.Headers.Add("accept", "text/html");
var htmlCode = client.DownloadString("https://www.eganba.com/?p=Products&ctg_id=2000&sort_type=rel-desc&view=0&page=1");
var result = htmlCode.Contains("Stokta var") ? true : false;
}
Hope it helps to you.
Request is not stopped after a period of time is passed set in Timeout.
I'd like to add a function to stop request automatically when a period of time is passed since the server has no response time to time.
Please find below source.
=========================================================================
String urlSetting = "http://www.test.com";
HttpWebRequest request = WebRequest.Create(urlSetting) as HttpWebRequest;
request.UserAgent = "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1);
Accept-Language:en";
request.MediaType = "text/html";
request.Method = System.Net.WebRequestMethods.Http.Get;
request.ContentType = "text/xml; charset=utf-8";
request.Timeout = 3000;
request.Proxy = new WebProxy() { UseDefaultCredentials = true };
HttpWebResponse webResponse = (HttpWebResponse)request.GetResponse();
=========================================================================
I am using GetResponse() to call server, but sometimes the server does not response.
I added "request.Timeout = 3000;" to stop requesting when three seconds passed.
But that part doesn't work.
Internet environment here is proxy as below.
proxy settings
I wonder if there is another way for Timeout for proxy environment.
I need to get the HTML source of the web page.
This web page is a part of the web site that requires NTLM authentication.
This authentication is silent because Internet Explorer can use Windows log-in credentials.
Is it possible to reuse this silent authentication (i.e. reuse Windows log-in credentials), without making the user enter his/her credentials manually?
The options I have tried are below.
string url = #"http://myWebSite";
//works fine
System.Diagnostics.Process.Start("IExplore.exe", url);
InternetExplorer ie = null;
ie = new SHDocVw.InternetExplorer();
ie.Navigate(url);
//Works up to here, but I do not know how to read the HTML source with SHDocVw
NHtmlUnit.WebClient webClient = new NHtmlUnit.WebClient(BrowserVersion.INTERNET_EXPLORER_8);
HtmlPage htmlPage = webClient.GetHtmlPage(url);
string ghjg = htmlPage.WebResponse.ContentAsString; // Error 401
System.Net.WebClient client = new System.Net.WebClient();
client.Credentials = CredentialCache.DefaultNetworkCredentials;
client.Proxy.Credentials = CredentialCache.DefaultCredentials;
// DefaultNetworkCredentials and DefaultCredentials are empty
client.Headers.Add("user-agent", "Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; GTB7.4; InfoPath.2; SV1; .NET CLR 3.3.69573; WOW64; en-US)");
string reply = client.DownloadString(url); // Error 401
HttpWebRequest request = HttpWebRequest.Create(url) as HttpWebRequest;
IWebProxy proxy = request.Proxy;
// Print the Proxy Url to the console.
if (proxy != null)
{
// Use the default credentials of the logged on user.
proxy.Credentials = CredentialCache.DefaultNetworkCredentials;
// DefaultNetworkCredentials are empty
}
request.UserAgent = "Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; GTB7.4; InfoPath.2; SV1; .NET CLR 3.3.69573; WOW64; en-US)";
request.Accept = "*/*";
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
Stream stream = response.GetResponseStream(); // Error 401
What url encoding a web browser uses while submitting data to server?
with my application i use HttpUtility.UrlEncode(string data)
but do not get result as i get with web browser.
My application submit some text data to forum.
When i am submitting data with web browser (Google Chrome) the exact text i can see submitted to server but when i am submitting using my application it showing some character different.
So is this necessary to submit any data to server must be url encoded?
---Edit---
data = HttpUtility.UrlEncode(textBox1.Text);
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.CookieContainer = cookieContainer;
request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.56 Safari/536.5";
request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
request.Headers.Add("Accept-Charset", "ISO-8859-2,utf-8;q=0.7,*;q=0.7");
request.Headers.Add("Accept-Encoding", "gzip, deflate");
request.Headers.Add("Accept-Language", "pl,en-us;q=0.7,en;q=0.3");
request.Method = "POST";
byte[] byteArray = new byte[256];
ASCIIEncoding ascii = new ASCIIEncoding();
byteArray = ascii.GetBytes(data);
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = byteArray.Length;
Stream dataStream = request.GetRequestStream();
dataStream.Write(byteArray, 0, byteArray.Length);
dataStream.Close();
and data in textbox1 is like this.
¦=¦=¦¦=¦=¦
RUNTIME………..: 2h:13m:15s
RELEASE SIZE……: 16,2GB
VIDEO CODEC…….: x264, 2pass,
You generally only have to URLEncode if you want to include a URL reserved charatcer (?, &, etc.) in a url parameter.
I have an HttpWebRequest with a StreamReader that works very fine without using a WebProxy. When I use WebProxy, the StreamReader reads strange character instead of the actual html. Here is the code.
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("https://URL");
req.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.224 Safari/534.10";
req.Accept = "application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
req.Headers.Add("Accept-Charset", "ISO-8859-1,utf-8;q=0.7,*;q=0.3");
req.Headers.Add("Accept-Encoding", "gzip,deflate,sdch");
req.Headers.Add("Accept-Language", "en-US,en;q=0.8");
req.Method = "GET";
req.CookieContainer = new CookieContainer();
WebProxy proxy = new WebProxy("proxyIP:proxyPort");
proxy.Credentials = new NetworkCredential("proxyUser", "proxyPass");
req.Proxy = this.proxy;
HttpWebResponse res = (HttpWebResponse)req.GetResponse();
StreamReader reader = new StreamReader(res.GetResponseStream());
string html = reader.ReadToEnd();
Without using the WebProxy, the variable html holds the expected html string from the URL. But with a WebProxy, html holds a value like that:
"�\b\0\0\0\0\0\0��]r����s�Y����\0\tP\"]ki���ػ��-��X�\0\f���/�!�HU���>Cr���P$%�nR�� z�g��3�t�~q3�ٵȋ(M���14&?\r�d:�ex�j��p������.��Y��o�|��ӎu�OO.�����\v]?}�~������E:�b��Lן�Ԙ6+�l���岳�Y��y'ͧ��~#5ϩ�it�2��5��%�p��E�L����t&x0:-�2��i�C���$M��_6��zU�t.J�>C-��GY��k�O�R$�P�T��8+�*]HY\"���$Ō�-�r�ʙ�H3\f8Jd���Q(:�G�E���r���Rܔ�ڨ�����W�<]$����i>8\b�p� �\= 4\f�> �&��$��\v��C��C�vC��x�p�|\"b9�ʤ�\r%i��w#��\t�r�M�� �����!�G�jP�8.D�k�Xʹt�J��/\v!�r��y\f7<����\",\a�/IK���ۚ�r�����ҿ5�;���}h��+Q��IO]�8��c����n�AGڟu2>�
Since you are passing
req.Headers.Add("Accept-Encoding", "gzip,deflate,sdch");
I would say your proxy compress the stream before sending it back to you.
Inspect the headers of the Response to check the encoding.
Just use Gzip to decompress it.