Navigating the webBrowser with proxy

Navigating the webBrowser with proxy - c#

I'm trying to make an application in VS in C #.
Wanted navigate in webBrowser I have on my form through proxy.
But I can not.
With this code I can access a page by proxy.
But if I click within the page. I am already browsing by webBrowser directly without going through proxy.
proxyURI = new System.Uri("http://" + myProxy);
proxy = new System.Net.WebProxy(proxyURI);
request = (HttpWebRequest)WebRequest.Create("http://stackoverflow.com/");
request.Proxy = proxy;
response = (HttpWebResponse)request.GetResponse();
receiveStream = response.GetResponseStream();
webBrowser1.DocumentStream = receiveStream;
To make things easier is not possible to place the proxy in the WebBrowser?
Like:
webBrowser1.Document.GetElementById("login").byProxy("MyProxy").InvokeMember("click");
My apology but I'm noob in C #

Don't mix WebRequest and WebBrowser in this case, they have completely different Net sessions.
Use WebBrowser alone, and set the proxy with UrlMkSetSessionOption/INTERNET_OPTION_PROXY. Do this before you create WebBrowser (Otherwise, I'm not sure if existing WebBrowser instance picks up new proxy settings instantly).

Related

How to submit webpage in library?

I have a URL that I'd like to submit. This will cause a redirect on the target page. I'd then like to get the new URL.
This will occur inside of a library. So no winforms or WPF. Is there some type of web browser object available at this level (similar to the winforms web browser object)?
I don't need to do any page scraping. Just access to the new URL. But some type of eventing will need to be available so I'll know when it is ok to grab the new URL.

You can use HttpWebRequest:
HttpWebRequest httpWebRequest = (HttpWebRequest) WebRequest.Create(<submit URL>);
httpWebRequest.AllowAutoRedirect = true;
httpWebRequest.GetResponse();
// address after all redirections:
Uri address = httpWebRequest.Address;

C# Downloading HTML from a website after logging in

I've recently been looking on how to get data from a website using C#. I tried using the WebBrowser object to navigate and log in, and that worked fine, but I keep getting the same problem over and over again: when I navigate to the wanted page I get disconnected.
I've tried several things like making sure that only one HtmlDocument exists but I still get logged out.
TLDR: how do you stay logged in, from page to page, while navigating a website with WebBrowser? Or are there better alternatives?
EDIT: So far I have the following code;
currentWebBrowser = new WebBrowser();
currentWebBrowser.DocumentText = #"<head></head><body></body>";
currentWebBrowser.Url = new Uri("about:blank");
currentWebBrowser.Navigate("http://google.com");
HttpWebRequest Req = (HttpWebRequest) WebRequest.Create("http://google.com");
Req.Proxy = null;
Req.UseDefaultCredentials = true;
HttpWebResponse Res = (HttpWebResponse)Req.GetResponse();
currentWebBrowser.Document.Cookie = Res.Cookies.ToString();
At which moment should I get the cookies? And is my code correct?

You have to preserve the cookies returned from your login request and reuse those cookies on all subsequent requests - the authentication cookie tells the server that you are in fact logged in already. E.g. see here on how to do that.

Any workaround to get text in an iFrame on another domain in a WebBrowser?

You will probably first think is not possible because of XSS restrictions. But I'm trying to access this content from an application that hosts a WebBrowser, not from javascript code in a site.
I understand is not possible and should not be possible via non hacky means to access this content from javascript because this would be a big security issue. But it makes no sense to have this restriction from an application that hosts a WebBrowser. If I'd like to steel my application user's Facebook information, I could just do a Navigate("facebook.com") and do whatever I want in it. This is an application that hosts a WebBrowser, not a webpage.
Also, if you go with Google Chrome to any webpage that contains an iFrame whose source is in another domain and right click its content and click Inspect Element, it will show you the content. Even simpler, if you navigate to any webpage that contains an iFrame in another domain, you will see its content. If you can see it on the WebBrowser, then you should be able to access it programmatically, because it have to be somewhere in the memory.
Is there any way, not from the DOM objects because they seem to be based on the same engine as javascript and therefore restricted by XSS restrictions, but from some more low level objects such as MSHTML or SHDocVw, to access this text?

Can this be useful for you?
foreach (HtmlElement elm in webBrowser1.Document.GetElementsByTagName("iframe"))
{
string src = elm.GetAttribute("src");
if (src != null && src != "")
{
string content = new System.Net.WebClient().DownloadString(src); //or using HttpWebRequest
MessageBox.Show(content);
}
}

Do you just need a way to request content from code?
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(webRequest.URL);
request.UserAgent = webRequest.UserAgent;
request.ContentType = webRequest.ContentType;
request.Method = webRequest.Method;
if (webRequest.BytesToWrite != null && webRequest.BytesToWrite.Length > 0) {
Stream oStream = request.GetRequestStream();
oStream.Write(webRequest.BytesToWrite, 0, webRequest.BytesToWrite.Length);
oStream.Close();
}
// Send the request and get a response
HttpWebResponse resp = (HttpWebResponse)request.GetResponse();
// Read the response
StreamReader sr = new StreamReader(resp.GetResponseStream());
// return the response to the screen
string returnedValue = sr.ReadToEnd();
sr.Close();
resp.Close();
return returnedValue;

C# Proxy with username and password

I set up a proxy instance and used it with a webrequest object.
WebProxy a = new WebProxy("ip:port", true);
proxy.Credentials = new NetworkCredential("username", "password");
WebRequest b = WebRequest.Create("webpage url");
b.Proxy = proxy;
WebResponse c = req.GetResponse();
StreamReader d = new StreamReader(c.GetResponseStream());
d.ReadToEnd();//web page source
Works as it should, but I want to display the page in a web browser control without loss of information and design. If I set my control's document text to the source that was just downloaded. It has very bad formatting.
edit: Is there a way for me to apply the proxy object to the web browser control itself?

edit The WebBrowser control just uses IE's settings, so you don't have to set the proxy yourself. See http://social.msdn.microsoft.com/Forums/en-US/winforms/thread/f4dc3550-f213-41ff-a17d-95c917bed027/ how to set the IE proxy in code.
Well the problem here is that the HTML that you've received via the WebRequest contains relative paths to the CSS files that are not present in the current context. You can modify the HTML, by adding the following tag in the <head> section:
<base href="http://domainname.com/" />
After that the WebBrowser control resolves the relative CSS paths to the domain in this tag.

you must use a browser that supports and has JavaScript enabled - c#

Am trying to post using HTTPWebrequest and this is the response i keep getting back:
you must use a browser that supports and has JavaScript enabled
This is my post code:
HttpWebRequest myRequest = null;
myRequest = (HttpWebRequest)HttpWebRequest.Create(submitURL);
myRequest.Headers.Add("Accept-Language", "en-US");
myRequest.Accept = "image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/xaml+xml, application/vnd.ms-xpsdocument, application/x-ms-xbap, application/x-ms-application, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*";
myRequest.Method = WebRequestMethods.Http.Post;
myRequest.Headers.Add("Accept-Language", "en-US");
myRequest.Accept = "*/*, text/xml";
myRequest.ContentType = "application/x-www-form-urlencoded" + "\n" + "\r";
myRequest.CookieContainer = cookieContainer;
myRequest.Headers.Add("UA-CPU", "x86");
myRequest.Headers.Add("Accept-Encoding", "gzip, deflate");
//cPostData section removed as submitting to SO
myRequest.ContentLength = cPostData.Length;
myRequest.ServicePoint.Expect100Continue = false;
StreamWriter streamWriter = new System.IO.StreamWriter(myRequest.GetRequestStream());
streamWriter.Write(cPostData);
streamWriter.Close();
HttpWebResponse httpWebResponse = (HttpWebResponse)myRequest.GetResponse();
StreamReader streamReader = new System.IO.StreamReader(httpWebResponse.GetResponseStream());
string stringResult = streamReader.ReadToEnd();
streamReader.Close();
how do i avoid getting this error?

It is difficult to say what the exact problem is because the server that is receiving you request doesn't think it is valid.
Perhaps the first thing to try would be to set the UserAgent property on your HttpWebRequest to some valid browser's user agent string as the server may be using this value to determine whether or not to serve the page.

This doesn't have anything to do with your code - the web server code has something that detects or relies on Javascript. Most likely a piece of Javascript on the page fills out (or modifies prior to posting) some hidden form field(s).
The solution to this is entirely dependent on what the web server is expecting to happen with that form data.

This is a layman's answer, and not a 100% technically accurate description of the httpWebRequest object, and meant that way bcause of the amount of time it would take to post it. The first part of this answer is to clarify the final sentence.
The httpWebRequest object basically acts as a browser that is interacting with web pages. It's a very simple browser, with no UI. it's designed basically to be able to post to and read from web pages. As such, it does not support a variety of features normally found in a browser these days, such as JavaScript.
The page you are attempting to post to requires javascript, which the httpWebRequest object does not support. If you have no control over the page that the WebRequst object is posting to, then you'll have to find another wat to post to it. If you own or control the page, you will need to modify the page to strip out items that require javascript (such as Ajax features, etc).
Added
I purposely didn't add anything about specifying a user-agent to try to trick the web server into thinking the httpWebRequest object supports JavaScript. because it is likely that the page really needs to have JavaScript enabled in order for the page to be displayed properly. However, a lot of my assumptions prove wrong, so I would agree with #Andrew Hare and say it's worth a try.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Navigating the webBrowser with proxy - c#

Related

How to submit webpage in library?

C# Downloading HTML from a website after logging in

Any workaround to get text in an iFrame on another domain in a WebBrowser?

C# Proxy with username and password

you must use a browser that supports and has JavaScript enabled - c#

Categories

Resources