C#: How to make HttpWebRequest mimic the Web Browser control - c#

I've used several HttpWebRequest's in the past but they've all been used to login into a site.
I was wondering how does one make the WebRequest mimic a WebBrowser as in once you're logged in, navigate to a new page, maybe perform an action there, then go to a different page?
I've researched a little about this before and I think it might involve using the prior request's cookies or something.
My question is how do I (I'm assuming) get the cookies from the previous session, then navigate to a page, or complete an action as if we were still on the last request if that makes sense.

the HttpWebRequest has a Cookies property and HttpWebResponse has a CookieContainer property.
you record the cookies from the container, and add them to the next request.
you may also need to set the HTTP referrer header field on the request object.
EDIT :
this will still not get you mimicking a web browser. things like JavaScript will not work/run. and you won't have a DOM to work against.

Related

I need a session ID from page A to correctly open page B. How to get page B HTML?

I want to get HTML code of B page. Unfortunately site requires to open A page first to get session_id, after it I can finally open webpage I wanted. What is solution to get html code of B page? I try do it with WebClient, but session_id is probably not saved.
var client = new WebClient();
client.DownloadString("http://moria.umcs.lublin.pl/link/");
client.DownloadString("http://moria.umcs.lublin.pl/link/grid/1/810");
It depends on how the server tracks that you have already visited page A when you visit page B.
Most likely it uses some kind of session ID, which is probably saved in cookies. Examining HTTP request and response headers in any browser's developer tools can get you an idea of what this website does to track the user.
If you need to be able to store session ID in cookies, cookies-aware web-client sample is given here
I would use HttpWebRequest instead of WebClient. I did not see any method in WebClient where you can get or set cookies. Take a look at this MSDN link. Your code for the initial request would be something like in the link. For the next request to another page, set the CookieContainers with the cookies from the response that you got from the initial request; before you request for the response.
https://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.cookiecontainer(v=vs.110).aspx

c# HttpClient does not store some cookies

I'm using the HttpClient (System.Net.Http.HttpCient) to send some requests and I'm also using a CookieContainer to hande Cookies. For some Webpages everything works fine, but on some other pages no cookies are stored, although my browser saves the Cookies when I visit the webpage.
Can someone here explain what's the problem.
ceddy
Maybe these pages redirect to an other url? Cookies are stored per url and hence it's possible you "loose" a cookie.
To verify the behavior you may set
request.AllowAutoRedirect = false;
and look at the response object about what's going on. If this is really the issue in your case, you can copy the cookies from one url to the other via the CookieContainer.

How to change the URL in the browser URL Bar

I am using server.transfer() method in my asp.net application to redirect the response. But I am running into the problem that it sets the previous page url (from where the original request for page was generated) at the browser url bar. I want to change the url in the browser. is it even possible??
I looked into it and i know that the Request has a url property but its read only. does any one know a way to change the url in the request?
Use Response.Redirect(); instead of server.transfer(); and it redirects in the browser.
If you can't do taht, you could use pushState (at least where it's aviable) to change the URL, but it seems a bit of a overkill...
The best way is clearly to change
server.transfer();
to
Response.Redirect();
EDIT
as you want to have the maximum performance, you could should use Response.Redirect with two parameters, and set the second to true.
so instead of
server.transfer(url);
you should have
to
Response.Redirect(url, true);
That causes the current request to abort and force a instant redirect.
Description
You can't change the Url of the Current request because it is already running.
I think you want to do a redirect.
The Redirect method causes the browser to redirect the client to a different URL.
Sample
Response.Redirect("<theNewUrl>");
Update
If you want to change the Url in the Browsers Address Bar without doing a requestion read this:
Can I change the URL string in the address bar using javascript
More Information
MSDN - Response.Redirect Method
Server.Transfer() is just changeing which content you send back.
Response.Redirect() is what you need to tell the browser to go to a new page
You cannot change the URL of a request - it would make no sense, the URL is what your client (the browser) has asked for.
No, you cannot change the URL in the browser like that. That would be a pretty massive security hole if you were able to do that. http://EvilDomain.com would be able to seamlessly masquerade as http://YourOnlineBank.com and no one would be any the wiser.

Automating/Scraping html from C# - redirects and logins/logouts

I'm trying to scrape a web app that uses a few redirects and logouts/logins in between requests. I believe I'll have to set AllowAutoRedirect to false so I can capture the redirect requests and manually redirect while watching for new cookies. My only experience with cookies is to just set the container and forget about it... do I have to parse the response headers to decide actions to take with cookies? Can someone lay out a general approach?
Update
It turns out that Chris was right. The redirects and cookies were working just fine... the application I was hitting did not like that I was not setting all the right headers (content type, user agent). After adding those in, I'm getting the response I expect.
You got the CookieContainer part right which most people miss so kudos to you!
AllowAutoRedirect should pick up cookies as redirects happen. Is there a reason that you need to manually process things?

Possible to get return url back from javascript/jquery?

I have a problem that when a user times out on my site they are still logged in. So they can still do an ajax request. If they do an ajax request on my site my asp.net mvc authorization tag will stop this.
The authorization normally then redirects the user back to the signin page if they fail authorization.
Now since this is an ajax request what seems to be happening is it send the entire page back rendered as html. So the user never gets redirect since I just got the entire page send to me as html.
However firebug says this in the console:
http://localhost:3668/Account/signIn?ReturnUrl="return" ( this is not in the actual url bar in the web browser so I can't go up there and get it. I only can seem to see it through firebug.)
So I am not sure but maybe if I could somehow grab this url from inside my errorCallback area that would be great.
Since from my testing no error code is sent back(200 OK is sent). Instead I just get parsing error(hence why errorCallback is called) but I can't assume that every time I get parsing error it means the user timed out.
I need something better. The only other option is too look at the response and look for key works and see if it is the signin page what I don't think is that great of away to do it.
You probably want to do one of two things:
Write your server code such that ajax requests return an ajax error when a session is expired. That way the javascript will expect a return code that indicates a session timeout, and you can tell the user the session expired.
If an elegant solution isn't forthcoming because of how your framework handles this stuff, just put a chunk of HTML comment in your login page like Uth7mee3 or something; then check for the existence of that string in your ajax code.
Alternative, you can also set a timer on the web page that figures out when the session is about to time out and warn the user with a little message that lets them renew their session. Once it times out, blank out the page and give them a link to login again.
How about having a script in the Loginpage
if(document.location.href != "/Account/Login")
{
document.location.href = "/Account/Login"
}
This would work if you try to render partials in an ajax request.
(Not if you expect json)
What is the status code of the response in this situation? I think you should be able to check for a 302 here. If not, the Location header would be the next best way to check for the sign-in page.
This isn't an answer to your specific question, but the way I deal with this is to have a some client-side code that understands about the session length and prompts the user to renew a session just prior to it being ready to expire if they haven't moved off the page. If the user doesn't respond to the prompt in time, it invokes the logout action of the site -- taking the user to the login page.
You can find more information on the exact implementation, including some code, on my blog: http://farm-fresh-code.blogspot.com.

Categories