get page source code after redirection

get page source code after redirection - c#

I tried some ways to get the page source code of the following website http://www.poppe-bedrijfswagens.nl. This website has a auto redirection set I think.
I tried following ways:
WebClient client = new WebClient();
string sourceCode = "";
sourceCode = client.DownloadString(address);
And
HttpWebRequest myWebRequest = (HttpWebRequest)HttpWebRequest.Create(address);
myWebRequest.AllowAutoRedirect = true;
myWebRequest.Method = "GET";
// make request for web page
HttpWebResponse myWebResponse = (HttpWebResponse)myWebRequest.GetResponse();
StreamReader myWebSource = new StreamReader(myWebResponse.GetResponseStream());
string myPageSource = myWebSource.ReadToEnd();
myWebResponse.Close();
I always get the source code of the first page but i need to get the source code of the page that the website redirected to.
The redirection for http://www.poppe-bedrijfswagens.nl is:
Type of redirect: “meta refresh” redirect after 0 second
Redirected to: http://www.poppe-bedrijfswagens.nl/daf-html/dealer_homepage.html
thanks in advance

The AllowAutoRedirect property is relevant when the redirection is done with an HTTP status code 302. A meta refresh isn't technically a redirection because you are loading the first page.
You can download the first page though and then search the DOM for the element you're interested in <meta http-equiv="refresh" content="0;url=HTTP://WWW.NEXT-URL.COM"> and then download the page you're interested in.

Related

Call classic ASP function from ASPX

I am working on an old web application where pages were written in classic ASP and are wrapped in aspx pages using Iframes. I am rewriting one of those pages in ASP.NET (using C#) removing the dependency on Iframes altogether. The page_to_rewrite.asp call many other functions present in other ASP pages in the same application.
I am facing difficulty calling those ASP functions from aspx.cs. I tried to use WebClient class like this:
using (WebClient wc = new WebClient())
{
Stream _stream= wc.OpenRead("http://localhost/Employee/finance_util.asp?function=GetSalary?EmpId=12345");
StreamReader sr= new StreamReader(_stream);
string s = sr.ReadToEnd();
_stream.Close();
sr.Close();
}
Every request coming to this application is checked for a valid session cookie using an IIS HTTP module and if its not present user is redirected to login page. Now when I call this ASP page url from aspx I get the login page of my application as the response as no session cookie is present.
Can anyone please kindly suggest how can I call the ASP methods successfully.

As told by #Schadensbegrenzer in the comment I just had to pass the cookie in the request header like this:
using (WebClient wc = new WebClient())
{
wc.Headers[HttpRequestHeader.Cookie] = "SessionID=" + Request.Cookies["SessionID"].Value;
Stream _stream= wc.OpenRead("http://localhost/Employee/finance_util.asp?function=GetSalary&EmpId=12345");
StreamReader sr= new StreamReader(_stream);
string s = sr.ReadToEnd();
_stream.Close();
sr.Close();
}
In other similar questions on StackOverflow some people have suggested to also include User-Agent in the request header if you are getting blank output from the asp page as some web servers require this value in the request headers. See if it helps in your case. Mine worked even without it.
Also you will have to handle the request in your ASP page something like this:
Dim param_1
Dim param_2
Dim output
param_1 = Request.QueryString("function")
param_2 = Request.QueryString("EmpId")
If param_1 = "GetSalary" Then
output = GetSalary(param_2)
response.write output
End If
Hope it helps!

C# Downloading HTML from a website after logging in

I've recently been looking on how to get data from a website using C#. I tried using the WebBrowser object to navigate and log in, and that worked fine, but I keep getting the same problem over and over again: when I navigate to the wanted page I get disconnected.
I've tried several things like making sure that only one HtmlDocument exists but I still get logged out.
TLDR: how do you stay logged in, from page to page, while navigating a website with WebBrowser? Or are there better alternatives?
EDIT: So far I have the following code;
currentWebBrowser = new WebBrowser();
currentWebBrowser.DocumentText = #"<head></head><body></body>";
currentWebBrowser.Url = new Uri("about:blank");
currentWebBrowser.Navigate("http://google.com");
HttpWebRequest Req = (HttpWebRequest) WebRequest.Create("http://google.com");
Req.Proxy = null;
Req.UseDefaultCredentials = true;
HttpWebResponse Res = (HttpWebResponse)Req.GetResponse();
currentWebBrowser.Document.Cookie = Res.Cookies.ToString();
At which moment should I get the cookies? And is my code correct?

You have to preserve the cookies returned from your login request and reuse those cookies on all subsequent requests - the authentication cookie tells the server that you are in fact logged in already. E.g. see here on how to do that.

Issues retrieving facebook social plugin comments for page, C# HttpWebRequest class

I'm hoping I've done something knuckle-headed here and there is an easy answer. I'm simply trying to retrieve the list of comments for a page on my site. I use the social plug-in and then retrieve the comment id via the edge event. Server side I send the page id back and do a simple request using a HttpWebRequest. Worked well back in October, but now I get an 'internal error' response from FB. I can use the same url string put it into a browser and get the comments back in the browser in json.
StringBuilder url = new StringBuilder();
url.Append("https://graph.facebook.com/comments/?ids=" + comment.page);
string requestString = url.ToString();
HttpWebRequest request = WebRequest.Create(requestString) as HttpWebRequest;
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
Ideas? Thanks much in advance.

Since you're using the Facebook C# SDK (per your tag), try:
var url = "{your url}";
var api = new Facebook.FacebookClient(appId,appSec);
dynamic commentsObj = api.Get("/comments/?ids=" + url);
dynamic arrayOfComments = commentsObj[url].data

Get different Source Code

I have wrote a little downloader in c# for different sites with videos to download them.
On the site "youtubeunblock.com", I get a different source code from the page when I start a WebRequest in the program. On any browser -> View Source Code I get under the embed source another link for the file appears different from what I have on the Downloader.
The code for the request inside the downloader:
CookieContainer cookieJar = new CookieContainer();
HttpWebRequest myWebRequest = (HttpWebRequest)HttpWebRequest.Create(url);
myWebRequest.CookieContainer = cookieJar;
myWebRequest.Method = "GET";
HttpWebResponse myWebResponse =(HttpWebResponse)myWebRequest.GetResponse();
StreamReader myWebSource = new StreamReader(myWebResponse.GetResponseStream());
string myPageSource = string.Empty;
myPageSource= myWebSource.ReadToEnd();
myWebResponse.Close();
return myPageSource;
i can try to explain
When i browse to this Site and search a video - > look at the source code (over a browser) from this page i found a tag file=http://12345.flv?12345
when i took this link into a href=http://12345.flv?12345 i can download this file.
when i try to take the source code from this page over the Webrequest, then i get the follow link file=http://12345.flv?abcde <- this link won´t work
Can anyone explain me this?

Your question is very unclear, but I think that this site don't allow unregistered users to download from it, so your code won't work.

HTTPS C# Post?

I am trying to login to a HTTPS website and then navigate to download a report using c# (its an xml report) ?
I have managed to login OK via cookies/headers etc - but whenever I navigate to the link once logged in, my connection takes me to the "logged out" page ?
Anyone know what would cause this ?

Make sure the CookieContainer you use for your login is the same one you use when downloading the actual report.
var cookies = new CookieContainer();
var wr1 = (HttpWebRequest) HttpWebRequest.Create(url1);
wr1.CookieContainer = cookies;
// do login here with wr1
var wr2 = (HttpWebRequest) HttpWebRequest.Create(url2);
wr2.CookieContainer = cookies;
// get the report with wr2

It can be any number of reasons. Did you pass in the cookie to the download request? Did you pass a referrer URL?
The best way to check is to record a working HTTP request with Wireshark or any number of Firefox extensions or Fiddler.
Then try to recreate the request in C#

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

get page source code after redirection - c#

Related

Call classic ASP function from ASPX

C# Downloading HTML from a website after logging in

Issues retrieving facebook social plugin comments for page, C# HttpWebRequest class

Get different Source Code

HTTPS C# Post?

Categories

Resources