Get Html from a Url in ASP.NET MVC - c#

I need to save a page from url - This page is in my own application - in html format. This html will then be send by email to a user. Any anyone knows how?

Well, you'll have to do it at the server to be able to e-mail - so at worst, simply:
using(WebClient client = new WebClient()) {
string html = client.DownloadString(address);
}
It might also be possible to do it directly within MVC - perhaps RenderPartial?

You could create a Result Filter or override the OnResultExecuted method of the controller to get access to the rendered page.

Related

I can't get the content of a web page without html codes in C#

I want to get the text of a web page in windows form application. I am using:
WebClient client = new WebClient();
string downloadString = client.DownloadString(link);
However, it gave me html codes of the web page.
Here is the question:
Can I get the specific part of a website? For example a part that has a class name "ask-page new-topbar". I want to get every part that has class name "ask-page new-topbar".
No, you can't get only parts of a website, when you send a request to a url.
What you can do is use the Html Agility Pack and let it dig through the Html code to give you the contents of the requested node.

C# loading html of a webpage currently on

I am trying to make a small app that can log in automatically on a website, get certain texts on the website and return to user.
To show what I have, I did below to make it log in,
System.Windows.Forms.HtmlDocument doc = logger.Document as System.Windows.Forms.HtmlDocument;
try
{
doc.GetElementById("loginUsername").SetAttribute("value", "myusername");
doc.GetElementById("loginPassword").SetAttribute("value", "mypassword");
doc.GetElementById("loginSubmit").InvokeMember("click");
And below to load html of the page
WebClient myClient = new WebClient();
Stream response = myClient.OpenRead(webbrowser.Url);
StreamReader reader = new StreamReader(response);
string src = reader.ReadToEnd(); // finally reading html and saving in variable
Now, it successfully loaded html but html of the page where it's not logged in. Is there a way to refer to current html somehow? Or another way to achieve my goals. Thank you for reading!
Use the Webclient class so you can use sessions and cookies.
check this Q&A: Using WebClient or WebRequest to login to a website and access data
Why don't you make REST API calls and send the data like username and password from your code itself?
Is there any Web API for the URL ? If yes , you can simply call the service and pass on the required parameters. The API shall return in JSON/XML which you can parse and extract information

redirecting to page from inside iframe

I have just now started coding in .NET framework, so my apologies if this issue happens to be of trivial nature.
What I got now
A main.aspx page with simple layout using three iframes
the middle iframe content needs to be dynamic (first a login.aspx page and after logging entryform.aspx)
Issue #1 :
After logging in login.aspx inside the iframe, redirecting to main.aspx
The solution I found:
ClientScript.RegisterStartupScript(this.GetType(),"scriptid",
"window.parent.location.href='main.aspx'", true);
(http://forums.asp.net/t/1273497.aspx)
Issue #2
After redirecting/logging how do I change the middle iframe content from login.aspx to entryform.aspx?
The silly solution I thought of:
Add '#form' to the url and listen to hashchange event in main.aspx. But then, anyone can get to the form using the url itself.
So, basically how do I find a secure way to tell the main.aspx page that it needs to change it's middle iframe content after the redirecting/logging
Or by any chance there is a request.setAttribute and getAttribute in .NET like in java that I have missed and made things difficult for me?
Passing variables or values across pages and domains wont be issue, you can use post method and cross page posting for that
After finding that the use of iframe isn't exactly the best idea in my case, I took Tieson T's advice and looked into HttpClient to fetch content from other web pages. In my case it will be both from same domain and other domains.
Since I have 4.0v .NET instead of HttpClient I used HttpWebRequest
code
HttpWebRequest request = (HttpWebRequest)WebRequest.Create (http://localhost:1706/WebSite3/test.html);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Encoding encode = System.Text.Encoding.GetEncoding("utf-8");
StreamReader reader = new StreamReader(response.GetResponseStream(),encode );
string html= reader.ReadToEnd();
myDiv.innerHtml(html);
References
HttpClient does not exist in .net 4.0: what can I do?
http://msdn.microsoft.com/en-us/library/456dfw4f.aspx
http://forums.asp.net/t/1382935.aspx/1
http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.aspx
http://msdn.microsoft.com/en-us/library/system.net.httpwebresponse.aspx
http://msdn.microsoft.com/en-us/library/system.net.http.httpclient.aspx
Wouldn't it be easier to just pass a parameter to Main.aspx? E.g.
ClientScript.RegisterStartupScript(this.GetType(),"scriptid", "window.parent.location.href='main.aspx?LoadEntry=true'", true);
And client-side JavaScript code inside of 'main.aspx' would read that parameter from 'location.search' and if 'LoadEntry=true' is set - would set SRC of middle frame to "entryform.aspx".
Of course inside of "entryform.aspx.cs" you would need to check if correct login really took place (e.g. some Session variable is set) so nobody would be able to simple set URL manually to "main.aspx?LoadEntry=true" to bypass the login.

How to capture HTML of redirect page before it redirects?

I am trying to read the HTML of a page that contains a non-delayed redirect. The following snippet (C#) will give me the destination/redirected page, not the initial one I need to see:
using System.Net;
using System.Text;
public class SomeClass {
public static void Main() {
byte[] data = new WebClient().DownloadData("http://SomeUrl.com");
System.Console.WriteLine(Encoding.ASCII.GetString(data));
}
}
Is there a way to get the HTML of a redirecting page? (I prefer .NET but a snippet in Java or Python would be fine too. Thx!)
Unless the redirect is done on the client side you can't. If the redirect is done server side, then no html is actually generated to the client, but the header is redirected at the new server.
It would take more work, but rather than using WebClient, use HttpWebRequest and set the AllowAutoRedirect property to False. A redirect will then throw an exception, but you can get any response text (and some pages do have response text along with the redirect) from the exception's response object. After you get the response from the exception, you can issue another HttpWebRequest for the redirect URL (specified in the Location response header).
You might be able to do something similar with WebRequest if you create a derived object, MyWebRequest, where you overload the GetWebRequest method and set the AllowAutoRedirect property. I don't know what kind of exception, if any, the DownloadData method will return if you do something like that.
As somebody said previously, this will only work for those pages that do client-side redirects (typically 301 or 302). If there is server-side redirection going on, you'd never know it.
Simplest answer would be to add the current page onto the QueryString component of the redirect when redirecting, for instance:
Response.Redirect(newPage + "?FromPage=" + Request.Url);
Then the new page could see where you cane from by simply looking at Request.QueryString("FromPage").
If you want to get the source of an html page you can use this tool:
http://www.selfseo.com/html_source_view.php

Server side include external HTML?

In my asp.net-mvc application I need to include a page that shows a legacy page.
The body of this page is created by calling an existing Perl script.
This Perl script is externally hosted.
Is there a way to do something like this:
<!-- #Include virtual="http://www.example.com/theScript.plx"-->
Not as a direct include, because ASP.NET server-side-includes require the page to be compiled at the server.
You could use jQuery to download the HTML from that URL when the page loads, though I appreciate that's not perfect.
Alternatively (and I have no idea whether this will work) you could perform a WebRequest to the perl webpage from your ASP.NET MVC controller, and put the resulting HTML in the view as text. That way you could make use of things like output caching to limit the hits to the perl page if it doesn't change often.
If you wanted to do it all in one go, you could do an HTTP Request from the server and write the contents to the page?
Something like this:
Response.Write(GetHtmlPage("http://www.example.com/theScript.plx"));
Calling this method:
public String GetHtmlPage(string strURL)
{
// the html retrieved from the page
String strResult;
WebResponse objResponse;
WebRequest objRequest = System.Net.HttpWebRequest.Create(strURL);
objResponse = objRequest.GetResponse();
// the using keyword will automatically dispose the object
// once complete
using (StreamReader sr = new StreamReader(objResponse.GetResponseStream()))
{
strResult = sr.ReadToEnd();
// Close and clean up the StreamReader
sr.Close();
}
return strResult;
}
(Most code ripped blatantly from here and therefore not checked)
You could implement this in a low-key fashion by simply using a frame and setting the frame source to the url that needs to be included. This is quite simple and can be down without any server or client side scripting, so that'd be my preferred approach, if possible.
If you want the html to appear to come from your server, however, you'll need to manually include it - typically by using WebRequest as Neil says. You may wish to cache the remote page for performance, though, since it's a perl script, I'll assume the page is dynamic, so this might not be a great idea.

Categories