C# loading html of a webpage currently on - c#

I am trying to make a small app that can log in automatically on a website, get certain texts on the website and return to user.
To show what I have, I did below to make it log in,
System.Windows.Forms.HtmlDocument doc = logger.Document as System.Windows.Forms.HtmlDocument;
try
{
doc.GetElementById("loginUsername").SetAttribute("value", "myusername");
doc.GetElementById("loginPassword").SetAttribute("value", "mypassword");
doc.GetElementById("loginSubmit").InvokeMember("click");
And below to load html of the page
WebClient myClient = new WebClient();
Stream response = myClient.OpenRead(webbrowser.Url);
StreamReader reader = new StreamReader(response);
string src = reader.ReadToEnd(); // finally reading html and saving in variable
Now, it successfully loaded html but html of the page where it's not logged in. Is there a way to refer to current html somehow? Or another way to achieve my goals. Thank you for reading!

Use the Webclient class so you can use sessions and cookies.
check this Q&A: Using WebClient or WebRequest to login to a website and access data

Why don't you make REST API calls and send the data like username and password from your code itself?
Is there any Web API for the URL ? If yes , you can simply call the service and pass on the required parameters. The API shall return in JSON/XML which you can parse and extract information

Related

Get Solvemedia image using RestSharp

I'm developing an application which needs to make a query on a website to extract certain data, such as the user's name, points / remaining balance etc.
I present a problem in the login, the client needs to solve a captcha solvemedia to be able to enter the website, I would like to extract the image of this captcha and show it to the client but I am having problems to extract it, i'm trying to do it through http requests with RestSharp, the reason I do not do it with a webbrowser or selenium is that it spends a lot more resources.
i try this:
RestClient restClient = new RestClient(#"//api-secure.solvemedia.com/papi/media?c=2#gAB09NHSertXLv3TnpobmKDxvkjsaT4m#X4wLMdkN.u0ENU8bgrS3KH9APTC4lJjokJaIfZePPIgNLL84QkOaQlXcxzHvOVTTU98Of7mo8BoC0QQuiH1RMqMrGof6BbL-tReeY8AHhPA7-nwvQKLqUEXQwTL4HhLXfZVre9jccpqQxFGIRYZH1ZQoAKCV5k1TGCLXXP9vMVsJFntDNz6Ozik02MANT1siBJRYTNIpGcj6p6Gbq5j0HvQChz7jtgdzwlj7nee0BdZphpg27ikQlVB5IUelMvSjzNNvPZawB9YbC9v6zyJngNQaJIJku2SPJkhFXIK0uoA;w=300;h=150;fg=000000;bg=f8f8f8");
var fileBytes = restClient.DownloadData(new RestRequest("#", Method.GET));
File.WriteAllBytes(Path.Combine(directory, "poster-got.jpg"), fileBytes);
The problem with this is that I only get an image that says "Media Error", Is there any way to get the image that is sent when you request the login page? Can it be done with restsharp? , if not with what library could I do it?
For downloading image I use https://github.com/jgiacomini/Tiny.RestClient
But when I try to view your image in in my browser I have a media error. I think you use a wrong url.
In your case
using Tiny.RestClient;
var client = new TinyRestClient(new HttpClient(), "#"http//api-secure.solvemedia.com");
FileInfo fileInfo = await client.
GetRequest("papi/media").
AddQueryParameter("2#gAB09NHSertXLv3TnpobmKDxvkjsaT4m#X4wLMdkN.u0ENU8bgrS3KH9APTC4lJjokJaIfZePPIgNLL84QkOaQlXcxzHvOVTTU98Of7mo8BoC0QQuiH1RMqMrGof6BbL-tReeY8AHhPA7-nwvQKLqUEXQwTL4HhLXfZVre9jccpqQxFGIRYZH1ZQoAKCV5k1TGCLXXP9vMVsJFntDNz6Ozik02MANT1siBJRYTNIpGcj6p6Gbq5j0HvQChz7jtgdzwlj7nee0BdZphpg27ikQlVB5IUelMvSjzNNvPZawB9YbC9v6zyJngNQaJIJku2SPJkhFXIK0uoA;w=300;h=150;fg=000000;bg=f8f8f8").
DownloadFileAsync("c:\"poster-got.jpg");

I can't get the content of a web page without html codes in C#

I want to get the text of a web page in windows form application. I am using:
WebClient client = new WebClient();
string downloadString = client.DownloadString(link);
However, it gave me html codes of the web page.
Here is the question:
Can I get the specific part of a website? For example a part that has a class name "ask-page new-topbar". I want to get every part that has class name "ask-page new-topbar".
No, you can't get only parts of a website, when you send a request to a url.
What you can do is use the Html Agility Pack and let it dig through the Html code to give you the contents of the requested node.

HttpWebResponse returns only one element of the page

Hello i making a simple httpwebrequest and then i read (StreamReader) the response and just want to get the html page of website,but i get only one laber(only one element of the page) in the browser all fine(i see all page) but when i try to set cookies to Deny\disable i also in the browser get this label(only one element of the page) and all is disappear.Sow i getting opinion if after i disabled cookies in browser i get the same page(like in code) that mean my HttpWebRequest is have settings cookies=deny/disable.
You can go to https://www.bbvanetcash.com/local_kyop/KYOPSolicitarCredenciales.html and disable cookies with F12 and you will see the difrance and also this page with one label.
Sow this my code any ideas what i need to change here?
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create("https://www.bbvanetcash.com/local_kyop/KYOPSolicitarCredenciales.html");
HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
Stream streamResponseLogin = myHttpWebResponse.GetResponseStream();
StreamReader streamReadLogin = new StreamReader(streamResponseLogin);
LoginInfo = streamReadLogin.ReadToEnd();
Your code is receiving complete page content, but it cannot receive the dynamic contents. This is happening because the page you are trying to access relies on Cookies for maintaining session as well as JavaScript (it is using jQuery) for loading dynamic contents and providing rich user experience.
To successfully receive the whole page, your code must
support retrieving, storing and sending cookie objects across various HttpRequest and HttpResponse.
be able to execute JavaScript code to load the dynamic contents/markup of the page
To test 'if your code is receiving proper values or not' visit the site Web Sniffer and put your URL there.
As you can try on web-sniffer site, for www.google.com, the response you are getting is a redirect instruction.... that means, even to access the Google's home page, your code must understand HTTP status messages (302 there).

Server side include external HTML?

In my asp.net-mvc application I need to include a page that shows a legacy page.
The body of this page is created by calling an existing Perl script.
This Perl script is externally hosted.
Is there a way to do something like this:
<!-- #Include virtual="http://www.example.com/theScript.plx"-->
Not as a direct include, because ASP.NET server-side-includes require the page to be compiled at the server.
You could use jQuery to download the HTML from that URL when the page loads, though I appreciate that's not perfect.
Alternatively (and I have no idea whether this will work) you could perform a WebRequest to the perl webpage from your ASP.NET MVC controller, and put the resulting HTML in the view as text. That way you could make use of things like output caching to limit the hits to the perl page if it doesn't change often.
If you wanted to do it all in one go, you could do an HTTP Request from the server and write the contents to the page?
Something like this:
Response.Write(GetHtmlPage("http://www.example.com/theScript.plx"));
Calling this method:
public String GetHtmlPage(string strURL)
{
// the html retrieved from the page
String strResult;
WebResponse objResponse;
WebRequest objRequest = System.Net.HttpWebRequest.Create(strURL);
objResponse = objRequest.GetResponse();
// the using keyword will automatically dispose the object
// once complete
using (StreamReader sr = new StreamReader(objResponse.GetResponseStream()))
{
strResult = sr.ReadToEnd();
// Close and clean up the StreamReader
sr.Close();
}
return strResult;
}
(Most code ripped blatantly from here and therefore not checked)
You could implement this in a low-key fashion by simply using a frame and setting the frame source to the url that needs to be included. This is quite simple and can be down without any server or client side scripting, so that'd be my preferred approach, if possible.
If you want the html to appear to come from your server, however, you'll need to manually include it - typically by using WebRequest as Neil says. You may wish to cache the remote page for performance, though, since it's a perl script, I'll assume the page is dynamic, so this might not be a great idea.

Get Html from a Url in ASP.NET MVC

I need to save a page from url - This page is in my own application - in html format. This html will then be send by email to a user. Any anyone knows how?
Well, you'll have to do it at the server to be able to e-mail - so at worst, simply:
using(WebClient client = new WebClient()) {
string html = client.DownloadString(address);
}
It might also be possible to do it directly within MVC - perhaps RenderPartial?
You could create a Result Filter or override the OnResultExecuted method of the controller to get access to the rendered page.

Categories