How to retrieve a webpage with C#? - c#

How to retrieve a webpage and diplay the html to the console with C# ?

Use the System.Net.WebClient class.
System.Console.WriteLine(new System.Net.WebClient().DownloadString(url));

I have knocked up an example:
WebRequest r = WebRequest.Create("http://www.msn.com");
WebResponse resp = r.GetResponse();
using (StreamReader sr = new StreamReader(resp.GetResponseStream()))
{
Console.WriteLine(sr.ReadToEnd());
}
Console.ReadKey();
Here is another option, using the WebClient this time and do it asynchronously:
static void Main(string[] args)
{
System.Net.WebClient c = new WebClient();
c.DownloadDataCompleted +=
new DownloadDataCompletedEventHandler(c_DownloadDataCompleted);
c.DownloadDataAsync(new Uri("http://www.msn.com"));
Console.ReadKey();
}
static void c_DownloadDataCompleted(object sender,
DownloadDataCompletedEventArgs e)
{
Console.WriteLine(Encoding.ASCII.GetString(e.Result));
}
The second option is handy as it will not block the UI Thread, giving a better experience.

// Save HTML code to a local file.
WebClient client = new WebClient ();
client.DownloadFile("http://yoursite.com/page.html", #"C:\htmlFile.html");
// Without saving it.
string htmlCode = client.DownloadString("http://yoursite.com/page.html");

Related

There is one Url for DownLoad Excel.When I paste that url in browser it downloads.But in C# it is not Working

There is one Url for DownLoad Excel.When I paste that url in browser it downloads.But when I use C# WebClient.DownLoadFile(source,Destination),It does not download.)
Well, showing more lines of codes will make solving your problem super fast.
However, you can try this
Using the WebClient class
using System.Net;
//...
WebClient Client = new WebClient();
var address = "http://i.stackoverflow.com/Content/Img/stackoverflow-logo-250.png";
var destination = #"C:\folder\stackoverflowlogo.png";
Client.DownloadFile(address, destination);
Edited: I noticed you're using credentials. Try this,
Uri ur = new Uri("http://remotehost.do/images/img.jpg");
using (WebClient client = new WebClient()) {
//client.Credentials = new NetworkCredential("username", "password");
String credentials = Convert.ToBase64String(Encoding.ASCII.GetBytes("Username" + ":" + "MyNewPassword"));
client.Headers[HttpRequestHeader.Authorization] = $"Basic {credentials}";
client.DownloadProgress += WebClientDownloadProgress;
client.DownloadDataDone += WebClientDownloadDone;
client.DownloadFileAsync(ur, #"C:\path\newImage.jpg");
}
Implement the functions:
void WebClientDownloadProgress(object sender, DownloadProgressEventArgs e)
{
Console.WriteLine("Download status: {0}%.", e.ProgressPercentage);
// updating the UI or do something else
}
void WebClientDownloadDone(object sender, DownloadDataDoneEventArgs e)
{
Console.WriteLine("Download finished!");
}
In summary, this might not be exactly what you want to do, but it gives you an idea of to use the authorizations. Happy coding!

C# downloading a csv file from https site

I am trying to download a file into my app. From what I can see if I put this link into a browser I get a good file downloaded so I know the file is available for me to get. I have other files to download from different sites and they all work well but this one will not download for me in a usable manner. I guess I do not understand something, do you know the key fact I am missing?
After many attempts I have now coded the following
private string url =
"https://coronavirus.data.gov.uk/api/v1/data?filters=areaType=nation&structure={\"Name\":\"areaName\",\"date\":\"date\",\"FirstDose\":\"cumPeopleVaccinatedFirstDoseByPublishDate\",\"SecondDose\":\"cumPeopleVaccinatedSecondDoseByPublishDate\"}&format=csv";
private void btn_wc1_Click(object sender, EventArgs e)
{
WebClient wc = new WebClient();
wc.Encoding = Encoding.UTF8;
wc.DownloadFile(url, "wc1_uk_data.csv");
}
private void btn_wc2_Click(object sender, EventArgs e)
{
using (var webClient = new WebClient())
{
webClient.Encoding = Encoding.UTF8;
string s = webClient.DownloadString(url);
File.WriteAllText(#"wc2_uk_data.csv", s);
}
}
private async void btn_https_Click(object sender, EventArgs e)
{
HttpClient _client = new HttpClient();
byte[] buffer = null;
try
{
HttpResponseMessage task = await _client.GetAsync(url);
Stream task2 = await task.Content.ReadAsStreamAsync();
using (MemoryStream ms = new MemoryStream())
{
await task2.CopyToAsync(ms);
buffer = ms.ToArray();
}
File.WriteAllBytes("https_uk_data.csv", buffer);
}
catch
{
}
}
private void btn_wc3_Click(object sender, EventArgs e)
{
using (var webClient = new WebClient())
{
webClient.Encoding = Encoding.UTF8;
byte[] myDataBuffer = webClient.DownloadData(url);
MemoryStream ms = new MemoryStream(myDataBuffer);
FileStream f = new FileStream(Path.GetFullPath(Path.Combine(Application.StartupPath, "wc3_uk_data.csv")),
FileMode.OpenOrCreate);
ms.WriteTo(f);
f.Close();
ms.Close();
}
}
Using the following UI
All the different functions above will download a file but none of the downloaded files is usable. It seems like it is not the file but maybe something to do with information regarding the file. As my app does not know what to do with this I never reply with what ever the other end wants. I guess if I replied the next set of data I got would be the file.
If I put the URL into a browser then I get a file that is good. This link is good at the moment.
https://coronavirus.data.gov.uk/api/v1/data?filters=areaType=nation&structure={"Name":"areaName","date":"date","FirstDose":"cumPeopleVaccinatedFirstDoseByPublishDate","SecondDose":"cumPeopleVaccinatedSecondDoseByPublishDate"}&format=csv
Anyone got any idea on what I need to do in my app to get the file like the browser does?
You need to set the WebClient.Encoding before calling DownloadString
string url = "https://coronavirus.data.gov.uk/api/v1/data?filters=areaType=nation&structure={\"Name\":\"areaName\",\"date\":\"date\",\"FirstDose\":\"newPeopleReceivingFirstDose\",\"SecondDose\":\"newPeopleReceivingSecondDose\"}&format=csv";
using (var webClient = new WebClient())
{
webClient.Encoding = Encoding.UTF8;
string s = webClient.DownloadString(url);
}
Here is a related question:
WebClient.DownloadString results in mangled characters due to encoding issues, but the browser is OK

Watin multithreading issue

I have a problem in my application, written in c# using WatiN.
The application creates few threads,and each thread open browser and the same page.
The page consist of HTML select element: and a submit button.
The browsers should select a specific option and click on the submit button at the same time but instead they do it "one by one".
Here is the main code lines:
[STAThread]
static void Main(string[] args)
{
for (int i = 0; i < numOfThreads;i++ )
{
var t = new Thread(() => RealStart(urls[i]));
t.SetApartmentState(ApartmentState.STA);
t.IsBackground = true;
t.Start();
}
}
private static void RealStart(string url)
{
using (var firstBrowser = new IE())
{
firstBrowser.GoTo(url);
firstBrowser.BringToFront();
OptionCollection options = firstBrowser.SelectList("Select").Options;
options[1].Select();
firstBrowser.Button(Find.ByName("Button")).Click();
firstBrowser.Close();
}
}
What is the cause of the "one by one" selection instead of simultaneously selection?
Solution:
After a long research I gave up using WatiN for this porpuse.
Instead, I have created HttpWebRequest and post it to the specific URL.
Works Like a charm:
HttpWebRequest httpWReq = (HttpWebRequest)WebRequest.Create("http://domain.com/page.aspx");
ASCIIEncoding encoding = new ASCIIEncoding();
string postData = "username=user";
postData += "&password=pass";
byte[] data = encoding.GetBytes(postData);
httpWReq.Method = "POST";
httpWReq.ContentType = "application/x-www-form-urlencoded";
httpWReq.ContentLength = data.Length;
using (Stream stream = httpWReq.GetRequestStream())
{
stream.Write(data,0,data.Length);
}
HttpWebResponse response = (HttpWebResponse)httpWReq.GetResponse();
string responseString = new StreamReader(response.GetResponseStream()).ReadToEnd();
I send those requests simultaneously, by creating a Thread for each request.

How can i download a website content to a string?

I tried this and i want that the source content of the website will be download to a string:
public partial class Form1 : Form
{
WebClient client;
string url;
string[] Search(string SearchParameter);
public Form1()
{
InitializeComponent();
url = "http://chatroll.com/rotternet";
client = new WebClient();
webBrowser1.Navigate("http://chatroll.com/rotternet");
}
private void Form1_Load(object sender, EventArgs e)
{
}
static void DownloadDataCompleted(object sender,
DownloadDataCompletedEventArgs e)
{
}
public string SearchForText(string SearchParameter)
{
client.DownloadDataCompleted += DownloadDataCompleted;
client.DownloadDataAsync(new Uri(url));
return SearchParameter;
}
I want to use WebClient and downloaddataasync and in the end to have the website source content in a string.
No need for async, really:
var result = new System.Net.WebClient().DownloadString(url)
If you don't want to block your UI, you can put the above in a BackgroundWorker. The reason I suggest this rather than the Async methods is because it is dramatically simpler to use, and because I suspect you are just going to stick this string into the UI somewhere anyway (where BackgroundWorker will make your life easier).
If you are using .Net 4.5,
public async void Downloader()
{
using (WebClient wc = new WebClient())
{
string page = await wc.DownloadStringTaskAsync("http://chatroll.com/rotternet");
}
}
For 3.5 or 4.0
public void Downloader()
{
using (WebClient wc = new WebClient())
{
wc.DownloadStringCompleted += (s, e) =>
{
string page = e.Result;
};
wc.DownloadStringAsync(new Uri("http://chatroll.com/rotternet"));
}
}
Using WebRequest:
WebRequest request = WebRequest.Create(url);
request.Method = "GET";
WebResponse response = request.GetResponse();
Stream stream = response.GetResponseStream();
StreamReader reader = new StreamReader(stream);
string content = reader.ReadToEnd();
reader.Close();
response.Close();
You can easily call the code from within another thread, or use background worer - that will make your UI responsive while retrieving data.

Current Page HTML output using c#

I am working in an asp.net website. I need to get the current page HTML output in the Page Load event. I tried the following code. But I am not getting any output, it executes continuously.
protected void Page_Load(object sender, EventArgs e)
{
Http(Request.Url.ToString());
}
public void Http(string url)
{
if (url.Length > 0)
{
Uri myUri = new Uri(url);
// Create a 'HttpWebRequest' object for the specified url.
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create(myUri);
// Set the user agent as if we were a web browser
myHttpWebRequest.UserAgent = #"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4";
HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
var stream = myHttpWebResponse.GetResponseStream();
var reader = new StreamReader(stream);
var html = reader.ReadToEnd();
// Release resources of response object.
myHttpWebResponse.Close();
Response.Write(html);
}
}
What is wrong here?
Is there is any other way to get current page HTML output using c#?
I tried the following code also:
protected void Page_Load(object sender, EventArgs e)
{
Page pp = this.Page;
StringWriter tw = new StringWriter();
HtmlTextWriter hw = new HtmlTextWriter(tw);
pp.RenderControl(hw);
string theOut = tw.ToString().Trim();
string FilePath = #"D:\Home.txt";
Stream s = new FileStream(FilePath, FileMode.Create);
StreamWriter sw = new StreamWriter(s);
sw.WriteLine(theOut);
sw.Close();
}
By using the code i am able to get the HTML in the ".txt" file.But execution of this code causes "A page can have only one server-side Form tag." error. Can anybody help me to solve this?
well, you will have to bend space-time continuum, because in Page_Load event there is no html output, and naturally your request in http method (isn't that really bad name?) will call Page_Load again.
It's a joke, you can't have html output in Page_Load event since it's not been produced yet.
Update:
You can make changes on produced output by page with HttpFilter, look at this SO answer :
https://stackoverflow.com/a/10215626/351383
Page_Render event is responsible for generating HTML for the page and Unload event gets called after this. In this event you should be able to get HTML output of the page.
You can try this...
public override void Render(HtmlTextWriter writer):
{
StringBuilder renderedOutput = new StringBuilder();
Streamwriter strWriter = new StringWriter(renderedOutput);
HtmlTextWriter tWriter = new HtmlTextWriter(strWriter);
base.Render(tWriter);
string html = tWriter.InnerWriter.ToString();
string filename = Server.MapPath(".") + "\\data.txt";
outputStream = new FileStream(filename, FileMode.Create);
StreamWriter sWriter = new StreamWriter(outputStream);
sWriter.Write(renderedOutput.ToString());
sWriter.Flush();
//render for output
writer.Write(renderedOutput.ToString());
}

Categories