Download HTML Page in C# - c#

I am writing an app in c#,
Is there a way to download a HTML page by giving my program its URL only.
Foe example my program will get the URL www.google.com and download the HTML page?

Use WebClient.DownloadString().

Use the WebClient class.
This is extracted from a sample on the msdn doc page:
using System;
using System.Net;
using System.IO;
public static string Download (string uri)
{
WebClient client = new WebClient ();
Stream data = client.OpenRead (uri);
StreamReader reader = new StreamReader (data);
string s = reader.ReadToEnd ();
data.Close ();
reader.Close ();
return s;
}

Related

Unable to properly download cyrillic-encoded HTML page in C#

I am trying to download HTML webpage locally to my computer and this works fine, however, it is a Bulgarian article and it does not seem to display properly afterwards.
I have tried many encoding (Code Page Identifiers - WINDOWS-1251, UTF-8, etc.) from MSDN https://msdn.microsoft.com/en-us/library/windows/desktop/dd317756(v=vs.85).aspx but for some reason I cannot get it to open as intended.
For example:
Стара планина - величествената кръстница на Балканския полуостров
Displays as:
??N�?�N�?� ???�?�???????� - ???�?�??N�?�N?N�???�???�N�?� ??N�NSN?N�????N�?� ???� ?�?�?�???�??N?????N? ?????�N???N?N�N�????
Below I am posting my simple code. Your help will be much appreciated! :)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Net;
namespace pageDownloader
{
class Program
{
public static void DownloadPage()
{
WebClient client = new WebClient();
string webpage = client.DownloadString("http://www.nasamnatam.com/statia/Stara_planina_velichestvenata_krystnica_na_Balkanskiia_poluostrov-2525.html");
System.IO.File.WriteAllText(#"C:\test\downloadedpage.html", webpage, Encoding.GetEncoding("windows-1251"));
}
static void Main()
{
DownloadPage();
}
}
}
Console.OutputEncoding = Encoding.UTF8;
string htmlCode = "";
WebClient client = new WebClient { Encoding = Encoding.UTF8 };
byte[] reply = client.DownloadData($"http://www.nasamnatam.com/statia/Stara_planina_velichestvenata_krystnica_na_Balkanskiia_poluostrov-2525.html");
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
Encoding encoding1251 = Encoding.GetEncoding("windows-1251");
var convertedBytes = Encoding.Convert(encoding1251, Encoding.UTF8, reply);
htmlCode = Encoding.UTF8.GetString(convertedBytes);

Pulling CSV file from server and displaying on a site [duplicate]

Im trying to create a web service which gets to a URL e.g. www.domain.co.uk/prices.csv and then reads the csv file. Is this possible and how? Ideally without downloading the csv file?
You could use:
public string GetCSV(string url)
{
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
StreamReader sr = new StreamReader(resp.GetResponseStream());
string results = sr.ReadToEnd();
sr.Close();
return results;
}
And then to split it:
public static void SplitCSV()
{
List<string> splitted = new List<string>();
string fileList = getCSV("http://www.google.com");
string[] tempStr;
tempStr = fileList.Split(',');
foreach (string item in tempStr)
{
if (!string.IsNullOrWhiteSpace(item))
{
splitted.Add(item);
}
}
}
Though there are plenty of CSV parsers out there and i would advise against rolling your own. FileHelpers is a good one.
// Download the file to a specified path. Using the WebClient class we can download
// files directly from a provided url, like in this case.
System.Net.WebClient client = new WebClient();
client.DownloadFile(url, csvPath);
Where the url is your site with the csv file and the csvPath is where you want the actual file to go.
In your Web Service you could use the WebClient class to download the file, something like this ( I have not put any exception handling, not any using or Close/Dispose calls, just wanted to give the idea you can use and refine/improve... )
using System.Net;
WebClient webClient = new WebClient();
webClient.DownloadFile("http://www.domain.co.uk/prices.csv");
then you can do anything you like with it once the file content is available in the execution flow of your service.
if you have to return it to the client as return value of the web service call you can either return a DataSet or any other data structure you prefer.
Sebastien Lorion's CSV Reader has a constructor that takes a Stream.
If you decided to use this, your example would become:
void GetCSVFromRemoteUrl(string url)
{
HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
using (CsvReader csvReader = new CsvReader(response.GetResponseStream(), true))
{
int fieldCount = csvReader.FieldCount;
string[] headers = csvReader.GetFieldHeaders();
while (csvReader.ReadNextRecord())
{
//Do work with CSV file data here
}
}
}
The ever popular FileHelpers also allows you to read directly from a stream.
The documentation for WebRequest has an example that uses streams. Using a stream allows you to parse the document without storing it all in memory

DownloadString and then writing to a page issue

I'm reading the content from a page using DownloadString from the WebClient class and then writing the contents of that to a static HTML file using the StreamWriter class. On the page that I'm reading in, there's an inline javascript method that just sets an anchor element's OnClick attribute to set the window.location = history.go(-1); I'm finding when I view the static HTML page, there's an odd looking letter showing up that isn't present on the dynamic web page.
WebClient & SteamWriter Code
using (var client = new WebClient())
{
var html = client.DownloadString(url);
//This constructor prepares a StreamWriter (UTF-8) to write to the specified file or will create it if it doesn't already exist
using (var stream = new StreamWriter(file, false, Encoding.UTF8))
{
stream.Write(html);
stream.Close();
}
}
The dynamic page's HTML snippet in question
<span>Sorry, but something went wrong on our end. Click here to go back to the previous page.</span>
The static page's HTML snippet
<span>Sorry, but something went wrong on our end. Â Click here to go back to the previous page.</span>
I was thinking that adding the Encoding.UTF8 parameter would solve this issue but it didn't seem to help. Is there some sort of extra encoding or decoding that I need to do? Or did I completely miss something else that's needed for this type of operation?
I updated the WebClient to encode in UTF8 as it converts the resource into a string, seems to have taken care of the issue.
using (var client = new WebClient())
{
client.Encoding = System.Text.Encoding.UTF8;
var html = client.DownloadString(url);
//This constructor prepares a StreamWriter (UTF-8) to write to the specified file or will create it if it doesn't already exist
using (var stream = new StreamWriter(file, false, Encoding.UTF8))
{
stream.Write(html);
stream.Close();
}
}

how to download a file from url and save in location using unity3d in C sharp?

iam working on unity3d project. i need set of files to be downloaded from the server. iam using C# for scripting. after a hour of google i haven't found a solution because of poor documentation. can anyone give me example code for download file from url and save it in a specific location in unity3d?
Unity3D uses the implementation of C# known as Mono. Mono supports almost everything available in the standard .NET library. Thus, whenever you are wondering 'How do I do that in Unity?', you can always take a look at the documentation for .NET available at msdn.com which is by no way poor. Regarding your question, use the WebClient class:
using System;
using System.Net;
using System.IO;
public class Test
{
public static void Main (string[] args)
{
WebClient client = new WebClient();
Stream data = client.OpenRead(#"http://google.com");
StreamReader reader = new StreamReader(data);
string s = reader.ReadToEnd();
Console.WriteLine(s);
data.Close();
reader.Close();
}
}
Edit
When downloading an image file, use the DownloadFile method provided by WebClient:
WebClient client = new WebClient();
client.DownloadFile("http://upload.wikimedia.org/wikipedia/commons/5/51/Google.png", #"C:\Images\GoogleLogo.png")
Take a look at the WWW functions in Unity 3d.
using UnityEngine;
using System.Collections;
public class ExampleClass : MonoBehaviour {
public string url = "http://images.earthcam.com/ec_metros/ourcams/fridays.jpg";
IEnumerator Start() {
WWW www = new WWW(url);
yield return www;
renderer.material.mainTexture = www.texture;
}
}
Asset bundles if you are looking to compress your data packets.
var www = new WWW ("http://myserver/myBundle.unity3d");
yield www;
// Get the designated main asset and instantiate it.
Instantiate(www.assetBundle.mainAsset);
Note that some functions are only Ready only... like the www.url function. Some of the examples have been moved to the Manual section instead of the scripting section as well.
Hope this helps.
-Mark
You can use System.Net.WebClient to download file asynchronously.
something like
System.Net.WebClient client = new WebClient();
client.DownloadFileAsync(new Uri("your uri"), "save path.");
I found a good example ,that can use with unity3d here.
Unity's WWW class & method group has been deprecated. Using UnityWebRequest is the current recommended method for handling web requests, especially GET requests:
using UnityEngine;
using System.Collections;
using UnityEngine.Networking;
public class MyBehaviour : MonoBehaviour {
void Start() {
StartCoroutine(GetText());
}
IEnumerator GetText() {
UnityWebRequest www = UnityWebRequest.Get("http://www.my-server.com");
yield return www.SendWebRequest();
if(www.isNetworkError || www.isHttpError) {
Debug.Log(www.error);
}
else {
// Show results as text
Debug.Log(www.downloadHandler.text);
// Or retrieve results as binary data
byte[] results = www.downloadHandler.data;
}
}
}

Output website X HTML to console in C#

Im in a middle of teaching myself to code so do pardon the ignorance.
So my question is, what do I need to read/learn in order to be able to output the HTML of a particular website (e.g google.com) to console?
Thanks.
I would suggest you start here: http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest%28v=VS.90%29.aspx
Essentially, you create the HttpWebRequest and then call the GetResponse() method. You can then read the response stream and output it to your console.
Use HttpWebRequest to create a request and output the response to the console.
using System;
using System.IO;
using System.Net;
using System.Text;
namespace Examples.System.Net
{
public class WebRequestGetExample
{
public static void Main ()
{
// Create a request for the URL.
WebRequest request = WebRequest.Create ("http://www.contoso.com/default.html");
// If required by the server, set the credentials.
request.Credentials = CredentialCache.DefaultCredentials;
// Get the response.
HttpWebResponse response = (HttpWebResponse)request.GetResponse ();
// Display the status.
Console.WriteLine (response.StatusDescription);
// Get the stream containing content returned by the server.
Stream dataStream = response.GetResponseStream ();
// Open the stream using a StreamReader for easy access.
StreamReader reader = new StreamReader (dataStream);
// Read the content.
string responseFromServer = reader.ReadToEnd ();
// Display the content.
Console.WriteLine (responseFromServer);
// Cleanup the streams and the response.
reader.Close ();
dataStream.Close ();
response.Close ();
}
}
}
Most browsers allow you to right-click and select "View Source", that's the easiest way to see the HTML.
Have a look at the WebClient class, particularly the example at the bottom of the MSDN page.
This will do the trick:
WebClient client = new WebClient();
Stream data = client.OpenRead("www.google.com");
StreamReader reader = new StreamReader(data);
string str = reader.ReadLine();
Console.WriteLine(str);

Categories