how to get html page source by C# - c#

I want to save complete web page asp in local drive by .htm from url or url but I did not success.
Code
public StreamReader Fn_DownloadWebPageComplete(string link_Pagesource)
{
//--------- Download Complete ------------------
// using (WebClient client = new WebClient()) // WebClient class inherits IDisposable
// {
//client
//HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(link_Pagesource);
//webRequest.AllowAutoRedirect = true;
//var client1 = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(link_Pagesource);
//client1.CookieContainer = new System.Net.CookieContainer();
// client.DownloadFile(link_Pagesource, #"D:\S1.htm");
// }
//--------- Download Page Source ------------------
HttpWebRequest URL_pageSource = (HttpWebRequest)WebRequest.Create("https://www.digikala.com");
URL_pageSource.Timeout = 360000;
//URL_pageSource.Timeout = 1000000;
URL_pageSource.ReadWriteTimeout = 360000;
// URL_pageSource.ReadWriteTimeout = 1000000;
URL_pageSource.AllowAutoRedirect = true;
URL_pageSource.MaximumAutomaticRedirections = 300;
using (WebResponse MyResponse_PageSource = URL_pageSource.GetResponse())
{
str_PageSource = new StreamReader(MyResponse_PageSource.GetResponseStream(), System.Text.Encoding.UTF8);
pagesource1 = str_PageSource.ReadToEnd();
success = true;
}
Error :
Too many automatic redirections were attempted.
Attemp by this codes but not successful.
many url is successful with this codes but this url not successful.

here is the way
string url = "https://www.digikala.com/";
using (HttpClient client = new HttpClient())
{
using (HttpResponseMessage response = client.GetAsync(url).Result)
{
using (HttpContent content = response.Content)
{
string result = content.ReadAsStringAsync().Result;
}
}
}
and result variable will contains the page as HTML then you can save it to a file like this
System.IO.File.WriteAllText("path/filename.html", result);
NOTE you have to use the namespace
using System.Net.Http;
Update if you are using legacy VS then you can see this answer for using WebClient and WebRequest for the same purpose, but Actually updating your VS is a better solution.

using (WebClient client = new WebClient ())
{
string htmlCode = client.DownloadString("https://www.digikala.com");
}

using (WebClient client = new WebClient ())
{
client.DownloadFile("https://www.digikala.com", #"C:\localfile.html");
}

Related

Quick Download HTML Source in C#

I am trying to download a HTML source code from a single website (https://www.faa.gov/air_traffic/flight_info/aeronav/aero_data/NASR_Subscription/) in C#.
The issue is that it takes 10 seconds to download a 30kb HTML page source. Internet connection is not an issue, as I am able to download 10Mb files in this program instantly.
The following has been executed both in a separate thread and in the main thread. It still takes 10-12 seconds to download.
1)
using (var httpClient = new HttpClient())
{
using (var request = new HttpRequestMessage(new HttpMethod("GET"), url))
{
var response = await httpClient.SendAsync(request);
}
}
2)
using (var client = new System.Net.WebClient())
{
client.Proxy = null;
response = client.DownloadString(url);
}
3)
using (var client = new System.Net.WebClient())
{
webClient.Proxy = GlobalProxySelection.GetEmptyWebProxy();
response = client.DownloadString(url);
}
4)
WebRequest.DefaultWebProxy = null;
using (var client = new System.Net.WebClient())
{
response = client.DownloadString(url);
}
5)
var client = new WebClient()
response = client.DownloadString(url);
6)
var client = new WebClient()
client.DownloadFile(url, filepath);
7)
System.Net.WebClient myWebClient = new System.Net.WebClient();
WebProxy myProxy = new WebProxy();
myProxy.IsBypassed(new Uri(url));
myWebClient.Proxy = myProxy;
response = myWebClient.DownloadString(url);
8)
using var client = new HttpClient();
var content = await client.GetStringAsync(url);
9)
HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create(Url);
myRequest.Method = "GET";
WebResponse myResponse = myRequest.GetResponse();
StreamReader sr = new StreamReader(myResponse.GetResponseStream(), System.Text.Encoding.UTF8);
string result = sr.ReadToEnd();
sr.Close();
myResponse.Close();
I want a faster way to do this in C#.
Any information or help you can provide is much appreciated.
I know that this is dated, but I think I found the cause: I've encountered this at other sites. If you look at the response cookies, you will find one named ak_bmsc. That cookie shows that the site is running the Akamai Bot Manager. It offers bot protection, thus blocks requests that 'look' suspicious.
In order to get a quick response from the host, you need the right request settings. In this case:
Headers:
Host: (their host data) www.faa.gov
Accept: (something like:) */*
Cookies:
AkamaiEdge = true
example:
class Program
{
private static readonly HttpClient _client = new HttpClient();
private static readonly string _url = "https://www.faa.gov/air_traffic/flight_info/aeronav/aero_data/NASR_Subscription/";
static async Task Main(string[] args)
{
var sw = Stopwatch.StartNew();
using (var request = new HttpRequestMessage(HttpMethod.Get,_url))
{
request.Headers.Add("Host", "www.faa.gov");
request.Headers.Add("Accept", "*/*");
request.Headers.Add("Cookie", "AkamaiEdge=true");
Console.WriteLine(await _client.SendAsync(request));
}
Console.WriteLine("Elapsed: {0} ms", sw.ElapsedMilliseconds);
}
}
Takes 896 ms for me.
by the way, you shouldn't put HttpClient in a using block. I know it's disposable, but it's not designed to be disposed.
This Question has stumped everyone I have asked. I have found a solution that I am going to stick with.
This solution does what I need it to do in 0.5 seconds on average. This will only work for windows from what I can tell. If the user does not have "CURL" I revert and go to the old way that takes 10 seconds to get what I need.
The solution creates a batch file in a temporary directory, calls that batch file to "CURL" the website, then output the result of CURL to a .txt file in the temp directory.
private static void CreateBatchFile()
{
string filePath = $"{tempPath}\\tempBat.bat";
string writeMe = "cd \"%temp%\\ProgramTempDir\"\n" +
"curl \"https://www.faa.gov/air_traffic/flight_info/aeronav/aero_data/NASR_Subscription/\">FAA_NASR.txt";
File.WriteAllText(filePath, writeMe);
}
private static void ExecuteCommand()
{
int ExitCode;
ProcessStartInfo ProcessInfo;
Process Process;
ProcessInfo = new ProcessStartInfo("cmd.exe", "/c " + $"{tempPath}\\tempBat.bat");
ProcessInfo.CreateNoWindow = true;
ProcessInfo.UseShellExecute = false;
Process = Process.Start(ProcessInfo);
Process.WaitForExit();
ExitCode = Process.ExitCode;
Process.Close();
}
private static void GetResponse()
{
string response;
string url = "https://www.faa.gov/air_traffic/flight_info/aeronav/aero_data/NASR_Subscription/";
CreateBatchFile();
ExecuteCommand();
if (File.Exists($"{tempPath}\\FAA_NASR.txt") && File.ReadAllText($"{tempPath}\\FAA_NASR.txt").Length > 10)
{
response = File.ReadAllText($"{tempPath}\\FAA_NASR.txt");
}
else
{
// If we get here the user does not have Curl, OR Curl returned a file that is not longer than 10 Characters.
using (var client = new System.Net.WebClient())
{
client.Proxy = null;
response = client.DownloadString(url);
}
}
}

How to download file from API using Post Method

I have an API using POST Method.From this API I can download the file via Postmen tool.But I would like to know how to download file from C# Code.I have tried below code but POST Method is not allowed to download the file.
Code:-
using (var client = new WebClient())
{
client.Headers.Add("X-Cleartax-Auth-Token", ConfigurationManager.AppSettings["auth-token"]);
client.Headers[HttpRequestHeader.ContentType] = "application/json";
string url = ConfigurationManager.AppSettings["host"] + ConfigurationManager.AppSettings["taxable_entities"] + "/ewaybill/download?print_type=detailed";
TransId Id = new TransId()
{
id = TblHeader.Rows[0]["id"].ToString()
};
List<string> ids = new List<string>();
ids.Add(TblHeader.Rows[0]["id"].ToString());
string DATA = JsonConvert.SerializeObject(ids, Newtonsoft.Json.Formatting.Indented);
string res = client.UploadString(url, "POST",DATA);
client.DownloadFile(url, ConfigurationManager.AppSettings["InvoicePath"].ToString() + CboGatePassNo.EditValue.ToString().Replace("/", "-") + ".pdf");
}
Postmen Tool:-
URL : https://ewbbackend-preprodpub-http.internal.cleartax.co/gst/v0.1/taxable_entities/1c74ddd2-6383-4f4b-a7a5-007ddd08f9ea/ewaybill/download?print_type=detailed
Header :-
Content-Type : application/json
X-Cleartax-Auth-Token :b1f57327-96db-4829-97cf-2f3a59a3a548
Body :-
[
"GLD24449"
]
using (WebClient client = new WebClient())
{
client.Headers.Add("X-Cleartax-Auth-Token", ConfigurationManager.AppSettings["auth-token"]);
client.Headers[HttpRequestHeader.ContentType] = "application/json";
string url = ConfigurationManager.AppSettings["host"] + ConfigurationManager.AppSettings["taxable_entities"] + "/ewaybill/download?print_type=detailed";
client.Encoding = Encoding.UTF8;
//var data = "[\"GLD24449\"]";
var data = UTF8Encoding.UTF8.GetBytes(TblHeader.Rows[0]["id"].ToString());
byte[] r = client.UploadData(url, data);
using (var stream = System.IO.File.Create("FilePath"))
{
stream.Write(r,0,r.length);
}
}
Try this. Remember to change the filepath. Since the data you posted is not valid
json. So, I decide to post data this way.
I think it's straight forward, but instead of using WebClient, you can use HttpClient, it's better.
here is the answer HTTP client for downloading -> Download file with WebClient or HttpClient?
comparison between the HTTP client and web client-> Deciding between HttpClient and WebClient
Example Using WebClient
public static void Main(string[] args)
{
string path = #"download.pdf";
// Delete the file if it exists.
if (File.Exists(path))
{
File.Delete(path);
}
var uri = new Uri("https://ewbbackend-preprodpub-http.internal.cleartax.co/gst/v0.1/taxable_entities/1c74ddd2-6383-4f4b-a7a5-007ddd08f9ea/ewaybill/download?print_type=detailed");
WebClient client = new WebClient();
client.Headers[HttpRequestHeader.ContentType] = "application/json";
client.Headers.Add("X-Cleartax-Auth-Token", "b1f57327-96db-4829-97cf-2f3a59a3a548");
client.Encoding = Encoding.UTF8;
var data = UTF8Encoding.UTF8.GetBytes("[\"GLD24449\"]");
byte[] r = client.UploadData(uri, data);
using (var stream = System.IO.File.Create(path))
{
stream.Write(r, 0, r.Length);
}
}
Here is the sample code, don't forget to change the path.
public class Program
{
public static async Task Main(string[] args)
{
string path = #"download.pdf";
// Delete the file if it exists.
if (File.Exists(path))
{
File.Delete(path);
}
var uri = new Uri("https://ewbbackend-preprodpub-http.internal.cleartax.co/gst/v0.1/taxable_entities/1c74ddd2-6383-4f4b-a7a5-007ddd08f9ea/ewaybill/download?print_type=detailed");
HttpClient client = new HttpClient();
var request = new HttpRequestMessage(HttpMethod.Post, uri)
{
Content = new StringContent("[\"GLD24449\"]", Encoding.UTF8, "application/json")
};
request.Headers.Add("X-Cleartax-Auth-Token", "b1f57327-96db-4829-97cf-2f3a59a3a548");
var response = await client.SendAsync(request);
if (response.IsSuccessStatusCode)
{
using (FileStream fs = File.Create(path))
{
await response.Content.CopyToAsync(fs);
}
}
else
{
}
}

How to download using cefsharp in winforms [duplicate]

I am making a call to a page on my site using webclient. I'm trying to get the result of the webpage put into a pdf so I am trying to get a string representation of the rendered page. The problem is that the request is not authenticated so all I get is a login screen. I have sent the UseDefaultCredentials property to true but I still get the same result. Below is a portion of my code:
WebClient webClient = new WebClient();
webClient.Encoding = Encoding.UTF8;
webClient.UseDefaultCredentials = true;
return Encoding.UTF8.GetString(webClient.UploadValues(link, "POST",form));
You need to give the WebClient object the credentials. Something like this...
WebClient client = new WebClient();
client.Credentials = new NetworkCredential("username", "password");
What kind of authentication are you using? If it's Forms authentication, then at best, you'll have to find the .ASPXAUTH cookie and pass it in the WebClient request.
At worst, it won't work.
Public Function getWeb(ByRef sURL As String) As String
Dim myWebClient As New System.Net.WebClient()
Try
Dim myCredentialCache As New System.Net.CredentialCache()
Dim myURI As New Uri(sURL)
myCredentialCache.Add(myURI, "ntlm", System.Net.CredentialCache.DefaultNetworkCredentials)
myWebClient.Encoding = System.Text.Encoding.UTF8
myWebClient.Credentials = myCredentialCache
Return myWebClient.DownloadString(myURI)
Catch ex As Exception
Return "Exception " & ex.ToString()
End Try
End Function
This helped me to call API that was using cookie authentication. I have passed authorization in header like this:
request.Headers.Set("Authorization", Utility.Helper.ReadCookie("AuthCookie"));
complete code:
// utility method to read the cookie value:
public static string ReadCookie(string cookieName)
{
var cookies = HttpContext.Current.Request.Cookies;
var cookie = cookies.Get(cookieName);
if (cookie != null)
return cookie.Value;
return null;
}
// using statements where you are creating your webclient
using System.Web.Script.Serialization;
using System.Net;
using System.IO;
// WebClient:
var requestUrl = "<API_url>";
var postRequest = new ClassRoom { name = "kushal seth" };
using (var webClient = new WebClient()) {
JavaScriptSerializer serializer = new JavaScriptSerializer();
byte[] requestData = Encoding.ASCII.GetBytes(serializer.Serialize(postRequest));
HttpWebRequest request = WebRequest.Create(requestUrl) as HttpWebRequest;
request.Method = "POST";
request.ContentType = "application/json";
request.ContentLength = requestData.Length;
request.ContentType = "application/json";
request.Expect = "application/json";
request.Headers.Set("Authorization", Utility.Helper.ReadCookie("AuthCookie"));
request.GetRequestStream().Write(requestData, 0, requestData.Length);
using (var response = (HttpWebResponse)request.GetResponse()) {
var reader = new StreamReader(response.GetResponseStream());
var objText = reader.ReadToEnd(); // objText will have the value
}
}

NULL JSON posted onto PHP server using HTTP requests from C#

I'm trying to post a JSON string on a PHP page using HTTP response methods as follows:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Net;
using System.IO;
using System.Web.Script.Serialization;
using System.Web;
namespace http_requests
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
//var httpWebRequest = (HttpWebRequest)WebRequest.Create("http://localhost/abc/products.php");
//httpWebRequest.ContentType = "application/json";
//httpWebRequest.Method = "POST";
//using (var streamWriter = new StreamWriter(httpWebRequest.GetRequestStream()))
//{
// string json = new JavaScriptSerializer().Serialize(new
// {
// user = "Foo",
// password = "Baz"
// });
// streamWriter.Write(json);
// streamWriter.Flush();
// streamWriter.Close();
// var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse();
// using (var streamReader = new StreamReader(httpResponse.GetResponseStream()))
// {
// var result = streamReader.ReadToEnd();
// }
//}
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create("http://localhost/ABC/products.php");
request.Method = WebRequestMethods.Http.Post;
string DataToPost = new JavaScriptSerializer().Serialize(new
{
user = "Foo",
password = "Baz"
});
byte[] bytes = Encoding.UTF8.GetBytes(DataToPost);
string byteString = Encoding.UTF8.GetString(bytes);
Stream os = null;
//string postData = "firstName=" + HttpUtility.UrlEncode(p.firstName) +
request.ContentLength = bytes.Length;
request.ContentType = "application/x-www-form-urlencoded";
os = request.GetRequestStream();
os.Write(bytes, 0, bytes.Length);
//StreamWriter writer = new StreamWriter(request.GetRequestStream());
//writer.Write(DataToPost);
//writer.Close();
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
//StreamReader reader = new StreamReader(response.GetResponseStream());
using (var streamReader = new StreamReader(response.GetResponseStream()))
{
var result = streamReader.ReadToEnd();
richTextBox1.AppendText("R : " + result);
Console.WriteLine(streamReader.ReadToEnd().Trim());
}
//richTextBox1.Text = response.ToString();
}
}
}
I tried it in many different ways (converting to bytes too) but still posts a NULL array.
PHP Code:
<?php
$json = $_POST;
if (isset($json)) {
echo "This var is set so I will print.";
//var_dump($json);
var_dump(json_decode(file_get_contents('php://input')));
}
?>
When I try to get tha response from server and print onto a text box, it prints right:
R : This var is set so I will print.object(stdClass)#1 (2) {
["user"]=>
string(3) "Foo"
["password"]=>
string(3) "Baz"
}
but i'm unable to check it on my PHP page, it says:
This var is set so I will print.NULL
Not sure if its posting a JSON onto PHP or not, but it sure does posts a NULL.
I want to see the JSON on PHP page, Any help would be appreciated.
Thank you,
Revathy
There is nothing wrong with your c# client side code, the problem is that visiting a site in your browser is a seperate request from the c# post, so you wont see anything.
As per my comment, if you want to see the data in a browser after a post i c#, you will need to save and retrieve it.
Here is a simple example using a text file to save post data and display it:
//if post request
if($_SERVER['REQUEST_METHOD']=='POST'){
//get data from POST
$data = file_get_contents('php://input');
//save to file
file_put_contents('data.txt', $data);
die('Saved');
}
//else if its a get request (eg view in browser)
var_dump(json_decode(file_get_contents('data.txt')));

Uploading images to Imgur.com using C#

I just recieve my unique developer API key from Imgur and I'm aching to start cracking on this baby.
First a simple test to kick things off. How can I upload an image using C#? I found this using Python:
#!/usr/bin/python
import pycurl
c = pycurl.Curl()
values = [
("key", "YOUR_API_KEY"),
("image", (c.FORM_FILE, "file.png"))]
# OR: ("image", "http://example.com/example.jpg"))]
# OR: ("image", "BASE64_ENCODED_STRING"))]
c.setopt(c.URL, "http://imgur.com/api/upload.xml")
c.setopt(c.HTTPPOST, values)
c.perform()
c.close()
looks like the site uses HTTP Post to upload images. Take a look at the HTTPWebRequest class and using it to POST to a URL: Posting data with HTTPRequest.
The Imgur API now provide a complete c# example :
using System;
using System.IO;
using System.Net;
using System.Text;
namespace ImgurExample
{
class Program
{
static void Main(string[] args)
{
PostToImgur(#"C:\Users\ashwin\Desktop\image.jpg", IMGUR_ANONYMOUS_API_KEY);
}
public static void PostToImgur(string imagFilePath, string apiKey)
{
byte[] imageData;
FileStream fileStream = File.OpenRead(imagFilePath);
imageData = new byte[fileStream.Length];
fileStream.Read(imageData, 0, imageData.Length);
fileStream.Close();
string uploadRequestString = "image=" + Uri.EscapeDataString(System.Convert.ToBase64String(imageData)) + "&key=" + apiKey;
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("http://api.imgur.com/2/upload");
webRequest.Method = "POST";
webRequest.ContentType = "application/x-www-form-urlencoded";
webRequest.ServicePoint.Expect100Continue = false;
StreamWriter streamWriter = new StreamWriter(webRequest.GetRequestStream());
streamWriter.Write(uploadRequestString);
streamWriter.Close();
WebResponse response = webRequest.GetResponse();
Stream responseStream = response.GetResponseStream();
StreamReader responseReader = new StreamReader(responseStream);
string responseString = responseReader.ReadToEnd();
}
}
}
Why don't you use the NuGet for this: called Imgur.API and for upload
you would have a method like this:
/*
The refresh token and all the values represented by constans are given when you allow the application in your imgur panel on the response url
*/
public OAuth2Token CreateToken()
{
var token = new OAuth2Token(TOKEN_ACCESS, REFRESH_TOKEN, TOKEN_TYPE, ID_ACCOUNT, IMGUR_USER_ACCOUNT, int.Parse(EXPIRES_IN));
return token;
}
//Use it only if your token is expired
public Task<IOAuth2Token> RefreshToken()
{
var client = new ImgurClient(CLIENT_ID, CLIENT_SECRET);
var endpoint= new OAuth2Endpoint(client);
var token = endpoint.GetTokenByRefreshTokenAsync(REFRESH_TOKEN);
return token;
}
public async Task UploadImage()
{
try
{
var client = new ImgurClient(CLIENT_ID, CLIENT_SECRET, CreateToken());
var endpoint = new ImageEndpoint(client);
IImage image;
//Here you have to link your image location
using (var fs = new FileStream(#"IMAGE_LOCATION", FileMode.Open))
{
image = await endpoint.UploadImageStreamAsync(fs);
}
Debug.Write("Image uploaded. Image Url: " + image.Link);
}
catch (ImgurException imgurEx)
{
Debug.Write("Error uploading the image to Imgur");
Debug.Write(imgurEx.Message);
}
}
Also you can find all the reference here: Imgur.API NuGet

Categories