Im trying to create a web service which gets to a URL e.g. www.domain.co.uk/prices.csv and then reads the csv file. Is this possible and how? Ideally without downloading the csv file?
You could use:
public string GetCSV(string url)
{
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
StreamReader sr = new StreamReader(resp.GetResponseStream());
string results = sr.ReadToEnd();
sr.Close();
return results;
}
And then to split it:
public static void SplitCSV()
{
List<string> splitted = new List<string>();
string fileList = getCSV("http://www.google.com");
string[] tempStr;
tempStr = fileList.Split(',');
foreach (string item in tempStr)
{
if (!string.IsNullOrWhiteSpace(item))
{
splitted.Add(item);
}
}
}
Though there are plenty of CSV parsers out there and i would advise against rolling your own. FileHelpers is a good one.
// Download the file to a specified path. Using the WebClient class we can download
// files directly from a provided url, like in this case.
System.Net.WebClient client = new WebClient();
client.DownloadFile(url, csvPath);
Where the url is your site with the csv file and the csvPath is where you want the actual file to go.
In your Web Service you could use the WebClient class to download the file, something like this ( I have not put any exception handling, not any using or Close/Dispose calls, just wanted to give the idea you can use and refine/improve... )
using System.Net;
WebClient webClient = new WebClient();
webClient.DownloadFile("http://www.domain.co.uk/prices.csv");
then you can do anything you like with it once the file content is available in the execution flow of your service.
if you have to return it to the client as return value of the web service call you can either return a DataSet or any other data structure you prefer.
Sebastien Lorion's CSV Reader has a constructor that takes a Stream.
If you decided to use this, your example would become:
void GetCSVFromRemoteUrl(string url)
{
HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
using (CsvReader csvReader = new CsvReader(response.GetResponseStream(), true))
{
int fieldCount = csvReader.FieldCount;
string[] headers = csvReader.GetFieldHeaders();
while (csvReader.ReadNextRecord())
{
//Do work with CSV file data here
}
}
}
The ever popular FileHelpers also allows you to read directly from a stream.
The documentation for WebRequest has an example that uses streams. Using a stream allows you to parse the document without storing it all in memory
Related
I am trying to upload a file on FTP folder, but getting the following error.
The remote server returned an error: (550) File unavailable (e.g.,
file not found, no access)
I am using the following sample to test this:
// Get the object used to communicate with the server.
string path = HttpUtility.UrlEncode("ftp://host:port//01-03-2017/John, Doe S. M.D/file.wav");
FtpWebRequest request = (FtpWebRequest)WebRequest.Create(path);
request.Method = WebRequestMethods.Ftp.UploadFile;
// This example assumes the FTP site uses anonymous logon.
request.Credentials = new NetworkCredential("user", "password");
// Copy the contents of the file to the request stream.
StreamReader sourceStream = new StreamReader(#"localpath\example.wav");
byte[] fileContents = Encoding.UTF8.GetBytes(sourceStream.ReadToEnd());
sourceStream.Close();
request.ContentLength = fileContents.Length;
Stream requestStream = request.GetRequestStream();
requestStream.Write(fileContents, 0, fileContents.Length);
requestStream.Close();
FtpWebResponse response = (FtpWebResponse)request.GetResponse();
Console.WriteLine("Upload File Complete, status {0}", response.StatusDescription);
response.Close();
I am able to upload files on the parent folder 01-03-2017 but not in the target folder ROLLINS, SETH S. M.D which clearly has special characters in it.
I am able to upload files using FileZilla
I have tried to HttpUtility.UrlEncode but that did n't help
Thanks for your time and help.
You need to encode the spaces (and maybe commas) in the URL path, like:
string path =
"ftp://host:port/01-03-2017/" +
HttpUtility.UrlEncode("John, Doe S. M.D") + "/file.wav";
Effectively, you get:
ftp://host:port/01-03-2017/John%2c+Doe+S.+M.D/file.wav
Use something like this:
string path = HttpUtility.UrlEncode("ftp://96.31.95.118:2121//01-03-2017//ROLLINS, SETH S. M.D//30542_3117.wav");
or You can form a Uri using the following code and pass it webrequest.
var path = new Uri("ftp://96.31.95.118:2121//01-03-2017//ROLLINS, SETH S. M.D//30542_3117.wav");
The code works on a C# console application but did not work in Web Api Action. I could not manage to find the reason.
So I have used a free library for the same.
Posting the sample code from one of the examples available here:
So i have used FluentFtp libary available through Nuget.
using System;
using System.IO;
using System.Net;
using FluentFTP;
namespace Examples {
public class OpenWriteExample {
public static void OpenWrite() {
using (FtpClient conn = new FtpClient()) {
conn.Host = "localhost";
conn.Credentials = new NetworkCredential("ftptest", "ftptest");
using (Stream ostream = conn.OpenWrite("01-03-2017/John, Doe S. M.D/file.wav")) {
try {
// istream.Position is incremented accordingly to the writes you perform
}
finally {
ostream.Close();
}
}
}
}
}
}
Again, if the file is a binary file, StreamReader should not be used as explained here.
I am trying to create my own download manger.
When a link is added to the download manger I use a webclient to get it's information from the server. Like so
WebClient webClient = new WebClient();
webClient.OpenRead(link);
string filename = webClient.ResponseHeaders["Content-Disposition"];
After that I download the file using DownloadFile
FileInfo fileInfo = new FileInfo(path);
if (!fileInfo.Exists)
{
webClient.DownloadFile(link, path);
}
When I do it like this. I get a WebException timeout.
However, when I remove the webClient.ResponseHeaders part. It never get the timeout exception.
I really need to read the Content-Disposition because some of the links don't have the name of the file on them.
I even tried using a different webclient for downloading and getting it's info but I got the same result.
I was able to fix the problem by finding another way to get the files info.
string Name = "";
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create(Link);
HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
for(int i=0; i < myHttpWebResponse.Headers.Count; i++)
{
if (myHttpWebResponse.Headers.Keys[i] == "Content-Disposition")
{
Name = myHttpWebResponse.Headers[i];
}
}
myHttpWebResponse.Close();
I have a webpage which has nothing on it except some string(s). No images, no background color or anything, just some plain text which is not really that long in length.
I am just wondering, what is the best (by that, I mean fastest and most efficient) way to pass the string in the webpage so that I can use it for something else (e.g. display in a text box)? I know of WebClient, but I'm not sure if it'll do what I want it do and plus I don't want to even try that out even if it did work because the last time I did it took approximately 30 seconds for a simple operation.
Any ideas would be appreciated.
The WebClient class should be more than capable of handling the functionality you describe, for example:
System.Net.WebClient wc = new System.Net.WebClient();
byte[] raw = wc.DownloadData("http://www.yoursite.com/resource/file.htm");
string webData = System.Text.Encoding.UTF8.GetString(raw);
or (further to suggestion from Fredrick in comments)
System.Net.WebClient wc = new System.Net.WebClient();
string webData = wc.DownloadString("http://www.yoursite.com/resource/file.htm");
When you say it took 30 seconds, can you expand on that a little more? There are many reasons as to why that could have happened. Slow servers, internet connections, dodgy implementation etc etc.
You could go a level lower and implement something like this:
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("http://www.yoursite.com/resource/file.htm");
using (StreamWriter streamWriter = new StreamWriter(webRequest.GetRequestStream(), Encoding.UTF8))
{
streamWriter.Write(requestData);
}
string responseData = string.Empty;
HttpWebResponse httpResponse = (HttpWebResponse)webRequest.GetResponse();
using (StreamReader responseReader = new StreamReader(httpResponse.GetResponseStream()))
{
responseData = responseReader.ReadToEnd();
}
However, at the end of the day the WebClient class wraps up this functionality for you. So I would suggest that you use WebClient and investigate the causes of the 30 second delay.
If you're downloading text then I'd recommend using the WebClient and get a streamreader to the text:
WebClient web = new WebClient();
System.IO.Stream stream = web.OpenRead("http://www.yoursite.com/resource.txt");
using (System.IO.StreamReader reader = new System.IO.StreamReader(stream))
{
String text = reader.ReadToEnd();
}
If this is taking a long time then it is probably a network issue or a problem on the web server. Try opening the resource in a browser and see how long that takes.
If the webpage is very large, you may want to look at streaming it in chunks rather than reading all the way to the end as in that example.
Look at http://msdn.microsoft.com/en-us/library/system.io.stream.read.aspx to see how to read from a stream.
Regarding the suggestion
So I would suggest that you use WebClient and investigate the causes of the 30 second delay.
From the answers for the question
System.Net.WebClient unreasonably slow
Try setting Proxy = null;
WebClient wc = new WebClient();
wc.Proxy = null;
Credit to Alex Burtsev
If you use the WebClient to read the contents of the page, it will include HTML tags.
string webURL = "https://yoursite.com";
WebClient wc = new WebClient();
wc.Headers.Add("user-agent", "Only a Header!");
byte[] rawByteArray = wc.DownloadData(webURL);
string webContent = Encoding.UTF8.GetString(rawByteArray);
After getting the content, the html tags should be removed. Regex can be used for this:
var result= Regex.Replace(webContent, "<.*?>", String.Empty);
But this method is not very accurate, the better way is to install HtmlAgilityPack and use the following code:
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(webData);
string result = doc.DocumentNode.InnerText;
You say it takes 30 seconds, It has nothing to do with using WebClient (The main factor is internet connections or proxy). WebClient has worked very well for me. example
WebClient client = new WebClient();
using (Stream data = client.OpenRead(Text))
{
using (StreamReader reader = new StreamReader(data))
{
string content = reader.ReadToEnd();
string pattern = #"((https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))+[\w\d:##%/;$()~_?\+-=\\\.&]*)";
MatchCollection matches = Regex.Matches(content,pattern);
List<string> urls = new List<string>();
foreach (Match match in matches)
{
urls.Add(match.Value);
}
}
XmlDocument document = new XmlDocument();
document.Load("www.yourwebsite.com");
string allText = document.InnerText;
Im in a middle of teaching myself to code so do pardon the ignorance.
So my question is, what do I need to read/learn in order to be able to output the HTML of a particular website (e.g google.com) to console?
Thanks.
I would suggest you start here: http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest%28v=VS.90%29.aspx
Essentially, you create the HttpWebRequest and then call the GetResponse() method. You can then read the response stream and output it to your console.
Use HttpWebRequest to create a request and output the response to the console.
using System;
using System.IO;
using System.Net;
using System.Text;
namespace Examples.System.Net
{
public class WebRequestGetExample
{
public static void Main ()
{
// Create a request for the URL.
WebRequest request = WebRequest.Create ("http://www.contoso.com/default.html");
// If required by the server, set the credentials.
request.Credentials = CredentialCache.DefaultCredentials;
// Get the response.
HttpWebResponse response = (HttpWebResponse)request.GetResponse ();
// Display the status.
Console.WriteLine (response.StatusDescription);
// Get the stream containing content returned by the server.
Stream dataStream = response.GetResponseStream ();
// Open the stream using a StreamReader for easy access.
StreamReader reader = new StreamReader (dataStream);
// Read the content.
string responseFromServer = reader.ReadToEnd ();
// Display the content.
Console.WriteLine (responseFromServer);
// Cleanup the streams and the response.
reader.Close ();
dataStream.Close ();
response.Close ();
}
}
}
Most browsers allow you to right-click and select "View Source", that's the easiest way to see the HTML.
Have a look at the WebClient class, particularly the example at the bottom of the MSDN page.
This will do the trick:
WebClient client = new WebClient();
Stream data = client.OpenRead("www.google.com");
StreamReader reader = new StreamReader(data);
string str = reader.ReadLine();
Console.WriteLine(str);
In my application I use the WebClient class to download files from a Webserver by simply calling the DownloadFile method. Now I need to check whether a certain file exists prior to downloading it (or in case I just want to make sure that it exists). I've got two questions with that:
What is the best way to check whether a file exists on a server without transfering to much data across the wire? (It's quite a huge number of files I need to check)
Is there a way to get the size of a given remote file without downloading it?
Thanks in advance!
WebClient is fairly limited; if you switch to using WebRequest, then you gain the ability to send an HTTP HEAD request. When you issue the request, you should either get an error (if the file is missing), or a WebResponse with a valid ContentLength property.
Edit: Example code:
WebRequest request = WebRequest.Create(new Uri("http://www.example.com/"));
request.Method = "HEAD";
using(WebResponse response = request.GetResponse()) {
Console.WriteLine("{0} {1}", response.ContentLength, response.ContentType);
}
When you request file using the WebClient Class, the 404 Error (File Not Found) will lead to an exception. Best way is to handle that exception and use a flag which can be set to see if the file exists or not.
The example code goes as follows:
System.Net.HttpWebRequest request = null;
System.Net.HttpWebResponse response = null;
request = (System.Net.HttpWebRequest)System.Net.HttpWebRequest.Create("www.example.com/somepath");
request.Timeout = 30000;
try
{
response = (System.Net.HttpWebResponse)request.GetResponse();
flag = 1;
}
catch
{
flag = -1;
}
if (flag==1)
{
Console.WriteLine("File Found!!!");
}
else
{
Console.WriteLine("File Not Found!!!");
}
You can put your code in respective if blocks.
Hope it helps!
What is the best way to check whether a file exists on a server
without transfering to much data across the wire?
You can test with WebClient.OpenRead to open the file stream without reading all the file bytes:
using (var client = new WebClient())
{
Stream stream = client.OpenRead(url);
// ^ throws System.Net.WebException: 'Could not find file...' if file is not present
stream.Close();
}
This will indicate if the file exists at the remote location or not.
To fully read the file stream, you would do:
using (var client = new WebClient())
{
Stream stream = client.OpenRead(url);
StreamReader sr = new StreamReader(stream);
Console.WriteLine(sr.ReadToEnd());
stream.Close();
}
In case anyone stuck with ssl certificate issue
ServicePointManager.ServerCertificateValidationCallback = new RemoteCertificateValidationCallback
(
delegate { return true; }
);
WebRequest request = WebRequest.Create(new Uri("http://.com/flower.zip"));
request.Method = "HEAD";
using (WebResponse response = request.GetResponse())
{
Console.WriteLine("{0} {1}", response.ContentLength, response.ContentType);
}