private static void DownloadFile()
{
FtpWebRequest reqFTP;
WebResponse webResponse;
GetTheResponseFromFTP(out reqFTP, out webResponse, true);
FtpWebResponse response = (FtpWebResponse)webResponse;
Stream responseStream = response.GetResponseStream();
StreamReader reader = new StreamReader(responseStream);
using (StreamWriter streamWriter =
new StreamWriter("d:\\TestUnity.pdf", true))
{
streamWriter.WriteLine(reader.ReadToEnd());
}
reader.Close();
response.Close();
}
I have the above function, that download a file from the FTP location.
I am reading the text and trying to write it in a file in my local machine.
The PDf file generated is of the same size as it is downloaded but when I open the file its blank. Now I have two questions:
Can any one suggest how to save the downloaded file to a path which can be changed.
Whats the reason for the above problem mentioned.
From the documentation.
StreamWriter implements a TextWriter for writing characters to a stream
This means you haven't created a pdf file but a textfile with the *.pdf extension.
There are multiple utilities available to create a pdf
WkHtmlToPDF and ITextSharp are just two
Here is very simple code which works for me
void GeneratePDF(WebResponse response)
{
using (var streamFile = File.Create("E:/JSS.pdf"))
response.GetResponseStream().CopyTo(streamFile);
}
Related
I need to convert any file coming from web response into .pdf format, I'm currently getting it word docx file format from the URL and saving it into memory stream so i can later insert it in it's designated library.
The problem I'm facing now is that I'm saving my docx files directly into .pdf by putting an extension at the end which obviously ends up not opening the file later, So i'm trying to convert my memory stream into pdf directly .
Here is my piece of code that i tried to convert the the stream to .pdf but it looks like the file isn't getting converted correctly.
private Stream DownloadFromUrl(string url)
{
var webRequest = WebRequest.Create(url);
webRequest.Credentials = CredentialCache.DefaultNetworkCredentials;
webRequest.PreAuthenticate = true;
webRequest.UseDefaultCredentials = true;
//EventLogUtility.LogInformationMessage(DocumentURL);
string message = string.Empty;
using (Stream outputStream = new MemoryStream())
{
using (var response = webRequest.GetResponse())
{
using (var content = response.GetResponseStream())
{
var memory = new MemoryStream();
content.CopyTo(memory);
Document doc = new Document(memory);
doc.Save(memory, SaveFormat.Pdf);
return memory;
}
}
}
}
If the content in the stream is actually in the Microsoft Word file format (and not just plain text), then you need to map the format to the file format for PDF. I know there is a 'Print to PDF' function available in Word, you could try looking into that.
I am downloading a pdf file using HttpWebRequest object and write the content directly to a FileStream from a response stream, using all "using" blocks and also the .Close method right after the data is copied.
And the next step, I need to extract some text from that pdf file by using some 3rd party library (iText7) but it can't access the file.
At first, I thought it was the iText7-related issue but then I realized it doesn't seem so because I can't even delete the file from file explorer, getting "file in use" error by my own app.
Here's the sample code:
HttpWebRequest webReq = (HttpWebRequest)HttpWebRequest.Create(url);
webReq.AllowAutoRedirect = true;
webReq.CookieContainer = Cookies;
webReq.UserAgent = UserAgent;
webReq.Referer = Referrer;
webReq.Method = WebRequestMethods.Http.Get;
using (HttpWebResponse response = (HttpWebResponse)webReq.GetResponse())
{
using (Stream httpResponseStream = response.GetResponseStream())
{
using (FileStream output = File.Create(file1))
{
httpResponseStream.CopyTo(output);
output.Close();
}
httpResponseStream.Close();
response.Close();
Cookies = webReq.CookieContainer;
}
}
GC.Collect();
ExtractPDFDoc(file1);//error throws in this function and the exception.message is "Cannot open document."
Console.WriteLine("now waiting to let you check the file is in use? try delete it manually...");
Console.ReadKey(); //added this line to ensure that file is actually in use. I can't even delete the file manually from windows file explorer at this time. But, interestingly, Acrobat Reader can OPEN the file when I double click, which makes me thing that Adobe and iText7 uses different methods to open the pdf file - but anyway - I can't help it tho.
Can you please help what is wrong here?
For those who wants to see the ExtractPDFDoc() method:
public static object ExtractPDFDoc(string filename)
{
iText.Kernel.Pdf.PdfReader pdfReader = null;
iText.Kernel.Pdf.PdfDocument pdfDocument = null;
try
{
pdfReader = new iText.Kernel.Pdf.PdfReader(filename);
pdfDocument = new iText.Kernel.Pdf.PdfDocument(pdfReader);
}
catch (Exception ex)
{
pdfReader = null;
pdfDocument = null;
return new Exception(string.Format("ExtractPDFDoc() failed on file '{0}' with message '{1}'", filename, ex.Message));
//this is where I get the error, ex.Message is 'Cannot open document.'
//however, I can open it in Adobe Reader but I can't delete it before closing my app.
}
}
If I remember correctly, the iText objects are all IDisposable, so you should be sure to dispose of them as well. Also, I don't know why you're returning an exception instead of just throwing it.
public static object ExtractPDFDoc(string filename)
{
iText.Kernel.Pdf.PdfReader pdfReader = null;
iText.Kernel.Pdf.PdfDocument pdfDocument = null;
try
{
pdfReader = new iText.Kernel.Pdf.PdfReader(filename);
pdfDocument = new iText.Kernel.Pdf.PdfDocument(pdfReader);
}
catch (Exception ex)
{
throw new Exception(string.Format("ExtractPDFDoc() failed on file '{0}' with message '{1}'", filename, ex.Message), ex);
}
finally
{
pdfReader?.Dispose();
pdfDocument?.Dispose();
}
}
Unrelated to that, you can also stack your using statements instead of nesting them.
using (HttpWebResponse response = (HttpWebResponse)webReq.GetResponse())
using (Stream httpResponseStream = response.GetResponseStream())
using (FileStream output = File.Create(file1))
{
// do stuff
}
I'm deeply sorry, thanks to #howcheng, I realized that it was the iText7 which leaves the file open after it's failed to open the document because of one of it's dependency files was missing in the output folder.
It's clear that I should do a .Close() on iText7 objects on exception to avoid false perceptions such as this.
Thanks for all your help.
we currently have a *.BAT file that contains some FTP commands to download a file from our AS400 and save into a TEXT file. The BAT works fine and the text file will show the records inside the downloaded file one under the other.
Now, we wanted to get rid of this *.BAT file and use C# to download the file for us and save into a text file. The problem now is that the file we get contains all the records in ONE single line of string! they are no longer listed under each other.
here is the code we are using:
tpWebRequest request = default(FtpWebRequest);
FtpWebResponse response = default(FtpWebResponse);
StreamWriter writer = default(StreamWriter);
request = WebRequest.Create("*******URL******") as FtpWebRequest;
request.Credentials = new NetworkCredential("user", "pass");
request.Method = WebRequestMethods.Ftp.DownloadFile;
request.UseBinary = true;
response = request.GetResponse() as FtpWebResponse;
writer = new StreamWriter(Server.MapPath("/filename.txt"));
using (StreamReader reader = new StreamReader(response.GetResponseStream(), Encoding.GetEncoding(37))) //37 for IBM encoding
{
writer.WriteLine(reader.ReadToEnd());
}
writer.Close();
response.Close();
Any idea why we are getting this? and why the simple DOS FTP command work better than our code?
Thanks a lot! :)
ASCII mode will add record delimiters when downloading a physical file. It is the default transfer mode of most ftp clients.
request.UseBinary = false;
Specifying false causes the FtpWebRequest to send a "Type A" command to the server.
Data Transfer Methods
Transferring QSYS.LIB files
The problem might be simple: you read the whole document at once. You need to read every line seperately:
using(StreamReader sr = new StreamReader(fs))
{
while(!sr.EndOfStream)
{
Console.WriteLine(sr.ReadLine());
}
}
This question already has answers here:
Upload file and download file from FTP
(3 answers)
Closed 4 years ago.
i need to change the logic in an old system and im trying to get the downloading file to work, any ideas? i need to download the files using ftp in c# this is the code i found but i need to get that into a file instead of a stream
// Get the object used to communicate with the server.
FtpWebRequest request = (FtpWebRequest)WebRequest.Create("ftp://192.168.1.52/Odt/"+fileName+".dat");
request.Method = WebRequestMethods.Ftp.DownloadFile;
// This example assumes the FTP site uses anonymous logon.
request.Credentials = new NetworkCredential ("anonymous","janeDoe#contoso.com");
FtpWebResponse response = (FtpWebResponse)request.GetResponse();
Stream responseStream = response.GetResponseStream();
StreamReader reader = new StreamReader(responseStream);
Console.WriteLine(reader.ReadToEnd());
Console.WriteLine("Download Complete, status {0}", response.StatusDescription);
reader.Close();
response.Close();
The suggestion from commenter Ron Beyer isn't bad, but because it involves decoding and re-encoding the text, there is a risk of data loss.
You can download the file verbatim by simply copying the request response stream to a file directly. That would look something like this:
// Some file name, initialized however you like
string fileName = ...;
using (Stream responseStream = response.GetResponseStream())
using (Stream fileStream = File.OpenWrite(filename))
{
responseStream.CopyTo(fileStream);
}
Console.WriteLine("Download Complete, status {0}", response.StatusDescription);
response.Close();
I want to send a url as query string e.g.
localhost/abc.aspx?url=http:/ /www.site.com/report.pdf
and detect if the above URL returns the PDF file. If it will return PDF then it gets saved automatically otherwise it gives error.
There are some pages that uses Handler to fetch the files so in that case also I want to detect and download the same.
localhost/abc.aspx?url=http:/ /www.site.com/page.aspx?fileId=223344
The above may return a pdf file.
What is best way to capture this?
Thanks
You can download a PDF like this
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(uri);
HttpWebResponse response = req.GetResponse();
//check the filetype returned
string contentType = response.ContentType;
if(contentType!=null)
{
splitString = contentType.Split(';');
fileType = splitString[0];
}
//see if its PDF
if(fileType!=null && fileType=="application/pdf"){
Stream stream = response.GetResponseStream();
//save it
using(FileStream fileStream = File.Create(fileFullPath)){
// Initialize the bytes array with the stream length and then fill it with data
byte[] bytesInStream = new byte[stream.Length];
stream.Read(bytesInStream, 0, bytesInStream.Length);
// Use write method to write to the file specified above
fileStream.Write(bytesInStream, 0, bytesInStream.Length);
}
}
response.Close();
The fact that it may come from an .aspx handler doesn't actually matter, it's the mime returned in the server response that is used.
If you are getting a generic mime type, like application/octet-stream then you must use a more heuristical approach.
Assuming you cannot simply use the file extension (eg for .aspx), then you can copy the file to a MemoryStream first (see How to get a MemoryStream from a Stream in .NET?). Once you have a memory stream of the file, you can take a 'cheeky' peek at it (I say cheeky because it's not the correct way to parse a PDF file)
I'm not an expert on PDF format, but I believe reading the first 5 chars with an ASCII reader will yield "%PDF-", so you can identify that with
bool isPDF;
using( StreamReader srAsciiFromStream = new StreamReader(memoryStream,
System.Text.Encoding.ASCII)){
isPDF = srAsciiFromStream.ReadLine().StartsWith("%PDF-");
}
//set the memory stream back to the start so you can save the file
memoryStream.Position = 0;