Webbrowser only download pdf and skip everything else

Webbrowser only download pdf and skip everything else - c#

Someone made a c# program that cycles through a set of links till it can find something to download, this can be a word document or a pdf, we would only like to have the pdf files and skip all the other files! The server is asp based so it does not show pdf in it's url! the source code of the page does show this however:
type="application/pdf"
the type is placed in an embed.
How could we stop the browser from downloading word documents, ... and only download the pdf's?

Make a web request and set
webRequest.Method = "HEAD";
That will download just the headers, which you can then inspect to see if the MIME type is one that you want to download.

Related

Downloading files from 2 stage authentication protected server

I have to download some files from a server which has 2 stage authentication for a PowerPoint Addin I am trying to develop.
First, I log into the workspace through a browser...
In that browser I can call a .txt file and the contents are displayed in the browser - Great!
In my PowerPoint Addin I then have the following code for download a PP file and open it...
Globals.ThisAddIn.Application.Presentations.Open(#"https://workspace2.blahblah.com/group/corenarratives/Shared%20Documents/corenarratives/BlankPresentationTemplate.pptx");
This downloads the pptx file and opens it perfect! - Great!
I then try and download a .txt file with this code:
WebClient wc = new WebClient();
wc.DownloadFile("https://workspace2.blahblah.com/group/corenarratives/Shared%20Documents/corenarratives/rts.txt", #"C:\trev\trev.txt");
And the contents of the file contains a html error page...
When I save the .txt file as a .html file and open it in a browser it redirects me to the workspace login page...
I don't understand why the PowerPoint file opens and the .txt file doesn't?
And, how, if possible, to download the .txt file?
Can anyone help please?
Thanks

Is it ok to post an answer I would describe as "in progress"?
I know I guy on twitter who really knows his stuff.
Not a close friend but someone who I had followed for a long time. he wrote fiddler.
So, I stuck my neck out and asked him.
This is what he said.
"Watch your traffic from each scenario with Fiddler. Is PPT sending a Cookie, Auth header, or User-Agent your code needs to send?"
"WebClient isn't based on WinINET/URLMon. PowerPoint downloads (often) are, and that means it gets cookies, UA string, etc."
"PowerPoint has cookies and automatic authentication behaviors inherited from URLMon/WinINET."
Which, if I understand correctly, explains why PowerPoint can download a file..
I think.
Update:
I ended up implement this:
Is it possible to transfer authentication from Webbrowser to WebRequest
HTH

PDF does not reload on directory browsing

Hi guys this is my follow up question and I think this is the real issue here. Click Here
Whenever I access the file using the directory browsing (eg. http://localhost/temp/1.pdf) it always render the old or previous PDF file even if I change the entire file with the same file name, example I have 1.pdf with some content then I delete this file on the directory then replace with new and different content and rename it to 1.pdf then when I access it using browser it always render the previous value not the new one that I replace. This only happens on IE and Opera browser. Please need help this is a production issue.
Edit:
I found something very strange, example i have this url for the location of the pdf file (eg. http://localhost/website/tempfolder/1.pdf) the first pdf that I generated will show then some part in my code I change the casing of some letter example 't', I change it to 'T' then the newly pdf shows but when I revert back to the old case expecting that it will show the new pdf but sad to say the first PDF file shows.

Your browser is probably caching the PDF files.
For IE, you can do a CTRL F5 to force it to reload from web server.
Or put a random query string in your URL
e.g. /temp/1.pdf?v=1, /temp/1.pdf?v=2
Assuming you server is IIS, permanent solution is to configure HTTP response headers.
Go to IIS manager
Navigate to your folder
Click "HTTP Response Headers" in Features View
Right-button click/select "Set Common Headers"
Check "Expire Web Content" and select "Immediately".

asp.net C# download file in ashx from external link

need some help in downloading file with ashx
i'm trying to download large file (about 2-4GB) from external link (file not stored on webserver)
here is my code
context.Response.Clear();
context.Response.ContentType = "video/mp4";
context.Response.AppendHeader("Content-Disposition", "attachment; filename=" + FileName);
context.Response.Write("http://otherserver/file.m4v");
context.Response.Flush();
context.Response.Close();
and downloaded file is 1kb
what i;m doing wrong?
and is other way to download file?
I'm trying to force browser to download file (and change filename) not to preview in brower
P.S sory for my english ;)

This is an incorrect approach. The file content will be:
http://otherserver/file.m4v
Which you are setting here:
context.Response.Write("http://otherserver/file.m4v");
What you need to use is the HttpWebRequest Class.

All you're doing is sending the browser a file containing the text "http://otherserver/file.m4v", with a header suggesting that the browser offers to download, rather than display, the file.
There's no magic going on in the browser that causes it to say "Oh, I should download whatever's at that URL" when it sees a file with a URL in it.
Moreover, having Googled around a bit and looked at several PHP discussions on this subject, I don't think there's a way to do what you want without literally streaming the file from the remote URL onto your server, and then sending it on from your server to the client.
You could try adding the header and then sending a redirect to the client, but I'd expect the client to discard the header when it makes the request to the remote URL - and so display the result in the browser.

How to identify whether the client machine supports PDF File format

Hii,
My requirment is to show a dynamically created pdf file directly to my web page. It works fine for the system which is having pdf reader software. But for the system which does not have the pdf software it is showing error like below
The XML page cannot be displayed
Cannot view XML input using style sheet. Please correct the error and then click the Refresh button, or try again later.
An invalid character was found in text content. Error processing resource 'http://localhost:4252/OmanePost/Customer/EBox/PD...
I need to handle this situation bit differently.i.e In this situation the file should be save to the physical location of the system for that i need to identify whether the client machine has pdf software or not then i can manage properly
I m using ASP.NET 2.0 version

It looks to me that you are serving your PDF with an XML mime/content-type. Make sure you set your content-type to application/pdf and you'll probably get a more suitable browser response.

In this case the browser should ask the user to open the file in an external application.
Please verify that you are sending the correct Content-Type: application/pdf header. Certain versions of Microsoft's browser ignore the content-type header, so you need to specify a filename ending in .pdf in the content disposition header: Content-Disposition: inline; filename=filename.pdf;
Note: I have not verified that it works with "inline" instead of "attachment", but I think it is worth a try.

My requirment is to show a dynamically created pdf file directly to my web page.
Try online ZohoViewer that takes a PDF file link and displays in the browser without requiring PDF reader on the client machine. As such there's no way to check if the client machine has a pdf reader or not.

You can not identify that client system has pdf software using javascript, asp.net, c#.

If the PDF reader software is not there and the PDF is a valid PDF then it should not throw exception. Instead it asks for a software in client machine which can read the file.

Download office document without the web server trying to render it

I'm trying to download an InfoPath template that's hosted on SharePoint. If I hit the url in internet explorer it asks me where to save it and I get the correct file on my disk. If I try to do this programmatically with WebClient or HttpWebRequest then I get HTML back instead.
How can I make my request so that the web server returns the actual xsn file and doesn't try to render it in html. If internet explorer can do this then it's logical to think that I can too.
I've tried setting the Accept property of the request to application/x-microsoft-InfoPathFormTemplate but that hasn't helped. It was a shot in the dark.

I'd suggest using Fiddler or WireShark, to see exactly how IE is sending the request, then duplicating that.

Have you tried spoofing Internet Explorer's User-Agent?

There is a HTTP response header that makes a HTTP user agent download a file instead of trying to display it:
Content-Disposition: attachment; filename=paper.doc
I understand that you may not have access to the server, but this is one straight-forward way to do this if you can access the server scripts.
See the HTTP/1.1 specification and/or say, Google, for more details on the header.

This is vb.net, but you should get the point. I've done this with an .aspx page that you pass the filename into, then return the content type of the file and add a header to make it an attachment, which prompts the browser to treat it as such.
Response.AddHeader("Content-Disposition", "attachment;filename=filename.xsn")
Response.ContentType = "application/x-microsoft-InfoPathFormTemplate"
Response.WriteFile(FilePath) ''//Where FilePath is... the path to your file ;)
Response.Flush()
Response.End()

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Webbrowser only download pdf and skip everything else - c#

Make a web request and set webRequest.Method = "HEAD"; That will download just the headers, which you can then inspect to see if the MIME type is one that you want to download.

Related

Downloading files from 2 stage authentication protected server

PDF does not reload on directory browsing

asp.net C# download file in ashx from external link

How to identify whether the client machine supports PDF File format

Download office document without the web server trying to render it

Categories

Resources