Is it possible to add custom metadata to file? - c#

I know that each file has metadata like title, subject, keywords and comments:
But what if I need custom metadata like tags for example?
Is it possible to do it with C#?

I know that each file has metadata like title, subject, keywords and comments
That is not true. Most file types do not have a 'standard' form of metadata.
In particular, PDF files don't have properties that Windows Explorer recognizes.
Metadata (file attributes) is not a function of the filesystem.
Office files use a structured format that allows for such attributes.
Jpeg ues EXIF, a different format

If using NTFS you can store whatever you like in an Alternate data stream

As per Jesper's comment, you can use the DSOFile library to read and write to Custom properties that are stored in ADS.
Works well for me though note the fact that the properties are lost when file is transferred to a different file system, including email.
see http://www.keysolutions.com/blogs/kenyee.nsf/d6plinks/KKYE-79KRU6 for a 64bit implementation, link to MS original and comments.

This will depend on whether the file type you are working with supports this. For example this will not be possible with a text file.

Has anyone thought of using the File ID for this? You can get it via the command
fsutil file queryfileid "C:/my/path/example.txt"
This could be used to store information about this file in a separate storage-file associated with the id.

Related

Determine if an .iso is actually a video/movie in C#

I'm in a situation where I'd like to, using C#, look at .iso files that are in a directory and determine if they are indeed video discs (DVD/BD or similar).
I don't need to actually distinguish the type, just a blanket "yes this is a video disc". Is there a way to do this?
the ISO file is actually a CD Image in file format. The easiest way to determine what is on it is to mount it with a Virtual CD program. Or you can look at the file contents.
Here is the Specifications for ISO files
http://users.telenet.be/it3.consultants.bvba/handouts/ISO9960.html
After you are able to determine what information is on the disk then you can determine if there is video information on it by finding out what the contents of those files are.
That is a much more daunting task then just determining the file structure.
This specification file will only define ISO files. Other cd formats will need to be read using their own Specifications...
You can determine if the file is of type ISO using the header data
Here is a Stack Question explaining in a little more detail.
Using .NET, how can you find the mime type of a file based on the file signature not the extension
EDIT
Looking into the Mime type thing a little more reveals that Microsoft will have to have a registered mime type for that header data. It may not know that it is an ISO and may tell you application/octet-stream If this is the case then you can instead use your own judgement with the same first 256 bytes. Determine some things that tell you that it is an ISO file that you can handle. Usually you can tell what type and version a file is with the first 20 bytes or so.
I did some searching around for a library that you could use to read/write ISO files. You just need the read part obviously and this project is something you could probably use http://discutils.codeplex.com/
As another mentioned, an ISO file contains a file system. The easiest way to read it is to mount it as a virtual drive, using any one of a number of utilities. Once you've mounted it as a drive, then you can determine that it likely contains a movie by inspecting the file system (i.e. using Directory.GetFiles and similar methods in C#).
If you want to read the file's contents directly (without mounting it), I'm not sure what to tell you. I've heard that 7-zip has an API that will let you read the files. You might also check out DiscUtils, which claims to be able to read ISO files.
Once you can read the contents of the file system, see the "Filesystem" section of http://en.wikipedia.org/wiki/DVD-Video. That will tell you what files and directories you should expect to see in the ISO of a DVD movie.
Note that the files' existence is an indication that the image probably contains a DVD movie. There's no way to tell for sure without examining the files' contents individually. Tracking down the specifications for the individual file types might be a more difficult task.
try using IMAPIv2 to interrogate the iso.
This link doesnt do that.. but it should get you started in the right direction.
How to create optical ISO using IMAPIv2

What's the best structure to conserve file related information?

I am building an interface whose primary function would be to act as a file renaming tool (the underlying task here is to manually classify each file within a folder according to rules that describe their content). So far, I have implemented a customized file explorer and a preview window for the files.
I now have to find a way to inform a user if a file has already been renamed (this will show up in the file explorer's listView). The program should be able to read as well as modify that state as the files are renamed. I simply do not know what method is optimal to save this kind of information, as I am not fully used to C#'s potential yet. My initial solution involved text files, but again, I do not know if there should be only one text file for all files and folders or simply a text file per folder indicating the state of its contained items.
A colleague suggested that I use an Excel spreadsheet and then simply import the row or columns corresponding to my query. I tried to find more direct data structures, but again I would feel a lot more comfortable with some outside opinion.
So, what do you think would be the best way to store this kind of data?
PS: There are many thousands of files, all of them TIFF images, located on a remote server to which I have complete access.
I'm not sure what you're asking for, but if you simply want to keep some file's information such as name, date, size etc. you could use the FileInfo class. It is marked as serializable, so that you could easily write an array of them in an xml file by invoking the serialize method of an XmlSerializer.
I am not sure I understand you question. But what I gather you want to basically store the meta-data regarding each file. If this is the case I could make two suggestions.
Store the meta-data in a simple XML file. One XML file per folder if you have multiple folders, the XML file could be a hidden file. Then your custom application can load the file if it exists when you navigate to the folder and present the data to the user.
If you are using NTFS and you know this will always be the case, you can store the meta-data for the file in a file stream. This is not a .NET stream, but a extra stream of data that can be store and moved around with each file without impacting the actual files content. The nice thin about this is that no matter where you move the file, the meta-data will move with the file, as long as it is still on NTFS
Here is more info on the file streams
http://msdn.microsoft.com/en-us/library/aa364404(VS.85).aspx
You could create an object oriented structure and then serialize the root object to a binary file or to an XML file. You could represent just about any structure this way, so you wouldn't have to struggle with the
I do not know if there should be only one text file for all files and folders or simply a text file per folder indicating the state of its contained items.
design issues. You would just have one file containing all of the metadata that you need to store. If you want speedier opening/saving and smaller size, go with binary, and if you want something that other people could open and view and potentially write their own software against, you can use XML.
There's lots of variations on how to do this, but to get you started here is one article from a quick Google:
http://www.codeproject.com/KB/cs/objserial.aspx

Getting the type of file from the ASP.NET FileUpload control?

I want to get the type of file uploaded using the ASP.NET FileUpload control. When I upload a file, I want to be able to get the type of file uploaded, so I can assign a an icon to the file (such as a word, excel, pdf icon).
Here is the problem, I can't go off the file extension because a file could be called test.xxxxxxxx and be a valid pdf file, or a file might not have an extension.
The other option is to read the content-type, but with some of these appear not to be standard or in a simple to read format such as excel files, so is there another option to determine the file type?
I would review the process that names a PDF file "test.xxxxxxxx" without a ".pdf" extension. Most workflows that files have some kind of naming convention (manual or automated)
If you cannot read the extension, the file format will need to be detected by interogating markers that make it a known file format: eg: if the stream starts with or contains "%PDF" its a PDF.
If you don't know the extension, then you would have to know the file format of popular file types and then read the file and see if it matches one of your known formats. That doesn't sound optimal, though.
An easy way to check the true type of a file server side, using System.IO.BinaryReader, is described here:
http://forums.asp.net/post/2680667.aspx
and VB version here:
http://forums.asp.net/post/2681036.aspx
You'll need to know the binary 'codes' for the file type(s) you're checking for, but you can get those by implementing this solution and debugging the code.
Also note that when the BinaryReader is closed with the r.Close() statement, this will make FileUploader.HasFile = False

how to create a custom file extension in C#?

I need help in how to create a custom file extension in my C# app. I created a basic notes management app. Right now I'm saving my notes as .rtf (note1.rtf). I want to be able to create a file extension that only my app understands (like, note.not, maybe)
As a deployment point, you should note that ClickOnce supports file extensions (as long as it isn't in "online only" mode). This makes it a breeze to configure the system to recognise new file extensions.
You can find this in project properties -> Publish -> Options -> File Associations in VS2008. If you don't have VS2008 you can also do it manually, but it isn't fun.
File extensions are an arbitrary choice for your formats, and it's only really dependent on your application registering a certain file extension as a file of a certain type in Windows, upon installation.
Coming up with your own file format usually means you save that format using a format that only your application can parse. It can either be in plain text or binary, and it can even use XML or whatever format, the point is your app should be able to parse it easily.
There are two possible interpretations of your question:
What should be the file format of my documents?
You are saving currently your notes in the RTF format. No matter what file name extension you choose to save them as, any application that understands the RTF format will be able to open your notes, as long as the user knows that it's in RTF and points that app to that file.
If you want to save your documents in a custom file format, so that other applications cannot read them. you need to come up with code that takes the RTF stream produced by the Rich Edit control (I assume that's what you use as editor in your app) and serializes it in a binary stream using your own format.
I personally would not consider this worth the effort...
What is the file name extension of my documents
You are currently saving your documents in RTF format with .rtf file name extension. Other applications are associated with that file extension, so double-clicking on such file in Windows Explorer opens that application instead of your.
If you want to be able to double click your file in Windows Explorer and open your app, you need to change the file name extension you are using AND create the proper association for that extension.
The file extension associations are defined by entries in the registry. You can create these per-machine (in HKLM\Software\Classes) or per-user (in HKCU\Software\Classes), though per-machine is the most common case. For more details about the actual registry entries and links to MSDN documentation and samples, check my answer to this SO question on Vista document icon associations.
I think it's a matter of create the right registry values,
or check this codeproject's article
You can save file with whatever extension you want, just put it in file name when saving file.
I sense that your problem is "How I can save file in something other than RTF?". You'll have to invent your own format, but you actually do not want that. You still can save RTF into file named mynote.not.
I would advise you to keep using format which is readable from other programs. Your users will be thankful once they want to do something with their notes which is not supported by your program.

Is there an easy way to determine the type of a file without knowing the file's extension?

I have a table with a binary column which stores files of a number of different possible filetypes (PDF, BMP, JPEG, WAV, MP3, DOC, MPEG, AVI etc.), but no columns that store either the name or the type of the original file. Is there any easy way for me to process these rows and determine the type of each file stored in the binary column? Preferably it would be a utility that only reads the file headers, so that I don't have to fully extract each file to determine its type.
Clarification: I know that the approach here involves reading just the beginning of each file. I'm looking for a good resource (aka links) that can do this for me without too much fuss. Thanks.
Also, just C#/.NET on Windows, please. I'm not using Linux and can't use Cygwin (doesn't work on Windows CE, among other reasons).
you can use these tools to find the file format.
File Analyser
http://www.softpedia.com/get/Programming/Other-Programming-Files/File-Analyzer.shtml
What Format
http://www.jozy.nl/whatfmt.html
PE file format analyser
http://peid.has.it/
This website may be helpful for you.
http://mark0.net/onlinetrid.aspx
Note:
i have included the download links to make sure that you are getting the right tool name and information.
please verify the source before you download them.
i have used a tool in the past i think it is File Analyser, which will tell you the closest match.
happy tooling.
This is not a complete answer, but a place to start would be a "magic numbers" library. This examines the first few bytes of a file to determine a "magic number", which is compared against a known list of them. This is (at least part) of how the file command on Linux systems works.
Someone else asked a similar question and posted the code used to do exactly this. You should be able to take what is posted here, and slightly modify it so that it pulls from your database.
https://stackoverflow.com/questions/58510
In addition to that, it looks like someone has written a library based off of magic numbers to do this, however, it looks like the site requires registration, and some form of alternate access in order to download this lirbary. The documentation is avaliable for free without registration, that may be helpful.
http://software.topcoder.com/catalog/c_component.jsp?comp=13249160&ver=2
The easiest way I know is to use file command that it is also available in Windows with Cygwin .
A lot of filetypes have well defined headers that begin the file. You could check the first few bytes to check to see how the file begins.
Easiest way to do this would be through access to a *nix (or cygwin) system that has the 'file' command:
$ file visitors.*
visitors.html: HTML document text
visitors.png: PNG image data, 5360 x 2819, 8-bit colormap, non-interlaced
You could write a C# application that piped the first X bytes of each binary column to the file command (using - as the file name)
You need to use some p/invoke interop code to call the SHGetFileInfo method from the Win32 API. This article may also help.

Categories