I'm on charge of building an ASP.NET MVC Document Management System. It have to be able to do basic document management tasks like adding, editing and searching entries and also perform versioning.
Anyways, I'm targeting PDF, Office and many image formats as the file attached to each document entry in the database. My question is: What design guidelines do pros follow when building the storage mechanism? Do they store the document files in the file system? Database? How file uploading is handled?
I used to upload the files to a temporal location while the user was editing the data and move it to permanent storage when the user confirmed the entry creation. Is this good? Any suggestions on improvement?
Files should generally be stored on a filesystem, rather than a database.
You will, however, have to consider some other things:
Are you planning on ever supporting load-balancing, replication, etc for your system?
If so, you'll need to support saving / loading files from a network location of some sort.
This can be trickier than you may imagine.
Are you planning to secure access to the files?
If so, you'll need to ensure they can't be read by someone who happens to know the URL.
eg: by returning the file as an attachment to a request.
This also prevents user-provided files being executed on your server - eg someone uploading an .aspx or .exe file and then accessing it.
Related
I am using the Azure Storage File Shares client library for .NET in order to save files in the cloud, read them and so on. I got a file saved in the storage which is supposed to be updated after every time I'm doing a specific action in my code.
The way I'm doing it now is by downloading the file from the storage using
ShareFileDownloadInfo download = file.Download();
And then I edit the file locally and uploading it back to the storage.
The problem is that the file can be updated frequently which means lots of downloads and uploads of the file which increases in size.
Is there a better way of editing a file on Azure storage? Maybe some way to edit the file directly in the storage without the need to download it before editing?
Downloading and uploading the file is the correct way to make edits with the way you currently handling the data. If you are finding yourself doing this often, there are some strategies you could use to reduce traffic:
If you are the only one editing the file you could cache a copy of it locally and upload the updates to that copy instead of downloading it each time.
Cache pending updates and only update the file at regular intervals instead of with each change.
Break the single file up into multiple time-boxed files, say one per hour. This wouldn't help with frequency but it can with size.
FYI, when pushing logs to storage, many Azure services use a combination of #2 and #3 to minimize traffic.
I know how to store images to db (convert them to byte[] and then save it) and also for retrieving (select byte[] from db and use image methods to create image from byte[]). I'm cool so far, but how can I save/retrieve a PDF to database? What about .doc , .mp3 , .exe and say .ppt files?
Is there a general way to save and retrieve files to and from sql server? The worst part is retrieving, let's imagine we found a way to save any file to sql server, now how can we rebuild the file from db? We don't know what the file extension was before saving?
Well, Generally speaking, it's considered bad practice to save actual files to the database.
a part of the reason is the problems you mentioned in your post, and an even bigger part is that saving files directly to the database has a large overhead (such as translating an image to a byte array and back).
the easy (and recommended in most cases) way to handle files and databases is to save the files directly to the file system, and keep the path in the database along with other file-related data such as the user id that uploaded the file.
this way you don't need to worry about braking and rebuilding the files, you just send them to the server and back to the user as is.
Keep in mind it's not recommended to keep the full path of the file, only a relative path.
What I normally do is save all files from the users either on the serer itself or on the users's computer (in a desktop application). in any of these cases, there is a dedicated folder with only read/write permissions (NEVER let a user save a file into a directory with execute permissions!), and keep the path of this directory either on a 'General Params' table in the database or in the configuration file of the website / application / webservice.
Well, the file attributes (name, extension, author, etc) are usually kept in relational way, in table inside SQL Server. The file itself should be kept in SQL Server database, exactly how depends on version od SQL Server and size of file. Use FILESTREAM or FILETABLES feature for larger, or VARBINARY(MAX) for smaller files.
It doesn't matter whether its an image, or doc or pdf -- if you car read into a FileStream, you can save it to database.
Advantages of storing files in a database is simplified management, backups, security, integrity. With FILESTREAM and FILETABLES feature, accessing a file is almost the same as if it were on a file system, using the SqlFileStream object from .NET.
See more here:
http://technet.microsoft.com/en-us/library/gg471497.aspx
And here:
http://technet.microsoft.com/en-us/library/ff929144.aspx
I have a media centric website that requires us to upload large images, videos to the media library.
I have the default settings for the following settings in web.config.
Media.MaxSizeInDatabase (20MB)
httpRuntime maxRequestLength
I do not want to increase MaxSizeInDatabase limit on the production server for security reasons.
Also, Media.UploadAsFiles is set to false.
So, my question is - Is there a way to configure sitecore such that if the file being uploaded is less than 20MB, it gets stored in the database and the files larger than 20MB get stored on the file system?
As Martijn says, there is nothing built in to automatically detect this, but if you know that the file is going to be large (or the upload fails due to the large size) then you can manually "force it" to save to file on a per upload basis.
You need to use the Advanced Upload option and select the "Upload as Files" option.
EDIT: If you are able to use YouTube then consider the following modules with nicely/tightly integrated with Sitecore. There are a couple of others ways of achieving the same thing for different providers.
YouTube Integration
YouTube Uploader
No, not that I know of. At least not automatically. Uploaded files are either stored in the DB or on the filesystem, based on your setting.
You might want to create an override upload method which could automatically handle this for you or use the manual checkbox in the Advanced Media Upload method as Jammykam says.
Background information:
This application is .NET 4/C# Windows
Forms using SQLite as it's backend.
There is only one user using the
database and in no way does it
interact through a network.
My software needs to save images associated to a Project record. Should I save the image as binary information in the database itself; or should I save the path to the picture on the file system and use that to retrieve it.
My concerns when saving as path is that someone might change the filename of a picture and that would essentially break my applications use.
Can anyone give some suggestions?interact through a network.
"It depends". If there are a lot of images, then all that BLOB weight may make backups increasingly painful (and indeed, may preclude some database implementations that only support limited sizes). But it works, and works well. The file system is fine as long as you only store the path relative to some unknown root, i.e. you store "foo/blah/blip.png", which is combined with configuration data to get the full path - then you can relocate the path easily. File systems have simpler backup options in some cases, but you need to marry the file-system and database backups.
In general, it is better to store them on the filesystem, with a path stored in the DB.
However, Microsoft published a white paper some time ago with research showing that files up to 150K can benefit from being put inside the DB (of course, this only pertains to SQL Server).
The question has been asked here many many times before:
Exact Duplicate: User Images: Database or filesystem storage?
Exact Duplicate: Storing images in database: Yea or nay?
Exact Duplicate: Should I store my images in the database or folders?
Exact Duplicate: Would you store binary data in database or folders?
Exact Duplicate: Store pictures as files or or the database for a web app?
Exact Duplicate: Storing a small number of images: blob or fs?
Exact Duplicate: store image in filesystem or database?
First of all have you checked the SQLite limits? If this is of no concern for you application, I would still chose the FS for storage needs simply due to overhead from getting large BLOBS from DB vs. reading a file from FS. You can mark the files as read only and hidden to lessen the chance of them being renamed... You can also store the file hash (like MD5) of a file in the DB so you can have secondary lookup option in case someone does rename the file (of course, they could move it as well in which case this would not help much)...
I am writing a website to consolidate a bunch of XML files with data into one MySQL database. I need to have a way to allow users to select a directory on their computer that contains a bunch of xml files. The site then reads each of those files and takes care of consolidating the information.
Is there a simple way (like the default open file dialog for win forms and wpf) to bring up a file dialog on a users computer, let the user pick a directory, and then be able to read the xml files in the selected directory? Would I have to upload them to the site temporarily first? Or could I just access them on the users computer?
Thanks!!
You can't access files from a webserver directly. You would need to write an ActiveX Control if you really don't find another way.
The standard conform way it just uploading one or more files with the browser fileupload:
http://msdn.microsoft.com/en-us/library/system.web.ui.webcontrols.fileupload.aspx
I would suggest that the user should zip the files and just upload the zip file.
There are some hacks - but I don't think it fits:
http://the-stickman.com/web-development/javascript/upload-multiple-files-with-a-single-file-element/
http://dotnetslackers.com/articles/aspnet/Upload_multiple_files_using_the_HtmlInputFile_control.aspx
I think you have to have a web dialog to upload the files to a temp location like you already mentioned and do the consolidation there before committing to your database. Or, maybe you can do the consolidation in JavaScript in the user's browser instance.