Store pdf files in a database - c#

I would like to design a C# application to store emlployees data, I have around 500 employees. I want to store also pdf scanned profile of each employee. I am planing to use PostgreSQL. Is it practical to store the pdf scanned profiles in the database? Do I need to use blob data-type?

Assuming that PDF is not going to be very large (probably less than 5MB I assume) it is ok. You should use type BYTEA for this.
Read more about how to use Npgsql: .NET Postgresql driver (scroll to Working with binary data and bytea datatype)

yes you need to save them as BLOB objects or in bytea or text types, and you need to consider Postgres limitation regarding this. limited 2G's per entry, & 4 Billion per database for blobs and limited to 1G per entry,4 Billion entries per table for bytea or text, but if i were you i will save a reference to this file in the database "where this PDF is located in the local file system" and you stream this file once it is needed
for PostgreSQL limitation check the following link http://wiki.postgresql.org/wiki/BinaryFilesInDB

Related

Suggestions on using NoSQL DB as FileStorage, and Pros & Cons

We are evaluating our options for alternatives to the static file storage (which is hosted among multiple geographic location).
We are on Microsoft.NET platform (C#, ASP.NET, WEB API, SQL SERVER)
We would like to store digital assets, mostly BINARY (AI, PSD, JPG, PNG, PDF, XLS, DOC...) files on any NoSQL DB.
For image files it could contain thumbnail (small size) to original artwork (large file: ranges from 300 MB to more than 1 GB).
Thumbnail would appear on the web page, but original would be available as an attachment with option to edit (User could download originals and edit using respective program and update the version).
Each thumbnail and original needs to store multiple versions.
We would not be hosting these digital asset on 3rd party platform (like Amazon S3, Azure) and CDN.
This digital asset could be hosted on different geographic environment based upon user system configuration. (User in USA could store either USA, Europe, or ASIA based servers/db).
Each storage needs to be replicated.
We are looking into MongoDB for this. Does anyone could suggest pros & cons based above assumptions or any other alternatives?
Some of MongoDB research reveals...
Disk space consumption is 3 times larger than size of raw data
Could cut down space consumption by -oplogSize parameter
If We try to read chunk and stream to the browser speed could be 6 times slower than reading it from static file store.
Replication is not bidirectional and it works as Master and Slave.
I have prototyped to read digital asset from static file system and store it to MongoDB GridFS in default chunk. What is the better approach in storing thumbnail and originals to MongoDB? As thumbnail would always be less than 16MB, but original could/not be more than 16MB, so by default should I store all image asset on GridFS?
I could envision to create different DB based upon content type, for example: one for PDF, Excel, Word, another for Image.
How can we replicate among different servers?
How can we store it among different MongoDB instance among different region?
I would really appreciate any input.
Thank you.
Some of MongoDB research reveals...
Disk space consumption is 3 times larger than size of raw data Could
cut down space consumption by -oplogSize parameter If We try to read
chunk and stream to the browser speed could be 6 times slower than
reading it from static file store. Replication is not bidirectional
and it works as Master and Slave.
Did you tried to store data or just found some info somewhere? There is always an overhead if you are using a database (no matter which) than a plain filestorage. Why? Well, you have indexes and meta information.
mongodb is a shared nothing strong consistent db. So you write your data to one node and it then gets replicated. But you can use WriteConcerns (http://docs.mongodb.org/manual/core/write-operations/#write-concern) to wait and so make sure that your data is been written to a number, majority etc of nodes in a replicaset. With replication you can do rolling upgrades without downtime and it is also very easy to scale using sharding. And using shard-tags to 'pin' documents to specific shards. see here: http://www.kchodorow.com/blog/2012/07/25/controlling-collection-distribution/

Storing references to images in SQL Server using ASP.NET and C#

I am trying to store references of images in a SQL Server DB and store the actual images in a file server/folder. I am hoping someone could give me a link or code example on how to do this. I don't want to store BLOB in the database.
I am using ASP.NET/C# to handle this.
All files in a folder have unique names, so I think you shouldn't worry about storing the path of the image, as suggested in the comments. If you are worried about consistence, i.e. someone deleting a "referenced" image or inserting a path to nonexistent image file, you could check that either from your application, or even from the database itself.
However I would not hesitate to use a blob, you can use MS SQL 2012 and insert the image files to a file table, which sounds quite convenient.
as per my knowledge.or my expirence the images are stored in sql server that is in image datatype feild.it is stored in byte format.that is actlly the reference of the actual image.hope this link help you to get more clear idea about it
http://www.sqlhub.com/2009/03/image-store-in-sql-server-2005-database.html
http://www.codeproject.com/Articles/10861/Storing-and-Retrieving-Images-from-SQL-Server-usin
Use a HttpHandler to grab the image from the database and use the Image data type:
Retrieve image of image control as byte array which is set using generichandler(.ashx)?
Storing images in your dataase or in a filestream, totaly depends on your images size. In Microsoft Research there is a good paper called To BLOB or Not To BLOB.
After a lot of test and much analysis;
If your pictures are below 256K, store them in datebase VARBINARY column is good.
If your pictures are over 1 MB, storing them in the filesystem is good.(With FILESTREAM attribute, they are still under transactional control and part of the your database)

Save document in SQL Server database

I have a C# / SQL Server project. and database is reachable from different places (no lan between that 3 places) and data in database is important so I am taking recovery or my database every hour for last 30 days.
Documents which I want to save are kind of fax, excel, word, pdf type data and not formatted. So its impossible to get data inside them.
Problem is how can I store documents in SQL Server I don't want to enlarge its size so much because of increasing backup size.
So what is the efficient solution?
It seems like your main issue is the size of your backup. If you are doing a full backup every hour then you could save space by doing a differential backup instead.
There is no need to backup everything if it hasn't all changed, so you would only need to backup the new data that hadn't been in the last backup.
This would save you a lot of space and time and is generally better practice.
I would suggest you consider implementing a backup rotation scheme. You can find more information on this here:
http://en.wikipedia.org/wiki/Backup_rotation_scheme
I would also suggest you save the file in the filestream data type field in order to reduce the performance impact of having large pages in the mdf file.
If you want to store something it's going to take place. You have multiple choises:
Store only file path in SQL and store files seperatly on server and have seperate backup process for them
Compress files before putting them to sql server, it will save you some place especialy with plain text formats, though it won't help with allready compressed formats(.png, office .docx, .xlsx and so on)
Use FILESTREAM and differential backups (Example)
Similar question: Store Files in SQL Server or keep them on the File Server?
If you worries about backups size - save documents in filesystem and in DB store only patches.
If you worries about backups consistency - store documents inside the DB

Should I save images to the database itself as binary information or as a path on the FS?

Background information:
This application is .NET 4/C# Windows
Forms using SQLite as it's backend.
There is only one user using the
database and in no way does it
interact through a network.
My software needs to save images associated to a Project record. Should I save the image as binary information in the database itself; or should I save the path to the picture on the file system and use that to retrieve it.
My concerns when saving as path is that someone might change the filename of a picture and that would essentially break my applications use.
Can anyone give some suggestions?interact through a network.
"It depends". If there are a lot of images, then all that BLOB weight may make backups increasingly painful (and indeed, may preclude some database implementations that only support limited sizes). But it works, and works well. The file system is fine as long as you only store the path relative to some unknown root, i.e. you store "foo/blah/blip.png", which is combined with configuration data to get the full path - then you can relocate the path easily. File systems have simpler backup options in some cases, but you need to marry the file-system and database backups.
In general, it is better to store them on the filesystem, with a path stored in the DB.
However, Microsoft published a white paper some time ago with research showing that files up to 150K can benefit from being put inside the DB (of course, this only pertains to SQL Server).
The question has been asked here many many times before:
Exact Duplicate: User Images: Database or filesystem storage?
Exact Duplicate: Storing images in database: Yea or nay?
Exact Duplicate: Should I store my images in the database or folders?
Exact Duplicate: Would you store binary data in database or folders?
Exact Duplicate: Store pictures as files or or the database for a web app?
Exact Duplicate: Storing a small number of images: blob or fs?
Exact Duplicate: store image in filesystem or database?
First of all have you checked the SQLite limits? If this is of no concern for you application, I would still chose the FS for storage needs simply due to overhead from getting large BLOBS from DB vs. reading a file from FS. You can mark the files as read only and hidden to lessen the chance of them being renamed... You can also store the file hash (like MD5) of a file in the DB so you can have secondary lookup option in case someone does rename the file (of course, they could move it as well in which case this would not help much)...

Storing and retrieving dynamically created pdf in sql

I have been playing around with creation of pdf documents for a project that I'm working on. I would like to store the generated pdf document in a SQL database and then later be able to retrieve this pdf as well.
What are some suggestions for doing this? Can the document be stored in the database without physically creating the document on the server?
This is again going to bring up the debate for/against storing things on the file system or within sql server itself.
It really depends on your needs, the size that you're expected, etc. Here are some references, each with more references.
Storing a file in a database as opposed to the file system?
store image in database or in a system file?
What are some suggestions for doing
this? Can the document be stored in
the database without physically
creating the document on the server?
Sure just create the pdf as a byte stream (byte[]) and store it in the database. Depending on what you use to create it, you don't have to write it to the file system.
Actually, on the argument about where to store it. If you have SQL server 2008, you want to use that. It will store the file on the file system, but you can access it through the database like you would with any other data. You get the best of both worlds.
Keep in mind that SQL Server 2008 now has the FILESTREAM data type. You can write the data to the file system, yet still store it in a column.
Save the PDF as a byte[] then you can use itextsharp to created the PDF when ready for viewing.
You can use a table like this to store files in SQL Server.
CREATE TABLE [Documents]
(
[FileName] nvarchar(1000),
[FileContent] varbinary(max)
)
You have 2 ways to do that:
Store in FileServer and store the Filename in the database.
Encode file and store in database.
I recomend that you use the second..
why I choose that answer?? for security.
One of a lot of reasons:
A little Example:
If you do the Firt(store the file in the fileserver...) you are
crating a folder on your database.. so you server will be vulnerable
for attacks or for virus..
If you do the second. the file will be encode and store in database
and you dont need to be worried about attacks or machine infections..
I think that this are 1 simple reason about why never use the first WAY!!!!!

Categories