I want to upload a music file to a database, but i don't know how. I need to upload the file to server and then upload it to the database? Or I can do everything at the same time? And how i can do it?
There is no shortage of tutorials on how to do this. Here's a decent one. Basically, what you're asking is how to store a file as a binary stream in a database field.
However, #Saif al Harthi makes a good point in his comment. It's generally considered bad practice to store a binary file in a relational database. Are you sure this is what you want to do? Your server already has a fairly efficient means of storing/retrieving files... the file system. Unless there's a compelling reason to store the file in the database, it's usually better practice to store it on the file system and just write a database record that references the file (path, maybe type, other application-specific data about it, etc.). The file's name can be changed to, say, the primary identifier for the database table in order to easily reference between them.
It's a little more work, but it's a little better for the server and makes use of the right tools for the right jobs. That is, of course, unless you have a compelling reason for keeping a binary file in a relational database. If there's a reason, please share it.
Related
I am building a Winform application that need a database.
The database needs to save an array of items of a custom class:
Name
Date
Duration
Artist
Genre
If I should build the database using a file that every time, when I increase the array, I will save. Is there wait time to save an array of 300 or so items?
And the second database is to use SQL.
What is the difference between them? And what should I use?
As someone mentioned in a comment, SQLite should work very well for this type of scenario.
If you think your data set will remain fairly small, you might consider XML, or a file, or something else if you think that would be quicker/easier.
In any case, I would strongly recommend that you hide your storage-logic behind an interface, and call only that from the winforms part of your application. This way you will be able to replace your storage-solution later if you should need to.
Update in response to comment: The reason for using SQLite instead of another DB System is that SQLite can be integrated directly into your application. Other DBMS`s will typically be external systems, that you just connect to from within your app.
A quick google search will provide you lots of info, such as this short article about using SQLite within a C# application.
I think you have to think about the futured size of your data.
If you know that i future the data will grow up exponentially, i think you have to use a database System like SQL.
Otherwise if it is only for a few records, you can use a XML File instead.
If you are using a MS SQL Database, you can open a Connection while saving your data, and write it with a sqladapter into the database.
If you are using a XML file instead, you can use the XMLSerializer class for serialization of your own Business object.
File vs database? - it is easy. What is database - it is a file. Only it has an engine that knows how to manipulate that file.
If you use file, you suddenly need to think, "what if?". What if file gets corrupted during write. Or what if computer shuts down in the middle of write? DBMS takes care of this issues by issuing all sorts of mechanisms such as uncommitted data files, etc. Now you will need to provide this mechanism yourself.
This is why you should write to file only non-critical data. For example, some user settings. Because if you lost that file, user can re-size controls again but no data will be at loss. Or log file is another good use of file. Because if you lose a log, you can live without. But if you lose months of worth of data...
In your case, I don't know, how user history is important. 300 items is not a large array. You can use XML by creating an object (class) and mark its properties with XML attributes and then use XML serializer to serialize your history into XML
http://msdn.microsoft.com/en-us/library/system.xml.serialization.xmlserializer.aspx
But if it is going to grow and you not planning to age some of it and delete, look into RDBMS.
The company I work for is running a C# project that crawling data from around 100 websites, saving it to the DB and running some procedures and calculations on that data.
Each one of those 100 websites is having around 10,000 events, and each event is saved to the DB.
After that, the data that was saved is being generated and aggregated to 1 big xml file, so each one of those 10,000 events that were saved, is now presented as a XML file in the DB.
This design looks like that:
1) crawling 100 websites to collects the data and save it the DB.
2) collect the data that was saved to the DB and generate XML files for each event
3) XML files are saved to the DB
The main issue for this post, is the selection of the saved XML files.
Each XML is about 1MB, and considering the fact that there are around 10,000 events, I am not sure SQL Server 2008 R2 is the right option.
I tried to use Redis, and the save is working very well (and fast!), but the query to get those XMLs works very slow (even locally, so network traffic wont be an issue).
I was wondering what are your thoughts? please take into consideration that it is a real-time system, so caching is not an option here.
Any idea will be welcomed.
Thanks.
Instead of using DB you could try a cloud-base system (Azure blobs or Amazon S3), it seems to be a perfect solution. See this post: azure blob storage effectiveness, same situation, except you have XML files instead of images. You can use a DB for storing the metadata, i.e. source and event type of the XML, the path in the cloud, but not the data itself.
You may also zip the files. I don't know the exact method, but it can surely be handled on client-side. Static data is often sent in zipped format to the client by default.
Your question is missing some details such as how long does your data need to remain in the database and such…
I’d avoid storing XML in database if you already have the raw data. Why not have an application that will query the database and generate XML reports on demand? This will save you a lot of space.
10GBs of data per day is something SQL Server 2008 R2 can handle with the right hardware and good structure optimization. You’ll need to investigate if standard edition will be enough or you’ll have to use enterprise or data center licenses.
In any case answer is yes – SQL Server is capable of handling this amount of data but I’d check other solutions as well to see if it’s possible to reduce the costs in any way.
Your basic arch doesn't seem to be at fault, its the way you've perceived the redis, basically if you design your key=>value right there is no way that the retrieval from redis could be slow.
for ex- lets say I have to store 1 mil objects in redis, and say there is an id against which I am storing my objects, this key is nothing but a guid, the save will be really quick, but when it comes to retrieval, do I know the "key" if i KNOW the key it'll be fast, but if I don't know it or I am trying to retrieve my data not on the basis of key but on the basis of some Value in my objects, then off course it'll be slow.
The point is - when it comes to retrieval you should just work against the "Key" and nothing else, so design your key like a pre-calculated value in itself; so when I need to get some data from redis/memcahce, I could make the KEY, and just do a single hit to get the data.
If you could put more details, we'll be able to help you better.
I want to develop an open source library, for a fast efficient file storage (under one large file, and index file) like NFileStorage. why i want to do this ?
A. under my line of work something like that waS needed.
B. our DBA said its not efficient to store files under the DB.
C. Its a good practice for me.
I am looking for a good article for file indexes
can you recommend one ?
what is your general idea ?
It may not be efficient to store files inside a database, however databases like SQL Server have the concept of FileStreams where it actually stores it on the local file system instead of placing it in the database file itself.
In my opinion this is a bad idea for a project.
You are going to run into exactly the same problem that databases have with storing all of the uploaded files inside the same single file... which is why some of them have moved away from this for binary / large objects and instead support alternative methods.
Some of the problems you will have to deal with include:
Allocating additional disk space for your backing file to store newly uploaded documents.
Permanently removing "files" from your storage and resizing / compressing the backing file.
Multi-user access / locks.
Failure recovery. Such as when you encounter a bad block on the drive and it hoses your backing file.
Transactional support.
Items 1 and 2 cause an increase in the amount of time it takes to write a "file" to your data store. Items 3, 4 and 5 are already supported by network file systems so you're just recreating the wheel.
In short you're going to have to either write your own file system or write your own DBMS. Neither of which I would consider "good practice" for 99% of real world applications. It might be worthwhile if your goal is to work for Seagate.. But even then they'd probably look at you funny.
If you are truly interested in the most efficient method of file storage, it is quite simply to purchase a SAN array and push your files to it while keeping a pointer to the file/location in your database. Easy to back up, fast to store files, much cheaper than spending developer time trying to figure out how to write your own file system and certainly 100% supported and understandable by future devs.
This kind of product already exist. You should read about Mongo Db (http://www.mongodb.org/display/DOCS/Home)
I'm developing some web app in ASP.Net which is mainly about Storing, Sharing and Processing MS Word doc and PDF files, but I'm not sure how to manage these documents, I was thinking of keeping documents in folders and only keeping metadata of them in DB OR keeping the whole documents in DB,I'm using SQL Server 2008. what's your suggestion?
SQL Server 2008 is reasonably good at storing and serving up large documents (unlike some of the earlier versions), so it is definitely an option. That said, having large blobs being served up from the DB is generally not a great idea. I think you need to think about the advantages and disadvantages of both approaches. Some general things to think about:
How large are the files going to be, and how many of them will there be? It's a lot easier to scale a file system past many TB than it is to do the same for a DB.
How do you want to manage backups? Obviously with a file system approach you'd need to back the files up separately from the DB.
I believe it's probably quicker to implement a solution that stores to the DB, but that storing to the file system is generally the superior solution. In the latter case, however, you will have to worry about some issues, such as having unique file names, and in general not wanting to store too many documents in a single folder (most solutions create new folders after every few thousand documents). Use the quicker approach if the files are not going to be numerous and large, otherwise invest some time in storing on the file system.
In the database unless you don't care about data integrity.
If you store docs outside of the database you will have missing documents and broken links soomer not later. Your backup/restore scenario is a lot more complex: you have no way to ensure that all data is from the same point in time.
FILESTREAM in SQL Server 2008 makes it efficient nowadays (and other RDBMS have features like this too)
If your storing these files in one folder, then maintain the files names in the DB. Since no dir can have 2 files names same with same extension. If you wish to store the file into DB, then you may have to use BLOB or byte array to store.
I see over head in opening a connection to the DB, though i dont know how fast the DB connection is compared to file.Open (even peformance wise).
If files are relatively small I would store them as BLOB fields in database. This way you can use standard procedures for backup/restore as well as transactions. If files are large there are some advantages in keeping them on the hard drive and storing filenames in the database as was suggested earlier
How many documents are you planning to store?
The main advantage of the database approach is the normal ACID properties--the meta-data will always be consistent with the document, which will not be the case if you use the file system to store the documents. Using the file system it would be relatively easy for your meta-data and documents to get out of sync: documents on the file system for which there is no meta-data, meta-data where the document has gone missing or is corrupted. If you need any sort of reliability in your document storage, then the database is a much better approach than using the file system.
If you are going only operate on that files I would think to store them i DB like a BLOB data. In case if you have a files on folders and only names in DB, you should care about the fact that, for example:
1) one day you may need rename file
2) change its location
3) change its extension
or whatever.
In case of DB instead, you can save in separate table BLOB data, in other table name and extension of the file along with its ID on BLOB table. In this case, in moment of previously discussed scenario you will need just execute simple SQL update query.
Sorry for the bad title.
I'm saving web pages. I currently use 1 XML file as an index. One element contains file created date (UTC), full URL (w. query string and what not). And the headers in a separate file with similar name but appended special extension.
However, going at 40k (incl. header) files, the XML is now 3.5 MB. Recently I was still reading, adding new entry, save this XML file. But now I keep it in memory and save it every once in a while.
When I request a page, the URL is looked up using XPath on the XML file, if there is an entry, the file path is returned.
The directory structure is
.\www.host.com/randomFilename.randext
So I am looking for a better way.
Im thinking:
One XML file per. domain (incl. subdomains). But I feel this might be a hassle.
Using SVN. I just tested it, but I have no experience in large repositories. Executing svn add "path to file" for every download, and commit when I'm done.
Create a custom file system, where I then can include everything I want, for ex. POST-data.
Generating a filename from the URL and somehow flattening the querystring, but large querystrings might be rejected by the OS. And if I keep it with the headers, I still need to keep track of multiple files mapped to each different query string. Hassle. And I don't want it to execute too slow either.
Multiple program instances will perform read/write operations, on different computers.
If I follow the directory/file method, I could in theory add a layer between so it uses DotNetZip on the fly. But then again, the query string.
I'm just looking for direction or experience here.
What I also want is the ability to keep history of these files, so the local file is not overwritten, and then I can pick which version (by date) I want. Thats why I tried SVN.
I would recommend either a relational database or a version control system.
You might want to use SQL Server 2008's new FILESTREAM feature to store the files themselves in the database.
I would use 2 data stores, one for the raw files and another for indexes.
To stored the flat file, I think Berkeley DB is a good choice, the key can be generated by md5 or other hash function, and you can also compress the content of the file to save some disk space.
For indexes, you can use relational database or more sophisticated text search engine like Lucene.