What's the NoSQL equivalent of SQL Server - c#

My project requires a solution to store billions of rows of data with minimal relational data.
The raw data is currently in a text file and looks something like this
Id(), Type(int), Data(Binary data between 1-10MB)
The Id column in the raw text file can be ignored when importing, and replace with either a new int, bigint or uniqueidentifier, which ever has better performance.
Any suggestions on what I should use and how I should design the database?
Also the front end will be written in C# with EF4 (or something else, im open to all suggestions).

I think you might be interrested in a serverless database. Like SQLite or SQL Server Compact.
You do not have to install a server, but you can query your data using SQL, LINQ etc.

Windows Azure Storage Services is the closest your gonna get if your looking for a NoSQL product by Microsoft
It's a cloud thing and Microsoft doesn't have a separate product that you yourself host.
Windows Azure Storage Services is however, built on top of MS SQL Server, just not exposed through the normal TDS protocol. That way, they never allow access to the database without NoSQL in mind. That doesn't stop you from treating your typical SQL Server database as if it was NoSQL, and if you did, you should be able to scale really well. The idea of NoSQL is just that you don't do stuff that doesn't scale horizontally.

http://en.wikipedia.org/wiki/NoSQL
NoSQL is not equviliant of any RDBMS. so "What's the NoSQL equivalent of MS SQL Server" makes no sense. it should either be NoSQL vs MS SQL or no mention of NoSQL at all.

There's a provider giving you some feeling of having a nosql document store over sql-server. http://www.sisodb.com

What is the problem you are attempting to solve with the system?
What type of data are you analysing?
Assuming this is an analysis system rather than a transactional processing system there are tools for analysing large data sets that might have the functionality you need without requiring you to write too much code. For example the Visualisation Toolkit from Kitware www.vtk.org or MIDAS http://www.kitware.com/products/midas.html
MIDAS integrates multimedia server technology with Kitware’s open-source data analysis and visualization clients. The server follows open standards for data storage, access and harvesting. MIDAS has been optimized for storing massive collections of scientific data and related metadata and reports. MIDAS is available under a non-restrictive (BSD) open-source license.
Alternatively IBM have OpenDX http://www.research.ibm.com/dx/

I suggest you conider using SQL Server with the Filesteam feature for the binary data.
http://technet.microsoft.com/en-us/library/bb933993.aspx
Your question has nothing much to do with NoSQL. Don't go thinking that filestream is the SQL Server "equivalent" of NoSQL!

I think the closest approach you will get from MS SQL - is XML column type. Since xml is by definition semi-structured data. So you can make field of xml type and add there your document with binary data encoded in hex or base64 format (if data storage space is not an issue to you).

Related

How can I store a lot of data locally for a program

I am current building (in C#) a fairly basic point-of-sale program for a local community in Uganda to use in tracking business at their sunflower seed press. I was thinking that I would need some sort of database (like a SQL database), but I've never set up a database before, so I'm wondering what the best way to do this is. Maybe a database isn't the best way. The program will not have internet access, so everything will have to be done locally on the machine.
I think your first step should be designing out what data you need to store. Build an Entity Relationship Model and decide what your domain model is going to be. There are many different Database Engines out there that you can use that have different features, installation requirements, etc. A database engine can be installed locally, or on a remote machine to connect to. If you're writing a C# app, you'll probably want to use the System.Data namespace. You can use plain ADO .NET, or use something like Linq To Enttiies to help create proxy classes for your data tables.
You can access a SQL database using the same API for queries / record extraction regardless of the DB Engine uses. In some caess, you may need to use a seperate library that provides an implementation (or a better one), as in the case of an Oracle Database and the Oracle Data Access Components. Right out of the gate, .NET works very well with Microsoft SQL Server, but other options would work.
The details of what database engine are not as important as defining a good set of data tables to represent your data.
Yes. If it has lots of data you have to consider using database. Whether you have internet or not, as long as you have local network, you can easily do database.
Set up a database server ( maybe sql)
Do your database and install it on the database server
Do your application and connect to your database through connection string.
You are on the right track to use a database to store data. It is pretty easy to accomplish. Your computer does not need to be connected to the internet.
SQL Server Express Edition is free with a limit of 10 gigs of data. This will probably be much, much more space than you will need.
From C#, use ADO.NET. It is very simple if you know some SQL. Code samples here.

creating a backend data storage for quick retrieval

I am writing a software which stores all the information of a users interaction in a global session object/class. I would like to store this values collected in a persistent storage. However i cannot use heavy databases such as sql server or mysql in the target pc as i need to keep the installer minimum in size.
I also need to retrieve values from the storage by passing simple Linq queries,etc.
My question is what is the next best thing to databases which can be manipulated by C# code?
Probably either SQLite or SQL Server Compact Edition - these are both fairly full-featured database systems than run entirely in-process and are frequently used these sorts of thing (for example Firefox uses SQLite to store bookmarks).
The next rung down the ladder of complexity would probably be either XML (using LINQ to XML), or just serialisable objects (using LINQ to objects) - you would of course incur performance penalties over a "proper" compact database like SQLite if you started storing a lot of data, however you would probably need store more than you think before it became noticable, and for small data sets the simplicity would even make this faster than SQLite (for example you could restrict your application to storing the last 100 or so actions).
SQL Server CE and SQLite are popular for the scenario you are describing. XML is as well
You could connect to Access MDB files. You don't need an SQL server for this, and it uses the same syntax.
Just need to use OleDb.
Example: DataEasy: Connect to MS Access (.mdb) Files Easily using C#

Do I need to run a server to achieve database-like functionality?

I'm currently trying to develop a program which stores lots of data, similar to an address book, I suppose, with the intent that new data will be periodically added to the program over time.
I know that I could set up a SQL server, and have the program interface with that, but if I want to share my program with other people, I can't guarantee that they'd have access to the server, or that they can set up a server of their own to hold the data.
I also know that I could simply hard-code all of the data into instantiated objects, but that is inelegant, and promises to be incredibly irritating to alter or maintain.
Is there someway I could design the program in such a way that it maintains a database-like structure, yet has no reliance on external programs (such as a SQL server)?
"SQLite is a software library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine."
http://www.sqlite.org/
Sure. You can use XML. XML can be also used as a datasource for ASP.NET components, just like a database.
You could use a flat file database like SQLite which is linked in on compile and could be distributed with your code.
You have several options:
Use flat-file database like SQLite (ADO.NET provider)
Save your data into a file in some format like XML or CSV, or use binary serialization (or something more elaborate if you don't want to have all the data in memory)
Have a public SQL server accessible from the internet
Of course these is always SQL Server express from MS for this purpose.
no you do not neet a server. the most flexibel solution would be to use NHibernate with FluentNHibernate.
There are different drivers for different databases like SQlite, MsAccess and also for Server like MsSql, Oracle, ....

C# - XML vs MySQL

In this program I'm writing, it would need frequent database communication, and at the moment I'm using just XML files. Is there really a benefit from using MySQL or SQL in general over XML. Just note that I'm using C# so MySQL is not very fun to deal with in it (from what little experience I have).
In terms of maintaining data stored in XML files vs. a relational database (Mysql, in your case), the database is far more robust than simple XML files. But this is simply an exercise in determining the needs of your application.
MySql, like many other RDBMSs, will provide much more than just a place to park your data. The biggest advantage to using a modern db such as MySql is ACID support. This means you get all-or-nothing transactions, ensuring consistency through your data.
You also get referential integrity to ensure that related records stay intact and don't leave you with abandoned references to other data records. We could go on and on to discuss the value of locking or the power of stored procedures.
But really, you should consider the needs of your application. If you do significant gymnastics to keep your data in order or you care about shared access and file locks while trying to read and write data, you need to punt on your XML file basis. No need trying to find ways around these issues when a basic mysql database will solve those issues.
If there's truly relational data...you'll almost always benefit from using a RDBMS. Retrieving data will be faster with the backing of a query engine rather than tying together XML nodes. You'll also get referential integrity when inserting data into the structure.
There is an ADO.NET provider for MySQL, so you shouldn't have any more difficulty dealing with a MySQL database than MS SQL Server.
You could even download DbLinq and give their LINQ to MySQL functionality a shot. Could make things even easier (or you could use Entity Framework with the MySQL ADO.NET provider).
The size of XML documents can be a large factor. In XML you either produce large and complicated text files with a huge amount of additional data or your data is split up accross several files. Managing these files can be a headache. Using a SQL database will allow you waste less disk space.
SQL is faster than using XML.
Any SQL database will give you access to a whole set of permissions and role capabilities that may be difficult to enforce using XML.
If you have relational data, a database would work. As an alternative to MySQL, if you aren't looking for a centralized solution, you can use SQLite. SQLite runs in-process (meaning the program running it is it's own "database server") and requires no installation other than distributing the DLL file containing it.
Robert Simpson has written System.Data.SQLite, a SQLite Data Provider for the .Net framework. It's free and open source (like SQLite) and works and feels as native as System.Data.SqlClient does. It supports standard ADO.Net conventions, Linq, and the Entity Framework.
I've used System.Data.SQLite for projects at work for applications that need to run fast and cache data locally for comparison between multiple runs (data processing and job scheduling). Firefox is a good example of an application using SQLite, Firefox 3 uses SQLite for it's Cookies, the Downloads history, Form autocomplete, and most importantly your web browsing history.
Again SQLite is meant for direct application use and lacks features like user authentication and schema permissions. It has issues if multiple programs try to write to the same database (those can be worked around but nothing like what a real RDBMS can do). It's biggest advantage is it doesn't need to be installed and set up to work like MySQL does. In the C# case all you have to do is reference System.Data.SQLite and copy the .dll file along with your program and it'll work.

.Net Data Handling Suggestions

I am just beginning to write an application. Part of what it needs to do is to run queries on a database of nutritional information. What I have is the USDA's SR21 Datasets in the form of flat delimited ASCII files.
What I need is advice. I am looking for the best way to import this data into the app and have it easily and quickly queryable at run time. I'll be using it for all the standard things. Populating controls dynamically, Datagrids, calculations, etc. I will also need to do user specific persistent data storage as well. This will not be a commercial app, so hopefully that opens up the possibilities. I am fine with .Net Framework 3.5 so Linq is a possibility when accessing the data (just don't know if it would be the best solution or not). So, what are some suggestions for persistent storage in this scenario? What sort of gotchas should I be watching for? Links to examples are always appreciated of course.
It looks pretty small, so I'd work out an appropriate object model, load the whole lot into memory, and then use LINQ to Objects.
I'm not quite sure what you're asking about in terms of "persistent storage" - aren't you just reading the data? Don't you already have that in the text files? I'm not sure why you'd want to introduce anything else.
I would import the flat files into SQL Server and access via standard ADO.NET functionality. Not only is DB access always better (more robust and powerful) than file I/O as far as data querying and manipulation goes, but you can also take advantage of SQL Server's caching capabilities, especially since this nutritional data won't be changing too often.
If you need to download updated flat files periodically, then look into developing a service that polls for these files and imports into SQL Server automatically.
EDIT: I refer to SQL Server, but feel free to use any DBMS.
My temptation would be to import the data into SQL Server (Express if you aren't looking to deploy the app) as it's a familiar source for me. Alternatively you can probably create an ODBC data source using the text file handler to get you a database-like connection.
I agree that you would benefit from a database, especially for rapid querying, and even more so if you are saving user changes to the data. In order to load the flat file data into a SQL Server (including Express), you can use SSIS.
Use Linq or text data to list method
1.create a list.
2.Read the text file line by line (or all lines).
3.process the line - get required data and attach to the list.
4.process the list for any further use.
the persistence storage will be files and List is volatile.

Categories