NOSQL database Selection for forum - c#

Hi i am developing a FORUM i am using asp.net, c# language for code.
I have read a article about NoSql i inspired a lot from there advantage over RDBMS (sql)
so i was thinking that should i use NoSql concept for Forum DataBase or not. I am not a expert
in database. So can u suggest me should i use NoSql? Currently I am using sql(rdbms).

Depends on what you wanna do with your forum.
If you want to store and retrieve user-written messages, then SQL will do fine.
If you want to analyze user relationships (Graph problem), you will want to examine Neo4J.
If you want to store a lot of large documents, but not on the file system, you will want to use NoSQL.
If you want to be able to change the table structure 100 times all over, NoSQL is the way to go.
Else, stick with SQL.
Since a forum is remotely related to what twitter does, I would look what twitter uses.

There are a few questions to answer before you make a decision about your database type. Will scalability be an issue? Are you designing your software to be used by hundreds of users concurrently? Also the previous poster is right about NoSQL offering schema flexibility.
Two main NoSQL products for .Net are RavenDB and FatDB. I'm using the latter with great performance results.

Related

Data layer design creation

I am planning to create a public facing website somethings on the lines of each user having a single profile page which they can maintain/update regularly. On this page the user can upload some pics and update their personal information.
I have 3 tier structure in mind.
I need inputs in creating my data layer. I have read many posts but I am not convinced on which particular approach to finalize. I have read about entity framework, Microsoft enterprise library, core ado.net etc. Many blogs say that its best to use plain ado.net for better performance.
Could you point out which could be the best approach for my case where I am looking for faster processing and performance. In terms of technology I am looking for asp.net, c#, data calls with WCF and No MVC.
Also in case of plain ado.net are there any ready to use Library available which I could use and get started with.
Thanks
I would not go with plane ADO.NET, if you're looking at whole picture, I would consider that as an micro optimization - by using caching, smart data structuring you would achieve much more than by using plain ado.net.
Entity Framework adds some cost, there's no doubt about it, it is shown here (although it may be outdated):
http://www.servicestack.net/benchmarks/
You could use some micro orm framework, that is mentioned in benchmark, but usually micro comes at is own cost, for example most of micro frameworks I've seen have problems with joins (they allow them in pure sql, but have no tools for writing in typed c#).
For example Stackoverflow has people profiles and is using micro ORM Dapper and their performance is great, because if I remember correctly ~95% of requests are served from Reddis cache, not database.
If your public profiles will be full text searchable and you'll have millions of them, may be relational database is not the right choice.

using nosql database as a replacement for sql server

I'm developing website that (if successful) its going to have a rapidly growing database (maybe terabytes or more). up to now I have always used sql server and didn't know anything about nosql.
I just found out about nosql doing research about the database size, and now I'm not sure if it will fullfil my needs. will I have the same power that I had with sql-server?
my question may seem silly as I'm a newbie in nosql but I just wanted to know if it doesn't support sql queries. how can we do something like:
select *, (select name from cities where id = cityid) from users
how to join tables? use something like stored procedures, views or things like these?
Thats a big question. NoSQL is a broad term pretty much used to describe a bunch of non relational data stores. They can range from MongoDB, RavenDB (which are document stores) to things like Redis and other variants of key/value stores. They all operate very differently to SQL relational models (and the resulting T-SQL).
Document databases like Mongo or Raven typically have a C# driver that (in most cases) allows you to use LinQ queries across the datastore (Mongo example here on this thread and a RavenDB example on their documentation page). They are all specific to their engine and different.
All these engines are not specifically designed to address the 'space' issue you are describing but rather try and have a low friction way of interacting with a datastore, in a fast way. All these data stores will still grow in size in the same way SQL does when throwing massive amounts of data at it. SQL Server will handle massive databases, as will most of the document stores and other NoSQL variants. To be honest, I'd trust SQL Server more than the newer NoSQL stores simply because it has been field tested for longer however as already stated, these document stores (and other stores like Apache Cassandra) can all handle large volumes of data. My only suggestion is to look at how you want to query the data. Document stores typically dont have the concepts of relational integrity like foriegn keys and so normalisation rules do not apply. In addition, you need to assess your reporting needs as SQL typically has an advantage in this area with more tooling. You can also choose a hybrid approach using SQL for your relational data and document stores for other object blobs and the like.
I would suggest looking into how you want to access your data first and then assess which one best suits your needs. One thing to note too is that SQL has some great features but often only in the enterprise versions. This costs a lot. Document databases tend to cost a LOT less for licencing, some being free, with many companies offering hosting so removing the need for you to worry about it. Finally, if going with SQL, I would suggest looking into sharding approaches from the very beginning given the amount of data you will be processing as this will make it much more manageable and also allow better query performance.
I've used MongoDB quite a bit. Id suggest signing up for a sandbox account on Mongolabs and playing around with it. There is an excellent C# driver for it too. NoSql is not really relational although you can relate documents via Ids. In your example you'd store an array of cities (if I am reading your example clearly) against the User document and query that or vice versa. There's less of a concern on data repetition because storage concerns aren't as important as they used to be. I write my scripts (equilivent of stored procs) using JavaScript and run it directly against Mongo, its incredibly flexible and powerful. Of course if you have tons of related objects, perhaps a relational database is your best bet.

How should I go about warehousing data from different sources?

I am starting on an analytics project that will be getting data from several different sources and comparing them to one another. Sources can be anything from an API such as google analytics API to a locally hosted database.
Should I build a single database to import this data into on a regular basis?
Can anyone suggest some best practices, patterns or articles? I really don't know where to start with this so any information would be great! Thanks!
I will be using SQL Server 2008 R2, C# 4.0.
That's a big question, Mike - plenty of people have entire careers doing nothing but Data Warehousing.
I would give a qualified "yes" to your first question - one of the main attractions of a DWH is that you can consolidate multiple data sources into a single source of information. (The qualification is that there may be circumstances where you don't want to do this - for example, for security or performance reasons.)
As ever, Wikipedia is a reasonable first stop for information on this subject. Since your question is already tagged with data-warehouse, StackOverflow is another possible source.
The canonical books on the subject are probably:
Building the Data Warehouse - WH Inmon
The Data Warehouse Toolkit - Ralph Kimball, Margy Ross
The Data Warehouse Lifecycle Toolkit - Ralph Kimball, Margy Ross, Warren Thornthwaite, Joy Mundy, Bob Becker
Note that the Inmon and Kimball approaches are radically different - Inmon concentrates on a top-down, normalised relational approach to constructing an enterprise DWH, while Kimball's approach is more bottom-up, dimensional, functional datamart-based.
The DWH Toolkit concentrates on the technical aspects of building a DWH, while The DWH Lifecycle Toolkit is based as much on the organisational challenges as on the technical details.
Good luck!
I would start with SSIS which is a data integration technology that comes with SQL Server. It may handle a lot of the data sources you need. If you are using APIs such as Googles to get data you may need to put that in a staging table first.
Start with a single staging database which you will use as your primary source to load data into Analysis Services and see how that works out. Use SSIS to populate that staging database.
You need to take up the following steps:
1. First you need to pick up the ETL platform like SSIS, Informatica, or other ETL tools, etc.
2. Then, you need to pick up the appropriate database like Oracle or SQL server, etc.
3.  Thereafter, you need to make the logical data warehouse modeling (Star or Snowflake) and
4. Finally, you need to develop the whole data ware house.
I would advise making two databases, i.e.
1. ODS for storing the data from different sources and for cleansing and
2. Warehouse database for storing all the relevant data.

Database design and hosting solution

I'm trying to prepare to build a database driven .net application and I have hit a roadblock early on due to my lack of knowledge on this topic. Searching around didn't yield anything so here I am asking for help.
I'm receiving weekly data in xml format that will be added to a database and then reports generated using that data. I have a limited license on the xml files so only I can download them and I need to get the results to my end users as well. As far as I can see, I have 2 options:
Feed the data from the xml files into a web hosted database and then have each user connect to the database.
Upload the xml data to a server, have each user download it and keep a local copy of their own database. I'm thinking this will invalidate my license to the original data.
Things / questions of note:
The database holds weekly sports historical data for about the last 10 years.
I need to limit access to the database to only subscribed users.
I'll need to decide how the database will be built.
I need to decide what kind of hosting I'll need.
As you can see, quite an ambitious project for someone new to this. I haven't asked any specific questions so far:
What kind of hosting solutions shall I look for?
Should I use SQL? (Complete newbie on this subject)
Should I use clickonce and then host the application?
Do you have any book or tutorial recommendations that would cover a project like this?
Do I need a script to feed the xml into the database if I go that route? Will that script reside on the server and do it automatically even if I'm not there to instigate it?
I hope the general topic isn't too vague. I tried to actually ask specific questions on it and I'm aware I don't have any code to show as it's just in the early stages of thinking.
The question is a bit vague since you are early on in the decision-making process. However, I do believe that I can offer some help in directing your thinking as you proceed. I think in the situation you are describing, one key thing you should consider is to host your data via JSON/WCF/REST. If you look into these technologies, you will see that there are different ways you can offer your data based upon your developing requirements. For example, how are you going to do authentication? Are you going to allow third-party clients?
What you really don't want to do is allow direct database access, even for authenticated users. Instead, put something in front of it. If you are working in the .NET space, look into all of the different things WCF offers and pick one based upon what fits best. Once you pick that, then you will know what you need for hosting and deployment. Even if you are going to provide the clients as well as the server, this is still a good way to protect your data and provide a way to expand your offering in the future.

How to store my data (C#.net)

I'm having a bit of a problem deciding how to store some data. To see it from a simple perspective, it will be a simple table of data but there will be many tables. There will be about 7 columns in each table, but again there will be a lot of tables (and they will be created at runtime, whenever the customer wants a clean grid)
The data has to be stored locally in a file (and there will not be multiple instances of the software running).
I'm using C# 4.0 and I have been looking at using XML files(one file per table, or storing multiple tables in a file), sqlite, sql server CE, access etc. I will be happy if someone here has some comments or suggestions on how to do/not to do. Stability and reliability(e.g. no trashed databases because of unstable third party software) is probably my biggest concern.
If you are looking to store the data locally in a file, I would recommend the sqlite option since it seems your data is created in the form of a database table already. Sqlite is already built to handle multiple tables and columns so it means less mental overhead for you, the developer.
http://web.archive.org/web/20100208133236/http://www.mikeduncan.com/sqlite-on-dotnet-in-3-mins/ is a decent tutorial to give a quick overview on how to set it up and get going.
As for what NOT to do: don't try to make your own scheme to save the data to a file, it's a well understood problem that has been solved many times over, why re-invent the wheel?
XML wont be a good choice if you are planning to make several queries, since loading text files may be painful when they grow (talking about files over 1mb). If you plan to mantain the data low, the xml would be good to keep it simple. I still won't use it, but if you have a background, then the benefits will be heavier than the learning curve.
If you have no expertise in any of them, and the data is light my suggestion is SQLite, I beleive is the best lightweight DB for .Net and the prvider is very good. you can find it easily on Google.
I would tell you that Access is not recommendable, but this is a personal oppinion. Many people use it and I think is for some reason. So you should check it out and try it.
Again, my final recommendation is SQLite, unless you know very well another one, in which case you'll have to think how much your data is going to grow. If you plan to have a DB around 100mb, any of them, except xml would do; If you think it'll grow bigger than that, consider SQLite heavily

Categories