LINQ - Select all in parent-child hierarchy - c#

I was wondering if there is a neat way do to this, that DOESN'T use any kind of while loop or similar, preferably that would run against Linq to Entities as a single SQL round-trip, and also against Linq To Objects.
I have an entity - Forum - that has a parent-child relationship going on. That is, a Forum may (or in the case of the top level, may not) have a ParentForum, and may have many ChildForums. A Forum then contains many Posts.
What I'm after here is a way to get all the Posts from a tree of Forums - i.e. the Forum in question, and all it's children, grandchildren etc. I don't know in advance how many sub-levels the Forum in question may have.
(Note - I know this example isn't necessarily a valuable use case, but the Forum object model one is one that is familiar to most people, and so serves as a generic and accessible premise rather than my actual domain model.)

One possible way would be if your actual data tables were stored using a left/right tree (example here: http://www.sitepoint.com/hierarchical-data-database-2/ . Note, that example is in MySQL/PHP, but it's trivial to implement). Using this, you can find out all forums that fall within a parent's left/right values and given that, you can retrieve all posts who's forum IDs is IN those forum IDs.

I'm sure you might get a few proper answers regarding the Linq queries. I'm posting this as an advisory when it comes to the SQL side of things.
I had a similar issue with a virtual filesystem in SQL. I needed to be able to query files in folders recursively - with folders, of course, having a recursive parent-child relationship. I also needed it to be fast, and I certainly didn't want to be dropping back to client-side processing.
For performance I ended up writing stored procedures and inline functions - unfortunately much too complicated to post here (and I might get the sack for sharing company code!). The key, however, was to learn how to work with Recursive CTEs http://msdn.microsoft.com/en-us/library/ms186243.aspx. It took me a few days to nail it but the performance is incredible (they are very easy to get wrong though - so pay attention to the query plans).

Related

translate large queries to linq to Entity framework

I have a few very large queries which I need to convert it linq because we are using Entity framework and I cant use stored procedures(breaks compatibility with other data bases).
using tool like linqer didnt even help and even if I get it to work with some mods to generated linq, there is a huge performance issue.
so, what is the best option in a situation like this where EF fails?
please don't ask me to divide it into small queries cause that's not possible.
Moving this to an "answer" because what I want to say is too long for a comment.
It sounds like you're running into an inherent limitation to ORMs. You won't get perfect performance trying to do everything in code. It sounds like you're trying to use an ORM like a T-SQL interface rather than a mapping between objects and a relational instance of data.
You say you want to maintain compatibility between databases but that's already a nonstarter if you consider schema differences from database to database. If you're already implementing a schema validation step so you ensure your code doesn't break, then there should be no reason why you can't use something like views.
You can say you don't want to support these things all day long but the simple point is that these things exist because they address certain problems. If you wholesale abandon them, then you can't really expect to get rid of the problem. Some things the database simply does better.
So, I think you're expecting something out of the technology that it wasn't meant to solve. You'll need to either reevaluate your strategy or use another tool to accomplish it. I think you may even need a couple different tools.
What you've been doing may have worked when your scale was smaller. I could see such a thing working for quite a while actually. However, it does have a scale limit, and I think you're coming up against it.
I think you need to make a determination on what databases you want to support. Saying "we support all databases" is untenable. Then, compare features and use the ones in common. If it's a MS SQL vs. MySQL thing, then there's no reason why you can't use views or stored procedures.
Check out LinqKit - this is a very useful tool for building up complex large EF queries.
http://www.albahari.com/nutshell/linqkit.aspx

Beginning learning SQL with C#/ASP.NET

Sorry if this has been asked elsewhere, but I couldn't find a clear answer anywhere.
I have decided to begin learning to use relational databases a bit more, namely SQL. This is a major beginners question but its probably essential to get started on.
I'm basically a little confused the best practice on how to utilize SQL (or other). At college i have accessed databases (using JSON strings) for things such as mobile apps, but i have never actually designed and built a database myself, as my tutor made the mentioned database for us to access himself.
Lets say I have a C# application that holds genealogy information (i.e. families and their members) and i wanted to store each individual on a database. Would I, simply use the structure I already have but save to fields in a database instead of an xml or text document? Or does it work the other way, i.e. do I create a database with required fields then just retrieve this from the database in a c# application and manipulate the data as I so wish, so the application would be entirely different (so the c# application basically doesn't really hold/store any data and just works on whats fed from the database)?
Whats troubling me is that usually where I would store my c# objects in a dictionary or list for example, would i instead just retrieve straight from the database? or retrieve from the and store the data into a normal structure and work from there (surely this would defeat the point of fast-searching from a database)?
I may be over-thinking it slightly. Hope that makes sense. Thanks in advance
Would I, simply use the structure I already...
or
do I create a database with required fields...
I think that is the crux of your question.
Starting from the database
For me, when building an application that uses a backend database, an Entity-Relationship diagram is pretty crucial. I found quite a nice little tutorial for you here: http://www.sum-it.nl/cursus/dbdesign/english/index.php3 but you can easily find one that suits your learning style. The key point is that you are trying to model the problem domain (the real world out there that needs your application) in a way that your application can somehow capture. Once you have an E-R diagram of related tables, it is easier to figure out the details. Using SQL Management Studio for SQL Server 2008 (Express edition) you can create a few basic tables and build the E-R diagram right there and have it generate relationships for you. You can then, at your leisure, examine the SQL used to achieve that and refine accordingly.
Personally, I always start by examining the problem domain, then I build the E-R diagram, then I build the database. I start building the C# application when I'm reasonably confident the database reflects the problem domain.
Starting from your C# application
However, what really matters is that you model the real world in a meaningful and effective way. In your case you already have a starting point in structures you've created in C# and you can use them to give you a starting point to build the E-R diagram. If you find it easier to get a C# application going and then build a database that reflects it, that should be fine. Perhaps you already have an approach that helps you capture the problem domain effectively. It's an iterative process whatever you do: building the C# code might reveal problems with the underlying database design and vice versa.
Diagramming - E-R or UML?
I'm personally convinced that this whole business is so complicated that you really need some diagrams.
to visualise your database, use an E-R diagram
to visualise your C# application use a UML class diagram
As you head towards a working application, you'll see how these 2 diagrams begin to match or at least reflect eash other pretty closely. In both cases, (entities or classes) understanding the relationship between objects will be really important when you query the database because it is crucial to understand relationships between tables (especially using 1-to-many relationships to resolve a complex many-to-many relationship) and various techniques for joining tables in queries (INNER or OUTER joins etc) No matter how clever your C# application is, you will at some point need to understand at least some of the complexities of the SQL language - and it is easier if you can refer to an E-R diagram.
Where to store?
Whats troubling me is that usually where I would store my c# objects in a dictionary or list for example, would i instead just retrieve straight from the database?
In the database, without a doubt. A C# class called Family would have a property FamilyName, say, with a setter method built in. If you discover a spelling mistake and want to change the name, the setter method would open a connection to the database, run an UPDATE query with the specified family name, (and probably the family id) as a parameter, and update the underlying field accordingly. Retrieving data would involve running a SELECT query etc.
Conclusion
Do some tutorials on how to examine a problem domain, create an entity-relationship diagram and build a set of related tables based on the diagram. I'm convinced that way you'll find it much easier to keep track of the C# classes that you build to communicate with the backend database.
Here's an example of a simple E-R diagram for families and their members:
To begin with you might think members and family could be in one table, but then you discover that creates a lot of duplication so you separate that out into family and member table with a one-to-many relationship, but then you realise that, through marriage for instance, people can belong to more than one family and you need to create a many-to-many relationship. I think the E-R diagram is the best place to work out that kind of complexity.
Not knowing what your structures look like or how your DB will be designed this is hard to answer. But you should be able to use existing data structures, and just pipe the data from the database instead of the XML file.
Look into Linq-to-XML, C# has a strong library to interact with SQL. May be a bit confusing at first, but very powerful once you learn it.
If I am right you are asking also if you should retrieve all the records from the database and store them as objects in a collection or retrieve selected records from the database and use the dataset results without placing them in a purpose defined structure.
I tend to select the records I want from the database and then load the results into my purpose defined classes / structures. This allows you to add your manipulation methods to the class holding a record result etc. without needing to take in dataset results to each method. However you will find yourself doing singular updates all the time when a batch update might be more efficient... if that makes sense.
Take a look at entity frameworks code first. If your data structures are classes in your application there are techniques to use that to create your database schema from that. As far as the data. Store it in your database and populate your lists and dictionaries with it. Or populate list of class genealogy individual with it.
If you want to write your own data classes, there's a free tutorial here written by myself. What I would definitely not to is use the data sources in ASP.NET, as these wizards are the Barty Crouches of the ASP.NET world - they appear good, but turn out to be evil, as inevitably you'll want to be able to tweak them and you won't understand how to do this.

Parsing existing "complex" SQL statements and converting into calls to custom API calls

I have a situation where I have several hundreds of complex excel spreadsheets, each with multiple pivot tables running queries against a sql database. I need to be able to convert these sql queries into function calls against a proprietary data store. This is complicated at many levels, but the part I am asking about now, and seems likely to have been addressed before in computer science, is how to "parse" the sql statements into a well defined structure that I can work with programmatically.
An example of my starting point:
SELECT vwFlowDataBest.MeasurementDate, vwFlowDataBest.LocationType, vwFlowDataBest.ScheduledVolume, tblPoints.Zone, tblPoints.Name AS SOME_ALIAS_FOR_NAME, vwFlowDataBest.PointID, tblCustomerType.Name, vwFlowDataBest.OperationallyAvailable, tblPoints.County, tblPoints.State, tblConnectingParty.Name
FROM Pipe2Pipe.dbo.tblConnectingParty tblConnectingParty, Pipe2Pipe.dbo.tblCustomerType tblCustomerType, Pipe2Pipe.dbo.tblPipelines tblPipelines, Pipe2Pipe.dbo.tblPoints tblPoints, Pipe2Pipe.dbo.vwFlowDataBest vwFlowDataBest
WHERE tblCustomerType.ID = tblPoints.CustomerTypeID AND tblPipelines.ID = vwFlowDataBest.PipelineID AND tblPoints.ID = vwFlowDataBest.PointID AND tblPoints.ConnectingPartyID = tblConnectingParty.ID AND ((tblPipelines.ID=16) AND (vwFlowDataBest.ScheduledVolume<>0) AND (tblPoints.Zone In ('mid 1','mid 2','mid 3','mid 4','mid 5','mid 6','mid 7')) AND (tblCustomerType.ID=16) AND (vwFlowDataBest.MeasurementDate>={ts '2010-05-15 00:00:00'}) AND (tblPipelines.ID<155))
So for this statement, I need to programatically handle the SELECT portion, the FROM portion, and the WHERE portion, and the subordinates within each. Complications of this are things such as aliases, differentiating between a join between tables and a plain old value filter in the where clause, the grouping (brackets) within the where clause, and other issues. Dealing with the complexities of Excel pivot tables is entirely outside the scope of this question, I can figure that out.
For now, I don't mind not supporting certain sql functions, such as "group by", "having", etc...for my problem, those are small enough that if necessary I can handle those manually. But if there's a known way to handle that as well, I'd be most happy.
My feeling is that I can probably get 70% of the way there (for my problem) just by splitting the sql statement into 3 parts, and then further breaking each of those down into their logical subordinate parts and then deal with them accordingly. But as I write this I can already see holes in my plan...this feels like a tarpit of complexity and edge cases.
I can't imagine I'm the first person to want to do such a thing, so my question is, are there old, proven approaches to this sort of problem, existing libraries, innovative approaches I could take, or any suggestions in general to apply to this task?
You seem to need a SQL parser (or at least part of one). It may be overkill for your purposes (more complete than you need), but there's a PL/SQL parser for ANTLR that might be useful.
Edit: I didn't really read that grammar as carefully as I should have before I posted the link. Doing a bit of looking, it doesn't really parse select statements at all -- it just recognizes where one is, and skips across it.
The ANTLR grammars page lists several more SQL grammars though (for the variants supported/used by MySQL, Oracle, etc.) Since you have C# and such in the tags, it's probably fair to guess you want to parse the MS SQL Server variant. There's a grammar strictly for its select statement that may be a reasonable fit for your needs.

MongoDB relationships for objects

Please excuse my english, I'm still trying to master it.
I've started to learn MongoDB (coming from a C# background) and I like the idea of what is MongoDB. I have some issues with examples on the internet.
Take the popular blog post / comments example. Post has none or many Comments associated with it. I create Post object, add a few Comment objects to the IList in Post. Thats fine.
Do I add that to just a "Posts" Collection in MonoDB or should I have two collections - one is blog.posts and blog.posts.comments?
I have a fair complicated object model, easiest way to think of it is as a Banking System - ours is mining. I tried to highlight tables with square brackets.
[Users] have one or many [Accounts] that have one or many [Transactions] which has one and only one [Type]. [Transactions] can have one or more [Tag] assigned to the transaction. [Users] create their own [Tags] unique to that user account and we sometimes need to offer reporting by those tags (Eg. for May, tag drilling-expense was $123456.78).
For indexing, I would have thought seperating them would be good but I'm worried it is bad practice this thinking from old RBDMS days.
In a way, its like the blog example. I'm not sure if I should have 1 [Account] Collection and persist all information there, or have an intermediate step that splits it up to seperate collections.
The other related query is, when you persist back and forth, do you usually get back everything associated with that record - even if not required or do you limit?
It depends.
It depends on how many of each of these type of objects you expect to have. Can you fit them all into a single MongoDB document for a given User? Probably not.
It depends on the relationships - is user-Account a one-to-many or a many-to-many relationship? If it's one to many and the number of Accounts is small you might chose to put them in an IList on a User document.
You can still model relationships in MongoDB with separate collections BUT there are no joins in the database so you have to do that in code. Loading a User and then loading their Accounts might be just fine from a performance perspective.
You can index INTO arrays on documents. Don't think of an Index as just being an index on a simple field on a document (like SQL). You can use, say, a Tag collection on a document and index into the tags. (See http://www.mongodb.org/display/DOCS/Indexes#Indexes-Arrays)
When you retrieve or write data you can do a partial read and a partial write of any document. (see http://www.mongodb.org/display/DOCS/Retrieving+a+Subset+of+Fields)
And, finally, when you can't see how to get what you want using collections and indexes, you might be able to achieve it using map reduce. For example, to find all the tags currently in use sorted by their frequency of use you would map each document emitting the tags used in it, and then you would reduce that set to get the result you want. You might then store the result of that map reduce permanently and only up date it when you need to.
One further concern: You mention calculating totals by tag. If you want accounting-quality transactional consistency, MongoDB might not be the right choice for you. "Eventual-consistency" is the name of the game for NoSQL data stores and they generally aren't a good fit for financial transactions. For example, it doesn't matter if one user sees a blog post with 3 comments while another sees 4 because they hit different replica copies that aren't in sync yet, but for a financial report, that kind of consistency does matter - your report might not add up!

SQL Server for C# Programmers [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I'm a pretty good C# programmer who needs to learn SQL Server. What's the best way for me to learn SQL Server/Database development?
Note: I'm a total newb when it comes to DB's and SQL.
SQL is about set theory, or more correctly, relational algebra. Read a brief primer on that. And learn to think in sets, not in procedures.
On the practical side, there are four fundamental operations,
selects, which show some projection of a table(s) data
deletes, which remove some subset of a table's rows,
inserts, which add rows to a table,
updates, which (possibly) change data in a table
(By subset, I mean any subset, including the empty set, and not necessarily a proper subset.)
Anywhere I can write a column name in DDL (except as the target of an update), I can write an expression that uses column names, functions, or constants.
select 1, 2, 3 from table will return the resultset "1 2 3", once for each row in the table. If the column named create_date is of type date, and the function month returns a month number given a date, select month( create_date) from table will show me the month number for each create_date.
A where clause is a predicate that restricts rows selected, or deleted, or updated to those rows for which the predicate is true. A where cause can be composed of an arbitrary number of predicates connected by the logical operators and or and not. Just like the column list in a select, I can use column names, functions, and constants in my where clause. What result set do you think is returned from select * from table where 1 = 1;?
In a query, tables are related by joins, in which some datum or key in one is related by an operator to a datum or key in another table. The relational operator is often equality, but can in fact be any binary operator or even a function.
Tables are related, as I mentioned above, by keys; a row in a table may relate to zero, one, or many rows in another table; this is referred to as the cardinality of the relation. Relations may be one-to-one, one-to-many, many-to-many. There are standard ways of representing each relation. Before you look up the standard ways to do this, think about how you'd represent each one, what the minimum requirements of each kind is. You'll see that a many-to-many relation can in fact also model one-to-many and one-to-one; ask yourself why, given that, all relations are not many-to-many.
EF Codd, among other, pioneered the idea of normal form in relational databases. There are commonly held to be five or six normal forms, but the most important summary of normal form is simple: every entity that your database models should be represented by one row and one row only, every attribute should depend on the row's key, and every row should model an entity or a relationship. Read a primer on normal form, and understand why you can get data inconsistencies if a your database isn't normalized.
In all this, try to understand why I like to say "if you lie to the database, it will lie to you". By this I don't mean bad data, I mean bad design. E.g., if you model a one-to-many relation as many-to-many, what "lies" can be recorded? What "lies" can happen if your tables aren't normalized?
A view, in practical terms, is a select query given a name and stored in the database. If I often join table student to table major through the many-to-many relation student_major, maybe I can write a view that selects the columns of interest from that join, and use the view instead of alway rewriting the join.
Practical tips: first, write a view. whatever you're doing, it'll be simpler and clearer if you write a view for every calculation or sub-calculation you do. Write a view that encapsulates each join, write a view that encapsulates each transformation. Almost anything you want to do can be done in a view.
Decomposing a query into views serves the same ends as functional decomposition serves in procedural code: it allows you to concentrate on doing one thing well, makes it more easily tested, and allows you to compose more complex functionality out of simpler operations. Here's an example where I use views to transform a table into forms that more easily allow me to apply successive transformations, in order to get to a goal.
Don't conflate data. Each table ought to unambiguously model one thing (one kind of entity) and only one thing; each column should express one and only one attribute of that thing. Different kinds of entities belong in different tables.
Metadata is your friend. Your database platform will provide some metadata; what it doesn't provide you should add. Since metadata is data, all the rules for modeling data apply. You can get, for example, the names of all objects in your database from the sytem table sysobjects; syscolumns contains all the columns. To find all the columns in one table, you'd join sysobjects and syscolumns on id, and add a where clause restricting the resultset to a particular table name: where sysobjects.name = 'mytable'.
Experiment. Sit down at a database and ask yourself, "How can I represent people with hair colors and professions and residences? What tables and relations are implied in modeling that?" Then model that, as tables.
Then ask yourself, "How can I show all blonde doctors who reside in Atlanta", and write the query that does that. Piece it together by writing views that show you all blondes, all doctors, and all people who reside in Atlanta.
You'll find that in asking "how can I find that", you'll expose deficiencies in your model, and you'll find that you want or even need to change the way your model works. Make the changes, see how they make your queries easier or harder to write.
I love Joe Celko books from novice to advanced. I also think virtual labs are great.
An easy way to learn SQL syntax?
Use Microsoft Access. Use the Northwind sample database, open Access up in Query view and run some queries.
Creating a Simple Query
Start with SELECT * FROM and work your way up to more complicated examples.
One of the Best resources is http://www.sqlservercentral.com/ Tons of articles
Another good resource is http://www.trainingspot.com/VideoLibrary/Default.aspx
And here is a list of books my DBA suggested I read for learning SQL
Best Damn Exchange, SQL and IIS Book Period or on google books
Beginning SQL Server 2008 Developers or on Google books
Here are the three books I strongly recommend you read in order.
Begining SQL Server 2005 Programming
Professional SQL Server 2005 Programming
The Gurus Guide to Transact SQL
W3Schools has a nice tutorial with try by example setup. But other than just installing a express edition and having a bunch of trials runs with the demo databases, I'd say no book will teach you better.
I would say your very best bet is to sign up for a DB class at a local college. You can usually find an evening class. You will start with simple Database concepts like what is a database, and what are tables.
The instructor will usually give you a project as homework about halfway though the class where you will design and implement a simple database for something like a video store. You will have interaction with other students who are at your same level and will be interested in discussing the technical details from a new DB guy standpoint. And you will have an experienced instructor you can ask questions of and get timely interaction from, who won't be snarky like us internet posters :)
Get it from horse's mouth --> http://www.asp.net/learn/videos/default.aspx?tabid=63#sql
These days most of the universities have their courses online. Try to research some good professors and learn the fundamentals. Their assignments are also useful.
of the top of my head, I can think of MIT opencourseware (OCW)
This depends on what you will need to do. If you just need to access databases, you should have a look the various access strategies - DataReader, DataSet, LINQ to SQL, Entity Framework, NHibernate - and pick a solution.
If you need to develope database, get a good book on that topic. Get familar with the theoretical stuff - relational algebra, keys, referential integrity and normalisation. Then have a look at SQL and finally you may have a closer look at ACID transaction, locking, concurency control, indexes, and all the technical details that make a database server work.
I would suggest to read the wikipedia articles - may be the 100 most important ones - to get the big picture and then approch the details where required. But this will probably be no replacement for a good book if you want to get a good database developer.
I tend to like books because I can read them anywhere, I can go at my own pace and I can get eBook copies (when using apress). I also happen to learn more efficiently in this manner as I already know most of the concepts, like database types.. int, bool, guid, etc... you will know those as well. So, essentially, I would recommend the apress series of books - very comprehensive IMO. And you can generally find them used for very cheap on Amazon... Here is one tailored to you:
http://www.amazon.com/Beginning-SQL-Server-2008-Developers/dp/1590599586/ref=sr_1_1?ie=UTF8&s=books&qid=1239758026&sr=1-1
When you sign up to Microsoft Books Newsletters (From Microsoft Press) they actually give you (free) an ebook called Introducing SQL Server 2008.
http://csna01.libredigital.com/?urss1q2we6

Categories