RegEx for parsing SQL query in C# - c#

I need to write RegEx in C# to parse SQL join query which is given as a string. Can somebody please help me because I'm new at this. Thanks a lot
this is my problem:
string query = #"SELECT table1.column1, table1.column2, table2.coulmn1, table2.column2
FROM table1 INNER JOIN table2 ON table1.column5 = table2.column5";
what I actually need is to put all important data into separate variables, like this:
string class1 = table1
string class2 = table2
string joinForeignKey1 = table1.column5
string joinForeignKey2 = table2.column5
List<string> attributes1 = table1.column1, table1.column2
List<string> attributes2 = table2.column1, table2.column2
//COMMENT
I realized that I have made a mistake in sql query so there will be an ON clause.
I can force a user to provide me with the correct syntax, so that will be no problem.
The thing I haven't mentioned is that there can be more than one JOIN ON clause (multiple joins).
Thanks a lot and I will appreciative any given help.

Pulling this over from comments, since I think it's the right answer here:
SQL is #3 on the list of Stuff You Should Not Try To Parse With A Regex, just behind HTML and MUMPS. Use a dialect-specific, dedicated SQL parser, not a regex.

I personally do not recommend doing this unless you have a VERY, VERY valid reason to do so as well as full control over the way that the SQL would be written.
First and foremost the syntax that you noted for the SQL statement is the old style join syntax and not using the more common ON syntax.
Something like
SELECT A.ColumnA, B.ColumnB
FROM MyTable A
INNER JOIN YourTable B
ON (A.MyIdentity = B.MyForeignKey)
So unless you can force users to input queries in the old syntax you are already going down the road to a way that will not work.
If I was forced to do this type of thing, and I did have control over it, I personally wouldn't bother with RegEx, due to the fact that the process is so structured. i would just use basic string manipulation.

Related

How to write Nested LINQ for specific scenario

Apologized to post a problem for which i could not construct anything solid. it is very shameful for me to post like this kind of question even after having so high reputation for this web site.
Most of the time i write sql in store procedure in sql server and hardly use LINQ. so facing problem to construct a nested LINQ query for a specific scenario. just wonder if anyone could help me or give me hints to construct such query with linq.
here i am providing a sample sql query which i want to construct the same one with LINQ.
SELECT EmployeeName,
(Select count(*) from table1 where condition) as data1,
(Select count(*) from table1 where condition) as data2,
(Select count(*) from table1 where condition) as data3
though i have seen couple of nested linq sample from these urls but still could not figure out how to construct mine.
http://odetocode.com/blogs/scott/archive/2007/09/24/nested-selects-in-linq-to-sql.aspx
https://social.msdn.microsoft.com/Forums/en-US/00340c95-221a-4b16-9c47-d1acbf2415dc/linq-nested-select-issue?forum=linqtosql
http://www.mercurynewmedia.com/blog/blog-detail/mercury-new-media-blog/2014/05/19/linq-join-queries-vs-nested-sub-queries
http://www.codethinked.com/the-linq-selectmany-operator
Nested Select linq query
http://www.java2s.com/Code/CSharp/LINQ/Nestedquerylist.htm
so just wonder if anyone could help.
This is a good example of something to do in Linq. You will find that you can do this easily and even more powerful queries if you understand two things:
Don't try to overly literally translate from SQL into Linq.
Lambdas.
The first is easy to understand but hard to carry out. Some of the Linq keywords are intentionally like SQL keywords. Linq tries to mirror some aspects of SQL without working in exactly the same way. One key difference is that with Linq, each function yields (no pun intended) a Linq object which can be passed to another Linq function. Thus filtering, transforming, aggregating, etc. can be done repeatedly in the order the programmer chooses. In SQL a single query is built up of several elements such as WHERE, FROM, some of which are optional, and the SQL engine decides in which order to evaluate them.
Get LinqPad and use it for playing with queries, and for doing database queries instead of writing SQL code in SSMS. When you have seen and written enough Linq constructs, you will no longer be writing transliterated SQL, and you will be able to switch back and forth between the two.
Secondly, make sure you understand lambdas. What does
x => x + 23
mean? What is its type? What is the type of its argument and return value? What about
(x, y) => x + y * 23
and
() => "Fish"
?
Select in Linq takes as argument a function. What Select does is the simplest possible thing to understand - it takes an iterator of input objects and a function, and returns an iterator of output objects, which are the input objects having had the function applied to them. This is actually easier to understand with a lambda than with a named function.
Remember that everything in C# has a type. A lambda has a type, even if the type is not explicit. Error messages usually tell you exactly what is wrong, except that sometimes they can be misinterpreted and seem to be pointing to exactly the wrong piece of code, and this can be very time-consuming to figure out. Break down the code you are writing into as small pieces as possible, and debug them in Linqpad.
You can get a long way with this. Try looking at a suggestion for a similar query to yours below and see if you can build on it.
/* A very simple dataset */
string s = "THE QUICK BROWN FOX JUMPED OVER THE LAZY DOG";
var x = s.Split();
/* The letters list plays the role of your EmployeeName */
var letters = new List<String>{"A", "E", "I", "O", "U"};
var result = letters
.Select(letter => new {letter = letter,
startswith = x.Count(w => w.StartsWith(letter)),
contains = x.Count(w => w.Contains(letter)),
endswith = x.Count(w => w.EndsWith(letter))});
If you get this far, you might want to learn more about how the Linq functions operate on iterators. Check out Jon Skeet's Edulinq series of blog posts: http://codeblog.jonskeet.uk/2011/02/23/reimplementing-linq-to-objects-part-45-conclusion-and-list-of-posts/ He reimplements Linq to show how it might be designed and how it should work.
Linqpad and Edulinq and two of the best programming resources I know of, in any language, and I recommend them in almost every Linq question I answer. Check them out.

Regex matching SQL with subqueries, indicate by bracket [duplicate]

I need to write RegEx in C# to parse SQL join query which is given as a string. Can somebody please help me because I'm new at this. Thanks a lot
this is my problem:
string query = #"SELECT table1.column1, table1.column2, table2.coulmn1, table2.column2
FROM table1 INNER JOIN table2 ON table1.column5 = table2.column5";
what I actually need is to put all important data into separate variables, like this:
string class1 = table1
string class2 = table2
string joinForeignKey1 = table1.column5
string joinForeignKey2 = table2.column5
List<string> attributes1 = table1.column1, table1.column2
List<string> attributes2 = table2.column1, table2.column2
//COMMENT
I realized that I have made a mistake in sql query so there will be an ON clause.
I can force a user to provide me with the correct syntax, so that will be no problem.
The thing I haven't mentioned is that there can be more than one JOIN ON clause (multiple joins).
Thanks a lot and I will appreciative any given help.
Pulling this over from comments, since I think it's the right answer here:
SQL is #3 on the list of Stuff You Should Not Try To Parse With A Regex, just behind HTML and MUMPS. Use a dialect-specific, dedicated SQL parser, not a regex.
I personally do not recommend doing this unless you have a VERY, VERY valid reason to do so as well as full control over the way that the SQL would be written.
First and foremost the syntax that you noted for the SQL statement is the old style join syntax and not using the more common ON syntax.
Something like
SELECT A.ColumnA, B.ColumnB
FROM MyTable A
INNER JOIN YourTable B
ON (A.MyIdentity = B.MyForeignKey)
So unless you can force users to input queries in the old syntax you are already going down the road to a way that will not work.
If I was forced to do this type of thing, and I did have control over it, I personally wouldn't bother with RegEx, due to the fact that the process is so structured. i would just use basic string manipulation.

Joins and subqueries in LINQ

I am trying to do a join with a sub query and can't seem to get it. Here is what is looks like working in sql. How do I get to to work in linq?
SELECT po.*, p.PermissionID
FROM PermissibleObjects po
INNER JOIN PermissibleObjects_Permissions po_p ON (po.PermissibleObjectID = po_p.PermissibleObjectID)
INNER JOIN Permissions p ON (po_p.PermissionID = p.PermissionID)
LEFT OUTER JOIN
(
SELECT u_po.PermissionID, u_po.PermissibleObjectID
FROM Users_PermissibleObjects u_po
WHERE u_po.UserID = '2F160457-7355-4B59-861F-9871A45FD166'
) used ON (p.PermissionID = used.PermissionID AND po.PermissibleObjectID = used.PermissibleObjectID)
WHERE used.PermissionID is null
Without seeing your database and data model, it's pretty impossible to offer any real help. But, probably the best way to go is:
download linqpad - http://www.linqpad.net/
create a connection to your database
start with the innermost piece - the subquery with the "where" clause
get each small query working, then join them up. Linqpad will show you the generated SQL, as well as the results, so build your small queries up until they are right
So, basically, split your problem up into smaller pieces. Linqpad is fantastic as it lets you test these things out, and check your results as you go
hope this helps, good luck
Toby
The LINQ translation for your query is suprisingly simple:
from pop in PermissibleObjectPermissions
where !pop.UserPermissibleObjects.Any (
upo => upo.UserID == new Guid ("2F160457-7355-4B59-861F-9871A45FD166"))
select new { pop.PermissibleObject, pop.PermissionID }
In words: "From all object permissions, retrieve those with at least one user-permission whose UserID is 2F160457-7355-4B59-861F-9871A45FD16".
You'll notice that this query uses association properties for navigating relationships - this avoids the need for "joining" and simplfies the query. As a result, the LINQ query is much closer to its description in English than the original SQL query.
The trick, when writing LINQ queries, is to get out of the habit of "transliterating" SQL into LINQ.

How to get linq to produce exactly the sql I want?

It is second nature for me to whip up some elaborate SQL set processing code to solve various domain model questions. However, the trend is not to touch SQL anymore. Is there some pattern reference or conversion tool out there that helps convert the various SQL patterns to Linq syntax?
I would look-up ways to code things like the following code: (this has a sub query):
SELECT * FROM orders X WHERE
(SELECT COUNT(*) FROM orders Y
WHERE Y.totalOrder > X.totalOrder) < 6
(Grab the top five highest total orders with side effects)
Alternatively, how do you know Linq executes as a single statement without using a debugger? I know you need to follow the enumeration, but I would assume just lookup the patterns somewhere.
This is from the MSDN site which is their example of doing a SQL difference. I am probably wrong, but I wouldn't think this uses set processing on the server (I think it pulls both sets locally then takes the difference, which would be very inefficient). I am probably wrong, and this could be one of the patterns on that reference.
SQL difference example:
var differenceQuery =
(from cust in db.Customers
select cust.Country)
.Except
(from emp in db.Employees
select emp.Country);
Thanks
-- Update:
-- Microsoft's 101 Linq Samples in C# is a closer means of constructing linq in a pattern to produce the SQL you want. I will post more as I find them. I am really looking for a methodology (patterns or a conversion tool) to convert SQL to Linq.
-- Update (sql from Microsoft's difference pattern in Linq):
SELECT DISTINCT [t0].[field] AS [Field_Name]
FROM [left_table] AS [t0]
WHERE NOT (EXISTS(
SELECT NULL AS [EMPTY]
FROM [right_table] AS [t1]
WHERE [t0].[field] = [t1].[field]
))
That's what we wanted, not what I expected. So, that's one pattern to memorize.
If you have hand-written SQL, you can use ExecuteQuery, specifying the type of "row" class as a function template argument:
var myList = DataContext.ExecuteQuery<MyRow>(
"select * from myview");
The "row" class exposes the columns as public properties. For example:
public class MyRow {
public int Id { get; set; }
public string Name { get; set; }
....
}
You can decorate the columns with more information:
public class MyRow {
....
[Column(Storage="NameColumn", DbType="VarChar(50)")]
public string Name { get; set; }
....
}
In my experience linq to sql doesn't generate very good SQL code, and the code it does generate breaks down for large databases. What linq to sql does very well is expose stored procedures to your client. For example:
var result = DataContext.MyProcedure(a,b,c);
This allows you to store SQL in the database, while having the benefits of an easy to use, automatically generated .NET wrapper.
To see the exact SQL that's being used, you can use the SQL Server Profiler tool:
http://msdn.microsoft.com/en-us/library/ms187929.aspx
The Linq-to-Sql Debug Visualizer:
http://weblogs.asp.net/scottgu/archive/2007/07/31/linq-to-sql-debug-visualizer.aspx
Or you can write custom code to log the queries:
http://goneale.wordpress.com/2008/12/31/log-linq-2-sql-query-execution-to-consoledebug-window/
This is why Linq Pad was created in the first place. :) It allows you to easily see what the output is. What the results of the query would be etc. Best is it's free. Maybe not the answer to your question but I am sure it could help you.
If you know exactly the sql you want, then you should use ExecuteQuery.
I can imagine a few ways to translate the query you've shown, but if you're concerned that "Except" might not be translated.
Test it. If it works the way you want then great, otherwise:
Rewrite it with items you know will translate, for example:
db.Customers.Where(c => !db.Employees.Any(e => c.Country == e.Country) );
If you are concerned about the TSQL generated, then I would suggest formalising the queries into stored procedures or UDFs, and accessing them via the data-context. The UDF approach has slightly better metadata and composability (compared to stored procedure) - for example you can add addition Where/Skip/Take etc to a UDF query and have it run at the database (but last time I checked, only LINQ-to-SQL (not Entity Framework) supported UDF usage).
You can also use ExecuteQuery, but there are advantages of letting the database own the fixed queries.
Re finding what TSQL executed... with LINQ-to-SQL you can assign any TextWriter (for example, Console.Out) to DataContext.Log.
I believe the best way is to use stored procedures. In this case you has full control on the SQL.

Converting user-entered search query to where clause for use in SQL Server full-text search

What's the best way to convert search terms entered by a user, into a query that can be used in a where clause for full-text searching to query a table and get back relevant results? For example, the following query entered by the user:
+"e-mail" +attachment -"word document" -"e-learning"
Should translate into something like:
SELECT * FROM MyTable WHERE (CONTAINS(*, '"e-mail"')) AND (CONTAINS(*, '"attachment"')) AND (NOT CONTAINS(*, '"word document"')) AND (NOT CONTAINS(*, '"e-learning"'))
I'm using a query parser class at the moment, which parses the query entered by users into tokens using a regular expression, and then constructs the where clause from the tokens.
However, given that this is probably a common requirement by a lot of systems using full-text search, I'm curious as to how other developers have approached this problem, and whether there's a better way of doing things.
How to implement the accepted answer using .Net / C# / Entity Framework...
Install Irony using nuget.
Add the sample class from:
http://irony.codeplex.com/SourceControl/latest#Irony.Samples/FullTextSearchQueryConverter/SearchGrammar.cs
Write code like this to convert the user-entered string to a query.
var grammar = new Irony.Samples.FullTextSearch.SearchGrammar();
var parser = new Irony.Parsing.Parser(grammar);
var parseTree = parser.Parse(userEnteredSearchString);
string query = Irony.Samples.FullTextSearch.SearchGrammar.ConvertQuery(parseTree.Root);
Perhaps write a stored procedure like this:
create procedure [dbo].[SearchLivingFish]
#Query nvarchar(2000)
as
select *
from Fish
inner join containstable(Fish, *, #Query, 100) as ft
on ft.[Key] = FishId
where IsLiving = 1
order by rank desc
Run the query.
var fishes = db.SearchLivingFish(query);
This may not be exactly what you are looking for but it may offer you some further ideas.
http://www.sqlservercentral.com/articles/Full-Text+Search+(2008)/64248/
In addition to #franzo's answer above you probably also want to change the default stop word behaviour in SQL. Otherwise queries containing single digit numbers (or other stop words) will not return any results.
Either disable stop words, create your own stop word list and/or set noise words to be transformed as explained in SQL 2008: Turn off Stop Words for Full Text Search Query
To view the system list of (English) sql stop words, run:
select * from sys.fulltext_system_stopwords where language_id = 1033
I realize it's a bit of a side-step from your original question, but have you considered moving away from SQL fulltext indexes and using something like Lucene/Solr instead?
The easiest way to do this is to use dynamic SQL (I know, insert security issues here) and break the phrase into a correctly formatted string.
You can use a function to break the phrase into a table variable that you can use to create the new string.
A combination of GoldParser and Calitha should sort you out here.
This article: http://www.15seconds.com/issue/070719.htm has a googleToSql class as well, which does some of the translation for you.

Categories