I have a DataSet that contains two tables. One is considered to be nested in the other.. All I want is for it to not be nested and for there to be one table. .Merge() and LINQ just aren't doing the trick.
Here is a sample of what the main table would look like
student-id ID
--------------------
123456789 1
654987321 2
But each of these has multiple rows that they correspond to in the next table
ID Col1 Col2 etc.
----------------------
1 fact1 fact2
1 fact3 fact4
2 fact5 fact6
I want to combine them so they would look like this...
student-id Col1 Col2
-------------------------------
123456789 fact1 fact2
123456789 fact3 fact4
654987321 fact5 fact6
Everytime that I try the merge it doesn't work I get an error that I cant duplicate the primary key which is "ID" and since the merge is based on the primary key(i believe) I cant remove it.
I cant use LINQ because I want to make this generic so that the second table could have any number of columns and I cant get the select to work for that.
UPDATE: MY SOLUTION
I ended up cloning the second table to a new data table. Then adding a column called 'student-id' and deleting the ID column. The I looped through the rows of the Main table finding and related them to row in the second table... Combined all the data in an array and created a row in the final table.
The LINQ isn't as bad as you suggest. You can just use an anonymous type that holds two DataRows:
var result = from t1 in table1.AsEnumerable()
join t2 in table2.AsEnumerable() on (int)t1["ID"] equals (int)t2["ID"]
select new
{
Student = t1,
Facts = t2
};
foreach(var s in result)
Console.WriteLine("{0} {1} {2}", s.Student["student-id"], s.Facts["Col1"], s.Facts["Col2"]);
That way, you're not including specific columns in your output, so you can access them after the fact.
That being said, the other poster's suggestion of using a pivot table is probably a better direction to go.
let's try it in SQL.
Let, 1st Table = Table1 and 2nd Table = Table2
SQL:
Select
X.student-id,Y.Col1,Y.Col2
From Table1 As X Inner Join Table2 As Y On Y.ID=X.ID
I think if you try it in SQL it's easy to do!!!
Sounds like what you need is a Pivot table.
This will essentially allow you to display the data how you want.
Here are a couple of tutorials/projects
http://www.codeproject.com/Articles/25167/Simple-Advanced-Pivots-with-C-and-ASP-NET
http://www.codeproject.com/Articles/46486/Pivoting-DataTable-Simplified
Update
you may find yourself better doing the 'pivot' part in MS SQL as stored procedure and then populating your datatable with the results of calling this stored procedure. This example here is a great starting block
http://blogs.msdn.com/b/spike/archive/2009/03/03/pivot-tables-in-sql-server-a-simple-sample.aspx
Related
first of all I am sorry if this question is too obvious, since I am quite new in SQL.
So, I have a list of IDs (variable, depending how many products the user chooses). And I want to check if all of them are in a table. If one of them is not, the result of the query should be null. If all of them are there, the result should be all the rows where those IDs are.
How can I do this?
Best regards,
Flavio
Do a LEFT JOIN from the list to the table on the ID field. You'll get a null if there is no record
You can even put a WHERE clause like 'WHERE List.ID IS NULL' to only see those that aren't in the table
Edit: Original Poster did not say they were using C# when I wrote this answer
UNTESTED:
Not sure if this is the most efficient but it seems like it should work.
1st it generates a count of items in the table for your list. Next it cross joins the 1 result record to a query containing the entire list ensuring the count matches the count in your provided list and limiting the results to your list.
SELECT *
FROM Table
CROSS JOIN (
SELECT count(*) cnt
FROM table
WHERE ID in (yourlist)) b
WHERE b.cnt = yourCount
and ID IN (YourList)
Running two in statements seems like it would be terribly slow overall; but my first step when writing SQL is usually to get something that works and then seek to improve performance if needed.
Get the list of Ids into a table, (you can pass them as a table variable parameter to a Stored proc), then in the stored proc, write
assuming the list of ids from C# is in table variable #idList
Select * from myTable
Where id in (Select id from #idList)
and not exists
(Select * from #idList
where id Not in
(Select id from myTable))
This question already has answers here:
SQL how to compare two tables for same data content?
(21 answers)
Closed 6 years ago.
I have 2 big tables(About 100-150k rows in each).
The structure of these tables is the same. Ids of entities are also the same in each table.
I need a very fast way to compare these tables and answer the following questions:
Which row's fields are different from another table's row?
Which ids exists in first table and doesn't exists in second table?
Which ids exists in second table and doesn't exists in first table?
Thank you!
Edit: I need to do this comparison using C# or maybe stored procedures(and then to select results by c#)
If you have two tables Table1 and Table2 and they have the same structure and primary key named ID you can use this SQL:
--Find rows that exist in both Table1 and Table2
SELECT *
FROM Table1
WHERE EXISTS (SELECT 0 FROM Table2 WHERE Table1.ID = Table2.ID)
--Find rows that exist in Table1 but not Table2
SELECT *
FROM Table1
WHERE NOT EXISTS (SELECT 0 FROM Table2 WHERE Table1.ID = Table2.ID)
If you are trying to compare and find rows that differ in one column or another, that is a little trickier. You can write SQL to check each and every column yourself, but it may be simpler to add a temporary CHECKSUM column to both tables and compare those. If the checksums are different then one or more columns are different.
SQL Data Compare is a great tool for doing this. Also Microsoft Visual Studio SQL Server Data Tools has a Data Compare function.
I found the following method to perform very well when comparing large data sets.
http://weblogs.sqlteam.com/jeffs/archive/2004/11/10/2737.aspx
Basically UNION ALL of the two data sources then aggregate them and return only rows which don't have an identical matching row in the other table.
With unionCTE As (
Select 'TableA' As TableName, col1, col2
From TableA
Union All
Select 'TableB', col1, col2
From TableB)
Select Max(TableName), col1, col2
From unionCTE
Group By col1, col2
Having Count(*) = 1
Order By col1, col2, TableName;
This will show the results in a single resultset, and if there are any rows that have the same key but different values the rows will be one above the other so that you can easily compare which values have changed between the tables.
This can easily be put into a stored procedure, if you want.
I have a query like this:
Select table1.*, table2.column1 from table1 join table2 on table1.column1=table2.column1
It works, but it puts the column in the end of the datagridview, but i have to put table2.column1, after a specified column of table2, and i have to use table1.* and i cant use listing of the table1's columns is it possible?
And why exactly can't you use a list of all the fields?
NO , it's not possible to place a column in the middle of columns specified with * , not with pure SQL and not with dynamic.
Just specify them, don't be lazy, it's better practice:
SELECT table1.col1,
table1.col2,
table2.col1,
table1.col3
..........
because i am using union queries, and the table names are changing and one table contains more colums than the other
if table1 differs, that above all should be a strong argument for specifing all needed fields separatly. In case of a new field in table1, your query would be broken, cause the number of fields will differ from the ones used in the next union.
I am trying to code a simple database management tool in C#. I am in the process of coding a function to insert a new row into the database, but I have run into a problem. I need to be able to detect which ID numbers are not already taken. I have done some research but haven't found any clear answers.
Example table:
ID Name
---------------
1 John
2 Linda
4 Mark
5 Jessica
How would I add a function that automatically detects that ID 3 is empty, and places a new entry there?
Edit: My real question is; When I want to insert a new row via C#, how do I handle a column which is auto-increment? An example would be fantastic :)
I don't like giving answers like this...but I am going to anyway on this occasion.
Don't
What if you store more data in another table which has a foreign key to the ID in this table? If you reuse numbers you are asking for trouble with referential integrity down the line.
I assume your field is an int? If so, an auto increment should give more than enough for most purposes. It makes your insert simpler, and maintains integrity.
Edit: You might have a very good reason to do it, but I wanted to make the point in case somebody comes along and sees this later on who thinks it is a good idea.
SQL:
SELECT ID From TABLE
OR
SELECT t.ID
FROM ( SELECT number + 1 AS ID
FROM master.dbo.spt_values
WHERE Type = 'p'
AND number <= ( SELECT MAX(ID) - 1
FROM #Table
)
) t
LEFT JOIN #Table ON t.ID = [#Table].ID
WHERE [#Table].ID IS NULL
C#
DataTable dt = new DataTable();
//Populate Dt with SQL
var tableInts = dt.Rows.Cast<DataRow>().Select(row => row.Field<int>("ID")).ToList<int>();
var allInts = Enumerable.Range(1, tableInts.Max()).ToList();
var minInt = allInts.Except(tableInts).Min();
SELECT #temp.Id
FROM #temp
LEFT JOIN table1 ON #temp.Id = table1.Id
WHERE table1.Id IS NULL
Try this?
But my suggestion is, just autoincrement the field.
How you do that is, you set the IDENTITY property of the column to true, and set it as Primary key too(not null).
To handle inserts, you might need triggers, which are like stored procedures, but they can act in place of insert or update or delete, or before/after insert/update/delete
Google triggers.
from How do I find a "gap" in running counter with SQL?
select
MIN(ID)
from (
select
0 ID
union all
select
[YourIdColumn]+1
from
[YourTable]
where
--Filter the rest of your key--
) foo
left join
[YourTable]
on [YourIdColumn]=ID
and --Filter the rest of your key--
where
[YourIdColumn] is null
I am confused about selecting two approaches.
Scenario
there are two tables Table 1 and Table 2 respectively. Table 1 contains user's data for example first name, last name etc
Table 2 contains cars each user has with its description. i.e Color, Registration No etc
Now if I want to have all the information of all users then what approach is best to be completed in minimum time?
Approach 1.
Query for all rows in Table 1 and store them all in a list for ex.
then Loop through the list and query it and get data from Table 2 according to user saved in in first step.
Approach 2
Query for all rows and while saving that row get its all values from table 2 and save them too.
If I think of system processes then I think it might be the same because there are same no of records to be processed in both approaches.
If there is any other better idea please let me know
Your two approaches will have about the same performance (slow because of N+1 queries). It would be faster to do a single query like this:
select *
from T1
left join T2 on ...
order by T1.PrimaryKey
Your client app can them interpret the results and have all data in a single query. An alternative would be:
select *, 1 as Tag
from T1
union all
select *, 2 as Tag
from T2
order by T1.PrimaryKey, Tag
This is just pseudo code but you could make it work.
The union-all query will have surprisingly good performance because sql server will do a "merge union" which works like a merge-join. This pattern also works for multi-level parent-child relationships, although not as well.