Find duplicates between two SQL tables - c#

I have two tables (Table1 and Table2) with the same primary keys, lets say Key1 and Key2. What I need to do is seperate the records in Table1 into two groups, the duplicates (records found in Table2) and the non-duplicates. I know I could use the below, but that seems bloated and repetetive. Is there a trimmer way to do this, possibly with a single call?
SELECT Key1, Key2 FROM Table1 WHERE Key1 IN (SELECT Key1 FROM Table2) AND Key2 IN (SELECT Key2 FROM Table2);
SELECT Key1, Key2 FROM Table1 WHERE Key1 NOT IN (SELECT Key1 FROM Table2) AND Key2 NOT IN (SELECT Key2 FROM Table2);
;
This call is being made from a C# ASP.NET codebehind page.

This query does a left outer join to ensure all records from table1 are returned. It joins on Table2, and if there are no matches, than any columns in Table2 will be NULL for that row. This behavior is used in a CASE statement to set a flag telling where the Table1 row exists in Table2 or not:
select t1.*,
case when t2.Key1 is null then 'false' else 'true' end as IsDuplicate
from Table1 t1
left outer join Table2 t2 on t1.Key1 = t2.Key1
and t1.Key2 = t2.Key2
You can then filter in your application based on the IsDuplicate column.

Check the new upsert statement in SQL2008
Upsert in SQL2008

Related

Doing a Select query syntax from a Dataset table with Sum() and Group by clause

Is there a way to query a dataset table?
I was trying to apply this to the dataset table but I do not the syntax
SELECT [In], F2, [Out], F5, SUM(Hours)
FROM Table1
WHERE (F2 = (SELECT TOP 1 FROM Table1 ORDER BY ASC)
AND F5 = (SELECT TOP 1 FROM Table1 ORDER BY DESC))
GROUP BY IN, OUT;

Nested Select MySQL statements to LINQ

I'm trying to convert the following MySQL statement in to LINQ query format
SELECT * FROM table1 WHERE table1.id IN (SELECT c_id FROM table2 WHERE a_id IN (1, 49) GROUP BY c_id HAVING COUNT(*) = 2) ORDER BY name
Got as far as this, but I'm drawing a blank on how to handle the IN and 2nd SELECT statement
myItems = from c in table1
let id = c.id
where ????
orderby c.name
select c;
Would appreciate some guidance with this please
Try this:
var ids=new[]{1,49};
var innerquery=table2.Where(e=>ids.Contains(e.a_id))
.GroupBy(e=>e.c_id)
.Where(g=>g.Count()==2)
.Select(g=>g.Key);
var myItems = from c in table1
where innerquery.Contains(c.id)
orderby c.name
select c;
First define your inner query,after the group by you will get a collection of IGrouping<TKey, TElement>> that represent a collection of objects that have a common key, filter the groups choosing only those where count==2, and select the keys of those groups. The second part is really easy to understand. I split the process in two queries to do it more readable, but you can merge both query in one.

Joining multiple tables with one to many relationship

I have 3 tables:
maintable(id, serialno, col3, col4, col5, ..., col10)
table1(t1_id, serialno, t1_type, t1_color)
table2(t2_id, serialno, t2_base, t2_price)
maintable's Primary Key is id and serialno is UNIQUE.
Table1's Primary Key is t1_id and table2's is t2_id.
Table1 and Table2 serialno are Foreign Keys that reference MainTable's serialno.
maintable has a one to many relationship with both table1 and table2.
What I want to do is join these 3 tables in a DataTable.
I first thought that it would be simple and I tried: "SELECT * FROM maintable INNER JOIN table1 ON maintable.serialno = table1.serialno INNER JOIN table2 ON maintable.serialno= table2.serialno WHERE maintable.id = 200";
The problem with the result is that if table1 has 3 rows and table2 has 4 rows then my DataTable becomes 12 rows(3x4). What I want to do in this instance is just get 4 rows.
table1 and table2 columns don't have anything to do with each other and they only have to match maintable's serialno.
In case that I'm not being understood, I want to select the rows of table1 and table2 that match maintable's serialno and add them to the right of maintable without them getting duplicated.
Edit: Sorry, I had written accountno instead of serialno in some cases.
SELECT * FROM
maintable m
INNER JOIN (
SELECT t1.serialno, t1.t1_type, t1.t1_color, null as t2_base, null as t2_price
FROM table1 t1
UNION
SELECT t2.serialno, null as t1_type, null as t1_color, t2.t2_base, t2.t2_price
FROM table2 t2
) t ON m.serialno = t.serialno
ORDER BY m.serialno
This will do what you're asking for: return number of rows in t1 + number of rows in t2, rather than rows in t1 x number of rows in t2. Fiddle. This may not perform so hot if you have a large amount of data.
Now that you know how it's done, don't do it.
The real question is why is this a requirement? What are you really trying to accomplish here? This is not a meaningful way to combine the data from the two child tables, given their relationships. T1 and t2 are different tables and not keyed to each other for a reason: they aren't meant to combine their data like this.
The only new data I can imagine extracting from this kind of query is the total count of rows in both t1 and t2 for a given serial number. But there are much better ways to get this information than selecting the rows like this. If you need both t1 and t2 data and duplicates are throwing you off, odds are good that you should be making two separate SELECT statements instead of trying to combine everything.
SELECT
maintable.accountno
FROM
maintable
INNER JOIN table1 ON
maintable.accountno = table1.accountno
INNER JOIN table2 ON
maintable.accountno = table2.accountno
WHERE
maintable.id = 200
GROUP BY
maintable.accountno

Using COUNT For Comparison in SQL Server CE 4.0

I'm attempting to combine the logic for some of my SQL queries, and I can't seem to figure out this problem. Obviously SQL Server CE has many limitations compared to SQL Server or mySQL, but surely there's a way to solve this.
I want to do a count on one table in my database, based on some parameters, and then I want to compare this value to a value stored in a column in another table.
Let's say the database is modeled like this:
Table1:
ID int
Key string
NumberInUse int
Table2:
ID int
OtherID int
Here's the necessary parts of the query.
SELECT *
FROM Table1
LEFT JOIN Table2 ON Table1.ID = Table2.ID
WHERE Table1.Key = #key
AND (SELECT COUNT(*) FROM Table2 WHERE ID = Table1.ID AND OtherID = #otherID) < Table1.NumberInUse;
Unfortunately this query gives me this error:
There was an error parsing the query. [ Token line number = 4,Token line offset = 6,Token in error = SELECT ]`
So is there a way I can rephrase the WHERE clause of my query to utilize this comparison?
Try this:
SELECT *
FROM Table1 t1
INNER JOIN (SELECT ID
,COUNT(*) numCount
FROM Table2 t2
WHERE t2.OtherId = #otherID
GROUP BY ID) t3
ON t1.ID = t3.ID
WHERE t1.Key = #Key
AND t3.numCount < t1.NumberInUse
Sure it's not SQL. You're missing the right operand of the second LEFT JOIN:
SELECT *
FROM Table1 LEFT JOIN Table2
ON Table1.ID = Table2.ID
LEFT JOIN ????? WHUT ?????
WHERE Table1.Key = #key
AND (SELECT COUNT(*) FROM Table2 WHERE ID = Table1.ID AND OtherID = #otherID) < Table1.NumberInUse;

SQL Server NULL value with inner join

I am using C# and SQL Server.
Take a look at the following SQL:
SELECT table1.id, table1.description, table2.name, table2.surname
FROM table1
INNER JOIN table2 ON table1.EmpID = table2.EmpID
It is straight forward and works fine. It retrieves the data from table1 table just fine and inner joins table1.empid to table2.name and table2.surname correctly.
Now, sometimes table1.empid is null and when it is, this SQL just ignores the "row" with the null value; which is pretty normal basing on the criteria.
What I need here is to also get the "rows" with the null values and when table1.empid is null I need to set a custom value to table2.name and table2.surname.
I have been playing with isnull() but all I did is make it even worst.
Any suggestions?
Thanks
You need to do a LEFT JOIN:
SELECT table1.id, table1.description, table2.name, table2.surname FROM table1
LEFT JOIN table2 ON table1.EmpID = table2.EmpID;
Try using a UNION:
SELECT table1.id, table1.description, table2.name, table2.surname
FROM table1
INNER JOIN table2 ON table1.EmpID = table2.EmpID
UNION
SELECT table1.id, table1.description, 'Table 2 Null', 'Table 2 Null'
FROM table1
WHERE table1.empId is null
If table 1 is null and you still need the records that you cannot start with that. Start with table2 and join table1.
SELECT table1.id, table1.description, ISNULL(table1.empid, "some new value") AS name, table2.surname
FROM table2
LEFT OUTER JOIN table1 ON table2.EmpID = table1.EmpID
SELECT table1.id
,table1.description
,COALESCE(table2.name, 'DEFAULT') AS name
,COALESCE(table2.surname, 'DEFAULT') AS surname
FROM table1
LEFT JOIN table2
ON table1.EmpID = table2.EmpID
Now note, that this will also include people when the EmpID is not null but nevertheless "invalid" if they have an EmpID in table1, but it isn't found in table2, so if that's something you want to avoid, another option is this:
SELECT table1.id
,table1.description
,table2.name
,table2.surname
FROM table1
INNER JOIN table2
ON table1.EmpID = table2.EmpID
UNION ALL
SELECT table1.id
,table1.description
,'DEFAULT' AS name
,'DEFAULT' AS surname
FROM table1
WHERE table1.EmpID IS NULL
Select table1.id table1.description
, Case When table1.EmpID Is Null Then 'Some Value' Else table2.name End As Table2Name
, Case When table1.EmpID Is Null Then 'Some Value' Else table2.surname End As Table2Surname
From table1
Left Join table2
On table2.EmpID = table1.EmpID
Where table1.EmpID Is Null
Or table2.EmpID Is Not Null

Categories