How to merge SQL Server data efficiently? - c#

I have three tables in SQL Server where I need to combine all matching rows from all tables into a fourth MergedTable that will contain all the columns from the three individual tables based on the U_ID column.
Is there a way of doing this via T-SQL in a stored procedure, or should I just create a loop function in C#?
Bottom line is this is going to be executed from a command from a website, so it needs to be something I can encapsulate into an MVC project or component.
Here is an example of the tables.
Table 1:
U_ID ClientNumber OrderDate Amount
---------------------------------------------
BB000Kw 1920384 5/14/2013 1093.39
AA000bM 3839484 12/8/2012 584.42
AA000gH 8294848 2/28/2014 4849.38
AA000md 3849484 4/31/2013 590.84
AA000mF 3998398 3/29/2013 448.82
AA000mG 9944848 11/28/2014 98.85
AA000mn 0292938 10/31/2012 300.48
Table 2:
U_ID Name Date
------------------------------------------
AA000bM "Krivis, Jeffrey" 7/1/2002
AA000bv "Saydah, Michael" 7/30/2002
AA000cA "Byrne, Richard" 4/21/2003
AA000dd "McNeil, Joseph" 6/10/2003
AA000dH "Greenberg, Arnold" 1/16/2003
AA000gH "Rich, Elwood" 7/5/2003
AA000id "O'Neill, Robert J." 11/20/2002
AA000jf "Patsey, Richard" 4/22/2003
AA000jr "Jones, Arthur" 7/1/2002
AA000jU "Toff, Ronald" 7/15/2002
AA000k4 "Anderson, Carl" 8/14/2002
BB000Kw "Wilson, Sam" 3/9/2003
Table 3:
U_ID Name
-----------------------------
AA000bM Acme Company
AA000jr Stockwell Industries
BB000ke Gensen Motors
BB999di Falstaff Cards
BB000dl B and R Printing
BB000Kw Go Golf Carts
AA000gH Rich's Sandwiches
Resulting merged table
U_ID ClientNumber OrderDate Amount CustomerName JoinDate CompanyName
-------------------------------------------------------------------------------------------------------
BB000Kw 1920384 5/14/2013 1093.39 "Wilson, Sam" 3/9/2003 Go Golf Carts
AA000bM 3839484 12/8/2012 584.42 "Krivis, Jeffrey" 7/1/2002 Acme Company
AA000gH 8294848 2/28/2014 4849.38 "Rich, Elwood" 7/5/2003 Rich's Sandwiches
Table 1 is the master table that the others are matched to. You can see from the result that there will be only a subset of all the tables based on those that are matched from Table 1.
I'll be using MVC with the Entity Framework 6 and Linq-to-Entities, but if a T-SQL script is more efficient, then I should probably use that instead.
Which is the better way to go to get this result?

If you want to create a new table you can use SELECT ... INTO ... FROM ... query. In your case it would look like this:
SELECT t1.U_ID, t1.ClientNumber, t1.OrderDate, t1.Amount,
t2.Name as CustomerName, t2.Date as JoinDate,
t3.Name as CompanyName
INTO dbo.ResultingMergedTable
FROM Table1 t1
INNER JOIN Table2 t2 ON t1.U_ID = t2.U_ID
INNER JOIN Table3 t3 ON t1.U_ID = t3.U_ID
Keep in mind that if you are looking at really big data table this will take a lot of time to execute.

You can create a 4th table to do what you mentioned but if you are using sql you can create a view to do the same thing. A view is a virtual table. We use this when we partition data as well as make a detailed record like described above.
http://msdn.microsoft.com/en-us/library/ms187956.aspx
http://www.sqlinfo.net/sqlserver/sql_server_VIEWS_the_basics.php
CREATE VIEW DetailView AS
(
SELECT
-- table1
t1.U_ID,
t1.ClientNumber,
t1.OrderDate,
t1.Amount,
-- table2
t2.Name,
t2.Date as [JoinDate],
-- table3
t3.Name as [Company]
FROM
table1 t1
LEFT JOIN
table2 t2
ON t1.U_ID = t2.U_ID
LEFT JOIN
table3 t3
ON t1.U_ID = t3.U_ID
WHERE
t1.U_ID = t2.U_ID
and
t1.U_ID = t3.U_ID
)

Related

Delete from TableA with Where equals in TableB

I have a staging table through which I want to delete all matching records in my Customers table. In "language terms":
delete
tableA.*
from
table A,table B
where
TableA.col1=TableB.col1
&& TableA.colb=TableB.col2 /// and so forth
Some info about the tables:
There are no relationships between the tables. The only true way to match the records is to match all of the columns (I want to clear any duplicates)
There are no foreign keys inplace between the 2 tables. Staging table is imported from CSV and the data will be transformed to use within our system.
Most of the imports will be identical (with around 80% of the staging rows to be deleted from around 60k records)
I have this working in Linq2SQL but it's taking a longer due to all of the queries and as there is around 80% matching records with each query and I feel a single query should be suffice.
Is this at all possible in SQL?
You can use JOIN with DELETE
DELETE a
FROM tableA a
INNER JOIN tableB b
ON a.Col1 = b.Col1
AND a.ColB = b.ColB
... and so on
or by using EXISTS:
DELETE a
FROM tableA a
WHERE EXISTS
(
SELECT 1 FROM tableB b
WHERE a.Col1 = b.Col1
AND a.ColB = b.ColB
....
)
merge table1 t1
using (
select t2.ID
from table2 t2
) as d
on t1.ID = d.ID
when matched then delete;

Joining multiple tables with one to many relationship

I have 3 tables:
maintable(id, serialno, col3, col4, col5, ..., col10)
table1(t1_id, serialno, t1_type, t1_color)
table2(t2_id, serialno, t2_base, t2_price)
maintable's Primary Key is id and serialno is UNIQUE.
Table1's Primary Key is t1_id and table2's is t2_id.
Table1 and Table2 serialno are Foreign Keys that reference MainTable's serialno.
maintable has a one to many relationship with both table1 and table2.
What I want to do is join these 3 tables in a DataTable.
I first thought that it would be simple and I tried: "SELECT * FROM maintable INNER JOIN table1 ON maintable.serialno = table1.serialno INNER JOIN table2 ON maintable.serialno= table2.serialno WHERE maintable.id = 200";
The problem with the result is that if table1 has 3 rows and table2 has 4 rows then my DataTable becomes 12 rows(3x4). What I want to do in this instance is just get 4 rows.
table1 and table2 columns don't have anything to do with each other and they only have to match maintable's serialno.
In case that I'm not being understood, I want to select the rows of table1 and table2 that match maintable's serialno and add them to the right of maintable without them getting duplicated.
Edit: Sorry, I had written accountno instead of serialno in some cases.
SELECT * FROM
maintable m
INNER JOIN (
SELECT t1.serialno, t1.t1_type, t1.t1_color, null as t2_base, null as t2_price
FROM table1 t1
UNION
SELECT t2.serialno, null as t1_type, null as t1_color, t2.t2_base, t2.t2_price
FROM table2 t2
) t ON m.serialno = t.serialno
ORDER BY m.serialno
This will do what you're asking for: return number of rows in t1 + number of rows in t2, rather than rows in t1 x number of rows in t2. Fiddle. This may not perform so hot if you have a large amount of data.
Now that you know how it's done, don't do it.
The real question is why is this a requirement? What are you really trying to accomplish here? This is not a meaningful way to combine the data from the two child tables, given their relationships. T1 and t2 are different tables and not keyed to each other for a reason: they aren't meant to combine their data like this.
The only new data I can imagine extracting from this kind of query is the total count of rows in both t1 and t2 for a given serial number. But there are much better ways to get this information than selecting the rows like this. If you need both t1 and t2 data and duplicates are throwing you off, odds are good that you should be making two separate SELECT statements instead of trying to combine everything.
SELECT
maintable.accountno
FROM
maintable
INNER JOIN table1 ON
maintable.accountno = table1.accountno
INNER JOIN table2 ON
maintable.accountno = table2.accountno
WHERE
maintable.id = 200
GROUP BY
maintable.accountno

SQL query to check the presence of a particular product in multiple tables

I have 7 tables and each table will contain an entry for a particular product.I want to check whether all 7 tables contains entry for a particular ID(eg: 4562). ie, data exists or not.I am using SQL server 2008.Please help me to write a query to check the status.
Try the following command (example for 3 tables T1,T2,T3). It returns 1 if ID = 4562 exists in ALL tables and 0 if at least one table miss this ID.
SELECT
CASE WHEN
(
EXISTS(SELECT ID FROM T1 WHERE ID=4562)
AND EXISTS(SELECT ID FROM T2 WHERE ID=4562)
AND EXISTS(SELECT ID FROM T3 WHERE ID=4562)
)
THEN 1
ELSE 0
END AS [ID_Exists_in_all_tables]
SQLFiddle demo
If you do a basic join rather than left join, the product will only appear if it's in all of the tables.
select * from tab1
join tab2 on tab2.id = tab1.id
join tab3 on tab3.id = tab1.id
join tab4 on tab4.id = tab1.id
join tab5 on tab5.id = tab1.id
Where tab1.id = 1234
etc etc

MS Access Database SQL Query

I have 3 Tables called Invoice, Customer and Company. I want to merge this 3 tables into single using Query. In Invoice Table contain Customer Id and Company Id. How to Join 3 tables ?
I tried Invoice and Customer Table working fine with this query. But I dont have idea to add 3rd table with this.
SELECT RPT_Invoice_Less.InvoiceNumber, RPT_Invoice_Less.Terms,
RPT_Invoice_Less.Invoicedate, RPT_Invoice_Less.OurQuote,
RPT_Invoice_Less.SalesPerson, RPT_Customer.CustomerName,
RPT_Customer.CustomerId, RPT_Customer.ContactPerson,
RPT_Customer.BillingAddress, RPT_Customer.DeliveryAddress,
RPT_Invoice_Less.OrderNumber, RPT_Invoice_Less.ShippingBy,
RPT_Invoice_Less.ShipReferenceNo, RPT_Invoice_Less.Notes,
RPT_Invoice_Less.Price, RPT_Invoice_Less.Discount,
RPT_Invoice_Less.Shipping, RPT_Invoice_Less.Tax,
RPT_Invoice_Less.GrandTotal, RPT_Invoice_Less.Company
FROM RPT_Invoice_Less
INNER JOIN RPT_Customer
ON RPT_Invoice_Less.CustomerId = RPT_Customer.CustomerId;
this code working fine for 2 tables
SELECT RPT_Invoice_Less.InvoiceNumber, RPT_Invoice_Less.Terms, RPT_Invoice_Less.Invoicedate, RPT_Invoice_Less.OurQuote, RPT_Invoice_Less.SalesPerson, RPT_Customer.CustomerName, RPT_Customer.CustomerId, RPT_Customer.ContactPerson, RPT_Customer.BillingAddress, RPT_Customer.DeliveryAddress, RPT_Invoice_Less.OrderNumber, RPT_Invoice_Less.ShippingBy, RPT_Invoice_Less.ShipReferenceNo, RPT_Invoice_Less.Notes, RPT_Invoice_Less.Price, RPT_Invoice_Less.Discount, RPT_Invoice_Less.Shipping, RPT_Invoice_Less.Tax, RPT_Invoice_Less.GrandTotal, RPT_OrionSystem.Company, RPT_OrionSystem.CompanyId
FROM RPT_Invoice_Less
INNER JOIN RPT_Customer
ON RPT_Invoice_Less.CustomerId = RPT_Customer.CustomerId
INNER JOIN RPT_OrionSystem
ON RPT_Invoice_Less.CompanyId = RPT_OrionSystem.CompanyId;
This code showing syntax error.
Help me to add 3rd Company table to this.
Supposing that you have a CompanyID field (or something like that) in the RPT_Customer table or in the RPT_Invoice_Less, it is just a matter to add another INNER JOIN
....
FROM ((RPT_Invoice_Less
INNER JOIN RPT_Customer
ON RPT_Invoice_Less.CustomerId = RPT_Customer.CustomerId)
INNER JOIN RPT_OrionSystem
ON RPT_Invoice_Less.CompanyID = RPT_OrionSystem.CompanyID)

How do I use multiple IDs from a table with an INNER JOIN using SQL?

I have a list of SiteUsers in one table and another table has columns with different types of owners (ID) for the record. For example, the SiteUserID in the SiteUsers table will be used for the SalesRepID, the StaffingManagerID, and RecruiterID in the Fill table. Of course, the SiteUserID is different for each of the values in the Fill table.
I'd like to return the name of the SiteUser for each ID column in the Fill Table.
How do I properly construct a JOIN statement to do this?
I'm guessing this is done through INNER JOIN, but I'm not sure.
My current select statement already has an INNER JOIN as I'm pulling the name of the FillType from another table. I'm using this in an asp.net application.
I'm not sure if this is even possible. Any help is appreciated.
Since each of the IDs in the Fills table allows null, you probably want to LEFT JOIN to the SiteUsers table like so:
SELECT f.FillID, s1.SiteUserLastName 'SalesRep', s2.SiteUserLastName 'StaffingManager', s3.SiteUserLastName 'Recruiter'
FROM Fills f
LEFT JOIN SiteUsers s1 on f.SalesRepID = s1.SiteUserID
LEFT JOIN SiteUsers s2 on f.StaffingManagerID = s2.SiteUserID
LEFT JOIN SiteUsers s3 on f.RecruiterID = s3.SiteUserID
You can always UNPIVOT the results like so:
SELECT
DISTINCT
unpvt.FillID
,unpvt.RepID
,unpvt.RepType
,s.SiteUserFirstName
,s.SiteUserLastName
FROM
(SELECT
FillID
,SalesRepID
,StaffingManagerID
,RecruiterID
FROM Fills
) f
UNPIVOT
(RepID FOR RepType IN
(SalesRepID, StaffingManagerID,RecruiterID)
) AS unpvt
JOIN SiteUsers AS s on unpvt.RepID = s.SiteUserID`
Obviously you can play with exact output (such as substituting the RepType for a different value with a CASE statement or whatnot.
My question is: why the piss-poor design? Instead of having three IDs in the Fills table, you should have a junction table between SiteUsers and Fills to allow many-to-many relationships. IF it were designed with a junction table, you'd never have had to ask this question.
You will have to join the Fill table with the SiteUsers table multiple times, one for each xxxID column in the Fills for which you want the SiteUser name and combine the results using an union as below:
select a.SiteUserId, a.SiteUserFirstName, a.SiteUserLastName
from dbo.SiteUsers a
inner join dbo.Fills b on b.SalesRepId = a.SiteUserId
UNION
select a.SiteUserId, a.SiteUserFirstName, a.SiteUserLastName
from dbo.SiteUsers a
inner join dbo.Fills b on b.StaffingManagerId = a.SiteUserId
UNION
select a.SiteUserId, a.SiteUserFirstName, a.SiteUserLastName
from dbo.SiteUsers a
inner join dbo.Fills b on b.RecruiterId = a.SiteUserId

Categories