Reading Excel file into database - c#

I'm reading (asp.net mvc c# site) an excel file with multiple worksheets into a sql database where the first sheet goes into table A and an auto-increment column generates a unique id or PK.
Now the second worksheet goes into TABLE B but it has a composite key made up of auto-increment column for TABLE B and the value from Table A.
My question is how do I get Table A's PK into Table B while reading the excel file?
I'm not sure if this question is better suited for database design or c#.

Depending on how you're doing your insert, you could run a query like
SELECT scope_identity()
which will get your last inserted PK.

Related

Reading Excel File with multiple worksheets into database

I'm in a pickle here. I'm reading an excel file that has three worksheet using exceldatareader's dataset function. After reading the file I have a dataset that contains three tables, one for each worksheet.
Now I need to export this data row by row into two different sql tables. These tables have an auto incremental primary key and PK of Table A makes up the FK in TableB . I'm using a stored procedure with SCOPE_IDENTITY() to achieve that.
Each column in excel file is a variable in stored procedure so as I iterate the sheets row by row I can assign these variables and then send them through stored procedure.
Now the question is how do I iterate through this dataset and assign each row[col] to variable for my stored procedure.
Thanks for help.
Update More Info:
1.Sheet 1 goes to TABLE 1 in sql
2.Sheet 2 goes to Table 2 in sql
3.Sheet 3 also goes to Table 2 in sql
4.Table 1 has 1 to many relationship to Table 2
Here is some half-pseudocode boilerplate to start from:
var datatable = dataset.Tables[0];
using(var proc = connection.CreateCommand(...))
{
proc.Parameters.Add("#firstcolumnname", SqlDbType.Int);
foreach(DataRow dr in datatable.Rows)
{
/* check for DbNull.Value if source column is nullable! */
proc.Parameters["#firstcolumnname"].Value = dr["FirstColumnName"];
proc.ExecuteNonQuery();
}
}
But this makes much sense only if you're doing some data processing inside the stored procedure. Otherwise this calls for a bulk insert, especially if you have lots (thousands) of rows in the excel sheets.
The one-to-many relation (foreign key) can be tricky, if you can control the order of inserts then simply start with the "one" table and then load the "many" table aftewards. Let the stored procedure match the key values, if they are not part of the Excel source data.
Whole different story if you are in a concurrent multi-user environment and the tables are accessed and written to, while you load. Then you'd have to add transactions to preserve referential integrety throughout the process.

How to remove duplicates while exporting from different excel files to database

I want to export data from different excel files to database ,
while exporting if the database table already contains the same row data present in the excel then that row should not be loaded to database.
can anybody provide me the code.
I know already how to export from excel to database, with this if am exporting same data to database then am seeing two rows with same data.
Thanks in advance
A simple solution would be to import data to a staging table and add non duplicates to the main table using
insert into tartget_table(col_list)
select t1.col_list from staging as t1 where not exists
(select * from target_table as t2 where t1.keycol=t2.keycol)
its better to export the data of excel to a temp table and then using distinct or any other select only distinct rows and insert then to your own tables

Import modified data in Excel to Database Table

I'm working on the function Import, i have an excel file which contains some data, that later will be edited by the user, I managed to do the import excel by SmartXLS in C# and update all data to SQL Server Database, however, what I did is to fetch all data in excel file and update all rows into the SQL Table, which affects to the performance and I also updated unedited rows.
I would like to ask that is there any way that I can get only modified cells, rows in Excel and update to the correponding data in SQL Table?
var workbook = new WorkBook();
workbook.read(filePath);
var dataTable = workbook.ExportDataTable();
Just a Scenarion, maybe it helps you to understand what gordatron and i were talking about:
Following Situation:
There is a Table "Products" wich is central storage place for product informations
and a table "UpdatedProducts" which structure looks exactly like "Products" table but data
maybe different. Think of following scenarion: you export product table to excel in the morning. the whole
day you delete, add, update products in your excel table. At the end of the day you want to re-import your excel
data to "Products" table. What you need:
delete all records from "UpdatedProducts"
insert data from excel to
"UpdatedProducts" (bulk insert if possible)
update the "Products"
table
Then a Merge-Statement could look like this:
MERGE Products AS TARGET
USING UpdatedProducts AS SOURCE
ON TARGET.ProductID = SOURCE.ProductID
WHEN MATCHED AND TARGET.ProductName <> SOURCE.ProductName OR TARGET.Rate <> SOURCE.Rate
THEN UPDATE SET TARGET.ProductName = SOURCE.ProductName,
TARGET.Rate = SOURCE.Rate
WHEN NOT MATCHED BY TARGET
THEN INSERT (ProductID, ProductName, Rate)
VALUES (SOURCE.ProductID, SOURCE.ProductName, SOURCE.Rate)
WHEN NOT MATCHED BY SOURCE
THEN DELETE
What this Statement does:
WHEN MATCHED:
Data exist in both tables, we update data in "Products" if ProductName or Rate is different
WHEN NOT MATCHED BY TARGET:
Data exist in staging table but not in your original table, we add them to "Products"
WHEN NOT MATCHED BY SOURCE:
Data exists in your original table but not in staging table, thy will be deleted from "Products"
Thanks a lot to http://www.mssqltips.com/sqlservertip/1704/using-merge-in-sql-server-to-insert-update-and-delete-at-the-same-time/ for this perfect example!

Upload an excel file and add/update record in Database

I am working on a functionality where i upload an EXCEL file and add/update those record(sheet 1) into SQL server. Now i was able to add the data in SQL server with this link.
But what it does, it truncates the table and adds the value again. I don't want to do that because there are 30% of data are generic and can not be deleted. There is field called OSID in excel sheet and same in database.That is the unique key in my table. What i want to do is update only those values in database where it matches with the key from database from the excel sheet.
I would suggest using the code from that link to import the excel data to a separate staging table and update your main table with a join to your staging table.
From that link, the table name they used was tdatamigrationtable. Your update query would look something like
update m set m.col1=s.col1, m.col2=s.col2, m.col3=s.col3
from dbo.mytable m
inner join dbo.tdatamigrationtable s on m.osid = s.osid;

TSQL Large Insert of Relational Data, W/ Foreign Key Upsert

Relatively simple problem.
Table A has ID int PK, unique Name varchar(500), and cola, colb, etc
Table B has a foreign key to Table A.
So, in the application, we are generating records for both table A and table B into DataTables in memory.
We would be generating thousands of these records on a very large number of "clients".
Eventually we make the call to store these records. However, records from table A may already exist in the database, so we need to get the primary keys for the records that already exist, and insert the missing ones. Then insert all records for table B with the correct foreign key.
Proposed solution:
I was considering sending an xml document to SQL Server to open as a rowset into TableVarA, update TableVarA with the primary keys for the records that already exist, then insert the missing records and output that to TableVarNew, I then select the Name and primary key from TableVarA union all TableVarNew.
Then in code populate the correct FKs into TableB in memory, and insert all of these records using SqlBulkCopy.
Does this sound like a good solution? And if so, what is the best way to populate the FKs in memory for TableB to match the primary key from the returned DataSet.
Sounds like a plan - but I think the handling of Table A can be simpler (a single in-memory table/table variable should be sufficient):
have a TableVarA that contains all rows for Table A
update the ID for all existing rows with their ID (should be doable in a single SQL statement)
insert all non-existing rows (that still have an empty ID) into Table A and make a note of their ID
This could all happen in a single table variable - I don't see why you need to copy stuff around....
Once you've handled your Table A, as you say, update Table B's foreign keys and bulk insert those rows in one go.
What I'm not quite clear on is how Table B references Table A - you just said it had an FK, but you didn't specify what column it was on (assuming on ID). Then how are your rows from Table B referencing Table A for new rows, that aren't inserted yet and thus don't have an ID in Table A yet?
This is more of a comment than a complete answer but I was running out of room so please don't vote it down for not being up to answer criteria.
My concern would be that evaluating a set for missing keys and then inserting in bulk you take a risk that the key got added elsewhere in the mean time. You stated this could be from a large number of clients so it this is going to happen. Yes you could wrap it in a big transaction but big transactions are hogs would lock out other clients.
My thought is to deal with those that have keys in bulk separate assuming there is no risk the PK would be deleted. A TVP is efficient but you need explicit knowledge of which got processed. I think you need to first search on Name to get a list of PK that exists then process that via TVP.
For data integrity process the rest one at a time via a stored procedure that creates the PK as necessary.
Thousands of records is not scary (millions is). Large number of "clients" that is the scary part.

Categories