inserting huge ammount of rows into mysql useing C# from csv file - c#

i need to insert data from csv files the dates strings ints code every file have about 28column and 6000 rows i need to insert the data of multiple file when clicking on a buttom i tryed to insert them row by row using the sql insert statment but it take around 2 mins to finish and sometimes crash i need some help to make the proccess faster and smooth i usually check if a row is inserted thus no rows with same data are inserted into the database any help would be appreciated

If there are a lot of duplicates in the files, you could load all rows into a list and use the Distinct method in System.Linq before inserting into the database. The type contained in the list would implement the IComparable interface.

Related

getting "DateTime" of the last change in rows in a table in SQL Server?

Is there any way to find "DateTime" of the last change in rows in a table in SQL Server?
The changes (Insert / Update) are submitted by another windows application
And all I have in this table is insert_Date and there is no update_Date (and I can't add any columns or use triggers)
I've tried some queries, but all I got was the number of "User Updates" in a table, not the IDs of modified rows!
I want to get rows which are modified or inserted after a specific DateTime
If the information isn't stored in the table (or in another one by using a trigger for example) then it's impossible to track which rows were inserted after a determined datetime. You might find the time the last operation was executed at a table/index level (by querying sys.dm_db_index_usage_stats) but not at a record level.
You can't find data that doesn't exist!

Importing large text file into sql database

I read txt files and saving rows from this file to local database.The Problem is that program reads 700 000 rows and it takes long time to read whole file. I use linq to sql, firs I read the row, then i split it in to the Table object and then I submit into DB.
For example The row has format
2014-03-01 00:08:02.380 00000000000001100111
this row is splited into DateTime and 20 columns (each column represents 1 channel (CH1 - CH20))
Is there a better (faster) way?
You can use the FileHelpers http://filehelpers.sourceforge.net/ to feed directly into SqlBulkCopy. http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx
That is by far the easiest and fastest approach.
You can still use Linq-2-sql for read/non-batch writes but for bulkinsert is is simply too slow.
That would be slow with linq to sql submitting that many items.
Bulk insert or Bulk update would be preferable for this task which you can't with linq to sql. See also this post. bulk insert with linq-to-sql
I suggest something else than linq to sql for this task.

Upload to db and then validate or Validate each line and then upload to db

I have a requirement where I have to upload a file to db.
File will have approx 100K records daily and one per month 8 to 10 million records.
Also there are some field level validations to be performed.
validations are like: are all fields present, do number field contains valid number, date contains valid date, is number in specified range, do the string format match, etc.
There are 3 ways.
1: Upload to temp and then validate
- Create a temp table (all string columns), have extra error column
- upload all entries to temp table
- run validation, populate error column if needed
- move valid entries to correct table
Cons: entries has to be written twice in db, even correct ones.
2: Upload to db directly
- upload all entries directly to table
- check which entries are not uploaded
Cons: would need to read each line even after upload, so as good as double read
3: Validate and then Upload
- read each line, run all validations on all columns
- if valid then write to db
Cons: file reading must be slow than bulk upload to db.
I am writing app in: C# & ASP.NET, DB is Oracle.
Which one of 3 ways is best?
I'll go with option 2.
100k rows are peanuts to bulk and query validation.
As #aF says, option 2, with the following addition:
Add a table that you can dump 'invalid' rows into. Then, run a statement like this:
INSERT INTO InvalidData
SELECT *
FROM InputData
WHERE restrictedColumn NOT IN ('A', 'B')
OR NOT IS_NUMERIC(numberColumn) -- I'm assuming some version of SQL Server...
then dump 'validated' rows into your actual table, excluding 'invalid' rows:
INSERT INTO Destination
SELECT a.*
FROM InputData as a
EXCEPTION JOIN InvalidData as b
ON b.id = a.id
The INSERT will fail if any (other) 'invalid' data is encountered, but should be discoverable. The 'invalid' table can then be worked to be cleaned up and re-inserted.

C# SQLCe Inserting new rows with DataSet without loading the whole table?

I've a SQLCe database with a table. In the application there is a method which should only inserting new rows to this table.
I'm wondering what is the best practise for doing so. When working with a DataSet one has to load the whole table into it.
To me this seems like a big overkill, since I only want to insert new rows and therefore there is no need to fill the DataSet with the entire table.
On the other hand it also seems very ineffective to manually insert every single row with an explicit INSERT statement.
So in order to do a "batch"-INSERT one would go with a DataSet. Is there a possibility to work with a DataSet without filling the entire table, e.g. get only the schema of the table and then insert the rows to the DataSet?
Many thanks,
Juergen
Working with DataSet against a SQL Server Compact database is vey inefficient . you should use SqlCeResultSet or my wrapper lbrar, that allows you to do batch INSERTs very fast, based on a DataTable, a DataReader or even a List of objects. http://sqlcebulkcopy.codeplex.com

Insert large size of data from txt, csv file to SQL Server table with few invalid rows

I need to insert data from data file into a SQL Server table. The datafile may contains thousands of rows but there is a possibility that some lines in the file are wrong (e.g BIRTH_DATE column value is in wrong format or string cannot be converted to int). I may use bulk insert feature but as some rows in the file are not valid no rows will be inserted because of few invalid ones. I would like to ignore wrong lines and ideally get line number of each invalid row. I would like to do with the highest performance. Any ideas?
Thanks in advance
-Petro
I usually handle this kind of situation in one of two ways:
Bulk copy the data into a staging table. The staging table might be all VARCHAR columns or it might be one long VARCHAR column for each row depending on the format and quality of the data. Once it's in the staging table I can then do checks and extract out any rows with issues to be handled later with human intervention.
SSIS includes nice facilities to redirect rows with errors. Check the appropriate documentation in MSDN online and if you have any specific issues you can create a new question here and I'm sure that people can help you through it.

Categories