Migrating Data to SQL Server 2008 - c#

I am trying to migrate data from an Informix database to SQL Server 2008. I've got quite a lot of data to move. I've been try multiple methods to get the data over, and so far SQLBulkCopy in multiple chunks seems to be the fastest that I can find. Does anyone know of a faster means of getting the data over? I'm trying to cut down on the transfer time so that on my cut-over date I don't run out of time to do the full cut-over. Thanks.

As you mentioned, I think that the bcp command is the fastest solution.
you can make csv file from your data and then import those to your db by bcp command.

There isn't much more you can do to get this work completed faster. One thing you might want to look at though is the recover model for the sql database. If it's currently set to Full, you're going to end up slowing down quite a bit as the transaction log fills up.
http://msdn.microsoft.com/en-us/library/ms189275.aspx
Hope that helps.

If you can use an Ole or ODBC connection to your Informix database, then SSIS may be the best option.

Related

How to add columns to a datareader

I have almost the exact same issue as the scenario (linked) below, but unfortunately i'm unable to recreate the solutions succesfully.
I have a c# application using SQL Bulk Import with a datareader and writetoserver, where it's the SQLDatReader or an OracleDataReader, and i need to add columns to the result set.
I can not do it on the source sql statement.
I can not load a data table first and modify it (as it's 100's of gb's of data, almost a terabyte).
How to add columns to DataReader
can anyone provide a working example an help "push" me over this problem?
I temporarily found a solution of using SQL Server Integration Services (SSIS), but what i found while watching it run is it downloads all the data to a dts_buffer, than does the column modifications and then pumps the data into sql server, try doing that with a couple 100gb of data and it is not a good performing thing, if you can even get your infrastructure to build you a 24 core VM with 128gb of memory).
I finally have a small working example, the codeproject (jdweng) was helpful.
I will pose followup, i've tested with sql server (sqldatareader), need to do a test with oracle data reader.
One of the cases i was trying was converting a oracle unique id (stored as a string) to sql server as a uniqueidentifier. I want convert that on the fly, there is no way to adjust the source oracle statement (ADyson) to return a compatible datatype to sql server. Altering a 1tb table afterwards from varchar(40) to uniqueidentifier is painful, but if i could just change as part of the bulk insert, it'd be quick.
and i think now i will be able to.

Optimized process of moving records in the range of 1 million to 10 million

What would you do if you had to massage data and move it from a database on one server to a database on another server?
Massage data was limited to using CONVERT or CAST. This process was called by a Data Loader in C#.NET. The SQL scripts were executed in SQL Server 2008.
Would you suggest this process be done using SQLBulkCopy, LINQ to SQL or should this be only done using a INSERT........ SELECT in TSQL?
The data could consist in the range of 1 million to 10 million rows.
I would appreciate your views on this process to verify an opitimized process on performing the above operation.
LINQ-to-SQL should be avoided here; it isn't optimised for this (it is aimed at individual objects/records - not bulk). A cross-db (and possibly linked-server) insert/select is possible, but I would be looking at bulk options. I suspect SSIS (ex DTS) might be of use here - it is pretty much designed for this. If you need a managed option, a data-reader from the source (ExecuteDataReader()) connected to SqlBulkCopy to the target will perform the same function as SSIS (using the same bulk protocol).
I have some issues like these before. I solve this issue by SQLBulkCopy. bcp is great in performance.
Looking on amount of data you're going to operate I would personaly choose ReplicationServices http://msdn.microsoft.com/en-us/library/ms151198.aspx and avoid programming solution , if it's possible.

Programmatically saving a SQL Server database to xml files and restoring it again

I want to save a whole MS SQL 2008 Database into XML files... using asp.net.
Now I am bit lost here.. what would be the best method to achieve this? Datasets?
And I need to restore the database later again.. using these XML files. I am thinking about using datasets for reading the tables and writing to xml and using the SQLBulkCopy class to restore the database again. But I am not sure whether this would be the right approach..
Any clues and tips for me?
If you will need to restore it on the same server type (I mean SQL Server 2008 or higher) and don't care about ability to see actual data inside the XML do the following:
Programmatically backup the DB using "BACKUP DATABASE" T-SQL
Compress the backup
Convert the backup to Base64
Place the backup as the content of the XML file (like: <database name="..." compressionmethod="..." compressionlevel="...">the Base64 content here</database>
On the server where you need to restore it, download the XML, extract the Base64 content, use the attributes to know what compression was used. Decompress and restore using T-SQL "RESTORE" command.
Would that approach work?
For sure, if you need to see the content of the database, you would need to develop the XML scheme, go through each table etc. But, you won't have SPs/Views and other items backed up.
Because you are talking about a CMS, I'm going to assume you are deploying into hosted environments where you might not have command line access.
Now, before I give you the link I want to state that this is a BAD idea. XML is way too verbose to transfer large amounts of data. Further, although it is relatively easy to pull data out, putting it back in will be difficult and a very time consuming development project in itself.
Next alert: as Denis suggested, you are going to miss all of your stored procedures, functions, etc. Your best bet is to use the normal sql server backup / restore process. (Incidentally, I upvoted his answer).
Finally, the last time I dealt with XML and SQL Server we noticed interesting issues that cropped up when data exceeded a 64KB boundary. Basically, at 63.5KB, the queries ran very quickly (200ms). At 64KB, the query times jumped to over a minute and sometimes quite a bit longer. We didn't bother testing anything over 100KB as that was taking 5 minutes on a fast/dedicated server with zero load.
http://msdn.microsoft.com/en-us/library/ms188273.aspx
See this for putting it back in:
How to insert FOR AUTO XML result into table?
For kicks, here is a link talking about pulling the data out as json objects: http://weblogs.asp.net/thiagosantos/archive/2008/11/17/get-json-from-sql-server.aspx
you should also read (not for the faint of heart): http://www.simple-talk.com/sql/t-sql-programming/consuming-json-strings-in-sql-server/
Of course, the commentors all recommend building something using a CLR approach, but that's probably not available to you in a shared database hosting environment.
At the end of the day, if you are truly insistent on this madness, you might be better served by simply iterating through your table list and exporting all the data to standard CSV files. Then, iterating the CSV files to load the data back in ala C# - is there a way to stream a csv file into database?
Bear in mind that ALL of the above methods suffer from
long processing times due to the data overhead; which leads to
a high potential for failure due to the various time outs (page processing, command, connection, etc); and,
if your data model changes between the time it was exported and reimported then you're back to writing custom translation code and ultimately screwed anyway.
So, only do this if you really really have to and are at least somewhat of a masochist at heart. If the purpose is simply to transfer some data from one installation to another, you might consider using one of the tools like SQL Compare and SQL Data Compare from RedGate to handle the transfer.
I don't care how much (or little) you make, the $1500 investment in their developer bundle is much cheaper than the months of time you are going to spend doing this, fixing it, redoing it, fixing it again, etc. (for the record I do NOT work for them. Their products are just top notch.)
Red Gate's SQL Packager lets you package a database into an exe or to a VS project, so you might want to take a look at that. You can specify which tables you want to consider for data.
Is there any specific reason you want to do this using xml?

whats the best way to load a large amount of SQL data in .net and generate a CSV file?

i have a sql proc that returns 20,000 + records and want to get this data into a CSV for a SQL2005 bulk load operation.
i think using a data set is overkill since i need only forward only read access to the data.
now i have a data reader but dont think iterating the data-reader is a good idea cause it will lock the oracle DB i am getting the 20,000 records from for some time while its done doing its thing.
logically i am thinking to create a disconnected snapshot of the data in a data table maybe and use that to generate my csv file.
i dont often develop such ETL apps so i wanted to know whats the gold standard on this type of operation.
thoughts?
also, allow me to mention that this needs to be a console app since CORP rules wont allow linked servers and anything cool - so that means SSIS is out.
Since you are worried about doing the iteration of the datareader yourself i could recommend using SqlBulkCopy class.
It lets you load data into an sql server database from any source than can be read with an IDataReader instance
Might solve your potential locking issue.

MARS for MySQL in C#

Today I've implemented a nasty hack in my code where every request to the database opens it's own connection due to the fact that I couldn't find any way to enable MARS (multiple active record sets) when communicating with a MySQL database.
In my C# program I do a lot of parallel work, which isn't a problem regarding databases such as MSSQL 2005 and 2008 (append ;MultipleActiveResultSets=true to your connection string) and SQLite (supports it "out of the box") and you are able to retrieve two datasets from the database at the same time.
Things that I do know: it's expensive to open a connection to the database and their for I would like to keep these to a minimum.
Any suggestions?
Maybe a best way to handle this scenario type is to implement that parallel data processing into your database, by using a store procedure or a cursor, so you don't need to deal with a very specific database feature.
Any suggestions?
I don't know if there really is no way to enable MARS with MySQL, but if that's correct, then my best suggestion is to implement connection-pool.
Look at the MySQL documentation for connection string parameters (no MARs) -
http://dev.mysql.com/doc/refman/5.5/en/connector-net-connection-options.html
Things that I do know: it's expensive to open a connection to the database and their for I would like to keep these to a minimum.
Utilize connection pooling all the way! (just make sure you use the exact same connection string each time).
Mysql doesn't support MARS,instead you have to save the data and close the reader or use another connection for the new reader.

Categories