I have almost the exact same issue as the scenario (linked) below, but unfortunately i'm unable to recreate the solutions succesfully.
I have a c# application using SQL Bulk Import with a datareader and writetoserver, where it's the SQLDatReader or an OracleDataReader, and i need to add columns to the result set.
I can not do it on the source sql statement.
I can not load a data table first and modify it (as it's 100's of gb's of data, almost a terabyte).
How to add columns to DataReader
can anyone provide a working example an help "push" me over this problem?
I temporarily found a solution of using SQL Server Integration Services (SSIS), but what i found while watching it run is it downloads all the data to a dts_buffer, than does the column modifications and then pumps the data into sql server, try doing that with a couple 100gb of data and it is not a good performing thing, if you can even get your infrastructure to build you a 24 core VM with 128gb of memory).
I finally have a small working example, the codeproject (jdweng) was helpful.
I will pose followup, i've tested with sql server (sqldatareader), need to do a test with oracle data reader.
One of the cases i was trying was converting a oracle unique id (stored as a string) to sql server as a uniqueidentifier. I want convert that on the fly, there is no way to adjust the source oracle statement (ADyson) to return a compatible datatype to sql server. Altering a 1tb table afterwards from varchar(40) to uniqueidentifier is painful, but if i could just change as part of the bulk insert, it'd be quick.
and i think now i will be able to.
Related
I have a conceptual question.
I have two databases which have the same structure. One database has already contained a lot of data. These data should be transferred to the other database via Select and Insert.
How can I do this data migration with the highest performance?
My first approach was to sort all the tables in a list where the tables which contain foreign keys will be stored behind the referenced tables. But with this solution it will be impossible to start parallel processing.
The second idea was to create a custom type which contains the tablename and the tablenames of the referenced tables and a bool flag which stores whether the data in the table have been copied. This type is stored for each table in a list. Then I start a new thread that checks before copying whether the referenced tables have already been created for each table. If not, I execute Thread.Sleep() after which I will check it again.
Is there a well performing approach to this problem?
Any suggestions will be helpful.
EDIT:
The old database is a SQL Base database.
The new database is a ms sql server database.
You may use either:
1) SQL Server Replication
Or
2) SQL Server Merge statement
You may use SQL Server Linked Servers to connect different database platforms (e.g. sql server, mysql, db2, ...)
Best advice : Stick with SQLBase.
Second best advice , use the SQLBase UNLOAD command via SQLTalk. This will write all DDL statements required to recreate the database else where - including Triggers , Stored Procs, Indexes etc. plus all the Data to load , if you use the right options, to an external file . This file can optionally then be edited programmatically if need be to be in a sql server format ( not much difference ) . There are many options to the UNLOAD command which can't be written here in detail , but here's a link to the syntax.
Note that SQLBase v12 has been released in recent months and performance has increased tenfold. With the right tuning and indexes etc . it will outstrip Sql Server in terms of efficiency and performance. On a 100Gb database our response times have gone from 50 seconds to 3 seconds with no additional work .
I have two databases. One of them belongs to a CRM software and is the source.
The other one will be the destination used by a tool I'm developing.
The destination will contain a table ADDRESSES with a subset of the columns of a table of the same name in the source database.
What is the best (most efficient) way to copy the data between those databases (btw: they're on different SQL Server instances if that's important).
I could write a loop which does INSERT into the destination for each row obtained from the source but I don't think that this is efficient.
My thoughts and information:
The data won't be altered on its way from source to destination
It will be altered on its way back
I don't have the complete structure of the source but I know which fields I need and that they're warranted to be in the source (hence, access to the rows obtained from source isn't possible using the index of columns)
I can't use LINQ.
Anything leading me in the right direction here is appreciated.
Edit:
I really need a C# way to copy the data. I also need to know how to merge the copied rows back to the source. Is it really necessary (or even best practise) to do this row after row?
Why write code to do this?
The single fastest and easiest way is just to use SQL Server's bcp.exe utility (bcp: Bulk Copy Program).
Export the data from the source server.
Zip it or tar it if it needs it.
FTP it over to where it needs to go, if you need to move it to another box.
Import it into the destination server.
You can accomplish the same thing via SQL Server Management Studio in a number of different ways. Once you've defined the task, it can be saved and it can be scheduled.
You can use SQL Server's Powershell objects to do this as well.
If you're set on doing it in C#:
write your select query to get the data you want from the source server.
execute that and populate a temp file with the output.
execute SQL Server's bulk insert statement against the destination server to insert the data.
Note: For any of these techniques, you'll need to deal with identity columns if the target table has them. You'll also need to deal with key collisions. It is sometimes easier to bulk load the data into a perma-temp table first, and then apply the prerequisite transforms and manipulations to get it to where it needs to go.
According to your comment on Jwrit's answer, you want two way syncs.
If so, you might want to look into Microsoft Sync Framework.
We use it to sync 200+ tables on Premise SQL to SQL Azure and SQL Azure to SQL Azure.
You can use purely C#. However, it might offer a lot more than you want, or it might be over kill for a small project.
I'm just saying so that you can have different option for your project.
If these databases exist on two servers you can setup a link between the servers by executing sp_addlinkedserver there are instructions for setting this up here. This may come in handy if you plan on regularly "Sharing" data.
http://msdn.microsoft.com/en-us/library/ff772782.aspx
Once the servers are linked a simple select statement can copy the rows from one table to another
INSERT INTO db1.tblA( Field1,Field2,Field2 )
SELECT Field1,Field2,Field2 FROM db2.tblB
If the Databases are on the same instance you only need to execute similar SQL to the above
If this is one time - the best bet is normally SSIS (SQL server integration services), unless there are complex data transformations - you can quickly and easily do column mappings and have it done (reliably) in 15 mins flat......
I am looking for an idea or some direction. I have to transfer data from one database to another both are structurally and schema wise same.
Its a complete database with maybe 70 tables and having relationship between tables at different levels. Even though i ' m going to mess up the identity when i move across database but as of now i am ok with it.
Idea which i thought was to load required data from all table into an XML and then create connection to second database and push data from this XML its kind of repeated and not best way at all. So looking for direction.
Can i use entity framework for this somehow??
I cannot use SSIS for this it has to be C# Sorry.
You can create a linked server as stated in the comments to your question. You seemed to indicate that you know how to do this, but in case not, you can do this from SQL Server Management Studio by drilling down to "Server Objects > Linked Servers" beneath your source database on the source database server, then right-click, "New Linked Server", etc.
Then you would use a statement like this, for example, from your C# code:
insert into DestServer.DBName.dbo.TableName
select * from SourceServer.DBName.dbo.TableName
Assuming you are connected to 'SourceServer' and that 'SourceServer' maintains a linked server object pointing to 'DestServer'. Note: you don't actually need to use the fully-qualified name for the table on 'SourceServer', but I've put it there for clarity. I.E. you could also do this:
insert into DestServer.DBName.dbo.TableName
select * from TableName
Don't forget to setup the permissions properly in your linked server object so that your query can write to the table in the destination server. You can do this any number of ways, and often (because I work in a small environment where it's maintained by just me and a couple other folks) I just use the "sa" login:
Yes, can use linked servers in .NET.
You just use the 4 part name.
What you can do in TSQL in SSMS you can do in a .NET SQLcommand.
My experience is that you get better performance connecting to the server you are writing to.
I'm trying to learn to use SQLite, but I'm very frustrated and confused. I've gotten as far as finding System.Data.SQLite, which is apparently the thing to use for SQLite in C#.
The website has no documentation whatsoever. The "original website", which is apparently obsolete from 2010 onwards, has no documentation either. I could find a few blog tutorials, but from what I can tell their method of operation is basically:
Initialize a database connection.
Feed SQL statements into the connection.
Take out stuff that comes out of the connection.
Close connection.
I don't want to write SQL statements in my C# code, they're ugly and I get no assistance from the IDE because I have to put the SQL code in strings.
Can't I just:
Create a DataSet.
Tell the DataSet that it should correspond to the SQLite database MyDB.sqlite.
Manipulate the DataSet using its member functions.
Not worry about SQLite because the DataSet automatically keeps itself in sync with the SQLite database on disc.
I know that I can fill a DataSet with the contents of a database, but if I want access to the entire database I will have to fill the DataSet with all of its contents. If my database is 1 GB, I have just used up 1 GB of RAM (not to mention the time needed to write all of it at once).
Can't I simply take a SQLite database connection and pretend it's just an ordinary DataSet (that perhaps needs to be asked occasionally if it's done syncing yet)?
The answer to the question is no.
No you cannot simply take a SQLite connection pretend it's just a DataSet.
If you don't want to code SQL statements then consider Entity Framework.
Using SQLite Embedded Database with Entity Framework and Linq-to-SQL
You shouldn't treat a DataSet as a database. It's just a result of a query.
You query the database to get a subset of data (you never want ALL the data from your DB) and this subset is used to populate your DataSet.
You are required to synchronize your changes manually because DataSet doesn't know which updates should be a part of which transaction. This is your system knowledge.
The DataSet is an in memory cache and will only synchronize to the underlying data store when the developer allows it. You could put a timer wrapper around in and do it on a schedule but you still need to keep the Dataset and data store synchronized manually.
Storing 1GB+ of data is really not recommended as the memory usage would be very high and the performance very low. You also don't want to be sending that amount of data over a network or god forbid an internet connection.
Why would you want to keep 1GB of data in memory?
I am trying to replace a DTS access exporter package with a exe we can call from our stored procedures (using xp_cmdshell).
We are in the middle of a transition between SQL 2000 and SQL 2005, and for the moment if we can not use DTS OR SSIS that would be the best options.
I believe I have the following options:
Using a SQL data reader to read SQL records, and using ADO.net to insert the read records into Access.
I have implemented this and it is WAY too slow. This is not a option
Setting up Linked tables in access, then getting access to pull the data out of sql.
If anyone has any experience in doing this I would be grateful for some code examples or pointing out some resources?
If there are any other options for transferring large amounts of data from SQL into a Access database that would be awesome, but performance is a big issue as we can be dealing with up to 1mil records per table.
Have you tried this?
Why not creating a linked table in Access, and pulling data from Sql Server instead of pushing from Sql to Access ?
I've done plenty of cases where I start with an Access database, attach to SQL Server, create a Create Table or Insert Querydef, and write some code to execute the querydef, possibly with arguments. But there are a lot of assumptions I would need to make about your problem and your familiarity with Access to go into more detail. How far can you get with that description?
I have ended up using Access interop, thanks to le dorfier for pointing me in the direction of the import function which seems to be the simplest way..
I now have something along these lines:
Access.ApplicationClass app = new Access.ApplicationClass();
Access.DoCmd doCmd = null;
app.NewCurrentDatabase(_args.Single("o"));
doCmd = app.DoCmd;
//Create a view on the server temporarily with the query I want to export
doCmd.TransferDatabase(Access.AcDataTransferType.acImport,
"ODBC Database",
string.Format("ODBC;DRIVER=SQL Server;Trusted_Connection=Yes;SERVER={0};Database={1}", _args.Single("s"), _args.Single("d")),
Microsoft.Office.Interop.Access.AcObjectType.acTable,
viewName,
exportDetails[0], false, false);
//Drop view on server
//Releasing com objects and exiting properly.
Have you looked at bcp? It's a command line utility that's supposed work well for importing and exporting large amounts of data. I've never tried to make it play nice with Access, but it's a great lightweight alternative to DTS and/or SSIS.
Like others have said, the easiest way I know to get data into an Access mdb is to set things up in Access to begin with. Roughly speaking:
Create linked tables to the SQL data you want to export. (in Access: File --> get ecternal data --> link tables) This just gives you a connection to sql server.
Create a local table that represents teh schema of the data you want to export. (on the tables tab, click the "new" button and follow your nose).
Create an Update query that selects data from the linked tables (SQL Server) and appends rows to the local table (access mdb).
On the macros tab, create a new macro that executes the query you just created above (I can't recall the exact "action" to use, but it's something like OpenQuery or RunQuery); name the macro "autoexec", which will cause it to automatically run when the mdb is opened.
Use a script (or whatever) to copy and open the mdb when appropriate; the autoexec macro will kick things off and the query will copy data from SQL server to the mdb.