SqlBulkCopy vs SSIS - c#

I was trying to write a program to perform large table, around 2 billions records, to another table. It looks to me that SqlBulkCopy will need to wait until all data is read from the SqlDataReader before inserting. If I use the same query and table in SSIS, SSIS was started right away and I could see data inserting from the target table.
Am I coding it correctly? How can I make it SSIS alike?
using (SqlConnection conn = new SqlConnection(connectionString))
{
conn.Open();
using (SqlCommand cmd = new SqlCommand(bukCopyData.SourceQuery, conn))
{
cmd.CommandTimeout = 0;
using (SqlDataReader reader = cmd.ExecuteReader())
{
performBulkCopy(connectionStringDest, DestinationTable, reader);
}
}
}
private void performBulkCopy(string connectionString, string destinationTable, SqlDataReader reader)
{
using (SqlBulkCopy sbc = new SqlBulkCopy(connectionString,
SqlBulkCopyOptions.KeepIdentity | SqlBulkCopyOptions.CheckConstraints | SqlBulkCopyOptions.KeepNulls
))
{
sbc.DestinationTableName = destinationTable;
sbc.BatchSize = 102400;
sbc.BulkCopyTimeout = 0;
try
{
sbc.WriteToServer(reader);
} catch (Exception e)
{
throw;
} finally
{
reader.Close();
}
}
}

Actualy, the EnableStreaming property seems to work. I had to make sure the column order in the query is matching with the target table. Not sure why it needs a long time to throw the error.

Related

Why can't I get data from my NPGSQL server?

I need to write a C# program and it has to be able to manage my data on my server. I have an NPGSQL server set up with a data table, I can write data into it, but I just can't get to read the data while running a program.What do I do wrong?
public NpgsqlDataReader reader;
public NpgsqlCommand InsertCommand = new NpgsqlCommand();
public String sConnectionString;
public Npgsql.NpgsqlConnection Conn;
public void DataBaseOpen()
{
sConnectionString = "Server=192.168.1.100;Port=5432;Username=postgres;Password=admin;Database=analoginput;Pooling=false;MinPoolSize=1;MaxPoolSize=999;Timeout=15;";
Conn = new Npgsql.NpgsqlConnection(sConnectionString);
InsertCommand = Conn.CreateCommand();
Conn.Open();
}
public void DataBaseClose()
{
Npgsql.NpgsqlConnection.ClearAllPools();
Conn.Close();
}
InsertCommand.CommandText = "Select * From public.sensorlog WHERE \"date\" > '2019.07.08.' And \"date\" < '2019.07.10.' order by Date asc;";
System.Windows.MessageBox.Show(InsertCommand.CommandText);
Npgsql.NpgsqlDataReader reader = InsertCommand.ExecuteReader();
System.Data.DataTable CSV = new System.Data.DataTable();
while (reader.Read())
{
CSV.Load(reader);
}
I want to load the data into the CSV datatable, but I just can't get it to work. The datatable is just empty.
What if you refactor your code to something like below.
The using statement will guarantee that your connection & command is closed/disposed when it goes out of scope and with the try/catch block you will catch any exceptions and report it to the UI via messagebox. This will assist in capturing exceptions if there are any.
public Npgsql.NpgsqlConnection DatabaseOpen()
{
var sConnectionString = "Server=192.168.1.100;Port=5432;Username=postgres;Password=xxx;Database=analoginput;Pooling=false;MinPoolSize=1;MaxPoolSize=999;Timeout=15;";
var Conn = new Npgsql.NpgsqlConnection(sConnectionString);
Conn.Open();
return Conn;
}
public void Main()
{
try
{
using (var conn = DatabaseOpen())
{
using (var InsertCommand = conn.CreateCommand())
{
InsertCommand.CommandText = "Select * From public.sensorlog WHERE \"date\" > '2019.07.08.' And \"date\" < '2019.07.10.' order by Date asc;";
System.Windows.MessageBox.Show(InsertCommand.CommandText);
Npgsql.NpgsqlDataReader reader = InsertCommand.ExecuteReader();
System.Data.DataTable CSV = new System.Data.DataTable();
while (reader.Read())
{
CSV.Load(reader);
}
}
}
}
catch (Exception ex)
{
System.Windows.MessageBox.Show(ex.Message);
}
finally
{
Npgsql.NpgsqlConnection.ClearAllPools();
}
}

Cannot Use DbContext.Query inside a transaction

I am using EF6 to query a backend database. User can customize a temporary table and query the data from the temporary table. I am using
DataTable result = context.Query(queryStatement);
to get the result and it has been working fine.
Now the query is needed among a serious of other sqlcommand and a transaction is needed. So I have
public static DataTable GetData()
{
using (MyDbContext context = new MyDbContext())
using (DbContextTransaction tran = context.Database.BeginTransaction())
{
try
{
int rowAffected = context.Database.ExecuteSqlCommand(
"UPDATE [MyDb].dbo.[TableLocks] SET RefCount = RefCount + 1 WHERE TableName = 'TESTTABLE1'");
if (rowAffected != 1)
throw new Exception("Cannot find 'TestTable1'");
//The following line will raise an exception
DataTable result = context.Query("SELECT TOP 100 * FROM [MyDb].dbo.[TestTable1]");
//This line will work if I change it to
//context.Database.ExecuteSqlCommand("SELECT TOP 100 * FROM [MyDb].dbo.[TestTable1]");
//but I don't know how to get the result out of it.
context.Database.ExecuteSqlCommand(
"UPDATE [MyDb].dbo.[TableLocks] SET RefCount = RefCount - 1 WHERE TableName = 'TestTable1'");
tran.Commit();
return result;
}
catch (Exception ex)
{
tran.Rollback();
throw (ex);
}
}
}
But this throws an exception while executing context.Query
ExecuteReader requires the command to have a transaction when the connection
assigned to the command is in a pending local transaction. The Transaction
property of the command has not been initialized.
And when I read this article: https://learn.microsoft.com/en-us/ef/ef6/saving/transactions
It says:
Entity Framework does not wrap queries in a transaction.
Is it the reason cause this issue?
How can I use context.Query() inside a transaction?
What else I can use?
I tried all other method, none of them work - because the return datatype cannot be predicted before hand.
I just realized that, the Query method is defined in MyDbContext!
public DataTable Query(string sqlQuery)
{
DbProviderFactory dbFactory = DbProviderFactories.GetFactory(Database.Connection);
using (var cmd = dbFactory.CreateCommand())
{
cmd.Connection = Database.Connection;
cmd.CommandType = CommandType.Text;
cmd.CommandText = sqlQuery;
using (DbDataAdapter adapter = dbFactory.CreateDataAdapter())
{
adapter.SelectCommand = cmd;
DataTable dt = new DataTable();
adapter.Fill(dt);
return dt;
}
}
}
May be you are missing this section -
you are free to execute database operations either directly on the
SqlConnection itself, or on the DbContext. All such operations are
executed within one transaction. You take responsibility for
committing or rolling back the transaction and for calling Dispose()
on it, as well as for closing and disposing the database connection
And then this codebase -
using (var conn = new SqlConnection("..."))
{
conn.Open();
using (var sqlTxn =
conn.BeginTransaction(System.Data.IsolationLevel.Snapshot))
{
try
{
var sqlCommand = new SqlCommand();
sqlCommand.Connection = conn;
sqlCommand.Transaction = sqlTxn;
sqlCommand.CommandText =
#"UPDATE Blogs SET Rating = 5" +
" WHERE Name LIKE '%Entity Framework%'";
sqlCommand.ExecuteNonQuery();
using (var context =
new BloggingContext(conn, contextOwnsConnection: false))
{
context.Database.UseTransaction(sqlTxn);
var query = context.Posts.Where(p => p.Blog.Rating >= 5);
foreach (var post in query)
{
post.Title += "[Cool Blog]";
}
context.SaveChanges();
}
sqlTxn.Commit();
}
catch (Exception)
{
sqlTxn.Rollback();
}
}
}
Specially this one -
context.Database.UseTransaction(sqlTxn);
Sorry guys, as mentioned above, I thought the Query method is from EF, but I examined the code and found it is actually coded by another developer, defined in class MyDbContext. Since this class is generated by EF, and I never think somebody have added a method.
It is
public DataTable Query(string sqlQuery)
{
DbProviderFactory dbFactory = DbProviderFactories.GetFactory(Database.Connection);
using (var cmd = dbFactory.CreateCommand())
{
cmd.Connection = Database.Connection;
cmd.CommandType = CommandType.Text;
cmd.CommandText = sqlQuery;
//And I added this line, then problem solved.
if (Database.CurrentTransaction != null)
cmd.Transaction = Database.CurrentTransaction.UnderlyingTransaction;
using (DbDataAdapter adapter = dbFactory.CreateDataAdapter())
{
adapter.SelectCommand = cmd;
DataTable dt = new DataTable();
adapter.Fill(dt);
return dt;
}
}
}

There is already an open DataReader associated with this Command which must be closed first. C#

When I start debugging that error shows, and it associted with the line:
textBox1.Text = cmd.ExecuteReader().ToString();
private void Form1_Load(object sender, EventArgs e)
{
SqlConnection conn = new SqlConnection(#"server= M_SHAWAF\ORCHESTRATE; integrated security=true; database=MyData");
try
{
conn.Open();
SqlCommand cmd = new SqlCommand();
cmd = new SqlCommand(#"select MAX(Nodelevel) from Org", conn);
int s = Int32.Parse(cmd.ExecuteScalar().ToString());
for (int i = 0; i <= s; i++)
{
cmd = new SqlCommand(#"select Name from Org where NodeLevel=" + i.ToString(),conn);
textBox1.Text = cmd.ExecuteReader().ToString();
}
}
catch (SqlException ex)
{
MessageBox.Show(ex.Message);
}
finally
{
conn.Close();
}
}
How can I fix that??
You don't need to continually execute readers in order to obtain the next row of data. If all you need to do is to iterate through all row values of Name from table Org, you can execute a single Sql query to return all rows into the reader, and then to traverse the reader, e.g.:
try
{
using (var conn = new SqlConnection(#"..."))
{
conn.Open();
using (var cmd = new SqlCommand(#"select Name from Org", conn))
{
using (var reader = cmd.ExecuteReader())
{
while (reader.Read())
{
textBox1.Text = reader["Name"].ToString();
}
}
}
}
}
catch (SqlException ex)
{
MessageBox.Show(ex.Message);
}
Edit, Re Hierarchical table structures
If you do need to retain separate iterators while navigating through multiple levels of a hierarchy, you will need multiple readers. As per #Philips answer, in order to have more than one active result set per SqlConnection, you'll need to enable MARS (or open multiple connections).
try
{
using (var conn = new SqlConnection(#"...;MultipleActiveResultSets=True"))
using (var cmdOuter = new SqlCommand(#"select distinct NodeLevel from Org", conn))
{
conn.Open();
using (var outerReader = cmd.ExecuteReader())
{
while (outerReader.Read())
{
var nodeLevel = reader.GetInt32(0);
Console.WriteLine("Node Level {0}", nodeLevel);
using (var cmdInner = new SqlCommand(#"select Name from Org WHERE NodeLevel = #NodeLevel", conn))
{
cmdInner.Parameters.AddWithValue("#NodeLevel", nodeLevel);
using (var innerReader = cmdInner.ExecuteReader())
{
while (innerReader.Read())
{
Console.WriteLine("Name: {0}", innerReader.GetString(0));
}
}
}
}
}
}
}
catch (SqlException ex)
{
MessageBox.Show(ex.Message);
}
modify your connection string to allow multiple results :
connectionString="Data source=localhost; initial catalog=Interstone; integrated security=True; multipleactiveresultsets=True;"
scroll to the right for the right information ;-)
But there are lots of alternatives to avoid needing multiple queries at the same time. Each query you issue that is still pending is a resource that is being used at the server which should be minimized.
So first consider algorithms that don't require multiple cursors and if there is no alternative then setup mars.
Reader is the wrong tool
And you do have an open reader
textBox1.Text = cmd.ExecuteScalar();

How to copy MySqlDataReader into an array and then loop throught the array?

I am new to C# so yes this should be a faily easy question but I can't seem to find the answer to it.
I have a method that query a database.
What I am trying to do here is handle the loop though the data outside the method.
public MySqlDataReader getDataSet(string query)
{
MySqlDataReader dataset = null;
MySqlConnection conn = new MySqlConnection(conn_string);
if (startConnection(conn) == true)
{
MySqlCommand cmd = new MySqlCommand(query, conn);
dataset = cmd.ExecuteReader();
closeConnection(conn);
}
return dataset;
}
what I could do is write a while loop just before the closeConnection(conn); line and handle the data. But, I don't want to do it inside this method and I want to do it somewhere else in my code.
In one of my forms I want to read the database on the load so here is what I tried to do
public newDepartment()
{
InitializeComponent();
inputDepartmentName.Text = "Hi";
dbConnetion db = new dbConnetion();
MySqlDataReader ds = db.getDataSet("SELECT name FROM test;");
while (ds.Read())
{
//Do Something
}
}
The problem that I am having is that I get an error Invalid attempt to Read when reader is closed
Which I belive I get this issue because I close the connection and then I am trying to read it. so What I need to do is read the data from the query and put it in an array and then loop through the array and deal with the data in a different form.
How can I workaround this issue? if my idea is good then how can I copy the data into an array and how do I loop though the array?
Here is the full class
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using MySql.Data.MySqlClient;
using System.Windows.Forms;
namespace POS
{
public class dbConnetion
{
//private OdbcConnection conn;
private readonly string mServer;
private readonly string mDatabase;
private readonly string mUid;
private readonly string mPassword;
private readonly string mPort;
private readonly string conn_string;
public dbConnetion()
{
mServer = "localhost";
mDatabase = "pos";
mUid = "root";
mPassword = "";
mPort = "3306";
conn_string = String.Format("server={0};user={1};database={2};port={3};password={4};", mServer, mUid, mDatabase, mPort, mPassword);
}
//Start connection to database
private bool startConnection(MySqlConnection mConnection)
{
try
{
mConnection.Open();
return true;
}
catch (MySqlException ex)
{
MessageBox.Show(ex.Message, "Error", MessageBoxButtons.OK);
return false;
}
}
//Close connection
private bool closeConnection(MySqlConnection mConnection)
{
try
{
mConnection.Close();
return true;
}
catch (MySqlException ex)
{
MessageBox.Show(ex.Message);
return false;
}
}
public MySqlDataReader getDataSet(string query)
{
MySqlDataReader dataset = null;
MySqlConnection conn = new MySqlConnection(conn_string);
if (startConnection(conn) == true)
{
MySqlCommand cmd = new MySqlCommand(query, conn);
dataset = cmd.ExecuteReader();
closeConnection(conn);
}
return dataset;
}
public void processQuery(string strSQL, List<MySqlParameter> pars)
{
MySqlConnection conn = new MySqlConnection(conn_string);
if (startConnection(conn) == true)
{
MySqlCommand cmd = new MySqlCommand(strSQL, conn);
foreach (MySqlParameter param in pars)
{
cmd.Parameters.Add(param);
}
cmd.ExecuteNonQuery();
closeConnection(conn);
}
}
}
}
Putting the records into an array would destroy the best feature of a using a datareader: that you only need to allocate memory for one record at a time. Try doing something like this:
public IEnumerable<T> getData<T>(string query, Func<IDataRecord, T> transform)
{
using (var conn = new MySqlConnection(conn_string))
using (var cmd = new MySqlCommand(query, conn))
{
conn.Open();
using (var rdr = cmd.ExecuteReader())
{
while (rdr.Read())
{
yield return transform(rdr);
}
}
}
}
While I'm here, there's a very serious security flaw with this code and the original. A method like this that only accepts a query string, with no separate mechanism for parameters, forces you to write code that will be horribly horribly vulnerable to sql injection attacks. The processQuery() method already accounts for this, so let's extend getDataset() to avoid that security issue as well:
public IEnumerable<T> getData<T>(string query, List<MySqlParameter> pars, Func<IDataRecord, T> transform)
{
using (var conn = new MySqlConnection(conn_string))
using (var cmd = new MySqlCommand(query, conn))
{
if (pars != null)
{
foreach(MySqlParameter p in pars) cmd.Parameters.Add(p);
}
conn.Open();
using (var rdr = cmd.ExecuteReader())
{
while (rdr.Read())
{
yield return transform(rdr);
}
}
}
}
Much better. Now we don't have to write code that's just asking to get hacked anymore. Here's how your newDepartment() method will look now:
public newDepartment()
{
InitializeComponent();
inputDepartmentName.Text = "Hi";
dbConnetion db = new dbConnetion();
foreach(string name in db.getDataSet("SELECT name FROM test;", null, r => r["name"].ToString() ))
{
//Do Something
}
}
One thing about this code is that is uses a delegate to have you provide a method to create a strongly-typed object. It does this because of the way the datareaders work: if you don't create a new object at each iteration, you're working on the same object, which can have undesirable results. In this case, I don't know what kind of object you're working with, so I just used a string based on what your SELECT query was doing.
Based on a separate discussion, here's an example of calling this for a more complicated result set:
foreach(var item in db.getDataSet(" long query here ", null, r =>
new columnClass()
{
firstname = r["firstname"].ToString(),
lastname = r["lastname"].ToString(),
//...
}
) )
{
//Do something
}
Since you are new to .Net I thought I point out that there are two layers of database access in ADO.Net. There are the data reader way that you are using and all of that is online only forward reading of queries. This is the lowest level access and will give you the best performance but it is more work. For most connection types you can only execute one command or have one active data reader per connection (And you can't close the connection before you have read the query as you are doing).
The other form is the offline data adapter and requires just a little bit different code, but is generally easier to use.
public DataTable getDataSet(string query)
{
MySqlConnection conn = new MySqlConnection(conn_string);
if (startConnection(conn) == true)
{
MySqlDataAdapter adapter = new MySqlDataAdapter(query, conn);
DataTable table = new DataTable();
adapter.Fill(table);
closeConnection(conn);
return table;
}
return null;
}
This will result in you getting a DataTable with columns and rows corresponding to the result of your query (Also look into command builders if you want to post changes back to the database later on from it, but for that you will need to keep the connection open).
One nice thing with using the data adapter is that it will figure out what the correct data types should be so you don't have to worry about invalid cast exceptions while reading the data from the data reader.
As somebody pointed out though you will need to read all the data into memory which could be a problem if you are dealing with a lot of memory. Also the DataTable class is really slow when you start dealing with a lot of records. Finally DataTable and DataSet classes also generally hook well into UI components in .Net so that their contents can easily be displayed to users.

My aim is to import data from SQL and export it to Oracle Database. there might be thousands of records, so I am trying it using bulk import

I have a SP at SQL side returning the required data. I need to export it to Oracle DB. I am using bulkcopy. here is the code.
string connectionString = System.Configuration.ConfigurationManager.AppSettings.Get("ConnectionString");
string ConnectionStringOracle = System.Configuration.ConfigurationManager.AppSettings.Get("ConnectionStringOracle");
using (SqlConnection connection = new SqlConnection(connectionString))
{
using (SqlCommand command = new SqlCommand("[SPNAME]", connection))
{
connection.Open();
SqlDataReader rdr = command.ExecuteReader();
using (OracleConnection destinationConnection = new OracleConnection(ConnectionStringOracle))
{
destinationConnection.Open();
try
{
using (Oracle.DataAccess.Client.OracleBulkCopy bulkCopy = new Oracle.DataAccess.Client.OracleBulkCopy(ConnectionStringOracle))
{
bulkCopy.DestinationTableName = "DESTTABLLENAME"; / //bulkCopy.ColumnMappings.Add(1,1);
bulkCopy.WriteToServer(rdr);
bulkCopy.Close();
bulkCopy.Dispose();
destinationConnection.Dispose();
connection.Close();
}
}
catch (OracleException ex)
{
}
}
// ...
}
// ...
}
I am getting the following error :
oracle exception external component has thrown an exception

Categories