SQL Query inside a C# For Loop - c#

I'm creating a simple C# programs to compare a bunch of log files with our SQL Server, to see which has been processed or not.
The issue is the following:
The SQL has more than 2.000.000. So, only a SELECT would take too much time to load.
My ideia is the following: I have a custom object that loads all the data from the logs and them, a FOR Loop would search for a match inside the SQL. It's happening that it won't find any matches with the Query into the For.
I've already tried the SELECT * FROM, but there are too many registers to load that way.
This is my C# Loop:
for (int i = 0; i < registros.Count; i++) {
command = new($"Select DateTime, P15, Reference from ProductionDay where DateTime = Convert(datetime, '{registros[i].day.Year}-{registros[i].day.Day}-{registros[i].day.Month}')", cnn);
Console.WriteLine($"{i:D5} - Editing {registros[i].part_number}");
using (SqlDataReader reader = command.ExecuteReader()) {
while (reader.Read()) {
Console.WriteLine("SQL Matched!");
try {
if (reader["Reference"].ToString().Trim() == registros[i].part_number) {
registros[i].sql_part_number = registros[i].part_number;
registros[i].quantity += Convert.ToInt32(reader["P15"].ToString());
}
} catch {
}
}
}
}
This code only writes SQL Matched! in the last itineration of the loop.
Is this the best way to do a QUERY Loop? I'm learning now about C# and SQL connections.

If you want to full load the table, can you try query in chunk? It will reduce the stress, also you can preserve memory for further processing. Try to query on server side too to help not transferring everything at once to client side (I got this from Postgresql, I think SQL Server also support this)
There is an example for this:
SQL Server : large DB Query In Chunks

Related

SQL DataReader network usage limit

I have such an idea (don't know bad or good).
I have utility, which connects by reglament to SQL server and fetches some data to application. Data is simple (2 varchar text attributes), but count of data is ~ 3 millions rows. So, my application uses network very intensively.
Can I programmatically decrease (limit, throttling, etc...) the network bandwidth usage by SQL DataReader? Let it work more slowly, but not stress nither server nor client. Is this idea good? If not, what I have to do so?
Here is code, so far:
using (SqlConnection con = new SqlConnection("My connection string here"))
{
con.Open();
using (SqlCommand command = new SqlCommand(query, con))
{
using (SqlDataReader reader = command.ExecuteReader())
{
while (reader.Read())
{
yield return new MyDBObject()
{
Date = (DateTime)reader["close_date"],
JsonResult = (string)reader["json_result"]
};
}
}
}
}
Making the server buffer data or hold an open query longer could actually be significantly increasing load on the server, but ultimately the only way to do what you're after would be to apply "paging" to your query, and access the data in successive pages, perhaps with pauses between pages. The pages could still be pretty big - 100k for example. You can achieve this relatively easily with OFFSET/FETCH in SQL Server.

Get the execution time of a ADO.NET SQL Command

I have been searching over to find if there is any easy way to get the Execution time of a ADO.NET command object.
I know i can manually do a StopWatch start and stop. But wanted to if there are any easy way to do it in ADO.NET
There is a way, but using SqlConnection, not command object. Example:
using (var c = new SqlConnection(connectionString)) {
// important
c.StatisticsEnabled = true;
c.Open();
using (var cmd = new SqlCommand("select * from Error", c)) {
cmd.ExecuteReader().Dispose();
}
var stats = c.RetrieveStatistics();
var firstCommandExecutionTimeInMs = (long) stats["ExecutionTime"];
// reset for next command
c.ResetStatistics();
using (var cmd = new SqlCommand("select * from Code", c))
{
cmd.ExecuteReader().Dispose();
}
stats = c.RetrieveStatistics();
var secondCommandExecutionTimeInMs = (long)stats["ExecutionTime"];
}
Here you can find what other values are contained inside dictionary returned by RetrieveStatistics.
Note that those values represent client-side statistics (basically internals of ADO.NET measure them), but seems you asked for analog of Stopwatch - I think that's fine.
The approach from the answer of #Evk is very interesting and smart: it's working client side and one of the main key of such statistics is in fact NetworkServerTime, which
Returns the cumulative amount of time (in milliseconds) that the
provider spent waiting for replies from the server once the
application has started using the provider and has enabled statistics.
so it includes the network time from the DB server to the ADO NET client.
An alternative, more DB server oriented, would be running SET STATISTICS TIME ON and then retrieve the InfoMessage.
A draft of the code of the delegate (where I'm simply writing to the debug console, but you may want to replace it with a StringBuilder Append)
internal static void TrackInfo(object sender, SqlInfoMessageEventArgs e)
{
Debug.WriteLine(e.Message);
foreach (var element in e.Errors) {
Debug.WriteLine(element.ToString());
}
}
and usage
conn.InfoMessage += TrackInfo;
using (var cmd = new SqlCommand(#"SET STATISTICS TIME ON", conn)) {
cmd.ExecuteNonQuery();
}
using (var cmd = new SqlCommand(yourQuery, conn)) {
var RD = cmd.ExecuteReader();
while (RD.Read()) {
// read the columns
}
}
I suggest you move to SQL Server 2016 and use the Query Store feature. This will track execution time and performance changes over time for each query you submit. Requires no changes in your application. Track all queries, including those executed inside stored procedures. Track any application, not only your own. Is available in all editions, including Express, and including the Azure SQL DB Service.
If you track on the client side, you must measure the time yourself, using a wall clock. I would add and expose performance counters and then use the performance counters infrastructure to capture and store the measurements.
As a side not, simply tracking the execution time of a batch sent to SQL Server yields very coarse performance info and is seldom actionable. Read How to analyse SQL Server performance.

What is the best way for me to connect to a offline database through C#

I have an application that needs to store loads of data in a table format. I want something easy to configure, which is also in built with C#.NET. I don't want to have to include additional DLL files.
Also some links to tutorials, explaining the connection process and querying would be great. I'm assuming this is just like PHP, but which database type do I need?
It needs to be able to hold a lot of data and the ability to perform backups would be nice.
I'm not sure what you mean by "built in with C#.NET", but SQL Server Express comes with Visual Studio.
If you're looking for "a self-contained, embeddable, zero-configuration SQL database engine", you could try System.Data.SQLite.
If you want an offline database you could use SQL Server CE, as its a in-process database that does not require being attached to a server instance, which is really what you want then. Here is an example in C# on how you would connect, and populate a data table to manipulate some data.
// this connectionstring can also be an absolute file path
string connectionString = "Data Source=|DataDirectory|\mydatabase.sdf";
using (SqlCeConnection connection = new SqlCeConnection(connectionString)) {
try {
connection.Open();
}
catch (SqlCeException) {
// connection failed
}
using (SqlCeDataAdapter adapter = new SqlCeDataAdapter("SELECT * FROM <table>", connection)) {
using (DataTable table = new DataTable("<table>")) {
adapter.Fill(); // Populate the table with your select statement
// do stuff with the datatable
// example:
foreach (DataRow row in table.Rows) {
row["mycolumn"] = "somedata";
}
table.AcceptChanges();
}
}
}
You can even use commands instead of data tables
using (SqlCeCommand command = new SqlCeCommand("DELETE FROM <table> WHERE id = '0'", connection)) {
command.ExecuteNonQuery(); // executes command
}
Have a look at the ease of SQL Server Compact
Not build-in but easily added, no install and free.

Delete from database in c#

I'm writing an app in C# WPF with VS10 express.
I have to say I'm a very beginner in C# and VS but I'v searched a lot of examples on Google, I really tried to solve this problem on my own..
I have a local database (mydatabase.sdf) and at the load of my window I fill a table of that database with some data. One of the fields of that table needs a unique value, so I want to put in every load the same data, but I get an error than off course.
I want to delete all the data from the database before I refill, this seems to be so easy but I don't get it working...
I tried
dataset.Tables["mytable"].Clear()
that doesn't work, it seems to be deleting only data from the datagrid (dataTable) but not really from the datastore.
also I tried:
for (int i = 0; i < dataset.Tables["mytable"].Rows.Count; i++)
{
dataset.Tables["mytable"].Rows[i].Delete();
}
this.TableAdapter.Update(this.dataset);
But at startup the dataset.Tables["mytable"].Rows.Count statement returns zero at startup, but if I put in my data I get the "unique-value error".
The only way to get it deleted is to delete it manually from the datagrid and then push an Update button, that really deletes it from the datastore.
It is no option to make that field in the database not-unique because of development reasons.
How can I delete really data from the datastore/database (mydatabase.sdf) in the load of my program??
EDIT
Here is the code how I fill the database with data:
public void FillInternet()
{
klantenTableAdapter1.ClearBeforeFill = false;
string MyConString = "SERVER=myserver;" +
"DATABASE=mydb;" +
"UID=myuid;" +
"PASSWORD=mypass;";
MySqlConnection connection = new MySqlConnection(MyConString);
MySqlCommand command = connection.CreateCommand();
MySqlDataReader Reader;
command.CommandText = "SELECT klantnr, voorletters, roepnaam, achternaam, tussenvoegsel, meisjesnaam, straat, huisnr, subhuisnr, postcode, plaats, telthuis, telmobiel, telwerk, fax, email, geboortedatum FROM klanten ORDER BY klantnr";
connection.Open();
Reader = command.ExecuteReader();
try
{
while (Reader.Read())
{
DataRow newLogRow = dataset1.Tables["klanten"].NewRow();
var thisrow = "";
for (int i = 0; i < Reader.FieldCount; i++)
{
thisrow = Reader.GetValue(i).ToString();
newLogRow[Reader.GetName(i)] = thisrow;
}
dataset1.Tables["klanten"].Rows.Add(newLogRow);
this.klantenTableAdapter1.Update(this.dataset1);
}
connection.Close();
}
catch (Exception ex)
{
MessageBox.Show("Error: " + ex.Message,"Fout",MessageBoxButton.OK,MessageBoxImage.Error);
}
dataset1.AcceptChanges();
//Fill from internet
//da.Fill(dataset1.klanten);
//Fill from local database
klantenTableAdapter1.Fill(dataset1.klanten);
this.klantenTableAdapter1.Update(this.dataset1);
this.DataContext = dataset1.klanten.DefaultView;
}
ADO.NET uses a "disconnected" recordset model. It keeps a copy of the data in client-side structures (DataSet and DataTable). Updates/inserts/deletions made to the client-side structures need to be pushed back out to the database. You need to read up on ADO.NET to get a basic understanding of this process and to get a sense of the ADO.NET event-model, which will be necessary if you want to do anything that involves typical real-world complications. There are many books written on ADO.NET because it is a feature-rich middle-tier data layer with significant complexities.
For your purposes, you could read up on the ADO.NET Command object and the SQL "delete" command. You will also need to explore how ADO.NET handles autoincrementing primary keys, which is one of the trickiest aspects of the disconnected model.
If the database itself defines an autoincrementing key, you cannot supply that value when inserting new rows unless you turn the auto-increment off temporarily in the back-end. That is not an ADO.NET issue, BTW. That is 100% back-end.
From your other posts, I'm going on the assumption that your database on this too is MySQL. You mention a unique column which typically means an auto-increment column. If you "delete" the entries after you've built them (say starting 1-10), and then try to re-add your next cycle the same 1-10 items, it can choke on you giving you this message. If you add numbers to your table starting with the last one used... see if that helps.

Very slow insert process using Linq to Sql

I'm inserting large number of records using LinqToSql from C# to SqlServer 2008 express DB. It looks like the insertion is very slow in this. Following is the code snippet.
public void InsertData(int id)
{
MyDataContext dc = new MyDataContext();
List<Item> result = GetItems(id);
foreach (var item in result)
{
DbItem dbItem = new DbItem(){ItemNo = item.No, ItemName=item.Name};
dc.Items.InsertOnSubmit();
}
dc.SubmitChanges();
}
Am I doing anything wrong? Or using Linq to insert large number of records is a bad choice?
Update: Thanks for all the answers.
#p.campbell: Sorry for the records count, it was a typo, actually it is around 100000. Records also range till 200k as well.
As per all the suggestions I moved this operation into parts (also a requirement change and design decision) and retrieving data in small chunks and inserting them into database as and when it comes. I've put this InsertData() method in thread operation and now using SmartThreadPool for creating a pool of 25 threads to do the same operation. In this scenario I'm inserting at a time only 100 records. Now, when I tried this with Linq or sql query it didn't make any difference in terms of time taken.
As per my requirement this operation is scheduled to run every hour and fetches records for around 4k-6k users. So, now I'm pooling every user data (retrieving and inserting into DB) operation as one task and assigned to one thread. Now this entire process takes around 45 minutes for around 250k records.
Is there any better way to do this kind of task? Or can anyone suggest me how can I improve this process?
For inserting massive amount of data into SQL in a oner
Linq or SqlCommand, neither are designed for bulk copying data into SQL.
You can use the SqlBulkCopy class which provides managed access to the bcp utility for bulk loading data into Sql from pretty much any data source.
The SqlBulkCopy class can be used to write data only to SQL Server tables. However, the data source is not limited to SQL Server; any data source can be used, as long as the data can be loaded to a DataTable instance or read with a IDataReader instance.
Performance comparison
SqlBulkCopy is by far the fastest, even when loading data from a simple CSV file.
Linq will just generate a load of Insert statements in SQL and send them to your SQL Server. This is no different than you using Ad-hoc queries with SqlCommand. Performance of SqlCommand vs. Linq is virtually identical.
The Proof
(SQL Express 2008, .Net 4.0)
SqlBulkCopy
Using SqlBulkCopy to load 100000 rows from a CSV file (including loading the data)
using (SqlConnection conn = new SqlConnection("Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=EffectCatalogue;Data Source=.\\SQLEXPRESS;"))
{
conn.Open();
Stopwatch watch = Stopwatch.StartNew();
string csvConnString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\\data\\;Extended Properties='text;'";
OleDbDataAdapter oleda = new OleDbDataAdapter("SELECT * FROM [test.csv]", csvConnString);
DataTable dt = new DataTable();
oleda.Fill(dt);
using (SqlBulkCopy copy = new SqlBulkCopy(conn))
{
copy.ColumnMappings.Add(0, 1);
copy.ColumnMappings.Add(1, 2);
copy.DestinationTableName = "dbo.Users";
copy.WriteToServer(dt);
}
Console.WriteLine("SqlBulkCopy: {0}", watch.Elapsed);
}
SqlCommand
using (SqlConnection conn = new SqlConnection("Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=TestDb;Data Source=.\\SQLEXPRESS;"))
{
conn.Open();
Stopwatch watch = Stopwatch.StartNew();
SqlCommand comm = new SqlCommand("INSERT INTO Users (UserName, [Password]) VALUES ('Simon', 'Password')", conn);
for (int i = 0; i < 100000; i++)
{
comm.ExecuteNonQuery();
}
Console.WriteLine("SqlCommand: {0}", watch.Elapsed);
}
LinqToSql
using (SqlConnection conn = new SqlConnection("Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=TestDb;Data Source=.\\SQLEXPRESS;"))
{
conn.Open();
Stopwatch watch = Stopwatch.StartNew();
EffectCatalogueDataContext db = new EffectCatalogueDataContext(conn);
for (int i = 0; i < 100000; i++)
{
User u = new User();
u.UserName = "Simon";
u.Password = "Password";
db.Users.InsertOnSubmit(u);
}
db.SubmitChanges();
Console.WriteLine("Linq: {0}", watch.Elapsed);
}
Results
SqlBulkCopy: 00:00:02.90704339
SqlCommand: 00:00:50.4230604
Linq: 00:00:48.7702995
if you are inserting large record of data you can try with BULK INSERT .
As per my knowledge there is no equivalent of bulk insert in Linq to SQL.
You've got the SubmitChanges() being called once, which is good. This means that only one connection and transaction are being used.
Consider refactoring your code to use InsertAllOnSubmit() instead.
List<dbItem> newItems = GetItems(id).Select(x=> new DbItem{ItemNo = x.No,
ItemName=x.Name})
.ToList();
db.InsertAllOnSubmit(newItems);
dc.SubmitChanges();
The INSERT statements are sent one-by-one as previous, but perhaps this might be more readable?
Some other things to ask/consider:
What's the state of the indexes on the target table? Too many will slow down the writes. * Is the database in Simple or Full recovery model?
Capture the SQL statements going across the wire. Replay those statements in an adhoc query against your SQL Server database. I realize you're using SQL Express, and likely don't have SQL Profiler. Use context.Log = Console.Out; to output your LINQ To SQL statements to the console. Prefer SQL Profiler for convenience though.
Do the captured SQL statements perform the same as your client code? If so, then the perf problem is at the database side.
Here's a nice walk-through of how to add a Bulk-Insert class to your application, which hugely improves the performance of inserting records using LINQ.
(All source code is provided, ready to be added to your own application.)
http://www.mikesknowledgebase.com/pages/LINQ/InsertAndDeletes.htm
You would just need to make three changes to your code, and link in the class provided.
Good luck !

Categories