What is the Method for Database CleanUp in SQlite? - c#

As what i experience using Sqlite for my Small Applications i always use sqliteadmin to use its database cleanup function to removes unnecessary data on my database.
Now i want to create a Method in my Application which do the same way as sqliteadmin CleanUp.
How to do this?
Thanks in Regards

using (SQLiteCommand command = m_connection.CreateCommand())
{
command.CommandText = "vacuum;";
command.ExecuteNonQuery();
}
here is the exact answer on how to execute vacuum.

It seems you're looking for the VACUUM statement.

Interesting post script. The Vacuum statement in SQLite copies the entire database to a temp file for rebuilding. If you plan on doing this "On Demand" via user or some process, it can take a considerable amount of disk space and time to complete once your database gets above 100MB, especially if you are looking at several GB. In that case, you are better off using the AUTO_VACUUM=true pragma statement when you create the database, and just deleting records instead of running the VACUUM. So far, this is the only advantage I can find that SQL Server Compact has over SQLite. On demand SHRINK of the Sql Server Compact database is extremely fast compared to SQLite's vacuum.

Related

Optimized process of moving records in the range of 1 million to 10 million

What would you do if you had to massage data and move it from a database on one server to a database on another server?
Massage data was limited to using CONVERT or CAST. This process was called by a Data Loader in C#.NET. The SQL scripts were executed in SQL Server 2008.
Would you suggest this process be done using SQLBulkCopy, LINQ to SQL or should this be only done using a INSERT........ SELECT in TSQL?
The data could consist in the range of 1 million to 10 million rows.
I would appreciate your views on this process to verify an opitimized process on performing the above operation.
LINQ-to-SQL should be avoided here; it isn't optimised for this (it is aimed at individual objects/records - not bulk). A cross-db (and possibly linked-server) insert/select is possible, but I would be looking at bulk options. I suspect SSIS (ex DTS) might be of use here - it is pretty much designed for this. If you need a managed option, a data-reader from the source (ExecuteDataReader()) connected to SqlBulkCopy to the target will perform the same function as SSIS (using the same bulk protocol).
I have some issues like these before. I solve this issue by SQLBulkCopy. bcp is great in performance.
Looking on amount of data you're going to operate I would personaly choose ReplicationServices http://msdn.microsoft.com/en-us/library/ms151198.aspx and avoid programming solution , if it's possible.

Migrating Data to SQL Server 2008

I am trying to migrate data from an Informix database to SQL Server 2008. I've got quite a lot of data to move. I've been try multiple methods to get the data over, and so far SQLBulkCopy in multiple chunks seems to be the fastest that I can find. Does anyone know of a faster means of getting the data over? I'm trying to cut down on the transfer time so that on my cut-over date I don't run out of time to do the full cut-over. Thanks.
As you mentioned, I think that the bcp command is the fastest solution.
you can make csv file from your data and then import those to your db by bcp command.
There isn't much more you can do to get this work completed faster. One thing you might want to look at though is the recover model for the sql database. If it's currently set to Full, you're going to end up slowing down quite a bit as the transaction log fills up.
http://msdn.microsoft.com/en-us/library/ms189275.aspx
Hope that helps.
If you can use an Ole or ODBC connection to your Informix database, then SSIS may be the best option.

Insert takes too long, code optimization needed

I've some code I use to transfer a table1 values to another table2, they are sitting in different database.
It's slow when I have 100.000 records. It takes forever to finish, 10+ minutes.
(Windows Mobile smartphone)
What can I do?
cmd.CommandText = "insert into " + TableName + " select * from sync2." + TableName+"";
cmd.ExecuteNonQuery();
EDIT
The problem is not resolved. I'm still after answers.
1] You can set the following parameters in your connectionString
string connectionString = #"Data Source=E:\myDB.db3; New=true; Version=3; PRAGMA cache_size=20000; PRAGMA page_size=32768; PRAGMA synchronous=off";
which has its own limitations. check this link for details
The above will increase the cache size ultimately(cache_size & page_size) but you could lose some data in case of force shutdown of your system(synchronous=off).
2] You can wrap your insert statements inside a transaction like this
dbTransaction = dbConnection.BeginTransaction();
dbCommand.Transaction = dbTransaction;
// your individual insert statements here
dbTransaction.Commit();
3] There is one more trick which is well explained in this page. Check it out
Hope that helps
Cheers!
As far as I can tell, you are using two SQL statements per row - one to insert it, and then another when you update the entire table. Do you need to update the entire table, or just the rows you are inserting? If not, you could just insert the row with dirty set to 0 in the first place.
You can also try changing your ad-hoc insert statement into a prepared/compiled/parametrized statement. In some database engines that provides a small speed boost.
There are a few options for improvment listed in the SQLite FAQ, here.
Finally, have you figured out what your bottleneck is? I don't know much about the performance of mobile phone applications - is your profiling showing that you are CPU bound or "disk" bound, whatever passes for a disk on that particular phone?
One suggestion, that may help (although you'd need to profile this), would be to call your insert or replace command a single time for multiple records, with multiple values. This would batch the calls to the DB, which potentially would speed things up.
Also, it sounds like you're copying from one DB to another - if the records are not going to exist in the new DB, you can use INSERT instead of INSERT OR REPLACE, which is (at least theoretically) faster.
Neither of these is going to be dramatic, however - this is likely going to still be a slow operation.
Instead of executing the statements individually you could build up the string then execute it at once as a batch, alternatively look into batch updates.
Alternatively you could do this all as one statement which would probably be more efficient, something like this is what I mean:
http://forums.mysql.com/read.php?61,15029,15029
I hope this helps.
Well you're using the query processor for each insert. It's going to be much better to use a command, add Parameters to it, Prepare it, and then just set the Parameters in your loop.
I don't know if SQLite supports TableDirect commands. If it does that would be way, way, way faster (it's a few orders of magnitude faster in SQL Compact for example). It would be certainly worth a try.
I think you could use a DataTable (I think - or was it DataSet? sorry, still a beginner in .NET), then .Fill() the Reader's results into that DataTable, and then there is a BulkCopy or BulkInsert operation, that can push it all out into a table in another database (connection).
Does the sql statement need also 10minutes+ if you do this directly in the SQL management studio?
If not, did you try make it with a SQL procedure and execute it?
Did you try setup the connection pool and/or make the cursor server sided?
run the SQL profiler (you can call it from the management studio) if 1. is slow and add transaction queue there.

C# + Sql Server - Execute a stored procedure large number of times. Best way?

I have one stored procedure which inserts data into 3 tables, (does UPSERTS), and has some rudamentary logic. (IF-THEN-ELSE)
I need to execute this Sproc millions of times (From a C# app) using different parameters and I need it to be FAST.
What is the best way to do so?
Does anybody know an open-source (or not) off the shelf document indexer besides Lucene or Sql Server FTS??
*I am trying to build a document word-index. For each word in the document I insert into the DB the word, docID, and word position.
This happens 100000 times for 100 documents for example.
The Sproc : there are 3 tables to insert into, for each one I do an UPSERT.
The C# app :
using (SqlConnection con = new SqlConnection(_connectionString))
{
con.Open();
SqlTransaction trans = con.BeginTransaction();
SqlCommand command = new SqlCommand("add_word", con, trans);
command.CommandType = System.Data.CommandType.StoredProcedure;
string[] TextArray;
for (int i = 0; i < Document.NumberOfFields; i++)
{
...
Addword(..., command); <---- this updates parameters with new values and ExecuteNonQuery.
}
}
I Forgot to mention , this code produces deadlocks in Sql Server . I have no idea why this happens.
Drop all the indexes on the table(s) you are loading, then add them back in once the load is complete. This will prevent a lot of thrashing / reindexing for each change.
Make sure the database has allocated enough physical file space prior to the load that way it doesn't have to spend time constantly grabbing it from the file system as you load. Usually databases are set to grow by something like 10% when full at which point sql server blocks queries until more space is allocated. When loading the amount of data you are talking about, sql will have to do a lot of blocking.
Look into bulk load / bulk copy if possible.
Do all of your IF THEN ELSE logic in code. Just send the actual values you want stored to the s'proc when it's ready. You might even run two threads. One to evaluate the data and queue it up, the other to write the queue to the DB server.
Look into Off The Shelf programs that do exactly what you are talking about with indexing the documents. Most likely they've solved these problems.
Get rid of the Transaction requirements if possible. Try to keep the s'proc calls as simple as possible.
See if you can limit the words you are storing. For example, if you don't care about the words "it", "as", "I", etc then filter them out BEFORE calling the s'proc.
If you want to quickly bulk INSERT data from C#, check out the SqlBulkCopy class (.NET 2.0 onwards).
This might seem like a rudimentary approach, but it should work and it should be fast. You can just generate a huge textfile with a list of SQL statements and then run it from a command line. If I’m not mistaken it should be possible to batch commands using the GO statement. Alternatively, you can do it directly from you application concatenating several SQL commands as strings and execute them in batches. It seems that what you are trying to do is a onetime task and that the data does not come directly as auser input. So you should be able to handle escapign yourself.
I’m sure there are more sophisticated ways to do that (the SqlBulkCopy looks like a good start), so please consider this as just a suggestion. I would spend some time investigating whether there are not more elegant ways better ways first.
Also, I would make sure that the logic in the stored procedure is as simple as possible and that the table does not have any indexes. They should be added later.
This is probably too generic as a requirement - in order for the procedure to be fast itself we need to see it and have some knowledge of your db-schema.
On the other end if you want to know what the best way to execute as fast as possible the same (non-optimized or optimized) procedure, usually the best way to go is to do some sort of caching on the client and call the procedure as few times as possible batching your operations.
If this is in a loop, what people usually do is - instead of calling the procedure each iteration - build/populate some caching data structure that will call down to the store procedure when the loop exits (or any given number of loops if you need this to happen more often) batching the operations that you cached (i.e. you can pass an xml string down to your sp which will then parse it, put the stuff in temp tables and then go from there - you can save a whole lot of overhead like this).
Another common solution solution for this is to use SqlServer Bulk operations.
To go back to the stored procedure - keep into account that optimizing your T-SQL and db-schema (with indexes etc.) can have a glorious impact on your performance.
Try use XML to do that.
You just will need execute 1 time:
Example:
DECLARE #XMLDoc XML
SET #XMLDoc = '<words><word>test</word><word>test2</word></words>'
CREATE PROCEDURE add_words
(
#XMLDoc XML
)
AS
DECLARE #handle INT
EXEC sp_xml_preparedocument #handle OUTPUT, #XMLDoc
INSERT INTO TestTable
SELECT * FROM OPENXML (#handle, '/words', 2) WITH
(
word varchar(100)
)
EXEC sp_xml_removedocument #handle
If you're trying to optimize for speed, consider simply upgrading your SQL Server hardware. Putting some RAM and a blazing fast RAID in your server may be the most cost effective long-term solution to speed up your query speed. Hardware is relatively cheap compared to developer time.
Heed the words of Jeff Atwood:
Coding Horror: Hardware is Cheap, Programmers are Expensive
The communication with the database will likely be a bottle-neck in this case, especially if the db is on another machine. I suggest sending the entire document to the database and writing a sproc that splits it into words, or use sql-server hosted managed code.
Assuming this is an app where there would not be contention between multiple users, try this approach instead:
Insert your parameters into a table set up for that purpose
Change your SP to loop through that table and perform its work on each row
Call the SP once
Have the SP truncate the table of inputs when it is complete
This will eliminate the overhead of calling the SP millions of times, and the inserts of the parameters into the table can be concatenated ("INSERT INTO foo(v) VALUE('bar'); INSERT INTO foo(v) VALUE('bar2'); INSERT INTO foo(v) VALUE('bar3');").
Disadvantage: the SP is going to take a long time to execute, and there won't be any feedback of progress, which isn't terribly user-friendly.
To move over a lot of data to the server, use either SqlBulkCopy or table valued parameter if you are on 2008. If you need speed, do not execute a stored procedure once per row, develop a set based one that processes all (or a large batch of) rows.
--Edited since the question was edited.
The biggest issue is to make sure the stored proc is correctly tuned. Your C# code is about as fast as you are going to get it.

Performing Insert OR Update (upsert) on sql server compact edition

I have c# project that is using sqlserver compact edition and entity framework for data access. I have the need to insert or update a large amount of rows, 5000+ or more to the db, so if the key exists update the record if not insert it. I can not find a way to do this with compact edition and EF with out horrible performance, ie taking 2 mins plus on a core i7 computer. I have tried searching for the record to see if it exists then inserting if not or update if it does, the search is the killer on that. I have tried compiling the search query and that only gave a small improvement. Another thing ive tried is inserting the record in a try catch and if it fails update, but that forces me to savechanges on every record to get the exception as opposed to at the end which is a performance killer. Obviously I can't use stored procedures since it is compact edition. Also I've looked at just executing t-sql directly somehow on the db, but lack of process statements in compact seems to rule that out.
I've searched the world over and out of ideas. I really wanted to use compact if I can over express for the deployment benefits and ability to prevent the user from digging around the db. Any suggesitons would be appreciated.
Thanks
When we're using SQL CE (and SQL 2005 Express for that matter) we always call an update first and then call an insert if the udate gives a row count of 0. This is very simple to implement and does not require expensice try..catch blocks for control flow.
Maybe you could obtain the result you seek by using simple queries.
Let's say the the table you want to insert into or update is like this
TABLE original
id integer,
value char(100)
first you could create a temporary table with the new values (you can use a SELECT INTO or other ways to create it)
TABLE temp
id integer,
value char(100)
now, you need to do two things, update the rows in original and then insert the new values
UPDATE original
SET original.value = temp.value
FROM original, temp
WHERE original.id = temp.id
INSERT INTO original
SELECT * from temp
WHERE temp.id not IN (select o.id from original o)
Given your problem statement, I'm going to guess that this software assumes a relatively beefy environment. Have you considered taking the task of determining off of sqlce and doing it on your own? Essentially, grab a sorted list of all the IDs(keys?) from the relevant table and checking every object key against that list before queueing it for insertion?
This makes a few assumptions that would be bad news with a typical DB, but that you can probably get away with in sqlce. E.g., it assumes that rows won't be inserted or significantly modified by a different user while you're performing this insert.
If the list of keys is too long to reasonably hold in memory for such a check, I'm afraid I'd say that sqlce just might not be the right tool for the job. :(
I'm not sure if this is feasible or not, as I haven't used the Entity Framework, but have you tried running the update first and checking the rowcount -- inserting if no rows were updated? This may be faster than catching exceptions. It's generally a bad practise to use exceptions for control flow, and often slows things down dramatically.
If you can write the SQL directly, then the fastest way to do it would be to get all the data into a temporary table and then update what exists and insert the rests (as in Andrea Bertani's example above). You should get slightly better results by using a left join on the original table in the select in your insert, and excluding any rows with values from the original table that are not null:
INSERT INTO original
SELECT * FROM temp
LEFT JOIN original ON original.id = temp.id
WHERE original.id IS NULL
I would recommend using SqlCeResultSet directly. You lose the nice type-safeness of EF, but performance is incredibly fast. We switched from ADO.NET 2.0-style TypeDataSets to SqlCeResultSet and SqlCeDataReader and saw 20 to 50 times increases in speed.
See SqlCeResultSet. For a .NETCF project I removed almost all sql code in favor of this class.
Just search for "SqlCeResultSet" here and msdn.
A quick overview:
Open the resultSet.
If you need seek (for existence check) you will have to provide an index for the result set.
Seek on the result set & read to check whether you found the row. This is extremely fast even on tables with tens of thousands rows (because the seek uses the index).
Insert or update the record (see SqlCeResultSet.NewRecord).
We have successfully developed a project with a sqlce database with a main product table with over 65000 rows (read/write with 4 indexes).
SQL Server compact edition is pretty early in development at this point. Also, depending on your device, memory-disk access can be pretty slow, and SQLCE plus .NET type-safety overhead is pretty intensive. It works best with a pretty static data store.
I suggest you either use a lighter-weight API or consider SQLite.

Categories