Bad perfomance of mass insertion to Redis DB with Sider .NET client - c#

I need to insert about one million key-value pairs in Redis DB. I have a Redis server instance on the same computer with my C# application. I use Sider client to connect to Redis. All settings are default. The following code executes for 4 seconds:
redis_client.Pipeline(c =>
{
for (int i = 0; i < 1000; ++i)
{
Console.Write("\r" + i);
string key = "aaaaaaaaaaa" + i;
string value = "bbbbbbbbbb";
c.Set(key, value);
}
});
I tried both usual and pipeline method of insertion. Standard benchmark of Redis shows similar results. CPU or HDD have no problems and them enough for another mass insertion in different databases. Official benchmark page of Redis says about possibility of ~100000 SET operations per second. I have less then 1000... What's the problem?

Related

Get-Counter provides a counter value, and PerformanceCounter.NextValue () is always 0

Good afternoon.
In Windows, I use the following performance counter:
<Counter>\APP_POOL_WAS(DefaultAppPool)\Current Application Pool Uptime</Counter>
Then he created a special group of data collectors and added the specified performance counter to it (he also added a number of counters there).
In, PowerShell checked performance counter data collection using the following command (under the administrator):
Get-Counter -Counter "\\MachineName\APP_POOL_WAS(DefaultAppPool)\Current Application Pool Uptime" -SampleInterval 2 -MaxSamples 3
The team gave the following correct results:
For, use the following code (Net Framework 4.8) to retrieve performance counter data:
var performanceCounter = new PerformanceCounter(categoryName: "APP_POOL_WAS",
instanceName: "DefaultAppPool",
counterName: "Current Application Pool Uptime",
machineName: "khubetsov-pc");
Console.WriteLine($"{nameof(performanceCounter.CounterName)}:{performanceCounter.CounterName}");
Console.WriteLine($"{nameof(performanceCounter.CategoryName)}:{performanceCounter.CategoryName}");
Console.WriteLine($"{nameof(performanceCounter.InstanceName)}:{performanceCounter.InstanceName}");
Console.WriteLine($"{nameof(performanceCounter.MachineName)}:{performanceCounter.MachineName}");
Console.WriteLine($"{nameof(performanceCounter.CounterType)}:{performanceCounter.CounterType}");
for (var i = 1; i <= 10; i++)
{
Console.WriteLine($"{nameof(performanceCounter.NextValue)}{i}:{performanceCounter.NextValue()}");
Thread.Sleep(1500);
}
Here are the results of the program code execution:
As you can see, for some reason I always get 0 as a performance counter value.
Other counters (for example,\Processor (_ Total)% Idle Time or \APP _ POOL _ WAS (DefaultAppPool )\Current Application Pool State) provide the correct data.
Questions:
How do I correct this situation, that is, to use the PerformanceCounter class to get the correct value for the specified performance counter.
Maybe you need to make some changes to the settings of the meter itself?

SQL Server performance for high concurrent access

One win service developed by C# and based on SQL Server.
I created a SqlConnection and then do ExecuteReader, but I face a problem when there is a high concurrent access.
On my local machine, code as below, average is 1.2s for one execution. MyObject table has 40 columns and 400'000 rows.
for (int i = 0; i < 500; i++)
{
Task.Factory.StartNew(() =>
{
Stopwatch sw = new Stopwatch();
sw.Start();
var serialNumber = "55292572";
var orderIdArr = ORM.GetObjecys<MyObject>(t =>t.PrimaryId== Id).ToList();//Only Open connection, do SqlCommandExecute, close connection here
//sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
});
}
I deployed this service to 2 win server. You can guess I have 500 customers and they access this service very often.
Online environment, for one execution will cost 10s.
So I don't know why does this low performance happen? On my local machine, it only takes 1.2s.
Could anybody give me some answers or give my some improvable suggestions?
Many thanks!

SQLite poor performance issue in multi-user local-network environment

We use SQLite as shared DB in our application. (I know this is not the best solution but server/client architecture was not possible)
There are only a few users, a very small db and just few writes.
The application is written in c# and we use System.Data.SQLite.dll but the problem occures also for example with the SQLiteDatabaseBrowser
As long as only one user connects to the DB and queries some results, it is very fast. Just some milliseconds. One user can establish multiple connections and execute select statements in parallel. This has also no impact on the performance.
But as soon as another user from a different mashine connects to the db, the performance becomes very poor for every connected user. The performance keeps poor as long as all connections/apps are closed.
After that, the first user connecting, gets the good performance back until the next user connects.
I tried many things:
PRAGMA synchronous = OFF
updated to the lates sqlite version (and created a new db file with that version)
DB-File read-only
network share read-only for everyone
connection string with different options (nearly all)
different sqlite programms (our application and SQLiteDatabaseBrowser)
different filesystems hostet on (NTFS and FAT32)
After that, I wrote a little app that opens a connection, queries some results and displays the passed time. This all in an endless loop.
Here is the Code of this simple app:
static void Main(string[] args)
{
SQLiteConnectionStringBuilder conBuilder = new SQLiteConnectionStringBuilder();
conBuilder.DataSource = args[0];
conBuilder.Pooling = false;
conBuilder.ReadOnly = true;
string connectionString = conBuilder.ConnectionString;
while (true)
{
RunQueryInNewConnection(connectionString);
System.Threading.Thread.Sleep(500);
}
}
static void RunQuery(SQLiteConnection con)
{
using (SQLiteCommand cmd = con.CreateCommand())
{
cmd.CommandText = "select * from TabKatalog where ReferenzName like '%0%'";
Console.WriteLine("Execute Query: " + cmd.CommandText);
Stopwatch watch = new Stopwatch();
watch.Start();
int lines = 0;
SQLiteDataReader reader = cmd.ExecuteReader();
while (reader.Read())
lines++;
watch.Stop();
Console.WriteLine("Query result: " + lines + " in " + watch.ElapsedMilliseconds + " ms");
}
}
static void RunQueryInNewConnection(string pConnectionString)
{
using (SQLiteConnection con = new SQLiteConnection(pConnectionString, true))
{
con.Open();
RunQuery(con);
}
System.Data.SQLite.SQLiteConnection.ClearAllPools();
GC.Collect();
GC.WaitForPendingFinalizers();
}
While testing with this little app, I realised, that it is enough to let another system take a file handle on the sqlite db to decrease the performance. So it seems, that this has nothing to do wih the connection to the db. The performance keeps low until ALL file handles are released. I tracked it with procexp.exe. In addition, only the remote systems encounter the performance issue. On the db file host itself, the queries runs fast every time.
Has anybody encountered the same issue or has some hints?
Windows does not cache files that are concurrently accessed on another computer.
If you need high concurrency, consider using a client/server database.

Azure Table Storage QueryAll(), ImproveThroughput

I have some data (approximatly 5 Mio of items in 1500 tables, 10GB) in azure tables. The entities can be large and contain some serialized binary data in the protobuf format.
I have to process all of them and transform it to another structure. This processing is not thread safe. I also process some data from a mongodb replica set using the same code (the mongodb is hosted in another datacenter).
For debugging purposes I log the throughput and realized that it is very low. With mongodb I have a throughput of 5000 items / sec, with azure table storage only 30 items per second.
To improve the performance I try to use TPL dataflow, but it doesnt help:
public async Task QueryAllAsync(Action<StoredConnectionSetModel> handler)
{
List<CloudTable> tables = await QueryAllTablesAsync(companies, minDate);
ActionBlock<StoredConnectionSetModel> handlerBlock = new ActionBlock<StoredConnectionSetModel>(handler, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 1 });
ActionBlock<CloudTable> downloaderBlock = new ActionBlock<CloudTable>(x => QueryTableAsync(x, s => handlerBlock.Post(s), completed), new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 20 });
foreach (CloudTable table in tables)
{
downloaderBlock.Post(table);
}
}
private static async Task QueryTableAsync(CloudTable table, Action<StoredConnectionSetModel> handler)
{
TableQuery<AzureTableEntity<StoredConnectionSetModel>> query = new TableQuery<AzureTableEntity<StoredConnectionSetModel>>();
TableContinuationToken token = null;
do
{
TableQuerySegment<AzureTableEntity<StoredConnectionSetModel>> segment = await table.ExecuteQuerySegmentedAsync<AzureTableEntity<StoredConnectionSetModel>>(query, token);
foreach (var entity in segment.Results)
{
handler(entity.Entity);
}
token = segment.ContinuationToken;
}
while (token != null)
}
I run the batch process on my local machine (with 100mbit connection) and in azure (as worker role) and it is very strange, that the throughput on my machine is higher (100 items / sec) than on azure. I reach my max capacity of the internet connection locally but the worker role should not have this 100mbit limitation I hope.
How can I increase the throughput? I have no ideas what is going wrong here.
EDIT: I realized that I was wrong with the 30items per second. It is often higher (100/sec), depending on the size of the items I guess. According to the documentation (http://azure.microsoft.com/en-us/documentation/articles/storage-performance-checklist/#subheading10) there is a limit:
The scalability limit for accessing tables is up to 20,000 entities (1KB each) per second for an account. This are only 19MB / sec, not so impressive, if you keep in mind, that there are also normal requests from the production system). Probably I test it to use multiple accounts.
EDIT #2: I made two single tests, starting with a list of 500 keys [1...500] (Pseudo Code)
Test#1 Old approach (TABLE 1)
foreach (key1 in keys)
foreach (key2 in keys)
insert new Entity { paritionkey = key1, rowKey = key2 }
Test#2 New approach (TABLE 2)
numpartitions = 100
foreach (key1 in keys)
foreach (key2 in keys)
insert new Entity { paritionkey = (key1 + key2).GetHashCode() % numParitions, rowKey = key1 + key2 }
Each entity gets another property with 10KB of random text data.
Then I made the query tests, in the first case I just query all entities from Table 1 in one thread (sequential)
In the next test I create on task for each partitionkey and query all entities from Table 2 (parallel). I know that the test is no that good, because in my production environment I have a lot more partitions than only 500 per table, but it doesnt matter. At least the second attempt should perform well.
It makes no difference. My max throughput is 600 entities/sec, varying from 200 to 400 the most of the time. The documentation says that I can query 20.000 entities / sec (with 1 KB each), so I should get at least 1500 or so in average, I think. I tested it on a machine with 500MBit internet connection and I only reached about 30mbit, so this should not be the problem.
You should also check out the Table Storage Design Guide. Hope this helps.

azure queue performance

For the windows azure queues the scalability target per storage is supposed to be around 500 messages / second (http://msdn.microsoft.com/en-us/library/windowsazure/hh697709.aspx). I have the following simple program that just writes a few messages to a queue. The program takes 10 seconds to complete (4 messages / second). I am running the program from inside a virtual machine (on west-europe) and my storage account also is located in west-europe. I don't have setup geo replication for my storage. My connection string is setup to use the http protocol.
// http://blogs.msdn.com/b/windowsazurestorage/archive/2010/06/25/nagle-s-algorithm-is-not-friendly-towards-small-requests.aspx
ServicePointManager.UseNagleAlgorithm = false;
CloudStorageAccount storageAccount=CloudStorageAccount.Parse(ConfigurationManager.AppSettings["DataConnectionString"]);
var cloudQueueClient = storageAccount.CreateCloudQueueClient();
var queue = cloudQueueClient.GetQueueReference(Guid.NewGuid().ToString());
queue.CreateIfNotExist();
var w = new Stopwatch();
w.Start();
for (int i = 0; i < 50;i++ )
{
Console.WriteLine("nr {0}",i);
queue.AddMessage(new CloudQueueMessage("hello "+i));
}
w.Stop();
Console.WriteLine("elapsed: {0}", w.ElapsedMilliseconds);
queue.Delete();
Any idea how I can get better performance?
EDIT:
Based on Sandrino Di Mattia's answer I re-analyzed the code I've originally posted and found out that it was not complete enough to reproduce the error. In fact I had created a queue just before the call to ServicePointManager.UseNagleAlgorithm = false; The code to reproduce my problem looks more like this:
CloudStorageAccount storageAccount=CloudStorageAccount.Parse(ConfigurationManager.AppSettings["DataConnectionString"]);
var cloudQueueClient = storageAccount.CreateCloudQueueClient();
var queue = cloudQueueClient.GetQueueReference(Guid.NewGuid().ToString());
//ServicePointManager.UseNagleAlgorithm = false; // If you change the nagle algorithm here, the performance will be okay.
queue.CreateIfNotExist();
ServicePointManager.UseNagleAlgorithm = false; // TOO LATE, the queue is already created without 'nagle'
var w = new Stopwatch();
w.Start();
for (int i = 0; i < 50;i++ )
{
Console.WriteLine("nr {0}",i);
queue.AddMessage(new CloudQueueMessage("hello "+i));
}
w.Stop();
Console.WriteLine("elapsed: {0}", w.ElapsedMilliseconds);
queue.Delete();
The suggested solution from Sandrino to configure the ServicePointManager using the app.config file has the advantage that the ServicePointManager is initialized when the application starts up, so you don't have to worry about time dependencies.
I answered a similar question a few days ago: How to achive more 10 inserts per second with azure storage tables.
For adding 1000 items in table storage it took over 3 minutes, and with the changes I described in my answer it dropped to 4 seconds (250 requests/sec). In the end, table storage and storage queues aren't all that different. The backend is the same, data is simply stored in a different way. And both table storage and queues are exposed through a REST API, so if you improve the way you handle your requests, you'll get a better performance.
The most important changes:
expect100Continue: false
useNagleAlgorithm: false (you're already doing this)
Parallel requests combined with connectionManagement/maxconnection
Also, ServicePointManager.DefaultConnectionLimit should be increased before making a service point. Actually Sandrino's answer says the same thing but using config.
Turn off proxy detection even in the cloud. Auto detect in proxy config element. Slows initialisation.
Choose distributed partition keys.
Collocate your account near to compute, and customers.
Design to add more accounts as needed.
Microsoft set the SLA at 2,000 tps on queues and tables as of 07 2012.
I didn't read Sandrino's linked answer, sorry, just was on this question as I was watching Build 2012 session on exactly this.

Categories