I have a C# ETL process which run once in a week and it takes 6 hours to complete on-premises windows server.
Here is the C# class structure.
Source database: Firebird database files downloaded everytime on disk, takes 20 mins to download.
Destination: SQL Server (on-prem)
Load process steps are here.
Do something
Do something
Execute parallel
We have 5 firebird independent db files which work on 5 different tables so set the max degree of parallelism to 5.
Parallel.ForEach(destTables, new ParallelOptions { MaxDegreeOfParallelism = 5}, (eachtable) =>
{
var tableName = eachtable.ToString(CultureInfo.InvariantCulture);
lock (tableName)
{
Thread.Sleep(10000);
readTable.BulkLoad(tableName, srcConForMainFile, destConForSQL);
Thread.Sleep(10000);
}
});
Do something
Complete the process
Now, I moved this process to Azure worker role.
Source database: Firebird database files downloaded on worker role disk (local storage=100GB set in .csdef file), taking 20-30 mins to download which is fine.
Destination db: Dedicated SQL Azure database of S3 Standard (100 DTUs) created in the same region where is worker role is created
I have setup a Large size worker role (4 cores, 7 GB RAM, High Net Bandwidth, 999 GB disk size) but the process took 20 hours to complete.
I also noticed that the CPU utilization has gone upto maximum 25% at some time and RAM used upto 2.5 or 3GB. That's it.
Is Parallel.ForEach working really in the worker role VM ?
How to verify Parallel execution is happening in worker role VM ?
Should we still increase the db to a higher pricing tier ?
Are there any other settings should be made to worker role VM so that the process runs much faster - 6hrs vs 20hrs ?
Related
I have a MySql server with the following config changes:
max_allowed_packet = 16M
max_connections = 2000
Innodb_cache_pool_size = 90G
I have a multi-threaded .netcore2 application executing 1000+ select queries per thread every couple of seconds.
My server environment is Ubuntu 16.04 cloud with the mysql package
This starts out fine and everything works fast but after a short amount of time, most connections change to "Sleep" mode and those who don't perform 5-10 selects every couple of seconds.
As far as resources go, I am using about 70% of all my cpus and abour 30% of my RAM.
my server is on an intranet network so I cannot copy the entire contents, but I'll try and give all the relevant info possible:
128GB RAM,
20 VCPUS (i7 cpus),
Dedicated server
All DBS are using the InnoDB engine with the Barracuda type.
Any help would be appreciated!
EDIT: as mentioned this is on an intranet network and I'm not allowed to copy any code.. I'll do my best to provide something similiar:
List<string> itemLst=new List<string>();
//fills the list here
ParallelOptions po=new ParallelOptions();
po.MaxDegreeOfParallelism = 45;
Parallel.ForEach(itemLst,po,item=>
{
//Open connection to MySql server
//do some SELECT queries
//do 1 insert and 1 delete query
//close and dispose connection
}
Our application connects to a .mdb-file on Network. All was fine, untill we swapped the Computer from a 32-bit Windows 7 to 64 bit Windows 10. Since this action, connecting to the database through our C# code is getting slower over time. It starts with something lower than 1 second after starting the application. After running the application ca. 8 hours the opening of the database file takes more than 10 seconds, and rising. Ca. every 5 seconds the access database gets connected.
After a restart of the application all is fine. The frequently restart isn't an long term option for our customer. The queries itself are done in some ms and closing is ok, too.
I've seen that the processor usage is only about 10%. Memory i don't know at the moment.
Has anybody an idea why connecting to the database get slowed down over time?
public void OpenDb ( string _sOpenString )
{
this.sFilePathToAccessDb = _sOpenString;
this.sConnectionString =
#"Provider=Microsoft.ACE.OLEDB.12.0;DataSource=" + this.sFilePathToAccessDb;
this.oOleDbConnection =
new System.Data.OleDb.OleDbConnection( this.sConnectionString);
this.oOleDbConnection.Open();
}
public void CloseDb()
{
if (this.oOleDbConnection != null)
{
this.oOleDbConnection.Close();
this.oOleDbConnection.Dispose();
}
this.oOleDbConnection = null;
}
public void foo()
{
OpenDb(#"\\fooserver\databases\bar.mdb");
//do some stuff
CloseDb();
}
Hence after several testing it turned out, that we've installed Microsoft Access Database Engine 2010 version 14 on the machine. After uninstall this version and install version 12.0 all is fine. Opening a Database on a network drive takes stable 30 ms.
I have a WCF web service based on JSON and POST method which have a function called website. This function has a simple code and only call another web service using the code bellow:
using (var cli = new MyWebClient())
{
Task<string> t = cli.UploadStringTaskAsync(myURI, "POST", request);
if (t == await Task.WhenAny(t, Task.Delay(400)))
{
response = t.Result;
}
else
{
response = "";
}
cli.Dispose();
}
and MyWebClient class is implemented as:
class MyWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).KeepAlive = true;
(request as HttpWebRequest).ContentType = "application/json";
}
return request;
}
}
The problem is that I can see in IIS that a large number of requests remain open more than 18 seconds and even more for 1 or 2 of my worker processes(as you can see in the attached image for one of them)!!! And this makes the service very slow down. Note that this service has about 2K requests per second and the application pool of this service has a web garden containing 12 worker processes and the queue limit is 10K. This situation takes place when there are (for example) 4 worker processes working in a predictable time(about 450 ms) and IIS shows that the maximum elapsed time on their requests are about 380.
a large number of requests that remain open in the IIS
Note that I have used cli.UploadStringTaskAsync and hence timeout is not considered for cli. So, I have to implement a code like t == await Task.WhenAny(t, Task.Delay(400)) to simulate timeout.
What is the problem do you think?! Does using await cause many context switch and the requests are queued to be executed by cpu?
Edit:
Here you can find some recommendation that is helpful, but no one can help and solve the problem. I set them up in my application's web config but it couldn't resolve my issue.
Update:
As additional information note that the network card is 1G, and at most we have 100Mgb/s bandwidth usage. There are 6 cores and 12 logical processors of Intel Xeon E5-1650 V3 3.5 Ghz. We have 128GB of RAM and 480GB of SSD.
I found a solution that solves the problem. The key point was "processModel Element (ASP.NET Settings Schema)". As I mentioned in my question:
This situation takes place when there are (for example) 4 worker processes working in a predictable time(about 450 ms) and IIS shows that the maximum elapsed time on their requests are about 380.
So, I think balancing the load among worker processes could be the problem. By configuring the processModel Element manually I have solved the issue. After researching a lot I found this valuable link about processModel Element and its properties.
Also this link describes all the properties and the effect of each item. As mentioned in this link there are 2 important properties called "requestLimit" and "requestQueueLimit":
requestQueueLimit: Specifies the number of requests that are allowed in the queue before ASP.NET begins returning the message "503 – Server Too Busy" to new requests. The default is 5000.
requestLimit: Specifies the number of requests that are allowed before ASP.NET automatically launches a new worker process to take the place of the current one. The default is Infinite.
The solution is to control and limit requestLimit by a rational number for example 300 in my case. Also, limiting requestQueueLimit by multiplication of the number of worker processes and requestLimit. I have increased the number of worker process to 20 and by this configuration 6000 request can be queued totally and each worker process has 300 request at most. By reaching 300 request per worker process ASP.NET automatically launches a new worker process to take the place of the current one.
So the load is balanced better among the worker processes. I have checked all queues and there is no request with more than 400 time elapsed!!!
I think this solution could be used as a semi-load balancer algorithm for IIS and worker processes by playing with these properties(requestLimit, requestQueueLimit and number of worker processes ).
I have a Worker Role that executes code (fetching data and storing it to Azure SQL) every X hours. The timing is implemented using a Thread.Sleep in the while(true) loop in the Run method.
In the Web Role I want to have the abillity to manualy start the code in Worker Role (manualy fecth and store data in my case). I found out that the whole Worker Role can be restarted using the Azure Management API but it seems like an overkill, especialy looking at all the work needed around certificates.
Is there a better way to restart Worker Role from Web Role or have the code in Worker Role run on demand from the Web Role?
Anything like posting an event to an Azure Queue, posting a blob to Azure Blobs, changing a record in Azure Tables or even making some change in SQL Azure will work - the web role will do the change and the worker role will wait for that change. Perhaps Azure Queues would be the cleanest way, although I'm not sure.
One very important thing you should watch for is that if you decide to use polling - like query a blob until it appears - you should insert a delay between the queries, otherwise this code:
while( true ) {
if( storage.BlobExists( blobName ) ) {
break;
}
}
will rush into the storage and you'll encounter outrageous transaction fees. In case of SQL Azure you will not see any fees, but you'll waste the service capacity for no good and this will slow down other operations you queue to SQL Azure.
This is how is should be done:
while( true ) {
if( storage.BlobExists( blobName ) ) {
break;
}
// value should not be less that several hundred (milliseconds)
System.Threading.Thread.Sleep( 15 * 1000 );
}
Well I suggest you use Azure Fluent Management (which uses the Service Management API internally). Take a look at the "Deploying to Windows Azure" page.
What you will want to do is the following:
Cloud Service: mywebapp.cloudapp.net
Production slot
Role: MyMvcApplication
Cloud Service: mybackgroundworker.cloudapp.net
Production slot
No DEPLOYMENT
So you would typically have a Cloud Service running with a Web Role and that's it. What you do next is create the Worker Role, add your code, package it to a cspkg file and upload it to blob storage.
Finally you would have some code in your Web Role that can deploy (or remove) the Worker Role to that other Cloud Service by downloading the package locally and then running code similar to this:
var subscriptionManager = new SubscriptionManager(TestConstants.SubscriptionId);
var deploymentManager = subscriptionManager.GetDeploymentManager();
deploymentManager
.AddCertificateFromStore(Constants.Thumbprint)
.ForNewDeployment(TestConstants.HostedServiceName)
.SetCspkgEndpoint(#"C:\mypackage")
.WithNewHostedService("myelastatestservice")
.WithStorageAccount("account")
.AddDescription("my new service")
.AddLocation(LocationConstants.NorthEurope)
.GoHostedServiceDeployment();
My situation is this:
was created a page that will run a long process . ... This process consists in:
- Read a file. Csv, for each row of the file wil be created an invoice...in the end of this process shows a successful message.
for this it was decided to use an updatepanel so that the process is asynchronous and can display an UpdateProgress while waiting to finish the process ... for this in the property of scriptmanagment was added the AsyncPostBackTimeout = 7200 (2 hours) and also timeout was increased in the web.config of the app as in qa and production servers.
Tests were made in the localhost as a qa server and works very well, the problem arises when testing the functionality on the production server.
that makes:
file is loaded and starts the process ... during this period is running the UpdateProgress but is only taking between 1 or 2 min and ends the execution without displaying the last message, as if truncated the process. When reviewing the invoices created are creating only the top 10 records of the file.(from a file with 50,100 or + rows)
so I would like to help me with this, because i don't know what could be wrong.
asp.net is not suited for long running processes.
The default page timeout for IIS is 110 seconds (90 for .net 1.0). You can increase this, but it is not recommended.
If you must do it, here is the setting:
<system.web>
...
<httpRuntime executionTimeout="180"/>
...
<\system.web>
Refer httpRuntime
Pass on this work to a windows service, WCF or a stand alone exe.
Use your page to get the status of the process from that application.
Here is an example that shows how to use workflows for long running processes.
You move the bulk of the processing out of asp.net, and free its threads to handle page requests.