Azure pipeline concurrent (not parallel) tests execution for xunit tests - c#

I am using azure pipeline to execute API tests, the tests are executed serially and I have hit the point where my job runs for over 1 hour - this means agent fails because 1 hour is the maximum job execution time. I have started to read how to execute tests in parallel, and in xunit tests should run in parallel by default when they are not in the same collection. In azure however, following https://learn.microsoft.com/en-us/azure/devops/pipelines/test/parallel-testing-vstest?view=azure-devops, to run tests parallelly it is required to use more than one agent and apparently more than 1 job. There are two cons in my case:
we have limited amount of agents and other people want to use them too, I do not want to occupy more agents than necessary;
I need to configure two firewall for cosmosdb for each job (1 job = 1 agent and each agent has new ip i need to add to firewall), this tasks takes 8 minutes and i need to configure 2 databases (16 minutes), each new job would mean i need to speend additional 16 minutes configuring firewalls.
So my question is - is it possible to run tests concurrent rather than parallel by utilizing same core? I have many thread.sleeps (but in my opinion i use them the correct way as i wait for some result for some time like 60 seconds, but i thread sleep for 2 seconds and every 2 seconds i'm polling in example databse if results are there) - thread.sleep means thread is sleeping and core should be available for other threads to run.

Apparently I have posted question too soon, found out that all I need to do is to set [assembly: CollectionBehavior(MaxParallelThreads = n)] to number I want (source: https://xunit.net/docs/running-tests-in-parallel). Default is number of logical processor and in case of azure agents this number is equal to 1.

Related

Wait time for queue-triggered Azure Functions is very high compared to equivalent Webjobs

I've migrated queue-triggered Azure Webjobs to Azure Functions. Based on my measurements the wait time to pluck messages off the queue is 5X to 60X+ (yes really) longer with the Functions.
In Webjob land, I observed that with BatchSize, NewBatchThreshold, and MaxPollingInterval at their defaults, queue wait times were generally sub-second.
With my Functions, I am seeing queue wait times often in excess of 45-60 seconds. There is a correlation between number of items in queue and wait times. If the number of items in the queue is low single digits, wait times are excessive, ie. 60 seconds plus. This is despite my trying many different combinations of BatchSize and NewBatchThreshold.
Some specific details:
The webjobs were .NET Core 3.1
The Functions are v3 and also .NET Core 3.1
I have tried Functions on the Consumption Plan and App Service plans and I am seeing no difference in wait times
To get some scientific measurements I instrumented my Functions to log time the message was queued and the time the message was retrieved from the queue in order to get the elapsed time. To further eliminate variables I created several completely empty functions - that is, the body of the queue triggered method contains nothing but the code to log the time. I saw massive wait times here as well.
If I take the queue triggered methods and copy and paste them into an Azure webjob, the queue wait times become 1 second or less.
Any guidance?
Not sure about Webjobs, but In Azure Functions the time between adding a message to the queue and the moment it's picked up varies - take a look at the details of the polling algorithm from the documentation:
The queue trigger implements a random exponential back-off algorithm
to reduce the effect of idle-queue polling on storage transaction
costs. The algorithm uses the following logic:
When a message is found, the runtime waits two seconds and then checks for another
message
When no message is found, it waits about four seconds before
trying again.
After subsequent failed attempts to get a queue message,
the wait time continues to increase until it reaches the maximum wait
time, which defaults to one minute.
The maximum wait time is
configurable via the maxPollingInterval property in the host.json
file. For local development the maximum polling interval defaults to
two seconds.
Based on that, it seems you need to decrease the value of maxPollingInterval - it's 60 seconds by default, so in worst case, you can expect the maximum delay to be around that value. If you decrease it to X, the worst time between adding the message and dequeuing will be around X (probably a bit more due to different overheads)

How to execute and run millions of unit tests quickly?

How do you execute millions of unit test quickly, meaning 20 to 30 minutes?
Here is the scenario:
You are releasing certain hardware and you have, let's say, 2000 unit tests.
You are releasing new hardware and you have additional 1000 tests for that.
Each new hardware will include tests, but also you have to run and execute every previous one, and the number gets bigger as does execution time.
During development, this is solved by categorizing, using the TestCategory attribute and running only what you need to.
The CI, however, must run every single test. As the number increases, executing time is slower and sometimes times out. The .testrunconfig is already set for parallelTestCount execution, but over time this does not solve the issue permanently.
How would you solve this?
It seems like with each update on Visual Studio 2017, execution time changes. We currently have over 6000 tests, of which 15 to 20% are unit tests, and the rest are integration tests.
The bottleneck seemed to be the CI server itself, running on a single machine. 70% to 80% of the tests are asynchronous, and analysis showed that there are no blocking I/O operations. Besides IO, we do not use databases, caching so there is that.
Now, we are in the process of migrating to Jenkins and using its Parallel Test Executor Plugin to parallelize the tests across multiple nodes instead of a single machine. Initial testing showed that timing for executing 6000+ tests varies from 10 to 15 minutes versus the old CI which took 2 hours or stopped sometimes.

Parallel programming for Windows Service

I have a Windows Service that has code similar to the following:
List<Buyer>() buyers = GetBuyers();
var results = new List<Result();
Parallel.Foreach(buyers, buyer =>
{
// do some prep work, log some data, etc.
// call out to an external service that can take up to 15 seconds each to return
results.Add(Bid(buyer));
}
// Parallel foreach must have completed by the time this code executes
foreach (var result in results)
{
// do some work
}
This is all fine and good and it works, but I think we're suffering from a scalability issue. We average 20-30 inbound connections per minute and each of those connections fire this code. The "buyers" collection for each of those inbound connections can have from 1-15 buyers in it. Occasionally our inbound connection count sees a spike to 100+ connections per minute and our server grinds to a halt.
CPU usage is only around 50% on each server (two load balanced 8 core servers) but the thread count continues to rise (spiking up to 350 threads on the process) and our response time for each inbound connection goes from 3-4 seconds to 1.5-2 minutes.
I suspect the above code is responsible for our scalability problems. Given this usage scenario (parallelism for I/O operations) on a Windows Service (no UI), is Parallel.ForEach the best approach? I don't have a lot of experience with async programming and am looking forward to using this opportunity to learn more about it, figured I'd start here to get some community advice to supplement what I've been able to find on Google.
Parallel.Foreach has a terrible design flaw. It is prone to consume all available thread-pool resources over time. The number of threads that it will spawn is literally unlimited. You can get up to 2 new ones per second driven by heuristics that nobody understands. The CoreCLR has a hill climbing algorithm built into it that just doesn't work.
call out to an external service
Probably, you should find out what's the right degree of parallelism calling that service. You need to find out by testing different amounts.
Then, you need to restrict Parallel.Foreach to only spawn as many threads as you want at a maximum. You can do that using a fixed concurrency TaskScheduler.
Or, you change this to use async IO and use SemaphoreSlim.WaitAsync. That way no threads are blocked. The pool exhaustion is solved by that and the overloading of the external service as well.

Prevent overload on remote system when using Parallel.ForEach()

We've built this app that needs to have some calculations done on a remote machine (actually a MatLab server). We're using web services to connect to the MatLab server and perform the calculations.
In order to speed things up, we've used Parallel.ForEach() in order to have multiple service calls going at the same time. If we're very conservative in setting ParallelOptions.MaxDegreeOfParallelism (DOP) to 4 or something, everything works fine and well.
However, if we let the framework decide on the DOP it will spawn so many threads that it forces the remote machine on its knees and timeouts start occurring ( > 10 minutes ).
How can we solve this issue? What I would LOVE to be able to do is use the response time to throttle the calls. If response time is less than 30 sec, keep adding threads, as soon as it's over 30 sec, use less. Any suggestions?
N.B. Related to the response in this question: https://stackoverflow.com/a/20192692/896697
Simplest way would be to tune for the best number of concurrent requests and hardcode that as you have done so far, however there are some nicer options if you are willing to put in some effort.
You could move from a Parallel.ForEach to using a thread pool. That way as things come back from the remote server you can either manually or programatically tune the number of available threads. reducing/increasing the number of available threads as things slow down/speed up, or even kill them if needed.
You could also do a variant of the above using Tasks which are the newer way of doing parallel/async stuff in .net.
Another option would be to use a timers and/or jobs model to schedule jobs every x milliseconds, which could then be throttled/relaxed as results returned from the server. The easiest way to get started would be using Quartz.Net.

How to run background service on web application - DotNetNuke

I made dnn scheduler and set to run it on every 1 min. It works when I do something on site. But I need to run some actions when I am not on the site. For example insert record to database with currenct time. Is this possible?
In Host Settings, use Scheduler Mode = Timer Method
This will make the scheduler run in a separate thread that is not triggered by page requests.
If the scheduler runs in the timer method, it won't have access to the current HttpContext.
You will also have to make sure that DNN is kept alive, and IIS doesn't shut down the application due to inactivity. Setting the application pool idle timeout appropriately, and pinging the /Keepalive.aspx should take care of this. Nevertheless, using the DNN scheduler for critical tasks is not a good idea.
See Also:
Creating DotNetNuke Scheduled Jobs
DotNetNuke Scheduler
Explained
If you just want database related things, such as inserting a record, you can use database jobs. You didn't mention what dbms you use but almost every database have almost same functionality under different names.
Doing the equivalent of a Cron job is still a pain in the butt on Windows.
The DNN Scheduler will work if you aren't super concerned about when it runs. What you may need to do is have more logic on your end... if it only runs every 10 minutes, or every hour or whatever you may have to look at your database tables, determine last time it ran and then do whatever it needs to do to 'catch up.' In your case add 60 minutes of info versus every minute.
I'm struggling to think of an example of why I would just write to a table every minute or on some interval. If I needed it for a join table or something convenient to report off of you should generate them in larger chunks.
The other option is to write a small .NET windows service which isn't that hard and have it run every minute. That would be more reliable.

Categories