Track dead WebDriver instances during parallel task

Track dead WebDriver instances during parallel task - c#

I am seeing some dead-instance weirdness running parallelized nested-loop web stress tests using Selenium WebDriver, simple example being, say, hit 300 unique pages with 100 impressions each.
I'm "successfully" getting 4 - 8 WebDriver instances going using a ThreadLocal<FirefoxWebDriver> to isolate them per task thread, and MaxDegreeOfParallelism on a ParallelOptions instance to limit the threads. I'm partitioning and parallelizing the outer loop only (the collection of pages), and checking .IsValueCreated on the ThreadLocal<> container inside the beginning of each partition's "long running task" method. To facilitate cleanup later, I add each new instance to a ConcurrentDictionary keyed by thread id.
No matter what parallelizing or partitioning strategy I use, the WebDriver instances will occasionally do one of the following:
Launch but never show a URL or run an impression
Launch, run any number of impressions fine, then just sit idle at some point
When either of these happen, the parallel loop eventually seems to notice that a thread isn't doing anything, and it spawns a new partition. If n is the number of threads allowed, this results in having n productive threads only about 50-60% of the time.
Cleanup still works fine at the end; there may be 2n open browsers or more, but the productive and unproductive ones alike get cleaned up.
Is there a way to monitor for these useless WebDriver instances and a) scavenge them right away, plus b) get the parallel loop to replace the task segment immediately, instead of lagging behind for several minutes as it often does now?

I was having a similar problem. It turns out that WebDriver doesn't have the best method for finding open ports. As described here it gets a system wide lock on ports, finds an open port, and then starts the instance. This can starve the other instances that you're trying to start of ports.
I got around this by specifying a random port number directly in the delegate for the ThreadLocal<IWebDriver> like this:
var ports = new List<int>();
var rand = new Random((int)DateTime.Now.Ticks & 0x0000FFFF);
var driver = new ThreadLocal<IWebDriver>(() =>
{
var profile = new FirefoxProfile();
var port = rand.Next(50) + 7050;
while(ports.Contains(port) && ports.Count != 50) port = rand.Next(50) + 7050;
profile.Port = port;
ports.Add(port);
return new FirefoxDriver(profile);
});
This works pretty consistently for me, although there's the issue if you end up using all 50 in the list that is unresolved.

Since there is no OnReady event nor an IsReady property, I worked around it by sleeping the thread for several seconds after creating each instance. Doing that seems to give me 100% durable, functioning WebDriver instances.
Thanks to your suggestion, I've implemented IsReady functionality in my open-source project Webinator. Use that if you want, or use the code outlined below.
I tried instantiating 25 instances, and all of them were functional, so I'm pretty confident in the algorithm at this point (I leverage HtmlAgilityPack to see if elements exist, but I'll skip it for the sake of simplicity here):
public void WaitForReady(IWebDriver driver)
{
var js = #"{ var temp=document.createElement('div'); temp.id='browserReady';" +
#"b=document.getElementsByTagName('body')[0]; b.appendChild(temp); }";
((IJavaScriptExecutor)driver).ExecuteScript(js);
WaitForSuccess(() =>
{
IWebElement element = null;
try
{
element = driver.FindElement(By.Id("browserReady"));
}
catch
{
// element not found
}
return element != null;
},
timeoutInMilliseconds: 10000);
js = #"{var temp=document.getElementById('browserReady');" +
#" temp.parentNode.removeChild(temp);}";
((IJavaScriptExecutor)driver).ExecuteScript(js);
}
private bool WaitForSuccess(Func<bool> action, int timeoutInMilliseconds)
{
if (action == null) return false;
bool success;
const int PollRate = 250;
var maxTries = timeoutInMilliseconds / PollRate;
int tries = 0;
do
{
success = action();
tries++;
if (!success && tries <= maxTries)
{
Thread.Sleep(PollRate);
}
}
while (!success && tries < maxTries);
return success;
}
The assumption is if the browser is responding to javascript functions and is finding elements, then it's probably a reliable instance and ready to be used.

Related

What is the correct way to use USBHIDDRIVER for multiple writes?

I am writing an application that needs to write messages to a USB HID device and read responses. For this purpose, I'm using USBHIDDRIVER.dll (https://www.leitner-fischer.com/2007/08/03/hid-usb-driver-library/ )
Now it works fine when writing many of the message types - i.e. short ones.
However, there is one type of message where I have to write a .hex file containing about 70,000 lines. The protocol requires that each line needs to be written individually and sent in a packet containing other information (start, end byte, checksum)
However I'm encountering problems with this.
I've tried something like this:
private byte[] _responseBytes;
private ManualResetEvent _readComplete;
public byte[][] WriteMessage(byte[][] message)
{
byte[][] devResponse = new List<byte[]>();
_readComplete = new ManualResetEvent(false);
for (int i = 0; i < message.Length; i++)
{
var usbHid = new USBInterface("myvid", "mypid");
usbHid.Connect();
usbHid.enableUsbBufferEvent(UsbHidReadEvent);
if (!usbHid.write(message)) {
throw new Exception ("Write Failed");
}
usbHid.startRead();
if (!_readComplete.WaitOne(10000)) {
usbHid.stopRead();
throw new Exception ("Timeout waiting for read");
}
usbHid.stopRead();
_readComplete.Reset();
devResponse.Add(_responseBytes.ToArray());
usbHid = null;
}
return devResponse;
}
private void ReadEvent()
{
if (_readComplete!= null)
{
_readComplete.Set();
}
_microHidReadBytes = (byte[])((ListWithEvent)sender)[0];
}
This appears to work. In WireShark I can see the messages going back and forth. However as you can see it's creating an instance of the USBInterface class every iteration. This seems very clunky and I can see in the TaskManager, it starts to eat up a lot of memory - current run has it above 1GB and eventually it falls over with an OutOfMemory exception. It is also very slow. Current run is not complete after about 15 mins, although I've seen another application do the same job in less than one minute.
However, if I move the creation and connection of the USBInterface out of the loop as in...
var usbHid = new USBInterface("myvid", "mypid");
usbHid.Connect();
usbHid.enableUsbBufferEvent(UsbHidReadEvent);
for (int i = 0; i < message.Length; i++)
{
if (!usbHid.write(message)) {
throw new Exception ("Write Failed");
}
usbHid.startRead();
if (!_readComplete.WaitOne(10000)) {
usbHid.stopRead();
throw new Exception ("Timeout waiting for read");
}
usbHid.stopRead();
_readComplete.Reset();
devResponse.Add(_responseBytes.ToArray());
}
usbHid = null;
... now what happens is it only allows me to do one write! I write the data, read the response and when it comes around the loop to write the second message, the application just hangs in the write() function and never returns. (Doesn't even time out)
What is the correct way to do this kind of thing?
(BTW I know it's adding a lot of data to that devResponse object but this is not the source of the issue - if I remove it, it still consumes an awful lot of memory)
UPDATE
I've found that if I don't enable reading, I can do multiple writes without having to create a new USBInterface1 object with each iteration. This is an improvement but I'd still like to be able to read each response. (I can see they are still sent down in Wireshark)

Running Sql Server Agent Jobs from C#

While searching on above topic in the internet, I found two approaches,Both are working fine, But I need to know the difference between the two, Which one is suitable for what occasion etc... Our Jobs take some time and I need a way to wait till the Job finishes before the next C# line executes.
Approach One
var dbConn = new SqlConnection(myConString);
var execJob = new SqlCommand
{
CommandType = CommandType.StoredProcedure,
CommandText = "msdb.dbo.sp_start_job"
};
execJob.Parameters.AddWithValue("#job_name", p0);
execJob.Connection = dbConn;
using (dbConn)
{
dbConn.Open();
using (execJob)
{
execJob.ExecuteNonQuery();
Thread.Sleep(5000);
}
}
Approach Two
using System.Threading;
using Microsoft.SqlServer.Management.Smo;
using Microsoft.SqlServer.Management.Smo.Agent;
var server = new Server(#"localhost\myinstance");
var isStopped = false;
try
{
server.ConnectionContext.LoginSecure = true;
server.ConnectionContext.Connect();
var job = server.JobServer.Jobs[jobName];
job.Start();
Thread.Sleep(1000);
job.Refresh();
while (job.CurrentRunStatus == JobExecutionStatus.Executing)
{
Thread.Sleep(1000);
job.Refresh();
}
isStopped = true;
}
finally
{
if (server.ConnectionContext.IsOpen)
{
server.ConnectionContext.Disconnect();
}
}

sp_start_job - sample 1
Your first example calls your job via the sp_start_job system stored procedure.
Note that it kicks off the job asynchronously, and the thread sleeps for an arbitrary period of time (5 seconds) before continuing regardless of the job's success or failure.
SQL Server Management Objects (SMO) - sample 2
Your second example uses (and therefore has a dependency on) the SQL Server Management Objects to achieve the same goal.
In the second case, the job also commences running asynchronously, but the subsequent loop watches the Job Status until it is not longer Executing. Note that the "isStopped" flag appears to serve no purpose, and the loop could be refactored somewhat as:
job.Start();
do
{
Thread.Sleep(1000);
job.Refresh();
} while (job.CurrentRunStatus == JobExecutionStatus.Executing);
You'd probably want to add a break-out of that loop after a certain period of time.
Other Considerations
It seems the same permissions are required by each of your examples; essentially the solution using SMO is a wrapper around sp_start_job, but provides you with (arguably) more robust code which has a clearer purpose.
Use whichever suits you best, or do some profiling and pick the most efficient if performance is a concern.

c# while loop usage

I have a fairly general c# while loop question.
This code should continue to execute only after the RDP session has truly disconnected.
When the Connected property is changed to 0 it means that the RDP session connection has truly terminated. When the property is 1 it is still connected and the connection has not yet terminated.
Does anyone see anything inherently bad about this code? Is there a better way to go about it?
private void Reconnect()
{
rdp1.Disconnect(); // force the RDP session to disconnect
while (rdp1.Connected == 1) // true as long as RDP is still connected
{
// do nothing
}
rdp1.Connect(); // execute this code after while loop is broken
}
/**************************************************************/
Here's the final code I used per James' answer.
The counter suffices as the timeout for my purpose.
int i = 0;
rdp1.Disconnect();
while (rdp1.Connected == 1)
{
if (i == 1000 * 10) break;
else Thread.Sleep(100);
i++;
}
rdp1.Connect();

You should do something in the body of loop, or it will consume all your CPU (at least for one core). Usually in this type of loop, you'd sleep for a while using System.Threading.Thread.Sleep(100) or something. Sleep takes the number of milliseconds to wait before checking the while condition again. Ideally, the RDP object would have a mutex or event or something you could just block on until it was disconnected, but it wouldn't surprise me if they left that out.
EDIT: As Ben pointed out, it's always a good idea to have a way out of the loop as well. Something like this (your stated answer will depend on the CPU speed, which could break in the future when CPUs are much faster):
DateTime stop = DateTime.UtcNow.AddSeconds(30);
while (rdp1.Connected)
{
if (DateTime.UtcNow > stop) throw new ApplicationException ("RDP disconnect timeout!");
System.Threading.Thread.Sleep (100);
}
Of course you will probably want to specify the timeout with a constant, a readonly TimeSpan, or a dynamically configurable TimeSpan rather than a magic number, and you should probably have a specific exception class for this case.

Set a timeout for the purpose
private void Reconnect()
{
timeOut = false;
new System.Threading.Thread(new System.Threading.ThreadStart(setTimeout)).Start();
rdp1.Disconnect();
while (rdp1.Connected == 1 && !timeOut);
rdp1.Connect();
}
bool timeOut = false;
void setTimeout()
{
System.Threading.Thread.Sleep(7000);
timeOut = true;
}

Can this code have bottleneck or be resource-intensive?

It's code that will execute 4 threads in 15-min intervals. The last time that I ran it, the first 15-minutes were copied fast (20 files in 6 minutes), but the 2nd 15-minutes are much slower. It's something sporadic and I want to make certain that, if there's any bottleneck, it's in a bandwidth limitation with the remote server.
EDIT: I'm monitoring the last run and the 15:00 and :45 copied in under 8 minutes each. The :15 hasn't finished and neither has :30, and both began at least 10 minutes before :45.
Here's my code:
static void Main(string[] args)
{
Timer t0 = new Timer((s) =>
{
Class myClass0 = new Class();
myClass0.DownloadFilesByPeriod(taskRunDateTime, 0, cts0.Token);
Copy0Done.Set();
}, null, TimeSpan.FromMinutes(20), TimeSpan.FromMilliseconds(-1));
Timer t1 = new Timer((s) =>
{
Class myClass1 = new Class();
myClass1.DownloadFilesByPeriod(taskRunDateTime, 1, cts1.Token);
Copy1Done.Set();
}, null, TimeSpan.FromMinutes(35), TimeSpan.FromMilliseconds(-1));
Timer t2 = new Timer((s) =>
{
Class myClass2 = new Class();
myClass2.DownloadFilesByPeriod(taskRunDateTime, 2, cts2.Token);
Copy2Done.Set();
}, null, TimeSpan.FromMinutes(50), TimeSpan.FromMilliseconds(-1));
Timer t3 = new Timer((s) =>
{
Class myClass3 = new Class();
myClass3.DownloadFilesByPeriod(taskRunDateTime, 3, cts3.Token);
Copy3Done.Set();
}, null, TimeSpan.FromMinutes(65), TimeSpan.FromMilliseconds(-1));
}
public struct FilesStruct
{
public string RemoteFilePath;
public string LocalFilePath;
}
Private void DownloadFilesByPeriod(DateTime TaskRunDateTime, int Period, Object obj)
{
FilesStruct[] Array = GetAllFiles(TaskRunDateTime, Period);
//Array has 20 files for the specific period.
using (Session session = new Session())
{
// Connect
session.Open(sessionOptions);
TransferOperationResult transferResult;
foreach (FilesStruct u in Array)
{
if (session.FileExists(u.RemoteFilePath)) //File exists remotely
{
if (!File.Exists(u.LocalFilePath)) //File does not exist locally
{
transferResult = session.GetFiles(u.RemoteFilePath, u.LocalFilePath);
transferResult.Check();
foreach (TransferEventArgs transfer in transferResult.Transfers)
{
//Log that File has been transferred
}
}
else
{
using (StreamWriter w = File.AppendText(Logger._LogName))
{
//Log that File exists locally
}
}
}
else
{
using (StreamWriter w = File.AppendText(Logger._LogName))
{
//Log that File exists remotely
}
}
if (token.IsCancellationRequested)
{
break;
}
}
}
}

Something is not quite right here. First thing is, you're setting 4 timers to run parallel. If you think about it, there is no need. You don't need 4 threads running parallel all the time. You just need to initiate tasks at specific intervals. So how many timers do you need? ONE.
The second problem is why TimeSpan.FromMilliseconds(-1)? What is the purpose of that? I can't figure out why you put that in there, but I wouldn't.
The third problem, not related to multi-programming, but I should point out anyway, is that you create a new instance of Class each time, which is unnecessary. It would be necessary if, in your class, you need to set constructors and your logic access different methods or fields of the class in some order. In your case, all you want to do is to call the method. So you don't need a new instance of the class every time. You just need to make the method you're calling static.
Here is what I would do:
Store the files you need to download in an array / List<>. Can't you spot out that you're doing the same thing every time? Why write 4 different versions of code for that? This is unnecessary. Store items in an array, then just change the index in the call!
Setup the timer at perhaps 5 seconds interval. When it reaches the 20 min/ 35 min/ etc. mark, spawn a new thread to do the task. That way a new task can start even if the previous one is not finished.
Wait for all threads to complete (terminate). When they do, check if they throw exceptions, and handle them / log them if necessary.
After everything is done, terminate the program.
For step 2, you have the option to use the new async keyword if you're using .NET 4.5. But it won't make a noticeable difference if you use threads manually.
And why is it so slow...why don't you check your system status using task manager? Is the CPU high and running or is the network throughput occupied by something else or what? You can easily tell the answer yourself from there.

The problem was the sftp client.
The purpose of the console application was to loop through a list<> and download the files. I tried with winscp and, even though, it did the job, it was very slow. I also tested sharpSSH and it was even slower than winscp.
I finally ended up using ssh.net which, at least in my particular case, was much faster than both winscp and sharpssh. I think the problem with winscp is that there was no evident way of disconnecting after I was done. With ssh.net I could connect/disconnect after every file download was made, something I couldn't do with winscp.

How to programatically check if NServiceBus has finished processing all messages

As part of an effort to automate starting/stopping some of our NServiceBus services, I'd like to know when a service has finished processing all the messages in it's input queue.
The problem is that, while the NServiceBus service is running, my C# code is reporting one less message than is actually there. So it thinks that the queue is empty when there is still one message left. If the service is stopped, it reports the "correct" number of messages. This is confusing because, when I inspect the queues myself using the Private Queues view in the Computer Management application, it displays the "correct" number.
I'm using a variant of the following C# code to find the message count:
var queue = new MessageQueue(path);
return queue.GetAllMessages().Length;
I know this will perform horribly when there are many messages. The queues I'm inspecting should only ever have a handful of messages at a time.
I have looked at
other
related
questions,
but haven't found the help I need.
Any insight or suggestions would be appreciated!
Update: I should have mentioned that this service is behind a Distributor, which is shut down before trying to shut down this service. So I have confidence that new messages will not be added to the service's input queue.

The thing is that it's not actually "one less message", but rather dependent on the number of messages currently being processed by the endpoint which, in a multi-threaded process, can be as high as the number of threads.
There's also the issue of client processes that continue to send messages to that same queue.
Probably the only "sure" way of handling this is by counting the messages multiple times with a delay in between and if the number stay zero over a certain number of attempts that you can assume the queue is empty.

WMI was the answer! Here's a first pass at the code. It could doubtless be improved.
public int GetMessageCount(string queuePath)
{
const string query = "select * from Win32_PerfRawData_MSMQ_MSMQQueue";
var query = new WqlObjectQuery(query);
var searcher = new ManagementObjectSearcher(query);
var queues = searcher.Get();
foreach (ManagementObject queue in queues)
{
var name = queue["Name"].ToString();
if (AreTheSameQueue(queuePath, name))
{
// Depending on the machine (32/64-bit), this value is a different type.
// Casting directly to UInt64 or UInt32 only works on the relative CPU architecture.
// To work around this run-time unknown, convert to string and then parse to int.
var countAsString = queue["MessagesInQueue"].ToString();
var messageCount = int.Parse(countAsString);
return messageCount;
}
}
return 0;
}
private static bool AreTheSameQueue(string path1, string path2)
{
// Tests whether two queue paths are equivalent, accounting for differences
// in case and length (if one path was truncated, for example by WMI).
string sanitizedPath1 = Sanitize(path1);
string sanitizedPath2 = Sanitize(path2);
if (sanitizedPath1.Length > sanitizedPath2.Length)
{
return sanitizedPath1.StartsWith(sanitizedPath2);
}
if (sanitizedPath1.Length < sanitizedPath2.Length)
{
return sanitizedPath2.StartsWith(sanitizedPath1);
}
return sanitizedPath1 == sanitizedPath2;
}
private static string Sanitize(string queueName)
{
var machineName = Environment.MachineName.ToLowerInvariant();
return queueName.ToLowerInvariant().Replace(machineName, ".");
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Track dead WebDriver instances during parallel task - c#

Related

What is the correct way to use USBHIDDRIVER for multiple writes?

Running Sql Server Agent Jobs from C#

c# while loop usage

Can this code have bottleneck or be resource-intensive?

How to programatically check if NServiceBus has finished processing all messages

Categories

Resources