UploadFromStreamAsync Cancellation Token not working - c#

Expected behaviour:
Looking at my internet usage in task manager after running should see a spike in upload for around 5 seconds and then a drop back to normal levels.
Result:
Upload speed spikes for a lot longer (closer to a minute or more, indicative of the full file being uploaded)
Tried:
Cancelling after multiple times (e.g. 1 second, 10 seconds etc)
Immediately cancelling with the token after starting the upload Using
UploadFromByteArrayAsync() instead of UploadFromStreamAsync() Using
BeginUploadFromStream() with EndUploadFromStream()
Although I can quite easily cancel a download using the CancellationToken, no matter what I do, I can't cancel this upload. Also, weirdly, searching online, I can't find any instance of anyone else having problems cancelling an upload.
_connectionString = "xxx";
if (_connectionString != "")
{
_storageAccount = CloudStorageAccount.Parse(_connectionString);
_blobClient = _storageAccount.CreateCloudBlobClient();
}
string ulContainerName = "speedtest";
string ulBlobName = "uploadTestFile" + DateTime.UtcNow.ToLongTimeString();
CloudBlobContainer container = _blobClient.GetContainerReference(ulContainerName);
CloudBlockBlob ulBlockBlob = container.GetBlockBlobReference(ulBlobName);
CreateDummyDataAsync(_fileUploadSizeMB);
byte[] byteArray = System.IO.File.ReadAllBytes(_filePath + "dummy_upload");
ulBlockBlob.UploadFromStreamAsync(new MemoryStream(byteArray), _ulCancellationTokenSource.Token);
_ulCancellationTokenSource.CancelAfter(5000);

To anyone that ends up in this situation and can't get the cancellationToken to work... the workaround I eventually used was
BlobRequestOptions timeoutRequestOptions = new BlobRequestOptions()
{
// Allot 10 seconds for this API call, including retries
MaximumExecutionTime = TimeSpan.FromSeconds(10)
};
Then include the timeoutRequestOptions in the method arguments:
ulBlockBlob.UploadFromStreamAsync(new MemoryStream(byteArray), new AccessCondition(),
timeoutRequestOptions,
new OperationContext(),
new progressHandler(), cancellationToken.Token);
This will force the API call to timeout after a certain time.

Related

Set WaitTimeOut for Azure Service Bus Session Processor

In the legacy version of Azure Service Bus (ASB) I can use MessageWaitTimeout in SessionHandlerOptions to control the timeout between 2 messages. For example, if I set timeout 5 seconds, after complete the first message, the queue waits for 5s then picks the next message.
In the new version Azure.Messaging.ServiceBus, the queue has to wait for around 1 minute to pick up the next message. I only need to process one-by-one messages, no need to process concurrent messages.
I follow this example and can't find any solution to set timeout like the old version.
Does anyone know how to do it?
var options = new ServiceBusSessionProcessorOptions
{
AutoCompleteMessages = false,
MaxConcurrentSessions = 1,
MaxConcurrentCallsPerSession = 1,
MaxAutoLockRenewalDuration = TimeSpan.FromMinutes(2),
};
EDIT:
I found the solution. It is RetryOptions in ServiceBusClient
var client = new ServiceBusClient("connectionString", new ServiceBusClientOptions
{
RetryOptions = new ServiceBusRetryOptions
{
TryTimeout = TimeSpan.FromSeconds(5)
}
});
With the latest stable release, 7.2.0, this can be configured with the SessionIdleTimeout property.

Issue with Azure Batch Job, ending with unkown error code

I'm having problems with Azure Batch Jobs. I'm trying to create an app pool, create a CloudTask and then execute my Application Package being online.
Do you see something not working correctly?
Here is the code used now. Main code:
await CreatePoolAsync(batchClient, currentPoolId, applicationFiles);
await CreateJobAsync(batchClient, currentJobId, currentPoolId);
await AddTasksAsync(batchClient, currentJobId, inputFiles, optimizationId, outputContainerSasUrl);
Creating the Pool:
CloudPool pool = batchClient.PoolOperations.CreatePool(
poolId: poolId,
targetDedicated: 1, // 3 compute nodes
virtualMachineSize: "small", // single-core, 1.75 GB memory, 225 GB disk
cloudServiceConfiguration: new CloudServiceConfiguration(osFamily: "4")); // Windows Server 2012 R2
pool.ApplicationPackageReferences = new List<ApplicationPackageReference>
{
new ApplicationPackageReference
{
ApplicationId = "my_app"
}
};
Creating the Job:
CloudJob job = batchClient.JobOperations.CreateJob();
job.Id = jobId;
job.PoolInformation = new PoolInformation {PoolId = poolId};
await job.CommitAsync();
And adding the Task.
string taskId = "myAppEngineTask";
string taskCommandLine = $"cmd /c %AZ_BATCH_APP_PACKAGE_MY_APP%\\MyApp.Console.exe -a NSGA2 -r 1000 -m db -i {optimizationId}";
CloudTask task = new CloudTask(taskId, taskCommandLine);
task.ApplicationPackageReferences = new List<ApplicationPackageReference>
{
new ApplicationPackageReference
{
ApplicationId = "my_app"
}
};
await batchClient.JobOperations.AddTaskAsync(jobId, tasks);
When done with adding tasks, everything seems to be up and running, but I get error code: -2146232576 and nothing is printed to any logs.
To diagnose failures for tasks, you will want to first see if the CloudTask ExecutionInformation.FailureInformation (if SDK 7.0.0+ or ExecutionInformation.SchedulingError if prior SDK version) is set. Examine those fields for any information.
For your particular task, it looks like it could be related to you adding a task-level Application package reference when you have already done that at the pool-level. Try omitting the task.ApplicationPackageReferences.
Consult the Application Package Documentation for more information regarding the difference between Pool-level and Task-level application packages and which one would suit your scenario the best.

Can this code have bottleneck or be resource-intensive?

It's code that will execute 4 threads in 15-min intervals. The last time that I ran it, the first 15-minutes were copied fast (20 files in 6 minutes), but the 2nd 15-minutes are much slower. It's something sporadic and I want to make certain that, if there's any bottleneck, it's in a bandwidth limitation with the remote server.
EDIT: I'm monitoring the last run and the 15:00 and :45 copied in under 8 minutes each. The :15 hasn't finished and neither has :30, and both began at least 10 minutes before :45.
Here's my code:
static void Main(string[] args)
{
Timer t0 = new Timer((s) =>
{
Class myClass0 = new Class();
myClass0.DownloadFilesByPeriod(taskRunDateTime, 0, cts0.Token);
Copy0Done.Set();
}, null, TimeSpan.FromMinutes(20), TimeSpan.FromMilliseconds(-1));
Timer t1 = new Timer((s) =>
{
Class myClass1 = new Class();
myClass1.DownloadFilesByPeriod(taskRunDateTime, 1, cts1.Token);
Copy1Done.Set();
}, null, TimeSpan.FromMinutes(35), TimeSpan.FromMilliseconds(-1));
Timer t2 = new Timer((s) =>
{
Class myClass2 = new Class();
myClass2.DownloadFilesByPeriod(taskRunDateTime, 2, cts2.Token);
Copy2Done.Set();
}, null, TimeSpan.FromMinutes(50), TimeSpan.FromMilliseconds(-1));
Timer t3 = new Timer((s) =>
{
Class myClass3 = new Class();
myClass3.DownloadFilesByPeriod(taskRunDateTime, 3, cts3.Token);
Copy3Done.Set();
}, null, TimeSpan.FromMinutes(65), TimeSpan.FromMilliseconds(-1));
}
public struct FilesStruct
{
public string RemoteFilePath;
public string LocalFilePath;
}
Private void DownloadFilesByPeriod(DateTime TaskRunDateTime, int Period, Object obj)
{
FilesStruct[] Array = GetAllFiles(TaskRunDateTime, Period);
//Array has 20 files for the specific period.
using (Session session = new Session())
{
// Connect
session.Open(sessionOptions);
TransferOperationResult transferResult;
foreach (FilesStruct u in Array)
{
if (session.FileExists(u.RemoteFilePath)) //File exists remotely
{
if (!File.Exists(u.LocalFilePath)) //File does not exist locally
{
transferResult = session.GetFiles(u.RemoteFilePath, u.LocalFilePath);
transferResult.Check();
foreach (TransferEventArgs transfer in transferResult.Transfers)
{
//Log that File has been transferred
}
}
else
{
using (StreamWriter w = File.AppendText(Logger._LogName))
{
//Log that File exists locally
}
}
}
else
{
using (StreamWriter w = File.AppendText(Logger._LogName))
{
//Log that File exists remotely
}
}
if (token.IsCancellationRequested)
{
break;
}
}
}
}
Something is not quite right here. First thing is, you're setting 4 timers to run parallel. If you think about it, there is no need. You don't need 4 threads running parallel all the time. You just need to initiate tasks at specific intervals. So how many timers do you need? ONE.
The second problem is why TimeSpan.FromMilliseconds(-1)? What is the purpose of that? I can't figure out why you put that in there, but I wouldn't.
The third problem, not related to multi-programming, but I should point out anyway, is that you create a new instance of Class each time, which is unnecessary. It would be necessary if, in your class, you need to set constructors and your logic access different methods or fields of the class in some order. In your case, all you want to do is to call the method. So you don't need a new instance of the class every time. You just need to make the method you're calling static.
Here is what I would do:
Store the files you need to download in an array / List<>. Can't you spot out that you're doing the same thing every time? Why write 4 different versions of code for that? This is unnecessary. Store items in an array, then just change the index in the call!
Setup the timer at perhaps 5 seconds interval. When it reaches the 20 min/ 35 min/ etc. mark, spawn a new thread to do the task. That way a new task can start even if the previous one is not finished.
Wait for all threads to complete (terminate). When they do, check if they throw exceptions, and handle them / log them if necessary.
After everything is done, terminate the program.
For step 2, you have the option to use the new async keyword if you're using .NET 4.5. But it won't make a noticeable difference if you use threads manually.
And why is it so slow...why don't you check your system status using task manager? Is the CPU high and running or is the network throughput occupied by something else or what? You can easily tell the answer yourself from there.
The problem was the sftp client.
The purpose of the console application was to loop through a list<> and download the files. I tried with winscp and, even though, it did the job, it was very slow. I also tested sharpSSH and it was even slower than winscp.
I finally ended up using ssh.net which, at least in my particular case, was much faster than both winscp and sharpssh. I think the problem with winscp is that there was no evident way of disconnecting after I was done. With ssh.net I could connect/disconnect after every file download was made, something I couldn't do with winscp.

Track dead WebDriver instances during parallel task

I am seeing some dead-instance weirdness running parallelized nested-loop web stress tests using Selenium WebDriver, simple example being, say, hit 300 unique pages with 100 impressions each.
I'm "successfully" getting 4 - 8 WebDriver instances going using a ThreadLocal<FirefoxWebDriver> to isolate them per task thread, and MaxDegreeOfParallelism on a ParallelOptions instance to limit the threads. I'm partitioning and parallelizing the outer loop only (the collection of pages), and checking .IsValueCreated on the ThreadLocal<> container inside the beginning of each partition's "long running task" method. To facilitate cleanup later, I add each new instance to a ConcurrentDictionary keyed by thread id.
No matter what parallelizing or partitioning strategy I use, the WebDriver instances will occasionally do one of the following:
Launch but never show a URL or run an impression
Launch, run any number of impressions fine, then just sit idle at some point
When either of these happen, the parallel loop eventually seems to notice that a thread isn't doing anything, and it spawns a new partition. If n is the number of threads allowed, this results in having n productive threads only about 50-60% of the time.
Cleanup still works fine at the end; there may be 2n open browsers or more, but the productive and unproductive ones alike get cleaned up.
Is there a way to monitor for these useless WebDriver instances and a) scavenge them right away, plus b) get the parallel loop to replace the task segment immediately, instead of lagging behind for several minutes as it often does now?
I was having a similar problem. It turns out that WebDriver doesn't have the best method for finding open ports. As described here it gets a system wide lock on ports, finds an open port, and then starts the instance. This can starve the other instances that you're trying to start of ports.
I got around this by specifying a random port number directly in the delegate for the ThreadLocal<IWebDriver> like this:
var ports = new List<int>();
var rand = new Random((int)DateTime.Now.Ticks & 0x0000FFFF);
var driver = new ThreadLocal<IWebDriver>(() =>
{
var profile = new FirefoxProfile();
var port = rand.Next(50) + 7050;
while(ports.Contains(port) && ports.Count != 50) port = rand.Next(50) + 7050;
profile.Port = port;
ports.Add(port);
return new FirefoxDriver(profile);
});
This works pretty consistently for me, although there's the issue if you end up using all 50 in the list that is unresolved.
Since there is no OnReady event nor an IsReady property, I worked around it by sleeping the thread for several seconds after creating each instance. Doing that seems to give me 100% durable, functioning WebDriver instances.
Thanks to your suggestion, I've implemented IsReady functionality in my open-source project Webinator. Use that if you want, or use the code outlined below.
I tried instantiating 25 instances, and all of them were functional, so I'm pretty confident in the algorithm at this point (I leverage HtmlAgilityPack to see if elements exist, but I'll skip it for the sake of simplicity here):
public void WaitForReady(IWebDriver driver)
{
var js = #"{ var temp=document.createElement('div'); temp.id='browserReady';" +
#"b=document.getElementsByTagName('body')[0]; b.appendChild(temp); }";
((IJavaScriptExecutor)driver).ExecuteScript(js);
WaitForSuccess(() =>
{
IWebElement element = null;
try
{
element = driver.FindElement(By.Id("browserReady"));
}
catch
{
// element not found
}
return element != null;
},
timeoutInMilliseconds: 10000);
js = #"{var temp=document.getElementById('browserReady');" +
#" temp.parentNode.removeChild(temp);}";
((IJavaScriptExecutor)driver).ExecuteScript(js);
}
private bool WaitForSuccess(Func<bool> action, int timeoutInMilliseconds)
{
if (action == null) return false;
bool success;
const int PollRate = 250;
var maxTries = timeoutInMilliseconds / PollRate;
int tries = 0;
do
{
success = action();
tries++;
if (!success && tries <= maxTries)
{
Thread.Sleep(PollRate);
}
}
while (!success && tries < maxTries);
return success;
}
The assumption is if the browser is responding to javascript functions and is finding elements, then it's probably a reliable instance and ready to be used.

UploadValuesAsync response time

I am writing test harness to test a HTTP Post. Test case would send 8 http request using UploadValuesAsync in webclient class in 10 seconds interval. It sleeps 10 seconds after every 8 request. I am recording start time and end time of each request. When I compute the average response time. I am getting around 800 ms. But when I run this test case synchronously using UploadValues method in web client I am getting average response time 250 milliseconds. Can you tell me why is difference between these two methods? I was expecting the less response time in Aync but I did not get that.
Here is code that sends 8 requests async
var count = 0;
foreach (var nameValueCollection in requestCollections)
{
count++;
NameValueCollection collection = nameValueCollection;
PostToURL(collection,uri);
if (count % 8 == 0)
{
Thread.Sleep(TimeSpan.FromSeconds(10));
count = 0;
}
}
UPDATED
Here is code that sends 8 requests SYNC
public void PostToURLSync(NameValueCollection collection,Uri uri)
{
var response = new ServiceResponse
{
Response = "Not Started",
Request = string.Join(";", collection.Cast<string>()
.Select(col => String.Concat(col, "=", collection[col])).ToArray()),
ApplicationId = collection["ApplicationId"]
};
try
{
using (var transportType2 = new DerivedWebClient())
{
transportType2.Expect100Continue = false;
transportType2.Timeout = TimeSpan.FromMilliseconds(2000);
response.StartTime = DateTime.Now;
var responeByte = transportType2.UploadValues(uri, "POST", collection);
response.EndTime = DateTime.Now;
response.Response = Encoding.Default.GetString(responeByte);
}
}
catch (Exception exception)
{
Console.WriteLine(exception.ToString());
}
response.ResponseInMs = (int)response.EndTime.Subtract(response.StartTime).TotalMilliseconds;
responses.Add(response);
Console.WriteLine(response.ResponseInMs);
}
Here is the code that post to the HTTP URI
public void PostToURL(NameValueCollection collection,Uri uri)
{
var response = new ServiceResponse
{
Response = "Not Started",
Request = string.Join(";", collection.Cast<string>()
.Select(col => String.Concat(col, "=", collection[col])).ToArray()),
ApplicationId = collection["ApplicationId"]
};
try
{
using (var transportType2 = new DerivedWebClient())
{
transportType2.Expect100Continue = false;
transportType2.Timeout = TimeSpan.FromMilliseconds(2000);
response.StartTime = DateTime.Now;
transportType2.UploadValuesCompleted += new UploadValuesCompletedEventHandler(transportType2_UploadValuesCompleted);
transportType2.UploadValuesAsync(uri, "POST", collection,response);
}
}
catch (Exception exception)
{
Console.WriteLine(exception.ToString());
}
}
Here is the upload completed event
private void transportType2_UploadValuesCompleted(object sender, UploadValuesCompletedEventArgs e)
{
var now = DateTime.Now;
var response = (ServiceResponse)e.UserState;
response.EndTime = now;
response.ResponseInMs = (int) response.EndTime.Subtract(response.StartTime).TotalMilliseconds;
Console.WriteLine(response.ResponseInMs);
if (e.Error != null)
{
response.Response = e.Error.ToString();
}
else
if (e.Result != null && e.Result.Length > 0)
{
string downloadedData = Encoding.Default.GetString(e.Result);
response.Response = downloadedData;
}
//Recording response in Global variable
responses.Add(response);
}
One problem you're probably running into is that .NET, by default, will throttle outgoing HTTP connections to the limit (2 concurrent connections per remote host) that are mandated by the relevant RFC. Assuming 2 concurrent connections and 250ms per request, that means the response time for your first 2 requests will be 250ms, the second 2 will be 500ms, the third 750ms, and the last 1000ms. This would yield a 625ms average response time, which is not far from the 800ms you're seeing.
To remove the throttling, increase ServicePointManager.DefaultConnectionLimit to the maximum number of concurrent connections you want to support, and you should see your average response time go down alot.
A secondary problem may be that the server itself is slower handling multiple concurrent connections than handing one request at a time. Even once you unblock the throttling problem above, I'd expect each of the async requests to, on average, execute somewhat slower than if the server was only executing one request at a time. How much slower depends on how well the server is optimized for concurrent requests.
A final problem may be caused by test methodology. For example, if your test client is simulating a browser session by storing cookies and re-sending cookies with each request, that may run into problems with some servers that will serialize requests from a single user. This is often a simplification for server apps so they won't have to deal with locking cross-requests state like session state. If you're running into this problem, make sure that each WebClient sends different cookies to simulate different users.
I'm not saying that you're running into all three of these problems-- you might be only running into 1 or 2-- but these are the most likley culprits for the problem you're seeing.
As Justin said, I tried ServicePointManager.DefaultConnectionLimit but that did not fix the issue. I could not able reproduce other problems suggested by Justin. I am not sure how to reproduce them in first place.
What I did, I ran the same piece of code in peer machine that runs perfectly response time that I expected. The difference between the two machines is operating systems. Mine is running on Windows Server 2003 and other machine is running on Windows Server 2008.
As it worked on the other machines, I suspect that it might be one of the problem specified by Justin or could be server settings on 2003 or something else. I did not spend much time after that to dig this issue. As this is a test harness that we had low priority on this issue. We left off with no time further.
As I have no glue on what exactly fixed it, I am not accepting any answer other than this. Becuase at very least I know that switching to server 2008 fixed this issue.

Categories