I have a function in my program that creates new widgets to represent data, however whenever a widget is created i get alot of "AutoRelease with no NSAutoReleasePool in place" error messages. Since an NSAutoReleasePool should be automatically created on the main thread, I have an inkling that these error messages appear because an async function might create my threads...
This is the function called to create widgets to represent the latest information. This function is called pretty often:
private void CreateAndDisplayTvShowWidget (TvShow show)
{
var Widget = new TvShowWidgetController (show);
Widget.OnRemoveWidget += ConfirmRemoveTvShow;
Widget.View.SetFrameOrigin (new PointF (0, -150));
Widget.View.SetFrameSize (new SizeF (ContentView.Frame.Width, 150));
ContentView.AddSubview (Widget.View);
show.ShowWidget = Widget;
}
This function is usually called when this async function returns:
private static void WebRequestCallback (IAsyncResult result)
{
HttpWebRequest request = (HttpWebRequest)result.AsyncState;
HttpWebResponse response = (HttpWebResponse)request.EndGetResponse (result);
StreamReader responseStream = new StreamReader (response.GetResponseStream ());
string responseString = responseStream.ReadToEnd ();
responseStream.Close ();
ProcessResponse (responseString, request);
}
ProcessResponse (responseString, request) looks like this:
private static void ProcessResponse (string responseString, HttpWebRequest request)
{
string requestUrl = request.Address.ToString ();
if (requestUrl.Contains (ShowSearchTag)) {
List<TvShow> searchResults = TvDbParser.ParseTvShowSearchResults (responseString);
TvShowSearchTimeoutClock.Enabled = false;
OnTvShowSearchComplete (searchResults);
} else if (requestUrl.Contains (MirrorListTag)) {
MirrorList = TvDbParser.ParseMirrorList (responseString);
SendRequestsOnHold ();
} else if (requestUrl.Contains (TvShowBaseTag)) {
TvShowBase showBase = TvDbParser.ParseTvShowBase (responseString);
OnTvShowBaseRecieved (showBase);
} else if (requestUrl.Contains (ImagePathReqTag)) {
string showID = GetShowIDFromImagePathRequest (requestUrl);
TvShowImagePath imagePath = TvDbParser.ParseTvShowImagePath (showID, responseString);
OnTvShowImagePathRecieved (imagePath);
}
}
CreateAndDisplayTvShowWidget (TvShow show) is called when the event OnTvShowBaseRecieved (TvShow) is called, which is when I get tons error messages regarding NSAutoReleasePool...
The last two functions are part of what is supposed to be a cross-platform assembly, so I can't have any MonoMac-specific code in there...
I never call any auto-release or release code for my widgets, so I assume that the MonoMac bindings does this automatically as part of its garbage collection?
You can create autorelease pools at point within the call stack, you can even have multiple nested autorelease pools with the same call stack. So you should be able to create your autorelease pools in the async entry functions.
You only need an NSAutoreleasePool if you use the auto-release features of objects. A solution is to create a NSAutoreleasePool around the code that manipulates auto-released objects (in the async callback).
Edit:
Have you tried to encapsulate the creation code with a NSAutoreleasePool ? As this is the only place where you call MonoMac code, this should solve the issue.
private void CreateAndDisplayTvShowWidget (TvShow show)
{
using(NSAutoreleasePool pool = new NSAutoreleasePool())
{
var Widget = new TvShowWidgetController (show);
Widget.OnRemoveWidget += ConfirmRemoveTvShow;
Widget.View.SetFrameOrigin (new PointF (0, -150));
Widget.View.SetFrameSize (new SizeF (ContentView.Frame.Width, 150));
ContentView.AddSubview (Widget.View);
show.ShowWidget = Widget;
}
}
Note that even if you don't use auto-released objects directly, there are some case where the Cococa API use them udner the hood.
I had a similar problem and it was the response.GetResponseStream that was the problem. I surrounded this code with...
using (NSAutoreleasePool pool = new NSAutoreleasePool()) {
}
... and that solved my problem.
Related
I have some code that loads up and AppDomain(call it domain) calling an object function within the domain. The purpose is to get a list of items from a usb device using the device API to retrieve the information. The API requires a callback to return the information.
var AppDomain.CreateDomain(
$"BiometricsDomain{System.IO.Path.GetRandomFileName()}");
var proxy = domain.CreateInstanceAndUnwrap(proxy.Assembly.FullName, proxy.FullName
?? throw new InvalidOperationException()) as Proxy;
var ids = obj.GetIdentifications();
The proxy code loaded into the domain is as follows
public class Proxy : MarshalByRefObject
{
public List<String> GetIdentifications()
{
var control = new R100DeviceControl();
control.OnUserDB += Control_OnUserDB;
control.Open();
int nResult = control.DownloadUserDB(out int count);
// need to be able to return the list here but obviously that is not
// going to work.
}
private void Control_OnUserDB(List<String> result)
{
// Get the list of string from here
}
}
Is there a way to be able to wait on the device and return the information as needed when the callback is called? Since the GetIdentifications() has already returned I don't know how to get the
You can consider wrapping the Event-Based Asynchronous Pattern (EAP) operations as one task by using a TaskCompletionSource<TResult> so that the event can be awaited.
public class Proxy : MarshalByRefObject {
public List<String> GetIdentifications() {
var task = GetIdentificationsAsync();
return task.Result;
}
private Task<List<String>> GetIdentificationsAsync() {
var tcs = new TaskCompletionSource<List<string>>();
try {
var control = new R100DeviceControl();
Action<List<string>> handler = null;
handler = result => {
// Once event raised then set the
// Result property on the underlying Task.
control.OnUserDB -= handler;//optional to unsubscribe from event
tcs.TrySetResult(result);
};
control.OnUserDB += handler;
control.Open();
int count = 0;
//call async event
int nResult = control.DownloadUserDB(out count);
} catch (Exception ex) {
//Bubble the error up to be handled by calling client
tcs.TrySetException(ex);
}
// Return the underlying Task. The client code
// waits on the Result property, and handles exceptions
// in the try-catch block there.
return tcs.Task;
}
}
You can also improve on it by adding the ability to cancel using a CancellationToken for longer than expected callbacks.
With that the proxy can then be awaited
List<string> ids = proxy.GetIdentifications();
Reference How to: Wrap EAP Patterns in a Task
NOTE: Though there may be more elegant solutions to the problem of asynchronous processing, the fact that this occurs in a child AppDomain warrants child AppDomain best practices. (see links below)
i.e.
do not allow code meant for a child AppDomain to be executed in the parent domain
do not allow complex types to bubble to the parent AppDomain
do not allow exceptions to cross AppDomain boundaries in the form of custom exception types
OP:
I am using it for fault tolerance
First I would probably add a Open or similar method to give time for the data to materialise.
var proxy = domain.CreateInstanceAndUnwrap(proxy.Assembly.FullName, proxy.FullName
?? throw new InvalidOperationException()) as Proxy;
proxy.Open(); // <------ new method here
.
. some time later
.
var ids = obj.GetIdentifications();
Then in your proxy make these changes to allow for data processing to occur in the background so that by the time you call GetNotifications data may be ready.
public class Proxy : MarshalByRefObject
{
ConcurrentBag<string> _results = new ConcurrentBag<string>();
public void Open()
{
var control = new R100DeviceControl();
control.OnUserDB += Control_OnUserDB;
control.Open();
// you may need to store nResult and count in a field?
nResult = control.DownloadUserDB(out int count);
}
public List<String> GetIdentifications()
{
var copy = new List<string>();
while (_results.TryTake(out var x))
{
copy.Add(x);
}
return copy;
}
private void Control_OnUserDB(List<String> result)
{
// Get the list of string from here
_results.Add (result);
}
}
Now you could probably improve upon GetNotifications to accept a timeout in the event either GetNotifications is called before data is ready or if you call it multiply but before subsequent data to arrive.
More
How to: Run Partially Trusted Code in a Sandbox
Not sure why you just don't maintain a little state and then wait for the results in the call:
public class Proxy : MarshalByRefObject
{
bool runningCommand;
int lastResult;
R100DeviceControl DeviceControl { get{ if(deviceControl == null){ deviceControl = new R100DeviceControl(); deviceControl.OnUserDB += Control_OnUserDB; } return deviceControl; } }
public List<String> GetIdentifications()
{
if(runningCommand) return null;
DeviceControl.Open();
runningCommand = true;
lastResult = control.DownloadUserDB(out int count);
}
private void Control_OnUserDB(List<String> result)
{
runningCommand = false;
// Get the list of string from here
}
}
Once you have a pattern like this you can easily switch between async and otherwise whereas before it will look a little harder to understand because you integrated the async logic, this way you can implement the sync method and then make an async wrapper if you desire.
Rookie here so please be nice!! I have had so much fun learning to program and gotten some great help along the way when google failed me. But alas, I'm stuck again.
I have a C# program that looks like this (It's MWS if anyone is familiar)
I've tried so many different ways to get this to effectively loop through a list of values in a text file. The problem I'm having is that the Main function is where I have to set the loop, but the BuildClass is where I need to cycle through the values in the text file (sentinel). I've included some stuff that probably isn't necessary just in case it is messing my code up and I don't realize it.
Here's what I've tried:
setting the loop inside the BuildClass - didn't expect it to work but it threw an exception before getting to the sentinel.
Reference the sentinel within the main function by changing the "using" or "var" in the main function sentinel to public - turned EVERYTHING red in visual studio
moving the string sentinel outside the main function so that the function and the BuildClass would recognize it - main function did not recognize it anymore.
I've tried so many other things unsuccessfully. I've gotten it to loop with the same sentinel value passed from BuildClass to the function over and over again but that's about it.
What I think I need:
A destructive version of streamReader that will remove the value from the text file when reading it. I'll put this inside the BuildClass, so that the next loop of the main function, the next value will be read and passed into the main function until the file is empty, terminating the loop.
an understanding of why changing sentinel to public destroys the code so badly. I have a decent understanding of why the other attempts wouldn't work.
namespace MainSpace
{
public class MainClass
{
int i;
public static void Main(string[] args)
{
ClientClass client = new ClientInterface(appName, appVersion, password, config);
MainClass sample = new MainClass(client);
string sentinel;
using (var streamReader = new StreamReader(#"sample.txt", true))
while((sentinel = streamReader.ReadLine()) != null)
{
try
{
//stuff
response = sample.InvokeBuild();
Console.WriteLine("Response Stuff");
string responseXml = response.ToXML();
Console.WriteLine(responseXml);
StreamWriter FileWrite = new StreamWriter("FileTest.xml", true);
FileWrite.WriteLine(responseXml);
FileWrite.Close();
}
catch (ExceptionsClass)
{
// Exception stuff
throw ex;
}
}
}
private readonly ClientInterface client;
public MainClass(ClientInterface client)
{
this.client = client;
}
public BuildClass InvokeBuild()
{
{
using (var streamReader = new StreamReader("sample.txt", true))
{
string sentinel = streamReader.ReadLine();
Thread.Sleep(6000);
i++;
Console.WriteLine("attempt " + i);
// Create a request.
RequestClass request = new RequestClass();
//Password Stuff
request.IdType = idType;
IdListType idList = new IdListType();
idList.Id.Add(sentinel);
request.IdList = idList;
return this.client.RequestClass(request);
}
}
}
}
I really hope there's someone experienced enough both with TPL & System.Net Classes and methods
What started as a simple thought of use TPL on current sequential set of actions led me to a halt in my project.
As I am still fresh With .NET, jumping straight to deep water using TPL ...
I was trying to extract an Aspx page's source/content(html) using WebClient
Having multiple requests per day (around 20-30 pages to go through) and extract specific values out of the source code... being only one of few daily tasks the server has on its list,
Led me to try implement it by using TPL, thus gain some speed.
Although I tried using Task.Factory.StartNew() trying to iterate on few WC instances ,
on first try execution of WC the application just does not get any result from the WebClient
This is my last try on it
static void Main(string[] args)
{
EnumForEach<Act>(Execute);
Task.WaitAll();
}
public static void EnumForEach<Mode>(Action<Mode> Exec)
{
foreach (Mode mode in Enum.GetValues(typeof(Mode)))
{
Mode Curr = mode;
Task.Factory.StartNew(() => Exec(Curr) );
}
}
string ResultsDirectory = Environment.CurrentDirectory,
URL = "",
TempSourceDocExcracted ="",
ResultFile="";
enum Act
{
dolar, ValidateTimeOut
}
void Execute(Act Exc)
{
switch (Exc)
{
case Act.dolar:
URL = "http://www.AnyDomainHere.Com";
ResultFile =ResultsDirectory + "\\TempHtm.htm";
TempSourceDocExcracted = IeNgn.AgilityPacDocExtraction(URL).GetElementbyId("Dv_Main").InnerHtml;
File.WriteAllText(ResultFile, TempSourceDocExcracted);
break;
case Act.ValidateTimeOut:
URL = "http://www.AnotherDomainHere.Com";
ResultFile += "\\TempHtm.htm";
TempSourceDocExcracted = IeNgn.AgilityPacDocExtraction(URL).GetElementbyId("Dv_Main").InnerHtml;
File.WriteAllText(ResultFile, TempSourceDocExcracted);
break;
}
//usage of HtmlAgilityPack to extract Values of elements by their attributes/properties
public HtmlAgilityPack.HtmlDocument AgilityPacDocExtraction(string URL)
{
using (WC = new WebClient())
{
WC.Proxy = null;
WC.Encoding = Encoding.GetEncoding("UTF-8");
tmpExtractedPageValue = WC.DownloadString(URL);
retAglPacHtmDoc.LoadHtml(tmpExtractedPageValue);
return retAglPacHtmDoc;
}
}
What am I doing wrong? Is it possible to use a WebClient using TPL at all or should I use another tool (not being able to use IIS 7 / .net4.5)?
I see at least several issues:
naming - FlNm is not a name - VisualStudio is modern IDE with smart code completion, there's no need to save keystrokes (you may start here, there are alternatives too, main thing is too keep it consistent: C# Coding Conventions.
If you're using multithreading, you need to care about resource sharing. For example FlNm is a static string and it is assigned inside each thread, so it's value is not deterministic (also even if it was running sequentially, code would work faulty - you would adding file name in path in each iteration, so it would be like c:\TempHtm.htm\TempHtm.htm\TempHtm.htm)
You're writing to the same file from different threads (well, at least that was your intent I think) - usually that's a recipe for disaster in multithreading. Question is, if you need at all write anything to disk, or it can be downloaded as string and parsed without touching disk - there's a good example what does it mean to touch a disk.
Overall I think you should parallelize only downloading, so do not involve HtmlAgilityPack in multithreading, as I think you don't know it is thread safe. On the other hand, downloading will have good performance/thread count ratio, html parsing - not so much, may be if thread count will be equal to cores count, but not more. Even more - I would separate downloading and parsing, as it would be easier to test, understand and maintain.
Update: I don't understand your full intent, but this may help you started (it's not production code, you should add retry/error catching, etc.).
Also at the end is extended WebClient class allowing you to get more threads spinning, because by default webclient allows only two connections.
class Program
{
static void Main(string[] args)
{
var urlList = new List<string>
{
"http://google.com",
"http://yahoo.com",
"http://bing.com",
"http://ask.com"
};
var htmlDictionary = new ConcurrentDictionary<string, string>();
Parallel.ForEach(urlList, new ParallelOptions { MaxDegreeOfParallelism = 20 }, url => Download(url, htmlDictionary));
foreach (var pair in htmlDictionary)
{
Process(pair);
}
}
private static void Process(KeyValuePair<string, string> pair)
{
// do the html processing
}
private static void Download(string url, ConcurrentDictionary<string, string> htmlDictionary)
{
using (var webClient = new SmartWebClient())
{
htmlDictionary.TryAdd(url, webClient.DownloadString(url));
}
}
}
public class SmartWebClient : WebClient
{
private readonly int maxConcurentConnectionCount;
public SmartWebClient(int maxConcurentConnectionCount = 20)
{
this.maxConcurentConnectionCount = maxConcurentConnectionCount;
}
protected override WebRequest GetWebRequest(Uri address)
{
var httpWebRequest = (HttpWebRequest)base.GetWebRequest(address);
if (httpWebRequest == null)
{
return null;
}
if (maxConcurentConnectionCount != 0)
{
httpWebRequest.ServicePoint.ConnectionLimit = maxConcurentConnectionCount;
}
return httpWebRequest;
}
}
I am working on an app that searches for email addresses in Google search results' URLs. The problem is it needs to return the value it found in each page + the URL in which it found the email, to a datagridview with 2 columns: Email and URL.
I am using Parallel.ForEach for this one but of course it returns random URLs and not the ones it really found the email on.
public static string htmlcon; //htmlsource
public static List<string> emailList = new List<string>();
public static string Get(string url, bool proxy)
{
htmlcon = "";
try
{
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
if (proxy)
req.Proxy = new WebProxy(proxyIP + ":" + proxyPort);
req.Method = "GET";
req.UserAgent = Settings1.Default.UserAgent;
if (Settings1.Default.EnableCookies == true)
{
CookieContainer cont = new CookieContainer();
req.CookieContainer = cont;
}
WebResponse resp = req.GetResponse();
StreamReader SR = new StreamReader(resp.GetResponseStream());
htmlcon = SR.ReadToEnd();
Thread.Sleep(400);
resp.Close();
SR.Close();
}
catch (Exception)
{
Thread.Sleep(500);
}
return htmlcon;
}
private void copyMails(string url)
{
string emailPat = #"(\b[a-zA-Z0-9._%-]+#[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b)";
MatchCollection mailcol = Regex.Matches(htmlcon, emailPat, RegexOptions.Singleline);
foreach (Match mailMatch in mailcol)
{
email = mailMatch.Groups[1].Value;
if (!emailList.Contains(email))
{
emailList.Add(email);
Action dgeins = () => mailDataGrid.Rows.Insert(0, email, url);
mailDataGrid.BeginInvoke(dgeins);
}
}
}
private void SEbgWorker_DoWork(object sender, DoWorkEventArgs e)
{
//ALOT OF IRRELEVAMT STUFF BEING RUN
Parallel.ForEach(allSElist.OfType<string>(), (s) =>
{
//Get URL
Get(s, Settings1.Default.Proxyset);
//match mails 1st page
copyMails(s);
});
}
so this is it: I execute a Get request(where "s" is the URL from the list) and then execute copyMails(s) from the URL's html source. It uses regex to copy the emails.
If I do it without parallel it returns the correct URL for each email in the datagridview. How can I do this parallel an still get the correct match in the datagridview?
Thanks
You would be better off using PLINQ's Where to filter (pseudo code):
var results = from i in input.AsParallel()
let u = get the URL from i
let d = get the data from u
let v = try get the value from d
where v is found
select new {
Url = u,
Value = v
};
Underneath the AsParallel means that TPL's implementation of LINQ operators (Select, Where, ...) is used.
UPDATE: Now with more information
First there are a number of issues in your code:
The variable htmlcon is static but used directly by multiple threads. This could well be your underlying problem. Consider just two input values. The first Get completes setting htmlcon, before that thread's call to copyMails starts the second thread's Get completes its HTML GET and writes to htmlcon. With `email
The list emailList is also accessed without locking by multiple threads. Most collection types in .NET (and any other programming platform) are not thread safe, you need to limit access to a single thread at a time.
You are mixing up various activities in each of your methods. Consider applying the singe responsibility principle.
Thread.Sleep to handle an exception?! If you can't handle an exception (ie. resolve the condition) then do nothing. In this case if the action throws then the Parallel.Foreach will throw: that'll do until you define how to handle the HTML GET failing.
Three suggestions:
In my experience clean code (to an obsessive degree) makes things easier: the details of the format
don't matter (one true brace style is better, but consistency is the key). Just going through
and cleaning up the formatting showed up issues #1 and #2.
Good naming. Don't abbreviate anything used over more than a few lines of code unless that is a
significant term for the domain. Eg. s for the action parameter in the parallel loop is really a url
so call it that. This kind of thing immediately makes the code easier to follow.
Think about that regex for emails: there are many valid emails that will not match (eg. use of + to provide multiple logical addresses: exmaple+one#gamil.com will be delivered to example#gmail.com and can then be used for local rules). Also an apostrophe ("'") is a valid character (and known people frustrated by web sites that refused their addresses by getting this wrong).
Second: A relatively direct clean up:
public static string Get(string url, bool proxy) {
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
if (proxy) {
req.Proxy = new WebProxy(proxyIP + ":" + proxyPort);
}
req.Method = "GET";
req.UserAgent = Settings1.Default.UserAgent;
if (Settings1.Default.EnableCookies == true) {
CookieContainer cont = new CookieContainer();
req.CookieContainer = cont;
}
using (WebResponse resp = req.GetResponse())
using (StreamReader SR = new StreamReader(resp.GetResponseStream())) {
return SR.ReadToEnd();
}
}
private static Regex emailMatcher = new Regex(#"(\b[a-zA-Z0-9._%-]+#[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b)", RegexOptions.Singleline);
private static string[] ExtractEmails(string htmlContent) {
return emailMatcher.Matches(htmlContent).OfType<Match>
.Select(m => m.Groups[1].Value)
.Distinct()
.ToArray();
}
private void SEbgWorker_DoWork(object sender, DoWorkEventArgs e) {
Parallel.ForEach(allSElist.OfType<string>(), url => {
var htmlContent = Get(url, Settings1.Default.Proxyset);
var emails = ExtractEmails(htmlContent);
foreach (var email in emails) {
Action dgeins = () => mailDataGrid.Rows.Insert(0, email, url);
mailDataGrid.BeginInvoke(dgeins);
}
}
Here I have:
Made use of using statements to automate the cleanup of resources.
Eliminated all mutable shared state.
Regex is explicitly documented to have thread safe instance methods. So I only need a single instance.
Removed noise: no need to pass the URL to ExtractEmails because the extraction doesn't use the URL.
Get now only performs the HTML get, ExtreactEMail just the extraction
Third: The above will block threads on the slowest operation: the HTML GET.
The real concurrency benefit would be to replace HttpWebRequest.GetResponse and reading the response stream with their asynchronous equivalents.
Using Task would be the answer in .NET 4, but you need to directly work with Stream and encoding yourself because StreamReader doesn't provide any BeginABC/EndABC method pairs. But .NET 4.5 is almost here, so apply some async/await:
Nothing to do in ExtractEMails.
Get is now asynchronous, blocking in neither the HTTP GET or reading the result.
SEbgWorker_DoWork uses Tasks directly to avoid mixing too many different ways to work with TPL. Since Get returns a Task<string> can simple continue (when it hasn't failed – unless you specify otherwise ContinueWith will only continue if the previous task has completed successfully):
This should work in .NET 4.5, but without a set of valid URLs for which this will work I cannot test.
public static async Task<string> Get(string url, bool proxy) {
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
if (proxy) {
req.Proxy = new WebProxy(proxyIP + ":" + proxyPort);
}
req.Method = "GET";
req.UserAgent = Settings1.Default.UserAgent;
if (Settings1.Default.EnableCookies == true) {
CookieContainer cont = new CookieContainer();
req.CookieContainer = cont;
}
using (WebResponse resp = await req.GetResponseAsync())
using (StreamReader SR = new StreamReader(resp.GetResponseStream())) {
return await SR.ReadToEndAsync();
}
}
private static Regex emailMatcher = new Regex(#"(\b[a-zA-Z0-9._%-]+#[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b)", RegexOptions.Singleline);
private static string[] ExtractEmails(string htmlContent) {
return emailMatcher.Matches(htmlContent).OfType<Match>
.Select(m => m.Groups[1].Value)
.Distinct()
.ToArray();
}
private void SEbgWorker_DoWork(object sender, DoWorkEventArgs e) {
tasks = allSElist.OfType<string>()
.Select(url => {
return Get(url, Settings1.Default.Proxyset)
.ContinueWith(htmlContentTask => {
// No TaskContinuationOptions, so know always OK here
var htmlContent = htmlContentTask.Result;
var emails = ExtractEmails(htmlContent);
foreach (var email in emails) {
// No InvokeAsync on WinForms, so do this the old way.
Action dgeins = () => mailDataGrid.Rows.Insert(0, email, url);
mailDataGrid.BeginInvoke(dgeins);
}
});
});
tasks.WaitAll();
}
public static string htmlcon; //htmlsource
public static List emailList = new List();
Problem is because these members htmlcon and emailList are shared resource among thread and among iterations. Each your iteration in Parallel.ForEach is executed parallel. Thats why you have strange behaviour.
How to solve problem:
Modify your code and try to implement it without static variables or shared state.
As an option is change from Parallel.ForEach to TPL Task chaining, when you make this change then result of one parallel operation will be input data for other and it's as an options among many how to modify code to avoid shared state.
Use locking or concurrent collections. Your htmlcon variable could be made volatile but with list you should yous lock's or concurrent collections.
Better way is modify your code to avoid shared state, and how to do that are many options based on your implementation, not only task chaining.
Suppose I have the following class:
Public class FooBar
{
List<Items> _items = new List<Items>();
public List<Items> FetchItems(int parentItemId)
{
FetchSingleItem(int itemId);
return _items
}
private void FetchSingleItem(int itemId)
{
Uri url = new Uri(String.Format("http://SomeURL/{0}.xml", itemId);
HttpWebRequest webRequest = (HttpWebRequest)HttpWebRequest.Create(url);
webRequest.BeginGetResponse(ReceiveResponseCallback, webRequest);
}
void ReceiveResponseCallback(IAsyncResult result)
{
// End the call and extract the XML from the response and add item to list
_items.Add(itemFromXMLResponse);
// If this item is linked to another item then fetch that item
if (anotherItemIdExists == true)
{
FetchSingleItem(anotherItemId);
}
}
}
There could be any number of linked items that I will only know about at runtime.
What I want to do is make the initial call to FetchSingleItem and then wait until all calls have completed then return List<Items> to the calling code.
Could someone point me in the right direction? I more than happy to refactor the whole thing if need be (which I suspect will be the case!)
Getting the hang of asynchronous coding is not easy especially when there is some sequential dependency between one operation and the next. This is the exact sort of problem that I wrote the AsyncOperationService to handle, its a cunningly short bit of code.
First a little light reading for you: Simple Asynchronous Operation Runner – Part 2. By all means read part 1 but its a bit heavier than I had intended. All you really need is the AsyncOperationService code from it.
Now in your case you would convert your fetch code to something like the following.
private IEnumerable<AsyncOperation> FetchItems(int startId)
{
XDocument itemDoc = null;
int currentId = startId;
while (currentID != 0)
{
yield return DownloadString(new Uri(String.Format("http://SomeURL/{0}.xml", currentId), UriKind.Absolute),
itemXml => itemDoc = XDocument.Parse(itemXml) );
// Do stuff with itemDoc like creating your item and placing it in the list.
// Assign the next linked ID to currentId or if no other items assign 0
}
}
Note the blog also has an implementation of DownloadString which in turn uses WebClient which simplifies things. However the principles still apply if for some reason you must stick with HttpWebRequest. (Let me know if you are having trouble creating an AsyncOperation for this)
You would then use this code like this:-
int startId = GetSomeIDToStartWith();
Foo myFoo = new Foo();
myFoo.FetchItems(startId).Run((err) =>
{
// Clear IsBusy
if (err == null)
{
// All items are now fetched continue doing stuff here.
}
else
{
// "Oops something bad happened" code here
}
}
// Set IsBusy
Note that the call to Run is asynchronous, code execution will appear to jump past it before all the items are fetched. If the UI is useless to the user or even dangerous then you need to block it in a friendly way. The best way (IMO) to do this is with the BusyIndicator control from the toolkit, setting its IsBusy property after the call to Run and clearing it in the Run callback.
All you need is a thread sync thingy. I chose ManualResetEvent.
However, I don't see the point of using asynchronous IO since you always wait for the request to finish before starting a new one. But the example might not show the whole story?
Public class FooBar
{
private ManualResetEvent _completedEvent = new ManualResetEvent(false);
List<Items> _items = new List<Items>();
public List<Items> FetchItems(int parentItemId)
{
FetchSingleItem(itemId);
_completedEvent.WaitOne();
return _items
}
private void FetchSingleItem(int itemId)
{
Uri url = new Uri(String.Format("http://SomeURL/{0}.xml", itemId);
HttpWebRequest webRequest = (HttpWebRequest)HttpWebRequest.Create(url);
webRequest.BeginGetResponse(ReceiveResponseCallback, webRequest);
}
void ReceiveResponseCallback(IAsyncResult result)
{
// End the call and extract the XML from the response and add item to list
_items.Add(itemFromXMLResponse);
// If this item is linked to another item then fetch that item
if (anotherItemIdExists == true)
{
FetchSingleItem(anotherItemId);
}
else
_completedEvent.Set();
}
}