Deduplicate stream records

Deduplicate stream records - c#

I'm using Redis to stream data. I have multiple producer instances producing the same data, aiming event consistency.
Right now the producers generate trades with random trade ids between 1 and 2. I want a deduplication service or something which based on trade id to distrinct the duplicates. How do I do that?
Consumer
using System.Text.Json;
using Shared;
using StackExchange.Redis;
var tokenSource = new CancellationTokenSource();
var token = tokenSource.Token;
var muxer = ConnectionMultiplexer.Connect("localhost:6379");
var db = muxer.GetDatabase();
const string streamName = "positions";
const string groupName = "avg";
if (!await db.KeyExistsAsync(streamName) ||
(await db.StreamGroupInfoAsync(streamName)).All(x => x.Name != groupName))
{
await db.StreamCreateConsumerGroupAsync(streamName, groupName, "0-0");
}
var consumerGroupReadTask = Task.Run(async () =>
{
var id = string.Empty;
while (!token.IsCancellationRequested)
{
if (!string.IsNullOrEmpty(id))
{
await db.StreamAcknowledgeAsync(streamName, groupName, id);
id = string.Empty;
}
var result = await db.StreamReadGroupAsync(streamName, groupName, "avg-1", ">", 1);
if (result.Any())
{
id = result.First().Id;
var dict = ParseResult(result.First());
var trade = JsonSerializer.Deserialize<Trade>(dict["trade"]);
Console.WriteLine($"Group read result: trade: {dict["trade"]}, time: {dict["time"]}");
}
await Task.Delay(1000);
}
});
Console.ReadLine();
static Dictionary<string, string> ParseResult(StreamEntry entry)
{
return entry.Values.ToDictionary(x => x.Name.ToString(), x => x.Value.ToString());
}
Producer
using System.Text.Json;
using Shared;
using StackExchange.Redis;
var tokenSource = new CancellationTokenSource();
var token = tokenSource.Token;
var muxer = ConnectionMultiplexer.Connect("localhost:6379");
var db = muxer.GetDatabase();
const string streamName = "positions";
var producerTask = Task.Run(async () =>
{
var random = new Random();
while (!token.IsCancellationRequested)
{
var trade = new Trade(random.Next(1, 3), "btcusdt", 25000, 2);
var entry = new List<NameValueEntry>
{
new("trade", JsonSerializer.Serialize(trade)),
new("time", DateTimeOffset.Now.ToUnixTimeSeconds())
};
await db.StreamAddAsync(streamName, entry.ToArray());
await Task.Delay(2000);
}
});
Console.ReadLine();

You can use a couple tactics here, depending on the level of distribution required and the degree to which you can handle missing messages incoming from your stream. Here are a couple workable solutions using Redis:
Use Bloom Filters When you can tolerate a 1% miss in events
You can use a BloomFilter in Redis, which will be a very compact, very fast way to determine if a particular record has not been recorded yet. If you run:
var hasBeenAdded = ((int)await db.ExecuteAsync("BF.ADD", "bf:trades",dict["trade"])) == 1;
If hasBeenAdded is true, you can definitively say that the record is not a duplicate, if it is false, there's about a probability depending on how you set up the bloom filter with BF.RESERVE
If you want to use a Bloom Filter, you'll need to either side-load RedisBloom into your instance of Redis, or you can just use Redis Stack
Use a Sorted Set when misses aren't acceptable
If your app cannot tolerate a miss, you are probably wiser to use a Set or a Sorted Set, in general I'd advise you to use a set because they are much easier to clean up.
Basically if you are using a sorted set, you would check to see if a record has already been recorded in your average by using a ZSCORE zset:trades trade-id, if a score comes back you know that the records been used already, otherwise you can add it to the sorted set. Importantly, because your sorted set grows linearly you are going to want to clean it up periodically, so if you set the timestamp from the message id to the score, you likely can determine some workable interval to go back in and do a ZREMRANGEBYSCORE to clear out older records.

Related

Parallelize C# Graph API SDK methods

I'm connecting to and fetching transitive groups data from MS Graph API via. following logic:
var queryOptions = new List<QueryOption>()
{
new QueryOption("$count", "true")
};
var lstTemp = graphClient.Groups[$"{groupID}"].TransitiveMembers
.Request(queryOptions)
.Header("ConsistencyLevel", "eventual")
.Select("id,mail,onPremisesSecurityIdentifier").Top(999)
.GetAsync().GetAwaiter().GetResult();
var lstGroups = lstTemp.CurrentPage.Where(x => x.ODataType.Contains("group")).ToList();
while (lstTemp.NextPageRequest != null)
{
lstTemp = lstTemp.NextPageRequest.GetAsync().GetAwaiter().GetResult();
lstGroups.AddRange(lstTemp.CurrentPage.Where(x => x.ODataType.Contains("group")).ToList());
}
Although the following logic works fine, for larger data set where the result count could be around 10K records or more, I've noticed the time required to fetch all of the results is around 10-12 seconds.
I'm looking for a solution by which we can parallelize (or multi-threading/tasking) API calls are executed in such a way that the overall time to get completed results is further reduced.
In C# we have Parallel.For etc. can I use it in this scenario to replace my regular While loop mentioned above?
Any suggestions?

Not really using the Parallel.For api, but you can execute a bunch of asynchronous tasks concurrently by throwing them into a List<Task<T>> and awaiting the whole list with Task.WhenAll. Your code may look something like this:
var queryOptions = new List<QueryOption>()
{
new QueryOption("$count", "true")
};
// Creating the first request
var firstRequest = graphClient.Groups[$"{groupID}"].TransitiveMembers
.Request(queryOptions)
.Header("ConsistencyLevel", "eventual")
.Select("id,mail,onPremisesSecurityIdentifier").Top(999)
.GetAsync();
// Creating a list of all requests (starting with the first one)
var requests = new List<Task<IGroupTransitiveMembersCollectionWithReferencesPage>>() { firstRequest };
// Awaiting the first response
var firstResponse = await firstRequest;
// Getting the total count from the request
var count = (int) firstResponse.AdditionalData["#odata.count"];
// Setting offset to the amount of data you already pulled
var offset = 999;
while (offset < count)
{
// Creating the next request
var nextRequest = graphClient.Groups[$"{groupID}"].TransitiveMembers
.Request() // Notice no $count=true (may potentially hurt performance and we don't need it anymore anyways)
.Header("ConsistencyLevel", "eventual")
.Select("id,mail,onPremisesSecurityIdentifier")
.Skip(offset).Top(999) // Skipping the data you already pulled
.GetAsync();
// Adding it to the list
requests.Add(nextRequest);
// Increasing the offset
offset += 999;
}
// Waiting for all the requests to finish
var allResponses = await Task.WhenAll(requests);
// This flattens the list while filtering as you did
allResponses
.Select(x => x.CurrentPage)
.SelectMany(x => x.Where(x => x.ODataType.Contains("group")));
Couldn't check if this code works without a Graph tenant, so you might need to modify a bit, but I hope you can see the general idea.
Also I allowed myself to refactor the code to use proper async/await since it's good and standard practice to do that, but it should work with .GetAwaiter().GetResult() if you can't use await in your context for some reason (please consider, though).

Report when input to first dataflow block finishes all linked blocks

I am using TPL Dataflow to download data from a ticketing system.
The system takes the ticket number as the input, calls an API and receives a nested JSON response with various information. Once received, a set of blocks handles each level of the nested structure and writes it to a relational database. e.g. Conversation, Conversation Attachments, Users, User Photos, User Tags, etc
Json
{
"conversations":[
{
"id":12345,
"author_id":23456,
"body":"First Conversation"
},
{
"id":98765,
"authorid":34567,
"body":"Second Conversation",
"attachments":[
{
"attachmentid":12345
"attachment_name":"Test.jpg"
}
}
],
"users":[
{
"userid":12345
"user_name":"John Smith"
},
{
"userid":34567
"user_name":"Joe Bloggs"
"user_photo":
{
"photoid":44556,
"photo_name":"headshot.jpg"
},
tags:[
"development",
"vip"
]
}
]
Code
Some blocks need to broadcast so that deeper nesting can still have access to the data. e.g. UserModelJson is broadcast so that 1 block can handle writing the user, 1 block can handle writing the User Tags and 1 block can handle writing the User Photos.
var loadTicketsBlock = new TransformBlock<int, ConversationsModelJson>(async ticketNumber => await p.GetConversationObjectFromTicket(ticketNumber));
var broadcastConversationsObjectBlock = new BroadcastBlock<ConversationsModelJson>(conversations => conversations);
//Conversation
var getConversationsFromConversationObjectBlock = new TransformManyBlock<ConversationsModelJson, ConversationModelJson>(conversation => ModelConverter.ConvertConversationsObjectJsonToConversationJson(conversation));
var convertConversationsBlock = new TransformBlock<ConversationModelJson, ConversationModel>(conversation => ModelConverter.ConvertConversationJsonToConversation(conversation));
var batchConversionBlock = new BatchBlock<ConversationModel>(batchBlockSize);
var convertConversationsToDTBlock = new TransformBlock<IEnumerable<ConversationModel>, DataTable>(conversations => ModelConverter.ConvertConversationModelToConversationDT(conversations));
var writeConversationsBlock = new ActionBlock<DataTable>(async conversations => await p.ProcessConversationsAsync(conversations));
var getUsersFromConversationsBlock = new TransformManyBlock<ConversationsModelJson, UserModelJson>(conversations => ModelConverter.ConvertConversationsJsonToUsersJson(conversations));
var broadcastUserBlock = new BroadcastBlock<UserModelJson>(userModelJson => userModelJson);
//User
var convertUsersBlock = new TransformBlock<UserModelJson, UserModel>(user => ModelConverter.ConvertUserJsonToUser(user));
var batchUsersBlock = new BatchBlock<UserModel>(batchBlockSize);
var convertUsersToDTBlock = new TransformBlock<IEnumerable<UserModel>, DataTable>(users => ModelConverter.ConvertUserModelToUserDT(users));
var writeUsersBlock = new ActionBlock<DataTable>(async users => await p.ProcessUsersAsync(users));
//UserTag
var getUserTagsFromUserBlock = new TransformBlock<UserModelJson, UserTagModel>(user => ModelConverter.ConvertUserJsonToUserTag(user));
var batchTagsBlock = new BatchBlock<UserTagModel>(batchBlockSize);
var convertTagsToDTBlock = new TransformBlock<IEnumerable<UserTagModel>, DataTable>(tags => ModelConverter.ConvertUserTagModelToUserTagDT(tags));
var writeTagsBlock = new ActionBlock<DataTable>(async tags => await p.ProcessUserTagsAsync(tags));
DataflowLinkOptions linkOptions = new DataflowLinkOptions()
{
PropagateCompletion = true
};
loadTicketsBlock.LinkTo(broadcastConversationsObjectBlock, linkOptions);
//Conversation
broadcastConversationsObjectBlock.LinkTo(getConversationsFromConversationObjectBlock, linkOptions);
getConversationsFromConversationObjectBlock.LinkTo(convertConversationsBlock, linkOptions);
convertConversationsBlock.LinkTo(batchConversionBlock, linkOptions);
batchConversionBlock.LinkTo(convertConversationsToDTBlock, linkOptions);
convertConversationsToDTBlock.LinkTo(writeConversationsBlock, linkOptions);
var tickets = await provider.GetAllTicketsAsync();
foreach (var ticket in tickets)
{
cts.Token.ThrowIfCancellationRequested();
await loadTicketsBlock.SendAsync(ticket.TicketID);
}
loadTicketsBlock.Complete();
The LinkTo blocks are repeated for each type of data to be written.
I know when the whole pipeline is complete by using
await Task.WhenAll(<Last block of each branch>.Completion);
but if I pass in ticket number 1 into the loadTicketsBlock block then how do I know when that specific ticket has been through all blocks in the pipeline and therefore is complete?
The reason that I want to know this is so that I can report to the UI that ticket 1 of 100 is complete.

You could consider using the TaskCompletionSource as the base class for all your sub-entities. For example:
class Attachment : TaskCompletionSource
{
}
class Conversation : TaskCompletionSource
{
}
Then every time you insert a sub-entity in the database, you mark it as completed:
attachment.SetResult();
...or if the insert fails, mark it as faulted:
attachment.SetException(ex);
Finally you can combine all the asynchronous completions in one, with the method Task.WhenAll:
Task ticketCompletion = Task.WhenAll(Enumerable.Empty<Task>()
.Append(ticket.Task)
.Concat(attachments.Select(e => e.Task))
.Concat(conversations.Select(e => e.Task)));

If I am tracking progress in Dataflow, usually I will set up the last block as a notify the UI of progress type block. To be able to track the progress of your inputs, you will need to keep the context of the original input in all the objects you are passing around, so in this case you need to be able to tell that you are working on ticket 1 all the way through your pipeline, and if one of your transforms removes the context that it is working on ticket 1, then you will need to rethink the object types that you are passing through your pipeline so you can keep that context.
A simple example of what I'm talking about is laid out below with a broadcast block going to three transform blocks, and then all three transform blocks going back to an action block that notifies about the progress of the pipelines.
When combining back into the single action block you need to make sure not to propagate completion at that point because as soon as one block propagates completion to the action block, that action block will stop accepting input, so you will still wait for the last block of each pipeline to complete, and then after that manually complete your final notify the UI action block.
using System;
using System.Threading.Tasks.Dataflow;
using System.Threading.Tasks;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
var broadcastBlock = new BroadcastBlock<string>(x => x);
var transformBlockA = new TransformBlock<string, string>(x =>
{
return x + "A";
});
var transformBlockB = new TransformBlock<string, string>(x =>
{
return x + "B";
});
var transformBlockC = new TransformBlock<string, string>(x =>
{
return x + "C";
});
var ticketTracking = new Dictionary<int, List<string>>();
var notifyUiBlock = new ActionBlock<string>(x =>
{
var ticketNumber = int.Parse(x.Substring(5,1));
var taskLetter = x.Substring(7,1);
var success = ticketTracking.TryGetValue(ticketNumber, out var tasksComplete);
if (!success)
{
tasksComplete = new List<string>();
ticketTracking[ticketNumber] = tasksComplete;
}
tasksComplete.Add(taskLetter);
if (tasksComplete.Count == 3)
{
Console.WriteLine($"Ticket {ticketNumber} is complete");
}
});
DataflowLinkOptions linkOptions = new DataflowLinkOptions() {PropagateCompletion = true};
broadcastBlock.LinkTo(transformBlockA, linkOptions);
broadcastBlock.LinkTo(transformBlockB, linkOptions);
broadcastBlock.LinkTo(transformBlockC, linkOptions);
transformBlockA.LinkTo(notifyUiBlock);
transformBlockB.LinkTo(notifyUiBlock);
transformBlockC.LinkTo(notifyUiBlock);
for(var i = 0; i < 5; i++)
{
broadcastBlock.Post($"Task {i} ");
}
broadcastBlock.Complete();
Task.WhenAll(transformBlockA.Completion, transformBlockB.Completion, transformBlockC.Completion).Wait();
notifyUiBlock.Complete();
notifyUiBlock.Completion.Wait();
Console.WriteLine("Done");
}
}
This will give an output similar to this
Ticket 0 is complete
Ticket 1 is complete
Ticket 2 is complete
Ticket 3 is complete
Ticket 4 is complete
Done

Data not being inserted into Cosmos C#

I have a list of items (32007 of them)
I am adding them in bulk
However, it seems as though some are not being inserted
The even stranger thing is that if I run my process several times from scratch (i.e. recreating my collection), the number items created varies, I have seen 32005, 32003
I have scaled up my collection to have a lot of RUs (Autoscaled) to 25000
I split the data into tranches of 100
My logic is below
public async Task<List<Account>> ProcessAccountsAsync(List<Account> accounts)
{
var cosmosConnection = await ConnectToDatabaseAsync().ConfigureAwait(false);
var failedAccounts = new List<Account>();
var accountsToInsert = new Dictionary<PartitionKey, Stream>(accounts.Count);
Parallel.ForEach(accounts, (account) =>
{
var stream = new MemoryStream();
var json = JsonConvert.SerializeObject(account, JsonHelper.DefaultSettings());
stream.Write(Encoding.Default.GetBytes(json));
stream.Position = 0;
accountsToInsert.Add(new PartitionKey(account.Id), stream);
});
var tasks = new List<Task>(accounts.Count);
foreach (var account in accountsToInsert)
{
tasks.Add(cosmosConnection.Container.CreateItemStreamAsync(account.Value, account.Key)
.ContinueWith((Task<ResponseMessage> task) =>
{
using (var response = task.Result)
{
if (!response.IsSuccessStatusCode)
{
var actualAccount = accounts.FirstOrDefault(x => account.Key.ToString().Contains(x.Id));
Debug.WriteLine($"Processing Account : {actualAccount?.ArcContactId} Received {response.StatusCode} ({response.ErrorMessage}).");
failedAccounts.Add(actualAccount);
}
}
}));
}
await Task.WhenAll(tasks);
return failedAccounts;
}
In my calling logic I retry all accounts that have failed
var tidiedJson = JsonConvert.SerializeObject(list, Formatting.Indented);
accounts = JsonConvert.DeserializeObject <List<DomainModels.Versions.v2.Account>>(tidiedJson);
var failedAccounts = await cosmosAccountRepository.ProcessAccountsAsync(accounts);
while (failedAccounts.Count > 0)
{
failedAccounts = await cosmosAccountRepository.ProcessAccountsAsync(failedAccounts);
}
I have no idea why accounts are not being inserted and why the behaviour is so random!
I have tried this with a variety of throughput and tranche sizes, no difference
Can anyone see anything obvious?
Failing this, I have the accounts with an ID field, is there a way of finding out which of the accounts are not in the database, without having to go through all 32007 of them one by one, which is obviously not a good plan!
Paul

Using SemaphoreSlim for tasks with return type

I am trying to create a producer/consumer TPL dataflow process. As part of it, I will be creating multiple producer tasks for different id range which will generate the data records for processing. So I am planning to throttle the number of threads that will connect to database to get the data and be active at same time.
However, the method which extracts data and sends it to BufferBlock has a return type so that I can get the count of records extracted. So I am unable to figure out where to call the Release method from the SemaphoreSlim class, but still be able to get a return value? Below is the sample code that I am using. Can someone please suggest any work around for this?
private Task<KeyRange>[] DataExtractProducer(ITargetBlock<DataRow[]> targetBuffer, ExtractQueryConfiguration QueryConf)
{
CancellationToken cancelToken = cancelTokenSrc.Token;
var tasks = new List<Task<KeyRange>>();
int taskCount = 0;
int maxThreads = QueryConf.MaxThreadsLimit > 0 ? QueryConf.MaxThreadsLimit : DataExtConstants.DefaultMaxThreads;
using (SemaphoreSlim concurrency = new SemaphoreSlim(maxThreads))
{
foreach (KeyRange range in keyRangeList)
{
concurrency.WaitAsync();
var task = Task.Run(() =>
{
Console.WriteLine(MsgConstants.StartKeyRangeExtract, QueryConf.KeyColumn, range.StartValue, range.EndValue);
//concurrency.Release();
return GetDataTask(targetBuffer, range, QueryConf.ExtractQuery);
}, cancelToken);
tasks.Add(task);
taskCount++;
}
}
Console.WriteLine(MsgConstants.TaskCountMessage, taskCount);
return tasks.ToArray<Task<KeyRange>>();
}
Edit: I tried this variant also. But this does not seem to work. I tried with limit of 20. But I see more than 50 DB connections going out. Eventually, I am hitting high memory consumption because of the unthrottled connections.
using (SemaphoreSlim concurrency = new SemaphoreSlim(maxThreads))
{
foreach (KeyRange range in keyRangeList)
{
concurrency.WaitAsync();
var task = Task.Run(() =>
{
Console.WriteLine(MsgConstants.StartKeyRangeExtract, QueryConf.KeyColumn, range.StartValue, range.EndValue);
var temptask = await GetDataTask(targetBuffer, range, QueryConf.ExtractQuery);
concurrency.Release();
return temptask;
}, cancelToken);
tasks.Add(task);
taskCount++;
}
}

Parallel execution of tasks in groups

I am describing my problem in a simple example and then describing a more close problem.
Imagine We Have n items [i1,i2,i3,i4,...,in] in the box1 and we have a box2 that can handle m items to do them (m is usually much less than n) . The time required for each item is different. I want to always have doing m job items until all items are proceeded.
A much more close problem is that for example you have a list1 of n strings (URL addresses) of files and we want to have a system to have m files downloading concurrently (for example via httpclient.getAsync() method). Whenever downloading of one of m items finishes, another remaining item from list1 must be substituted as soon as possible and this must be countinued until all of List1 items proceeded.
(number of n and m are specified by users input at runtime)
How this can be done?

Here is a generic method you can use.
when you call this TIn will be string (URL addresses) and the asyncProcessor will be your async method that takes the URL address as input and returns a Task.
The SlimSemaphore used by this method is going to allow only n number of concurrent async I/O requests in real time, as soon as one completes the other request will execute. Something like a sliding window pattern.
public static Task ForEachAsync<TIn>(
IEnumerable<TIn> inputEnumerable,
Func<TIn, Task> asyncProcessor,
int? maxDegreeOfParallelism = null)
{
int maxAsyncThreadCount = maxDegreeOfParallelism ?? DefaultMaxDegreeOfParallelism;
SemaphoreSlim throttler = new SemaphoreSlim(maxAsyncThreadCount, maxAsyncThreadCount);
IEnumerable<Task> tasks = inputEnumerable.Select(async input =>
{
await throttler.WaitAsync().ConfigureAwait(false);
try
{
await asyncProcessor(input).ConfigureAwait(false);
}
finally
{
throttler.Release();
}
});
return Task.WhenAll(tasks);
}

You should look in to TPL Dataflow, add the System.Threading.Tasks.Dataflow NuGet package to your project then what you want is as simple as
private static HttpClient _client = new HttpClient();
public async Task<List<MyClass>> ProcessDownloads(IEnumerable<string> uris,
int concurrentDownloads)
{
var result = new List<MyClass>();
var downloadData = new TransformBlock<string, string>(async uri =>
{
return await _client.GetStringAsync(uri); //GetStringAsync is a thread safe method.
}, new ExecutionDataflowBlockOptions{MaxDegreeOfParallelism = concurrentDownloads});
var processData = new TransformBlock<string, MyClass>(
json => JsonConvert.DeserializeObject<MyClass>(json),
new ExecutionDataflowBlockOptions {MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded});
var collectData = new ActionBlock<MyClass>(
data => result.Add(data)); //When you don't specifiy options dataflow processes items one at a time.
//Set up the chain of blocks, have it call `.Complete()` on the next block when the current block finishes processing it's last item.
downloadData.LinkTo(processData, new DataflowLinkOptions {PropagateCompletion = true});
processData.LinkTo(collectData, new DataflowLinkOptions {PropagateCompletion = true});
//Load the data in to the first transform block to start off the process.
foreach (var uri in uris)
{
await downloadData.SendAsync(uri).ConfigureAwait(false);
}
downloadData.Complete(); //Signal you are done adding data.
//Wait for the last object to be added to the list.
await collectData.Completion.ConfigureAwait(false);
return result;
}
In the above code only concurrentDownloads number of HttpClients will be active at any given time, unlimited threads will be processing the received strings and turning them in to objects, and a single thread will be taking those objects and adding them to a list.
UPDATE: here is a simplified example that only does what you asked for in the question
private static HttpClient _client = new HttpClient();
public void ProcessDownloads(IEnumerable<string> uris, int concurrentDownloads)
{
var downloadData = new ActionBlock<string>(async uri =>
{
var response = await _client.GetAsync(uri); //GetAsync is a thread safe method.
//do something with response here.
}, new ExecutionDataflowBlockOptions{MaxDegreeOfParallelism = concurrentDownloads});
foreach (var uri in uris)
{
downloadData.Post(uri);
}
downloadData.Complete();
downloadData.Completion.Wait();
}

A simple solution for throttling is a SemaphoreSlim.
EDIT
After a slight alteration the code now creates the tasks when they are needed
var client = new HttpClient();
SemaphoreSlim semaphore = new SemaphoreSlim(m, m); //set the max here
var tasks = new List<Task>();
foreach(var url in urls)
{
// moving the wait here throttles the foreach loop
await semaphore.WaitAsync();
tasks.Add(((Func<Task>)(async () =>
{
//await semaphore.WaitAsync();
var response = await client.GetAsync(url); // possibly ConfigureAwait(false) here
// do something with response
semaphore.Release();
}))());
}
await Task.WhenAll(tasks);
This is another way to do it
var client = new HttpClient();
var tasks = new HashSet<Task>();
foreach(var url in urls)
{
if(tasks.Count == m)
{
tasks.Remove(await Task.WhenAny(tasks));
}
tasks.Add(((Func<Task>)(async () =>
{
var response = await client.GetAsync(url); // possibly ConfigureAwait(false) here
// do something with response
}))());
}
await Task.WhenAll(tasks);

Process items in parallel, limiting the number of simultaneous jobs:
string[] strings = GetStrings(); // Items to process.
const int m = 2; // Max simultaneous jobs.
Parallel.ForEach(strings, new ParallelOptions {MaxDegreeOfParallelism = m}, s =>
{
DoWork(s);
});

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Deduplicate stream records - c#

Related

Parallelize C# Graph API SDK methods

Report when input to first dataflow block finishes all linked blocks

Data not being inserted into Cosmos C#

Using SemaphoreSlim for tasks with return type

Parallel execution of tasks in groups

Categories

Resources