I have this function that can get up to 10 items as an input list
public async Task<KeyValuePair<string, bool>[]> PayCallSendSMS(List<SmsRequest> ListSms)
{
List<Task<KeyValuePair<string, bool>>> tasks = new List<Task<KeyValuePair<string, bool>>>();
foreach (SmsRequest sms in ListSms)
{
tasks.Add(Task.Run(() => SendSMS(sms)));
}
var result = await Task.WhenAll(tasks);
return result;
}
and in this function, i await for some JSON to be downloaded and after it's done in deserialize it.
public async Task<KeyValuePair<string, bool>> SendSMS(SmsRequest sms)
{
//some code
using (WebResponse response = webRequest.GetResponse())
{
using (Stream responseStream = response.GetResponseStream())
{
StreamReader rdr = new StreamReader(responseStream, Encoding.UTF8);
string Json = await rdr.ReadToEndAsync();
deserializedJsonDictionary = (Dictionary<string, object>)jsonSerializer.DeserializeObject(Json);
}
}
//some code
return GetResult(sms.recipient);
}
public KeyValuePair<string, bool> GetResult(string recipient)
{
if (deserializedJsonDictionary[STATUS].ToString().ToLower().Equals("true"))
{
return new KeyValuePair<string, bool>(recipient, true);
}
else // deserializedJsonDictionary[STATUS] == "False"
{
return new KeyValuePair<string, bool>(recipient, false);
}
}
My problem is in the return GetResult(); part in which deserializedJsonDictionary is null(and ofc it is becuase the json havent done downloading).
but I don't know how to solve it
I tried to use ContinueWith but it doesn't work for me.
I'm willing to accept any change to my original code and/or the design of the solution
Unrelated tip: Don't abuse KeyValuePair<>, use C# 7 value-tuples instead (not least because they're much easier to read).
Using a foreach loop to build a List<Task> is fine - though it can be more succint to use .Select() instead. I use this approach in my answer.
But don't use Task.Run with the ancient WebRequest (HttpWebRequest) type. Instead use HttpClient which has full support for async IO.
Also, you should conform to the .NET naming-convention:
All methods that are async should have Async has a method-name suffix (e.g. PayCallSendSMS should be named PayCallSendSmsAsync).
Acronyms and initialisms longer than 2 characters should be in PascalCase, not CAPS, so use Sms instead of SMS.
Use camelCase, not PascalCase for parameters and locals - and List is a redundant prefix. A better name for ListSms would be smsRequests as its type is List<SmsRequest>).
Generally speaking, parameters should be declared using the least-specific type required - especially collection parameters, consider typing them as IEnumerable<T> or IReadOnlyCollection<T> instead of T[], List<T>, and so on).
You need to first check that the response from the remote server actually is a JSON response (instead of a HTML error message or XML response) and has the expected status code - otherwise you'll be trying to deserialize something that is not JSON.
Consider supporting CancellationToken too (this is not included in my answer as it adds too much visual noise).
Always use Dictionary.TryGetValue instead of blindly assuming the dictionary indexer will match.
public async Task< IReadOnlyList<(String recipient, Boolean ok)> > PayCallSendSmsAsync( IEnumerable<SmsRequest> smsRequests )
{
using( HttpClient httpClient = this.httpClientFactory.Create() )
{
var tasks = smsRequests
.Select(r => SendSmsAsync(httpClient, r))
.ToList(); // <-- The call to ToList is important as it materializes the list and triggers all of the Tasks.
(String recipient, Boolean ok)[] results = await Task.WhenAll(tasks);
return results;
}
}
private static async Task<(String recipient, Boolean ok)> SendSmsAsync(HttpClient httpClient, SmsRequest smsRequest)
{
using (HttpRequestMessage request = new HttpRequestMessage( ... ) )
using (HttpResponseMessage response = await httpClient.SendAsync(request).ConfigureAwait(false))
{
String responseType = response.Content.Headers.ContentType?.MediaType ?? "";
if (responseType != "application/json" || response.StatusCode != HttpStatusCode.OK)
{
throw new InvalidOperationException("Expected HTTP 200 JSON response but encountered an HTTP " + response.StatusCode + " " + responseType + " response instead." );
}
String jsonText = await response.Content.ReadAsStringAsync().ConfigureAwait(false);
Dictionary<String,Object> dict = JsonConvert.DeserializeObject< Dictionary<String,Object> >(jsonText);
if(
dict != null &&
dict.TryGetValue(STATUS, out Object statusValue) &&
statusValue is String statusStr &&
"true".Equals( statusStr, StringComparison.OrdinalIgnoreCase )
)
{
return ( smsRequest.Recipient, ok: true );
}
else
{
return ( smsRequest.Recipient, ok: false );
}
}
}
Related
I've been working for a few days on a performance problem.
Before I delve deeper I want to quickly explain how the specific service work.
I have a main Service that get a request and send requests to other micro-services but the single Entry Point for the user is to Main service, I thinks is more simple to understand with this image:
After the Main service get request from API he do some logic, query the Db and then get a list, every item on the list has Id, to get enrichment about every item the main service create request to one of the micro-service.
For example John request main service, main service get from Db a list of 90 items then the main service will create 90 calls to micro service and return to John single response that include 90 items.
Now the question is only about the right way to create async call to micro service.
This how I develop this part:
GetDetailsAsync(Id, result.Items, request.SystemComponentId);
private static void GetDetailsAsync(string Id, List<MainItem> items, int systemId)
{
var getDetailsTasks = new List<Task>();
foreach (MainItem single in items)
{
getDetailsTasks.Add(SetSingleDetailsAsync(Id, single, systemId));
}
Task.WhenAll(getDetailsTasks);
}
private static async Task SetSingleDetailsAsync(string Id, MainItem single, int systemId)
{
single.ActivityExtendedDetails = await ProcessItemDetailsRequest.GetItemDetailsAsync(Id, single.TypeId,
single.ItemId, systemId);
}
public static Task<JObject> GetItemDetailsAsync(string id, short type,
string itemId, int systemId)
{
var typeList = ActivityTypeDetails.GetActivityTypes();
var url = GetActivityUrl(id, type, itemId, typeList);
if (url == null)
{
throw new Failure($"No url defined for type {type}");
}
try
{
JObject res;
using (var stream = client.GetStreamAsync(url).Result)
using (var sr = new StreamReader(stream))
using (var reader = new JsonTextReader(sr))
{
var serializer = new JsonSerializer();
res = serializer.Deserialize<JObject>(reader);
}
return Task.FromResult(res);
}
catch(Exception ex)
{
Logger.Warn(
$"The uri {url} threw exception {ex.Message}.");
//[Todo]throw exception
return null;
}
}
This code run and the result is not good enough, the CPU rises very quickly and becomes very high, I think that I has a problem on GetItemDetailsAsync func because I use client.GetStreamAsync(url).Result
when using .Result it's block until the task is completed.
So I do some minor change on GetItemDetailsAsync to try to be really async:
public static async Task<JObject> GetItemDetailsAsync(string id, short type,
string itemId, int systemId)
{
var typeList = ActivityTypeDetails.GetActivityTypes();
var url = GetActivityUrl(id, type, itemId, typeList);
if (url == null)
{
throw new Failure($"No url defined for type {type}");
}
try
{
JObject res;
using (var stream = await client.GetStreamAsync(url))
using (var sr = new StreamReader(stream))
using (var reader = new JsonTextReader(sr))
{
var serializer = new JsonSerializer();
res = serializer.Deserialize<JObject>(reader);
}
return res;
}
catch(Exception ex)
{
Logger.Warn(
$"The uri {url} threw exception {ex.Message}.");
//[Todo]throw exception
return null;
}
}
But now I get null where I supposed to get the data that come from Async function.
I try to debugging and I noticed something weird, everything happen likes as I would expect: the methods was called, request to micro-service was executed and get response but the response from the End-Point(which is found on main-service) return before the async method return from micro-service, that cause that I get null instead of my expected data.
I thinks that maybe I don't use correctly async\await and would be happy if anyone could explain how this behavior happens
My question may be trivial but I have spent almost 6hrs just trying things out.
public async Task<object> save()
{
var uri = "https://newsapi.org/v1/articles?source=talksport&apiKey=longKey";
var httpClient = new HttpClient ();
HttpResponseMessage res = await httpClient.GetAsync(uri);
var data = await res.Content.ReadAsStreamAsync();
// this is what I want to achieve like in python you can do something like this
foreach(var item in data){
Console.writeline(item.summary);
}
// end of arbitrary code
return data;
}
My problem is ,am unable to do this conversion to get the response and then accessing the json data.
In python you can do something
r = request.get(apiUrl)
data = r.json()
for item in data:
print(item.summary)
This is all I have struggle to achieve with c#, Any help to complete the code or explanation. Thanks
Try to use something like this:
Install Newtonsoft.Json package and add using Newtonsoft.Json;
using (var request = new HttpRequestMessage()) {
request.RequestUri = new Uri("https://newsapi.org/v1/articles?source=talksport&apiKey=longKey");
request.Method = HttpMethod.Get;
using (var response = await httpClient.SendAsync(request)) {
string content = await response.Content.ReadAsStringAsync();
var result = JsonConvert.DeserializeObject<IList<dynamic>>(content);
foreach(var item in result){
Console.writeline(item.summary);
}
}
}
From comment
Then i get this
"{\"vouchers\":[\"UN9NKK\",\"FYMFVS\",\"WV5AX7\",\"M2TJJ8\",\"FBB9AL\",\"MBW8Z4\"]}"
You can create a new class
public class MyResponse {
public IEnumerable<string> Vouchers {get;set; }
}
then
var response = JsonConvert.DeserializeObject<MyResponse>(content);
foreach(var item in response.Vouchers){
Console.WriteLine(item);
}
If you don't mind a small library dependency, Flurl (disclaimer: I'm the author) gets you Python's simplicity in C#:
var data = await apiUrl.GetJsonAsync();
In this case, data is a C# dynamic type, which means you can access all the JSON object's properties by name without defining a corresponding C# class, much like an untyped language. If you do want to declare a class and get compile-time type checking, that works with Flurl too:
var data = await apiUrl.GetJsonAsync<MyClass>();
Now data is an instance of MyClass instead of a dynamic.
Get Flurl.Http on Nuget, and reference it with using Flurl.Http;.
i am trying to speed up some google directory api calls in the .net client library with BatchRequests
lets say i have the following batchRequest (which consists only of one
request for simplicity):
static async Task BatchRequesting()
{
var batchReq = new BatchRequest(_dirservices[0]);
var r = _dirservices[0].Users.Get("user#domain.com");
batchReq.Queue<UsersResource.GetRequest>(r,
(contentReq, error, j, message) =>
{
... what to do here?
});
await batchReq.ExecuteAsync();
}
how do i get the resulting deserialized response object in the callback (which would be a User object in my case)
Do i have to handle the message.Content object (HttpContent) myself with all the json deserializing?
I found the solution. I used the wrong generic parameter. My Code example has to be like this:
static async Task BatchRequesting()
{
var batchReq = new BatchRequest(_directoryService);
var request = _directoryService.Users.Get("user#domain.com");
batchReq.Queue<User>(request,
(returnedUser, error, j, message) =>
{
if (error != null)
{
Console.WriteLine(error.Message);
}
else
{
... work with returnedUser
}
});
await batchReq.ExecuteAsync();
}
Let's suppose I have the following variable:
System.Net.HttpStatusCode status = System.Net.HttpStatusCode.OK;
How can I check if this is a success status code or a failure one?
For instance, I can do the following:
int code = (int)status;
if(code >= 200 && code < 300) {
//Success
}
I can also have some kind of white list:
HttpStatusCode[] successStatus = new HttpStatusCode[] {
HttpStatusCode.OK,
HttpStatusCode.Created,
HttpStatusCode.Accepted,
HttpStatusCode.NonAuthoritativeInformation,
HttpStatusCode.NoContent,
HttpStatusCode.ResetContent,
HttpStatusCode.PartialContent
};
if(successStatus.Contains(status)) //LINQ
{
//Success
}
None of these alternatives convinces me, and I was hoping for a .NET class or method that can do this work for me, such as:
bool isSuccess = HttpUtilities.IsSuccess(status);
If you're using the HttpClient class, then you'll get a HttpResponseMessage back.
This class has a useful property called IsSuccessStatusCode that will do the check for you.
using (var client = new HttpClient())
{
var response = await client.PostAsync(uri, content);
if (response.IsSuccessStatusCode)
{
//...
}
}
In case you're curious, this property is implemented as:
public bool IsSuccessStatusCode
{
get { return ((int)statusCode >= 200) && ((int)statusCode <= 299); }
}
So you can just reuse this algorithm if you're not using HttpClient directly.
You can also use EnsureSuccessStatusCode to throw an exception in case the response was not successful.
The accepted answer bothers me a bit as it contains magic numbers, (although they are in standard) in its second part. And first part is not generic to plain integer status codes, although it is close to my answer.
You could achieve exactly the same result by instantiating HttpResponseMessage with your status code and checking for success. It does throw an argument exception if the value is smaller than zero or greater than 999.
if (new HttpResponseMessage((HttpStatusCode)statusCode).IsSuccessStatusCode)
{
// ...
}
This is not exactly concise, but you could make it an extension.
I am partial to the discoverability of extension methods.
public static class HttpStatusCodeExtensions
{
public static bool IsSuccessStatusCode(this HttpStatusCode statusCode)
{
var asInt = (int)statusCode;
return asInt >= 200 && asInt <= 299;
}
}
As long as your namespace is in scope, usage would be statusCode.IsSuccessStatusCode().
The HttpResponseMessage class has a IsSuccessStatusCode property, looking at the source code it is like this so as usr has already suggested 200-299 is probably the best you can do.
public bool IsSuccessStatusCode
{
get { return ((int)statusCode >= 200) && ((int)statusCode <= 299); }
}
Adding to #TomDoesCode answer If you are using HttpWebResponse
you can add this extension method:
public static bool IsSuccessStatusCode(this HttpWebResponse httpWebResponse)
{
return ((int)httpWebResponse.StatusCode >= 200) && ((int)httpWebResponse.StatusCode <= 299);
}
It depends on what HTTP resource you are calling. Usually, the 2xx range is defined as the range of success status codes. That's clearly a convention that not every HTTP server will adhere to.
For example, submitting a form on a website will often return a 302 redirect.
If you want to devise a general method then the code >= 200 && code < 300 idea is probably your best shot.
If you are calling your own server then you probably should make sure that you standardize on 200.
This is an extension of the previous answer, that avoids the creation and subsequent garbage collection of a new object for each invocation.
public static class StatusCodeExtensions
{
private static readonly ConcurrentDictionary<HttpStatusCode, bool> IsSuccessStatusCode = new ConcurrentDictionary<HttpStatusCode, bool>();
public static bool IsSuccess(this HttpStatusCode statusCode) => IsSuccessStatusCode.GetOrAdd(statusCode, c => new HttpResponseMessage(c).IsSuccessStatusCode);
}
I am working on an app that searches for email addresses in Google search results' URLs. The problem is it needs to return the value it found in each page + the URL in which it found the email, to a datagridview with 2 columns: Email and URL.
I am using Parallel.ForEach for this one but of course it returns random URLs and not the ones it really found the email on.
public static string htmlcon; //htmlsource
public static List<string> emailList = new List<string>();
public static string Get(string url, bool proxy)
{
htmlcon = "";
try
{
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
if (proxy)
req.Proxy = new WebProxy(proxyIP + ":" + proxyPort);
req.Method = "GET";
req.UserAgent = Settings1.Default.UserAgent;
if (Settings1.Default.EnableCookies == true)
{
CookieContainer cont = new CookieContainer();
req.CookieContainer = cont;
}
WebResponse resp = req.GetResponse();
StreamReader SR = new StreamReader(resp.GetResponseStream());
htmlcon = SR.ReadToEnd();
Thread.Sleep(400);
resp.Close();
SR.Close();
}
catch (Exception)
{
Thread.Sleep(500);
}
return htmlcon;
}
private void copyMails(string url)
{
string emailPat = #"(\b[a-zA-Z0-9._%-]+#[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b)";
MatchCollection mailcol = Regex.Matches(htmlcon, emailPat, RegexOptions.Singleline);
foreach (Match mailMatch in mailcol)
{
email = mailMatch.Groups[1].Value;
if (!emailList.Contains(email))
{
emailList.Add(email);
Action dgeins = () => mailDataGrid.Rows.Insert(0, email, url);
mailDataGrid.BeginInvoke(dgeins);
}
}
}
private void SEbgWorker_DoWork(object sender, DoWorkEventArgs e)
{
//ALOT OF IRRELEVAMT STUFF BEING RUN
Parallel.ForEach(allSElist.OfType<string>(), (s) =>
{
//Get URL
Get(s, Settings1.Default.Proxyset);
//match mails 1st page
copyMails(s);
});
}
so this is it: I execute a Get request(where "s" is the URL from the list) and then execute copyMails(s) from the URL's html source. It uses regex to copy the emails.
If I do it without parallel it returns the correct URL for each email in the datagridview. How can I do this parallel an still get the correct match in the datagridview?
Thanks
You would be better off using PLINQ's Where to filter (pseudo code):
var results = from i in input.AsParallel()
let u = get the URL from i
let d = get the data from u
let v = try get the value from d
where v is found
select new {
Url = u,
Value = v
};
Underneath the AsParallel means that TPL's implementation of LINQ operators (Select, Where, ...) is used.
UPDATE: Now with more information
First there are a number of issues in your code:
The variable htmlcon is static but used directly by multiple threads. This could well be your underlying problem. Consider just two input values. The first Get completes setting htmlcon, before that thread's call to copyMails starts the second thread's Get completes its HTML GET and writes to htmlcon. With `email
The list emailList is also accessed without locking by multiple threads. Most collection types in .NET (and any other programming platform) are not thread safe, you need to limit access to a single thread at a time.
You are mixing up various activities in each of your methods. Consider applying the singe responsibility principle.
Thread.Sleep to handle an exception?! If you can't handle an exception (ie. resolve the condition) then do nothing. In this case if the action throws then the Parallel.Foreach will throw: that'll do until you define how to handle the HTML GET failing.
Three suggestions:
In my experience clean code (to an obsessive degree) makes things easier: the details of the format
don't matter (one true brace style is better, but consistency is the key). Just going through
and cleaning up the formatting showed up issues #1 and #2.
Good naming. Don't abbreviate anything used over more than a few lines of code unless that is a
significant term for the domain. Eg. s for the action parameter in the parallel loop is really a url
so call it that. This kind of thing immediately makes the code easier to follow.
Think about that regex for emails: there are many valid emails that will not match (eg. use of + to provide multiple logical addresses: exmaple+one#gamil.com will be delivered to example#gmail.com and can then be used for local rules). Also an apostrophe ("'") is a valid character (and known people frustrated by web sites that refused their addresses by getting this wrong).
Second: A relatively direct clean up:
public static string Get(string url, bool proxy) {
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
if (proxy) {
req.Proxy = new WebProxy(proxyIP + ":" + proxyPort);
}
req.Method = "GET";
req.UserAgent = Settings1.Default.UserAgent;
if (Settings1.Default.EnableCookies == true) {
CookieContainer cont = new CookieContainer();
req.CookieContainer = cont;
}
using (WebResponse resp = req.GetResponse())
using (StreamReader SR = new StreamReader(resp.GetResponseStream())) {
return SR.ReadToEnd();
}
}
private static Regex emailMatcher = new Regex(#"(\b[a-zA-Z0-9._%-]+#[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b)", RegexOptions.Singleline);
private static string[] ExtractEmails(string htmlContent) {
return emailMatcher.Matches(htmlContent).OfType<Match>
.Select(m => m.Groups[1].Value)
.Distinct()
.ToArray();
}
private void SEbgWorker_DoWork(object sender, DoWorkEventArgs e) {
Parallel.ForEach(allSElist.OfType<string>(), url => {
var htmlContent = Get(url, Settings1.Default.Proxyset);
var emails = ExtractEmails(htmlContent);
foreach (var email in emails) {
Action dgeins = () => mailDataGrid.Rows.Insert(0, email, url);
mailDataGrid.BeginInvoke(dgeins);
}
}
Here I have:
Made use of using statements to automate the cleanup of resources.
Eliminated all mutable shared state.
Regex is explicitly documented to have thread safe instance methods. So I only need a single instance.
Removed noise: no need to pass the URL to ExtractEmails because the extraction doesn't use the URL.
Get now only performs the HTML get, ExtreactEMail just the extraction
Third: The above will block threads on the slowest operation: the HTML GET.
The real concurrency benefit would be to replace HttpWebRequest.GetResponse and reading the response stream with their asynchronous equivalents.
Using Task would be the answer in .NET 4, but you need to directly work with Stream and encoding yourself because StreamReader doesn't provide any BeginABC/EndABC method pairs. But .NET 4.5 is almost here, so apply some async/await:
Nothing to do in ExtractEMails.
Get is now asynchronous, blocking in neither the HTTP GET or reading the result.
SEbgWorker_DoWork uses Tasks directly to avoid mixing too many different ways to work with TPL. Since Get returns a Task<string> can simple continue (when it hasn't failed – unless you specify otherwise ContinueWith will only continue if the previous task has completed successfully):
This should work in .NET 4.5, but without a set of valid URLs for which this will work I cannot test.
public static async Task<string> Get(string url, bool proxy) {
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
if (proxy) {
req.Proxy = new WebProxy(proxyIP + ":" + proxyPort);
}
req.Method = "GET";
req.UserAgent = Settings1.Default.UserAgent;
if (Settings1.Default.EnableCookies == true) {
CookieContainer cont = new CookieContainer();
req.CookieContainer = cont;
}
using (WebResponse resp = await req.GetResponseAsync())
using (StreamReader SR = new StreamReader(resp.GetResponseStream())) {
return await SR.ReadToEndAsync();
}
}
private static Regex emailMatcher = new Regex(#"(\b[a-zA-Z0-9._%-]+#[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b)", RegexOptions.Singleline);
private static string[] ExtractEmails(string htmlContent) {
return emailMatcher.Matches(htmlContent).OfType<Match>
.Select(m => m.Groups[1].Value)
.Distinct()
.ToArray();
}
private void SEbgWorker_DoWork(object sender, DoWorkEventArgs e) {
tasks = allSElist.OfType<string>()
.Select(url => {
return Get(url, Settings1.Default.Proxyset)
.ContinueWith(htmlContentTask => {
// No TaskContinuationOptions, so know always OK here
var htmlContent = htmlContentTask.Result;
var emails = ExtractEmails(htmlContent);
foreach (var email in emails) {
// No InvokeAsync on WinForms, so do this the old way.
Action dgeins = () => mailDataGrid.Rows.Insert(0, email, url);
mailDataGrid.BeginInvoke(dgeins);
}
});
});
tasks.WaitAll();
}
public static string htmlcon; //htmlsource
public static List emailList = new List();
Problem is because these members htmlcon and emailList are shared resource among thread and among iterations. Each your iteration in Parallel.ForEach is executed parallel. Thats why you have strange behaviour.
How to solve problem:
Modify your code and try to implement it without static variables or shared state.
As an option is change from Parallel.ForEach to TPL Task chaining, when you make this change then result of one parallel operation will be input data for other and it's as an options among many how to modify code to avoid shared state.
Use locking or concurrent collections. Your htmlcon variable could be made volatile but with list you should yous lock's or concurrent collections.
Better way is modify your code to avoid shared state, and how to do that are many options based on your implementation, not only task chaining.