I have a process that retrieves html from a remote site and parses it. I pass several URL's into the method, so I would like to ajaxify the process and give a screen notification each time a URL completes parsing. For example, this is what I am trying to do:
List<string> urls = ...//load up with arbitary # of urls
foreach (var url in urls)
{
string html = GetContent(url);
//DO SOMETHING
//COMPLETED.. SEND NOTIFICATION TO SCREEN (HOW DO I DO THIS)
}
public static string GetContent(string url)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Method = "GET";
using (var stream = request.GetResponse().GetResponseStream())
{
using (var reader = new StreamReader(stream, Encoding.UTF8))
{
return reader.ReadToEnd();
}
}
}
In each iteration in the loop, I want to show the URL was completed and moving on to the next one. How can I accomplish this?
The first thing you need to worry about is the fact (I'm assuming) that you're running a potentially long-running operation in ASP.NET code. This will become a problem when you run in to IIS timeouts. (By default, 90 seconds.) Assume you're processing ten URLs, each of which takes 15 seconds to complete reader.ReadToEnd() – your code will time out and get killed after the sixth URL.
You might be thinking "I can just crank up the timeout," but that's not really a good answer; you're still under time pressure.
The way I solve problems like this is to move long-running operations into a stand-alone Windows Service, then use WCF to communicate between ASP.NET code and the Service. The Service can run a thread pool that executes requests to process a group of URLs. (Here's an implementation that allows you to queue work items.)
Now, from your web page, you can poll for status updates via AJAX requests. The handler in your ASP.NET code can use WCF to pull the status information from the Service process.
A way to do this might be to assign each submitted work unit a unique ID and return that ID to the client. The client can then poll for status by sending an AJAX request for the status of work unit n. In the Service, keep a List of work units with their statuses (locking it as appropriate to avoid concurrency problems).
public class WorkUnit {
public int ID { get; set; }
public List<string> URLs { get; set; }
public int Processed { get; set; }
}
private var workUnits = new List<WorkUnit>();
private void ExecuteWorkUnit(int id) {
var unit = GetWorkUnit(id);
foreach (var url in unit.URLs) {
string html = GetContent(url);
// do whatever else...
lock (workUnits) unit.Processed++;
}
}
public WorkUnit GetWorkUnit(int id) {
lock (workUnits) {
// Left as an exercise for the reader
}
}
You'll need to fill in methods to add a work unit, return the status of a given work unit, and deal with the thread pool.
I've used a similar architecture with great success.
Related
I am currently trying to get a lot of data about video games out of Wikipedia using their public API. I've gotten some of the way. I can currently get all the pageid I need with their associated article title. But then I need to get their Unique Identifiers (Qxxxx where x are numbers) and that takes quite a while...possibly because I have to make single queries for every title (there are 22031) or because I don't understand Wikipedia Queries.
So I thought "Why not just make multiple queries at once?" so I started working on that, but I've run into the issue in the title. After the program has run for a while (usually 3-4 minutes) about a minute passes then the application crashes with the error in the title. I think it's because my approach is just bad:
ConcurrentBag<Entry> entrybag = new ConcurrentBag<Entry>(entries);
Console.WriteLine("Getting Wikibase Item Ids...");
Parallel.ForEach<Entry>(entrybag, (entry) =>
{
entry.WikibaseItemId = GetWikibaseItemId(entry).Result;
});
Here is the method that is called:
async static Task<String> GetWikibaseItemId(Entry entry)
{
using (var client = new HttpClient(new HttpClientHandler { AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate }))
{
client.BaseAddress = new Uri("https://en.wikipedia.org/w/api.php");
entry.Title.Replace("+", "Plus");
entry.Title.Replace("&", "and");
String queryString = "?action=query&prop=pageprops&ppprop=wikibase_item&format=json&redirects=1&titles=" + entry.Title;
HttpResponseMessage response = await client.GetAsync(queryString);
response.EnsureSuccessStatusCode();
String result = response.Content.ReadAsStringAsync().Result;
dynamic deserialized = JsonConvert.DeserializeObject(result);
String data = deserialized.ToString();
try
{
if (data.Contains("wikibase_item"))
{
return deserialized["query"]["pages"]["" + entry.PageId + ""]["pageprops"]["wikibase_item"].ToString();
}
else
{
return "NONE";
}
}
catch (RuntimeBinderException)
{
return "NULL";
}
catch (Exception)
{
return "ERROR";
}
}
}
And just for good measure, here is the Entry Class:
public class Entry
{
public EntryCategory Category { get; set; }
public int PageId { get; set; }
public String Title { get; set; }
public String WikibaseItemId { get; set; }
}
Could anyone perhaps help out? Do I just need to change how I query or something else?
Initiating roughly 22000 http requests in parallel from one process is just too much. If your machine had unlimited resources and internet connection bandwidth, this would come close to a denial-of-service attack.
What you see is either TCP/IP port exhaustion or queue contention. To resolve it, process your array in smaller chunks, for example fetch 10 items, process those in parallel, then fetch the next ten, and so on.
Specifically Wikimedia sites have a recommendation to process requests serially:
There is no hard and fast limit on read requests, but we ask that you be considerate and try not to take a site down. Most sysadmins reserve the right to unceremoniously block you if you do endanger the stability of their site.
If you make your requests in series rather than in parallel (i.e. wait for the one request to finish before sending a new request, such that you're never making more than one request at the same time), then you should definitely be fine.
Be sure to check their API terms of service to learn whether and how many parallel requests would be in compliance.
I have 1 exe which is nothing bit a Windows form which will continuously run in background and will watch my serial port and I have 1 event data receive event which fires as my serial port receive data.
As soon as I receive data in this event I will pass this data to another event handler which saves this data in database through web api method.
But data to my serial port will be coming frequently so I want to save this data to my database independently so that my database insert operation doesn't block my incoming serial port data.
This is my code:
void _serialPort_DataReceived(object sender, SerialDataReceivedEventArgs e)//Fires as my serial port receives data
{
int dataLength = _serialPort.BytesToRead;
byte[] data = new byte[dataLength];
int nbrDataRead = _serialPort.Read(data, 0, dataLength);
if (nbrDataRead == 0)
return;
// Send data to whom ever interested
if (NewSerialDataRecieved != null)
{
NewSerialDataRecieved(this, new SerialDataEventArgs(data)); //pass serial port data to new below event handler.
}
}
void _spManager_NewSerialDataRecieved(object sender, SerialDataEventArgs e) //I want this event handler to run independently so that database save operation doenst block incoming serial port data
{
if (this.InvokeRequired)
{
// Using this.Invoke causes deadlock when closing serial port, and BeginInvoke is good practice anyway.
this.BeginInvoke(new EventHandler<SerialDataEventArgs>(_spManager_NewSerialDataRecieved), new object[] { sender, e });
return;
}
//data is converted to text
string str = Encoding.ASCII.GetString(e.Data);
if (!string.IsNullOrEmpty(str))
{
//This is where i will save data to through my web api method.
RunAsync(str).Wait();
}
}
static async Task RunAsync(string data)
{
using (var client = new HttpClient())
{
client.BaseAddress = new Uri("http://localhost:33396/");
client.DefaultRequestHeaders.Accept.Clear();
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
var content = new StringContent(data);
var response = await client.PostAsJsonAsync<StringContent>("api/Service/Post", content);//nothing happens after this line.
}
}
Web api controller:
public class MyController : ApiController
{
[HttpPost]
public HttpResponseMessage Post(HttpRequestMessage request)
{
var someText = request.Content.ReadAsStringAsync().Result;
return new HttpResponseMessage() { Content = new StringContent(someText) };
}
}
But here problem is:
var response = await client.PostAsJsonAsync<StringContent>("api/Service/Post", content);
Nothing happens after this line that is operation blocks on this line.
So can anybody guide me with this?
By independently we determined in the SO C# chat room that you really mean "Asynchronously".
Your solution is the code above, saving this data to a WebAPI endpoint so any solution to the problem needs to be in 2 parts ...
PART 1: The Client Part
On the client all we need to do is make the call asynchronously in order to free up the current thread to carry on receiving data on the incoming serial port, we can do that like so ...
// build the api client, you may want to keep this in a higher scope to avoid recreating on each message
var api = new HttpClient();
api.BaseAddress = new Uri(someConfigVariable);
// asynchronously make the call and handle the result
api.PostAsJsonAsync("api/My", str)
.ContinueWith(t => HandleResponseAsync(t.Result))
.Unwrap();
...
PART 2: The Server Part
Since you have web api i'm also going to assume you are using EF too, the common and "clean" way to do this, with all the extras stripped out (like model validation / error handling) might look something like this ...
// in your EF code you will have something like this ...
Public async Task<User> SaveUser(User userModel)
{
try
{
var newUser = await context.Users.AddAsync(userModel);
context.SavechangesAsync();
return newUser;
}
catch(Exception ex) {}
}
// and in your WebAPI controller something like this ...
HttpPost]
public async Task<HttpResponseMessage> Post(User newUser)
{
return Ok(await SaveUser(newUser));
}
...
Disclaimer:
The concepts involved here go much deeper and as I hinted above, much has been left out here like validation, error checking, ect but this is the core to getting your serial port data in to a database using the technologies I believe you are using.
Key things to read up on for anyone wanting to achieve this kind of thing might include: Tasks, Event Handling, WebAPI, EF, Async operations, streaming.
From what you describe it seems like you might want to have a setup like this:
1) your windows form listens for serial port
2) when new stuff comes to port your windows forms app saves it to some kind of a queue (msmq, for example)
3) you should have separate windows service that checks queue and as it finds new messages in a queue it sends request to web api
Best solution for this problem is to use ConcurrentQueue.
Just do search on google and you will get planty of samples.
ConcurrentQueue is thread safe and it support writing and reading from multiple threads.
So the component listening to the searal port can write data to the queue. And you can have 2 or more tasks running parallel which listening to this queue and update db as soon as it receives data.
Not sure if it's the problem, but you shouldn't block on async code. You are doing RunAsync(str).Wait(); and I believe that's the problem. Have a look at this blog post by Stephen Cleary:
http://blog.stephencleary.com/2012/07/dont-block-on-async-code.html
I have an ASP.NET WebForms application that mimics a help desk system. The application works fine, but recently, they asked me to make it so that it can text message everyone in the system whenever a new help desk ticket is opened.
I am using Twilio to do this and it is working just fine. The only problem is, there are like 15 people in the system that should be getting this text message and when the ticket is submitted, the application takes about 15-20 seconds to repost from the submit. In the future, there could be more then 15 people, double that even.
What I am wondering is if there is a way to send these messages in the background, so that the page will come back from the submit right away. Here is my relevant code:
This is my main method I wrote for sending the text message. Its in a Utility class:
public static string SendSms(string phoneNumber, string message)
{
var request = (HttpWebRequest)WebRequest.Create("https://api.twilio.com/2010-04-01/Accounts/" + Constants.TwilioId + "/Messages.json");
string postData = "From=" + Constants.TwilioFromNumber + "&To=+1" + phoneNumber + "&Body=" + HttpUtility.HtmlEncode(message);
byte[] data = Encoding.ASCII.GetBytes(postData);
string authorization = string.Format("{0}:{1}", Constants.TwilioId, Constants.TwilioAuthToken);
string encodedAuthorization = Convert.ToBase64String(Encoding.ASCII.GetBytes(authorization));
string credentials = string.Format("{0} {1}", "Basic", encodedAuthorization);
request.Method = "POST";
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = data.Length;
request.Headers[HttpRequestHeader.Authorization] = credentials;
using (var stream = request.GetRequestStream())
{
stream.Write(data, 0, data.Length);
}
string responseString;
using (var response = (HttpWebResponse) request.GetResponse())
{
using (var reader = new StreamReader(response.GetResponseStream()))
{
responseString = reader.ReadToEnd();
}
}
return responseString;
}
And here is how I'm calling it:
public void BtnSubmit_Click(object sender, EventArgs e)
{
//
// This is more code here, but its irrelevant
//
var employees = new Employees();
employees.GetAll();
foreach (Employee employee in employees)
{
string number = employee.CellPhoneAreaCode + employee.CellPhonePrefix +
employee.CellPhoneSuffix;
if (!string.IsNullOrEmpty(number) && number.Length == 10)
{
Utility.SendSms(number, "A new Help Desk Ticket is in the System!");
}
}
}
The only other idea I can come up with is to create a WCF service, but that seemed like over kill. Any suggestions are welcome!
Any asynchronous approach should do the trick. For example, using a Task or (if you're on .NET 4.5+) an async method. (Remember to handle the asynchronous errors by supplying a callback with something like .ContinueWith() to examine the task for errors and respond accordingly.)
Meaningfully responding to errors in this case might be complex, though. It sounds like the sort of operation where you want to keep re-trying in the event of a failure (with logging in case of constant failures), and definitely want to continue with the loop even if one message fails. So something a little more manual might be in order.
For that I would recommend persisting the messages themselves to a simple database table from the application and continuing with the UI as you want. Then have a separate application, such as a Windows Service, which periodically polls that database table and sends the messages in a simple loop over the records.
A good approach for something like this would be to keep a simple status flag on the message records. Queued, sent, error (with an error message), etc. The Windows Service can update the records as it sends the messages in the loop. As any given message errors, just update that record and continue with the loop. Re-try error-ed messages as appropriate.
I created an MVC 4 App that gets data from some external sensors, and then shows data depending on the values recived. The external sensors expose their values through an http page (e.g. http:///CheckValue). The MVC App must be continiously checking for those values, let´s say every 5 seconds.
The basic Idea is that this process must be done on the background and in a infinite loop, each sensor on a different thread.
The problem is that I don´t know where is the best place to do this, as of now I just create a new Task for each sensor at the Application_Start method on the Global.aspx file.
protected void Application_Start()
{
foreach (var sensor in sensors)
{
Task.Factory.StartNew(() => sensor.readValue(5000));
}
}
This is the code for readValue()
public void readValue(int timespan)
{
try
{
using HttpClient client = new HttpClient())
{
while(true){
try
{
string result = await client.GetStringAsync(url);
//result validation logic
}
catch(Exception)
{
//Exception Handling
}
Thread.Sleep(timespan);
}
}
}catch(Exception e)
{
Debug.Write(e.Message);
}
}
I´m new to ASP.NET so I really don´t know if this should be in the Application_Start method, or if maybe it shouldn´t be on the MVC App at all, and do it on a separate Windows Service (If so how do I send the values back to the MVC App)
As part of an effort to automate starting/stopping some of our NServiceBus services, I'd like to know when a service has finished processing all the messages in it's input queue.
The problem is that, while the NServiceBus service is running, my C# code is reporting one less message than is actually there. So it thinks that the queue is empty when there is still one message left. If the service is stopped, it reports the "correct" number of messages. This is confusing because, when I inspect the queues myself using the Private Queues view in the Computer Management application, it displays the "correct" number.
I'm using a variant of the following C# code to find the message count:
var queue = new MessageQueue(path);
return queue.GetAllMessages().Length;
I know this will perform horribly when there are many messages. The queues I'm inspecting should only ever have a handful of messages at a time.
I have looked at
other
related
questions,
but haven't found the help I need.
Any insight or suggestions would be appreciated!
Update: I should have mentioned that this service is behind a Distributor, which is shut down before trying to shut down this service. So I have confidence that new messages will not be added to the service's input queue.
The thing is that it's not actually "one less message", but rather dependent on the number of messages currently being processed by the endpoint which, in a multi-threaded process, can be as high as the number of threads.
There's also the issue of client processes that continue to send messages to that same queue.
Probably the only "sure" way of handling this is by counting the messages multiple times with a delay in between and if the number stay zero over a certain number of attempts that you can assume the queue is empty.
WMI was the answer! Here's a first pass at the code. It could doubtless be improved.
public int GetMessageCount(string queuePath)
{
const string query = "select * from Win32_PerfRawData_MSMQ_MSMQQueue";
var query = new WqlObjectQuery(query);
var searcher = new ManagementObjectSearcher(query);
var queues = searcher.Get();
foreach (ManagementObject queue in queues)
{
var name = queue["Name"].ToString();
if (AreTheSameQueue(queuePath, name))
{
// Depending on the machine (32/64-bit), this value is a different type.
// Casting directly to UInt64 or UInt32 only works on the relative CPU architecture.
// To work around this run-time unknown, convert to string and then parse to int.
var countAsString = queue["MessagesInQueue"].ToString();
var messageCount = int.Parse(countAsString);
return messageCount;
}
}
return 0;
}
private static bool AreTheSameQueue(string path1, string path2)
{
// Tests whether two queue paths are equivalent, accounting for differences
// in case and length (if one path was truncated, for example by WMI).
string sanitizedPath1 = Sanitize(path1);
string sanitizedPath2 = Sanitize(path2);
if (sanitizedPath1.Length > sanitizedPath2.Length)
{
return sanitizedPath1.StartsWith(sanitizedPath2);
}
if (sanitizedPath1.Length < sanitizedPath2.Length)
{
return sanitizedPath2.StartsWith(sanitizedPath1);
}
return sanitizedPath1 == sanitizedPath2;
}
private static string Sanitize(string queueName)
{
var machineName = Environment.MachineName.ToLowerInvariant();
return queueName.ToLowerInvariant().Replace(machineName, ".");
}