How to get the most recent National Weather Service radar images? - c#

I am trying to get the most recent NWS radar images using c#. There are directories on the NWS website that contain a list of the most recent images. However, the files are named by the date uploaded, not in numerical order. They are generally uploaded every few minutes, but the exact amount of minutes can vary by as much as 5 minutes. To get the URLs of the images, I could write an XML parser to extract the URLs from the index page, however this seems over complicated for such a simple task. In addition, this index page is not an API, and if they might change something with the format that would screw up the XML parser. Is there some other way to get the URLs of the most recent images?

An html is not always a valid Xml. But you can use use a real html parser like HtmlAgilityPack for this.
WebClient wc = new WebClient();
var page = wc.DownloadString("http://radar.weather.gov/ridge/RadarImg/NCR/OKX/?C=M;O=D");
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(page);
var imageLink = doc.DocumentNode.SelectNodes("//td/a[#href]")
.Select(a=>a.Attributes["href"].Value)
.OrderByDescending(a=>a)
.First();
--EDIT--
Forget about this answer and go that way United States Weather Radar Data Feed or API?

Related

C# HTML scraping between tags

Okay so I'm trying to do a Skype tool which would have a "dictionary" command which would retrieve the meaning of the word from urban dictionary at the moment I'm able to load the whole HTML document in to string like this:
private void urbanDictionary(string term)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://www.urbandictionary.com/define.php?term=" + term);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader stream = new StreamReader(response.GetResponseStream());
string final_response = stream.ReadToEnd();
MessageBox.Show(final_response);
}
The problem is that I only want the meaning which is like so
<div class='meaning'> "meaning" </div>
I have tried all kinds of stuff but i cant manage to retrieve the text between "div" tags.
How could i do this?
Use the HtmlAgilityPack library, exactly what you need.
http://www.codeproject.com/Articles/659019/Scraping-HTML-DOM-elements-using-HtmlAgilityPack-H
I can suggest, in final_response string first find then add create a substring from that index+"div class='meaning'".length to end of string. After in that substring find index position of "" and use this again to find another substring having text inbetween div tag.
Example.
IF you get at index 100 then create substring using 100+38 to end.
This substring will like "meaning" .
Again find index position of lets assume that it is 10 then find substring from 0 to (10 -1) this will give output as meaning
Maybe not the answer you're looking for. But I used https://www.mashape.com to get an API for urban dictionary. Unfortunatly it's unofficial, so I don't know for how long this will work. But as comments already mentioned, the html could also always change - most likely more often than an API. Also the API consumes less bandwidth, which should always preferred.
Usage would be
var client = new WebClient();
client.Headers.Add("X-Mashape-Key", "APIKEY");
client.Headers.Add("Accept", "text/plain");
Console.WriteLine(client.DownloadString("https://mashape-community-urban-dictionary.p.mashape.com/define?term="+ term));
There are two options.
1) You can use Regex to remove the HTML tags. This is short and sweet and you can use it if the HTML source you are dealing with is not complex.
string meaningStr = Regex.Replace(final_response, #"<[^>]+>", "").Trim();
You can find the above solution tested live at: regexstorm.net/tester
2) You can use HTMLAgilityPack . This method is recommended but needs you to expend some effort setting it up. With Nuget, it's not that difficult.
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(final_response);
final_response = doc.InnerText;

Reading values from webpage programmatically

I don't know what it called, but i think this is possible
I am looking to write something(don't know the exact name) that will,
go to a webpage and select a value from drop-down box on that page and read values from that page after selection, I am not sure weather it called crawler or activity, i am new to this but i heard long time back from one of my friend this can be done,
can any one please give me a head start
Thanks
You need an HTTP client library (perhaps libcurl in C, or some C# wrapper for it, or some native C# HTTP client library like this).
You also need to parse the retrieved HTML content. So you probably need an HTML parsing library (maybe HTML agility pack).
If the targeted webpage is nearly fixed and has e.g. some comments to ease finding the relevant part, you might use simpler or ad-hoc parsing techniques.
Some sites might send a nearly empty static HTML client, with the actual page being dynamically constructed by Javascript scripts (Ajax). In that case, you are unlucky.
Maybe you want some web service ....
One simple way (but not the most efficient way) is to simply read the webpage as String using the WebClient, for example:
WebClient Web = new WebClient();
String Data = Web.DownloadString("Address");
Now since HTML is simply an XML document you can parse the string to a XDocument and look up the tag that represents the dropdown box. Parsing the string to XDocument is done this way:
XDocument xdoc = XDocument.Pase(Data);
Update:
If you want to read the result of the selected value, and that result is displayed within the page do this:
Get all the items as I explained.
If the page does not make use of models, then you can use your selected value as an argument for example :
www.somepage.com/Name=YourItem?
Read the page again and find the value

Displaying one item from an xml file and dispalying it c#

I am creating an app for windows phone and i am searching the lyrics and need to display the data from the xml.
Now i know how to do this for many items in listbox,
but the xml data i receive is only ever going to have one option.
http://lyrics.wikia.com/api.php?artist=pharrell%20williams&song=happy&fmt=xml
There is an example of the xml i am trying to parse and display.
So any idea/ tips on how i would go around just parsing this one entry and displaying it to the textbox.
the only thing i am after is the lyrics data and the URL as that's all i will be displaying on the page.
You can use linq-to-xml to query specific information from xml. Following is an example to get lyrics and url using linq-to-xml :
var doc = XDocument.Parse(xml);
var lyrics = doc.Root.Element("lyrics").Value;
var url = doc.Root.Element("url").Value;
With that, lyrics and url information extracted and ready to be displayed in textbox or any other control of choice.
Note: xml is xml string downloaded from link in the question.

How do I scrape a website for information?

I want my program to automatically download only certain information off a website. After finding out that this is nearly impossible I figured it would be best if the program would just download the entire web page and then find the information that I needed inside of a string.
How can I find certain words/numbers after specific words? The word before the number I want to have is always the same. The number varies and that is the number I need in my program.
Sounds like screen scraping. I recommend using CSQuery https://github.com/jamietre/CsQuery (or HtmlAgilityPack if you want). Get the source, parse as object, loop over all text nodes and do your string comparison there. The actual way of doing this varies a LOT on how the source HTML is done.
Maby something like this untested example written from memory (CSQuery)
var dom = CQ.Create(stringWithHtml);
dom["*"].Each((i, e) =>
{
// handle only text nodes
if (e.NodeType == NodeType.TEXT_NODE) {
// do your check here
}
}
I've used HTML Agility Pack for multiple applications and it works well. Lots of options too.
It's a lovely HTML parser that is commonly recommended for this. It will take malformed HTML and massage it into XHTML and then a traversable DOM, like the XML classes. So, is very useful for the code you find in the wild.

Finding "Keywords" with potentially damaged HTML Files and Counting Hits

I'm trying to create a master index file for a bunch of HTML files sitting in a directory. There could be anywhere from 5 to 5000. These files aren't clean or nice, so some of the libs I looked at don't seem like they would play nice. Many of these files come from the temp directory or are carved out of the file slack (ergo incomplete files in many cases). Plus, sometimes people just write sloppy HTML.
I've basically decided to enumerate through the directory and use something like
string[] FileEntries = Directory.GetFiles(WhichDirectory);
foreach (string FileName in FileEntries)
{
using (StreamReader sr = new StreamReader(FileName))
{
HTMLContents = sr.ReadToEnd();
}
I'm hoping that the StreamReader can dump the contents into a character array the same way it would a text file.
Anyways, given that this might not be the cleanest HTML in the world, there a few things I'd like to parse out of the array.
Any Instance of a date in ANY format (e.g. 1/1/11, January 1st, 2011, 1-1-11, Jan-1-2011, etc) and dump these into a string to be read back later. Hopefully there is a lib or something for finding "instances" of dates.
Read a text file line by line with various "keywords" to look for in the mess of HTML. Things like "Bob Evans" or "Sausage Factory Ltd" etc. I then want to count the number of times each "keyword" shows up. The problem is I don't want to have to resort to the user having to know regex expressions.
So, the desired output would be something like this:
BobEvans9304902.html
Title: Bob Evans Secret Sausage Recipe
Dates Found: "October 2nd, 2009" , "7/22/09"
"Bob Evans Sausage" : 30 hits
"Paprika" : 2 hits
"Don't overwork it" : 5 hits
All the solutions I have seen so far seem like they only work for single characters or words (LINQ) or split a "neat' sentence into words. I'm hoping I won't have to create a new copy of the string and strip out all the HTML tags, since it's not always going to be neat and I don't want to add another step to mass file processing. If that's the only way to do it, though, so be it.
You probably want to investigate an HTML to XML parser that handles poorly formed XML like the html agility pack. Then you can focus on the content and use XPath queries to search for/count keywords. I expect you'll probably still need regex to handle the dates though.

Categories