I have the following list -
List<string> finalMessageContent
where
finalMessageContent[0] = "<div class="mHr" id="mFID">
<div id="postedDate">11/12/2015 11:12:16</div>
</div>" // etc etc
I am trying to sort the list by a particular value located in the entires - postedDate tag.
Firstly I have create an new object and then serialized it to make the html elements able to be parsed -
string[][] newfinalMessageContent = finalMessageContent.Select(x => new string[] { x }).ToArray();
string json = JsonConvert.SerializeObject(newfinalMessageContent);
JArray markerData = JArray.Parse(json);
And then used Linq to try and sort using OrderByDescending -
var items = markerData.OrderByDescending(x => x["postedDate"].ToString()).ToList();
However this is failing when trying to parse the entry with -
Accessed JArray values with invalid key value: "postedDate". Array position index expected.
Perhaps linq is not the way to go here however it seemed like the most optimised, where am I going wrong?
First, i would not use string methods, regex or a JSON-parser to parse HTML. I would use HtmlAgilityPack. Then you could provide such a method:
private static DateTime? ExtractPostedDate(string inputHtml, string controlID = "postedDate")
{
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(inputHtml);
HtmlNode div = doc.GetElementbyId(controlID);
DateTime? result = null;
DateTime value;
if (div != null && DateTime.TryParse(div.InnerText.Trim(), DateTimeFormatInfo.InvariantInfo, DateTimeStyles.None, out value))
result = value;
return result;
}
and following LINQ query:
finalMessageContent = finalMessageContent
.Select(s => new { String = s, Date = ExtractPostedDate(s) })
.Where(x => x.Date.HasValue)
.OrderByDescending(x => x.Date.Value)
.Select(x => x.String)
.ToList();
Don't know if I get your question right.
But did you know that you can parse HTML with XPath?
foreach (var row in doc.DocumentNode.SelectNodes("//div[#id="postedDate"]"))
{
Console.WriteLine(row.InnerText);
}
this is just an example from the top of my head you might have to double-check the XPath query depending on your document. You can also consider converting it to array or parsing the date and do other transformations with it.
Like I said this is just from the top of my head. Or if the html is not so compley consider to extract the dates with an RegEx but this would be a topic for another question.
HTH
Json Serializer serializes JSON typed strings. Example here to json
To parse HTML I suggest using HtmlAgility https://htmlagilitypack.codeplex.com/
Like this:
HtmlAgilityPack.HtmlDocument htmlparsed = new HtmlAgilityPack.HtmlDocument();
htmlParsed.LoadHtml(finalMessageContent[0]);
List<HtmlNode> OrderedDivs = htmlParsed.DocumentNode.Descendants("div").
Where(a => a.Attributes.Any(af => af.Value == "postedDate")).
OrderByDescending(d => DateTime.Parse(d.InnerText)); //unsafe parsing
Related
I have my json ["[\"~:bbl:P5085\",\"~:cosco:NoTag\"]"] coming in
options.Type1.Values()
I am trying to keep only the values coming with bbl so from above I want to keep P5085 and remove all, there can be multiple bbl values in here and I need to keep all. I tried the below code but its not working. The splitting gives me
P5085","~:cosco
I dont understand what wrong am i doing in below code. Can someone provide the fix here?
private void InitializePayload(JsonTranslatorOptions options)
{
_payload.Add("ubsub:attributes", _attributes);
_payload.Add("ubsub:relations", _relations);
JArray newType = new JArray();
foreach (JValue elem in options.Type1.Values())
{
if (elem.ToString().Contains("rdl"))
{
string val = elem.ToString().Split(":")[1];
newType.Add(val);
}
}
_payload.Add("ubsub:type", newType);
}
Try this:
var input = "['[\"~:bbl:P5085\",\"~:cosco:NoTag\"]']";
var BBLs_List = JArray.Parse(input)
.SelectMany(m => JArray.Parse(m.ToString()))
.Select(s => s.ToString().Split(":"))
.Where(w => w[1] == "bbl")
.Select(s => s[2])
.ToList();
As I explain in the comments this isn't JSON, except at the top level which is an array with a single string value. That specific string could be parsed as a JSON array itself, but its values can't be handled as JSON in any way. They're just strings.
While you could try parsing and splitting that string, it would be a lot safer to find the actual specification of that format and write a parser for it. Or find a library for that API.
You could use the following code for parsing, but it's slow, not very readable and based on assumptions that can easily break - what happens if a value contains a colon?
foreach(var longString in JArray.Parse(input))
{
foreach(var smallString in JArray.Parse(longString))
{
var values=smallString.Split(":");
if(values[1]=="bbl")
{
return values[2];
}
}
}
return null;
You could convert that to LINQ, but that would be just as hard to read :
var value=JArray.Parse(input)
.SelectMany(longString=>JArray.Parse(longString))
.Select(smallString=>smallString.Split(":"))
.Where(values=>values[1]=="bbl")
.Select(values=>values[2])
.FirstOrDefault();
I am having trouble identifying how to use linq-to-xml to extract total price and individual prices from the xml below (e.g I want to get the fare price and also sum of all prices). Any help would be much appreciated especially with using the method syntax of linq-to-xml
I use the following code to get the data loaded into an xDocument and work with the xmlResponse object to parse the response.
var xmlResponse = from element in xdoc.Descendants()
select element;
and get data like
xmlResponse.SingleOrDefault(x => x.Name.LocalName == "Registration")
Below is a subset of thwe xml response :-
<StateList>
<State>
<SourceJobID>J999999999999</SourceJobID>
<TargetJobState>Complete</TargetJobState>
<TargetJobID>11111111</TargetJobID>
<TargetSystem>TESTSYSTEM</TargetSystem>
<VehicleDetails>
<Registration>TESTREGISRATION</Registration>
<Plate>11111111111</Plate>
<CO2Rating>160</CO2Rating>
<Badge>1111111</Badge>
<Description>TEST DESCRIPTION</Description>
</VehicleDetails>
<CompleteDetails>
<CompletedOn>2015-09-15T13:39:11+01:00</CompletedOn>
<JobDistance>0</JobDistance>
<WaitingTime />
<CO2Usage>0</CO2Usage>
<ChargeList>
<Charge>
<Name>Airport Pickup</Name>
<Currency>GBP</Currency>
<Price>0.00</Price>
</Charge>
<Charge>
<Name>Fare</Name>
<Currency>GBP</Currency>
<Price>0.00</Price>
</Charge>
<Charge>
<Name>Extra Stops</Name>
<Currency>GBP</Currency>
<Price>0.00</Price>
</Charge>
</ChargeList>
</CompleteDetails>
</State>
Assuming you only have a single state like in your example, you could do something like the following:
decimal fare = decimal.Parse(xml.Descendants("Charge").Single(x => x.Element("Name").Value == "Fare").Element("Price").Value);
decimal total = xml.Descendants("Charge").Sum(x => decimal.Parse(x.Element("Price").Value));
Although if you have a series of elements in your list you will have to modify that.
EDIT: If, as you say in the comments, you would like to sum only certain charges:
// Valid names of charges to sum.
string[] names = { "Airport Pickup", "Fare" };
// Iterate over every state.
foreach (var state in xml.Descendants("State"))
{
// Get all charge elements in the current state whose names are contained in 'names' - then convert their 'Price' element to decimal and sum them.
decimal stateTotal = state.Descendants("Charge").Where(x => names.Contains(x.Element("Name").Value)).Sum(x => decimal.Parse(x.Element("Price").Value));
}
if(doc.Descendants("Charge").Any())
{
var FarePrice = doc.Descendants("Charge")
.Where(x => x.Descendants("Name").First().Value.Equals("Fare")).First().Element("Price").Value;
var Sum = doc.Descendants("Charge")
.Select(x => Convert.ToDouble(x.Descendants("Price").First().Value))
.Sum();
Console.WriteLine("Fare price:{0} Sum:{1}",FarePrice,Sum);
}
It returns 35 as sum for 10 and 25 inputs.
Fiddle here : https://dotnetfiddle.net/cuHXBn
I have a list of strings in a List container class that look like the following:
MainMenuItem|MenuItem|subItemX
..
..
..
..
MainMenuItem|MenuItem|subItem99
What I am trying to do is transform the string, using LINQ, so that the first item for each of the tokenised string is removed.
This is the code I already have:
protected static List<string> _menuItems = GetMenuItemsFromXMLFile();
_menuItems.Where(x => x.Contains(menuItemToSearch)).ToList();
First line of code is returning an entire XML file with all the menu items that exist within an application in a tokenised form;
The second line is saying 'get me all menu items that belong to menuItemToSearch'.
menuItemToSearch is contained in the delimited string that is returned. How do I remove it using linq?
EXAMPLE
Before transform: MainMenuItem|MenuItem|subItem99
After transform : MenuItem|subItem99
Hope the example illustrates my intentions
Thanks
You can take a substring from the first position of the pipe symbol '|' to remove the first item from a string, like this:
var str = "MainMenuItem|MenuItem|subItemX";
var dropFirst = str.Substring(str.IndexOf('|')+1);
Demo.
Apply this to all strings from the list in a LINQ Select to produce the desired result:
var res = _menuItems
.Where(x => x.Contains(menuItemToSearch))
.Select(str => str.Substring(str.IndexOf('|')+1))
.ToList();
Maybe sth like this can help you.
var regex = new Regex("[^\\|]+\\|(.+)");
var list = new List<string>(new string[] { "MainMenuItem|MenuItem|subItem99", "MainMenuItem|MenuItem|subItem99" });
var result = list.Where(p => regex.IsMatch(p)).Select(p => regex.Match(p).Groups[1]).ToList();
This should work correctly.
I have an XML file with multiple checkItem elements. I need to save each checkItem element into a database. I'm having a difficult time getting exactly what I need using the query below.
<checkItem>
<checkItemType>check</checkItemType>
<checkAmount>195000</checkAmount>
<nonMICRCheckData>
<legalAmount>195000</legalAmount>
<issueDate>2010-04-30</issueDate>
<other>PAY VAL 20 CHARACTER</other>
</nonMICRCheckData>
<postingInfo>
<date>2013-05-01</date>
<RT>10108929</RT>
<accountNumber>111111111</accountNumber>
<seqNum>11111111</seqNum>
<trancode>111111</trancode>
<amount>195000</amount>
<serialNumber>1111111</serialNumber>
</postingInfo>
<totalImageViewsDelivered>2</totalImageViewsDelivered>
<imageView>
<imageIndicator>Actual Item Image Present</imageIndicator>
<imageViewInfo>
<Format>
<Baseline>TIF</Baseline>
</Format>
<Compression>
<Baseline>CCITT</Baseline>
</Compression>
<ViewSide>Front</ViewSide>
<imageViewLocator>
<imageRefKey>201305010090085000316000085703_Front.TIF</imageRefKey>
<imageFileLocator>IFTDISB20130625132900M041.zip</imageFileLocator>
</imageViewLocator>
</imageViewInfo>
<imageViewInfo>
<Format>
<Baseline>TIF</Baseline>
</Format>
<Compression>
<Baseline>CCITT</Baseline>
</Compression>
<ViewSide>Rear</ViewSide>
<imageViewLocator>
<imageRefKey>201305010090085000316000085703_Rear.TIF</imageRefKey>
<imageFileLocator>IFTDISB20130625132900M041.zip</imageFileLocator>
</imageViewLocator>
</imageViewInfo>
</imageView>
</checkItem>
Here is the query I've been working with. I've tried several different ways with no luck. Without the use of .Concat, I cannot get the other elements; however, using .Concat does not allow me to get all values in a manageable format. I need to separate the Front and Rear imageViews based on the ViewSide value, and only need the imageRefKey and imageFileLocator values from the imageView element. Can anyone point me in the right direction?
var query = doc.Descendants("checkItem")
//.Concat(doc.Descendants("postingInfo"))
//.Concat(doc.Descendants("imageViewLocator"))//.Where(x => (string)x.Element("ViewSide") == "Front"))
//.Concat(doc.Descendants("imageViewInfo").Where(x => (string)x.Element("ViewSide") == "Rear"))
.Select(x => new {
CheckAmount = (string) x.Element("checkAmount"),
ImageRefKey = (string) x.Element("imageRefKey"),
PostingDate = (string) x.Element("dare"),
//FrontViewSide = (string) x.Element("ViewSide"),
//RearViewSide = (string) x.Element("BViewSide")
});
You can easily get nested elements of any XElement by just calling the Elements() method of that instance, then calling Select() on that collection, to created a nested collection of an anonymous type in your main anonymous type.
var query = doc.Elements("checkItem")
.Select( x =>
new
{
CheckAmount = (string) x.Element("checkAmount"),
ImageRefKey = (string) x.Element("imageRefKey"),
PostingDate = (string) x.Element("dare"),
ImageViews = x.Element("ImageView").Elements("ImageViewInfo")
.Select(iv=>
new
{
Format = iv.Element("Format").Element("Baseline").Value
// more parsing here
}
}
My xml look like:
<CURRENCIES>
<LAST_UPDATE>2014-01-17</LAST_UPDATE>
<CURRENCY>
<NAME>Dollar</NAME>
<UNIT>1</UNIT>
<CURRENCYCODE>USD</CURRENCYCODE>
<COUNTRY>USA</COUNTRY>
<RATE>3.489</RATE>
<CHANGE>-0.086</CHANGE>
</CURRENCY>
</CURRENCIES>
I want to find specific currncy from the element "NAME" and "COUNTRY", and take the value "RATE".
I wrote:
public void ConvertCurrency(int value, string currency)
{
WebClient webClient = new WebClient();
XDocument xml = new XDocument();
webClient.DownloadFile("http://www.boi.org.il/currency.xml", #"currency.xml");
XDocument currency_xml = XDocument.Load("currency.xml");
var findCurrency = from currency1 in currency_xml.Descendants("CURRENCIES")
where (Convert.ToString(currency1.Element("CURRENCY").Element("NAME").Value) == currency) && (Convert.ToString(currency1.Element("CURRENCY").Element("COUNTRY").Value) == "USA")
select currency1.Element("RATE").Value;
int rate = Convert.ToInt32(findCurrency);
int result = value * rate;
Console.WriteLine("Result:{0}",result);
}
How can I do it right?
Your query is over CURRENCIES elements, and you're only looking at the first CURRENCY child. Then you're looking for a RATE child within CURRENCIES rather than CURRENCY. Additionally, you're getting a sequence of integers - and that isn't a single int. I think you want:
// Load directly from the web - it's simpler...
XDocument doc = XDocument.Load("http://www.boi.org.il/currency.xml");
var element = doc.Root
.Elements("CURRENCY")
.Where(x => (string) x.Element("COUNTRY") == "USA") &&
(string) x.Element("NAME") == currency)
.FirstOrDefault();
if (element != null)
{
// You don't want an int here - you shouldn't lose information!
decimal rate = (decimal) element.Element("RATE");
decimal result = value * rate;
Console.WriteLine("Result: {0}", result);
}
else
{
Console.WriteLine("Couldn't find currency rate for USA");
}
Notes:
I haven't used a query expression here as it wouldn't help to simplify anything
This will fail if there's a CURRENCY element for USA without a RATE element; do you need to fix that?
I prefer to use the user-defined conversions in XElement instead of using Convert.ToXyz; they're specifically geared up for XML values (so won't use the culture when converting decimal values, for example)
Jon Skeet's answer is complete, anyway here is your fixed query with LINQ syntax
var findCurrency = (from c in currency_xml.Descendants("CURRENCY")
where (string)c.Element("NAME") == currency
&& (string)c.Element("COUNTRY") == "USA"
select (string)c.Element("RATE")).FirstOrDefault();