how to get values from script element - c#

want to get some values from html script element with HTML Agility
Need to output these values : "KMN Gang Azet T-Shirt Fast" and "https://www.30grad.shop/item/images/11296/3000x3000/azet-kmngang-shirt-fastlife.jpg"
I already got the script element but can't get the values.
(function(){var pinto_primary_url='/xjs/_/js/k\x3dxjs.il.en_US.mTcJQn619Nc.O/m\x3dcdos,r,jsa,csi,dbm,cr,d,ivg,dgm,ish,qtf,ivw/am\x3dAACYGj0C/rt\x3dj/d\x3d1/t\x3dzcms/rs\x3dACT90oERgo7cd2pKPZNXIWg1CpAwZ14CoQ';var _expids='201794,1354277,1354722,1354916,1355527,1355736,1355922,1356032,1356078,1356343,1356470,1356555,4029815,4031109,4038214,4038394,4041776,4043492,4045096,4045293,4045841,4047140,4047454,4048347,4048980,4050750,4051887,4056126,4056682,4058016,4061666,4061980,4062724,4064468,4064796,4069829,4076999,4078430,4078588,4080760,4081039,4081165,4082230,4083113,4097153,4097922,4097929,4098733,4098740,4098752,4102238,4103474,4103845,4103861,4104202,4104258,4106085,4106647,4109293,4109316,4109490,4110086,4110931,4112243,4113217,4115289,4115624,4115697,4116349,4116724,4116731,4116926,4116935,4117328,4117980,4118227,4118798,4119032,4119034,4119036,4120415,4120660,4120911,4121035,4121518,4122382,4123645,4124091,4124850,4125837,4126200,4127095,4127445,4127744,4128586,4129001,4129002,4129555,4129559,4129633,4130362,4130560,4131073,4131247,4131370,4131834,4132528,4132785,4132956,4133063,4133064,4133090,4133114,4133416,4133755,4134271,4134919,4134946,4135085,4135089,4135249,4135404,4135576,4135744,4135934,4136073,4136235,4136562,4136627,4137099,4137110,4137415,4137461,4137462,4137597,4137646,4138341,4138344,4138431,4138853,4139394,4139435,4139701,4139928,4140117,4140241,4140464,4140786,4140798,4141393,4141520,4141581,4141683,4141725,4141729,4142231,4142326,4142328,4142420,4142492,4142494,4142503,4142504,4142558,4142560,4142574,4142607,4142610,4142729,4143112,4143132,4143224,4143246,4143318,4143578,10200083,10202524,10202535,10202543,10202562,41317155';var pinto_module_config='{\x22/1S6iw\x22:{},\x2210Kacw\x22:{},\x22ADSBcg\x22:{},\x22Fa+7Pw\x22:{},\x22NpA8BQ\x22:{},\x22WZXAwQ\x22:{},\x22YFCs/g\x22:{},\x22aWiv7g\x22:{},\x22cdos\x22:{\x22bih\x22:877,\x22biw\x22:1760,\x22cdobsel\x22:false,\x22dpr\x22:\x221\x22},\x22cr\x22:{\x22eup\x22:false,\x22qir\x22:false,\x22rctj\x22:true,\x22ref\x22:false,\x22uff\x22:false},\x22csi\x22:{\x22acsi\x22:true,\x22dlm\x22:true,\x22jsmf\x22:true},\x22d\x22:{},\x22gf\x22:{\x22pid\x22:196},\x22hmvvig\x22:{},\x22jsa\x22:{\x22csi\x22:true,\x22csir\x22:100},\x22r\x22:{},\x22sx\x22:{},\x22v0BWCA\x22:{}}';var ctx=["root",[["t-a2hICACK35I","is_NJKM1C840","r-is_NJKM1C840",[["global_config",null,null,null,null,[null,"[\"AOvVaw3piykki21tcS6pJhxrjznh\\u0026ust\\u003d1511885221974957\",null,0,null,0,null,null,null,0,null,null,null,1,1,1,1,0,1,null,null,null,null,null,null,null,null,0,0,0,null,null,null,0,null,null,0,0,null,null,1,null,400,null,null,null,1,null,0,null,null,null,null,0,0,\"NONE\",null,4,\"Related image\",7,0,\"%1$d\\u0026nbsp;\\u0026#215;\\u0026nbsp;%2$d\",0,null,null,null,null,null,0,null,0,\"#222\",0,1,null,1,null,0,null,null,null,null,null,0,null,0,0,0]\n"]
]
,["group_config",null,null,null,null,[null,"[null,null,null,null,1,null,null,1]\n"]
]
,["image_group",null,null,null,null,[null,"[[[0,\"mdba4buxuK9BKM:\",[\"https://encrypted-tbn0.gstatic.com/images?q\\u003dtbn:ANd9GcRR2V_IL-Zh_LdrVzgBgvq6zcL68YSL01zNGsQJxXTo2cpXaqVz6A\",256,197]\n,[\"https://www.30grad.shop/item/images/11296/3000x3000/azet-kmngang-shirt-fastlife.jpg\",3000,2315]\n,null,0,{\"2001\":[]\n,\"2003\":[null,\"RXXHApKYqqq7RM\",\"https://www.30grad.shop/hersteller/kmn-gang/\",\"KMN Gang | 30° Shop\",\"KMN Gang Azet T-Shirt Fast Life\",null,null,null,null,null,null,null,\"30° Merchandise Shop\"]\n}]\n]\n]\n"]
]
]
]
,["t-cuCqGEujB5w","ik5Gk2IHW4Sw","r-ik5Gk2IHW4Sw",[["enable_close_for_background",null,null,null,null,[null,null,null,null,1]
]
,["initial_open",null,null,null,null,[null,null,null,null,null,0]
]
,["remain_in_lightbox_container",null,null,null,null,[null,null,null,null,0]
]
,["ux",null,null,null,null,[null,"[{\"220802553\":1}]\n"]
]
,["gsa",null,null,null,null,[null,"[{\"46740956\":0,\"244399487\":0}]\n"]
]
]
]
,["t-RHI35lUscno","iIJTpCvJ03LA","r-iIJTpCvJ03LA"]
,["t-3mFqq0A9uuY","iy1_jPBsPPco","r-iy1_jPBsPPco",[["hide_label_on_focus",null,null,null,null,[null,null,null,null,0]
]
]
]
,["t-mqWFpp0vPaI","i7Nj7DN5Ak3M","r-i7Nj7DN5Ak3M"]
,["t-mqWFpp0vPaI","i_DjAejc9SK4","r-i_DjAejc9SK4"]
,["t-mqWFpp0vPaI","iUYK6IKmz8Cg","r-iUYK6IKmz8Cg"]
]
]
;window._ = this;window._DumpException = function(e){throw e;};google.xjsu = pinto_primary_url;google.kEXPI = _expids;google.pmc = JSON.parse(pinto_module_config);google.jsc.x(ctx);})();
C# code
HtmlAgilityPack.HtmlDocument htmlDocument = new HtmlAgilityPack.HtmlDocument();
htmlDocument.LoadHtml(pagesourceCode);
var nodes = htmlDocument.DocumentNode.SelectNodes("//script");

As far as I know, you can't access Javascript code using Html Agility pack. It's for HTML itself.
In fact, let me rephrase it, you can access the body of a javascript which is enclosed in <script> tags, but you will not mine that data out of it as you have it.
You would probably need to serialize the object by stripping the function and variable declarations as then you'll be able to treat it as JSON. Alternatively you can use regex... but good luck with that with the provided structure.

Related

How I can parse string variable like that

I need to parse this string with C# but don't know how. Would you advise me on a good way to get values from this string" Arrays "
["root",[["t-a2hICACK35I","isYktsZwEVMQ","r-isYktsZwEVMQ",[["global_config",null,null,null,null,[null,"[\"AOvVaw0H3zstE2R8Hh96uT8kZylb\\u0026ust\\u003d1511890102832262\",null,0,null,0,null,null,null,0,null,null,null,1,1,1,1,0,1,null,null,null,null,null,null,null,null,0,0,0,null,null,null,0,null,null,0,0,null,null,1,null,400,null,null,null,1,null,0,null,null,null,null,0,0,\"NONE\",null,4,\"Related image\",7,0,\"%1$d\\u0026nbsp;\\u0026#215;\\u0026nbsp;%2$d\",0,null,null,null,null,null,0,null,0,\"#222\",0,1,null,1,null,0,null,null,null,null,null,0,null,0,0,0]\n"]
]
,["group_config",null,null,null,null,[null,"[null,null,null,null,1,null,null,1]\n"]
]
,["image_group",null,null,null,null,[null,"[[[0,\"mdba4buxuK9BKM:\",[\"https://encrypted-tbn0.gstatic.com/images?q\\u003dtbn:ANd9GcRR2V_IL-Zh_LdrVzgBgvq6zcL68YSL01zNGsQJxXTo2cpXaqVz6A\",256,197]\n,[\"https://www.30grad.shop/item/images/11296/3000x3000/azet-kmngang-shirt-fastlife.jpg\",3000,2315]\n,null,0,{\"2001\":[]\n,\"2003\":[null,\"RXXHApKYqqq7RM\",\"https://www.30grad.shop/hersteller/kmn-gang/\",\"KMN Gang | 30° Shop\",\"KMN Gang Azet T-Shirt Fast Life\",null,null,null,null,null,null,null,\"30° Merchandise Shop\"]\n}]\n]\n]\n"]
]
]
]
,["t-cuCqGEujB5w","iP3_T8N4D_s8","r-iP3_T8N4D_s8",[["enable_close_for_background",null,null,null,null,[null,null,null,null,1]
]
,["initial_open",null,null,null,null,[null,null,null,null,null,0]
]
,["remain_in_lightbox_container",null,null,null,null,[null,null,null,null,0]
]
,["ux",null,null,null,null,[null,"[{\"220802553\":1}]\n"]
]
,["gsa",null,null,null,null,[null,"[{\"46740956\":0,\"244399487\":0}]\n"]
]
]
]
,["t-RHI35lUscno","igbzzOoE9k74","r-igbzzOoE9k74"]
,["t-3mFqq0A9uuY","iymxTruthWUk","r-iymxTruthWUk",[["hide_label_on_focus",null,null,null,null,[null,null,null,null,0]
]
]
]
,["t-mqWFpp0vPaI","iJcTl2Z4mNb0","r-iJcTl2Z4mNb0"]
,["t-mqWFpp0vPaI","iLi8ChEUFkT8","r-iLi8ChEUFkT8"]
,["t-mqWFpp0vPaI","iM4MogigWfMk","r-iM4MogigWfMk"]
]
]
The given string is a json array (not a json object)
You can use Newtonsoft.json (add the NuGet package to your solution) to parse it into a JArray and do whatever you want with it.
using System;
using Newtonsoft.Json.Linq;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var s = "[\"root\",[[\"t-a2hICACK35I\",\"isYktsZwEVMQ\",\"r-isYktsZwEVMQ\",[[\"global_config\",null,null,null,null,[null,\"[\\\"AOvVaw0H3zstE2R8Hh96uT8kZylb\\\\u0026ust\\\\u003d1511890102832262\\\",null,0,null,0,null,null,null,0,null,null,null,1,1,1,1,0,1,null,null,null,null,null,null,null,null,0,0,0,null,null,null,0,null,null,0,0,null,null,1,null,400,null,null,null,1,null,0,null,null,null,null,0,0,\\\"NONE\\\",null,4,\\\"Related image\\\",7,0,\\\"%1$d\\\\u0026nbsp;\\\\u0026#215;\\\\u0026nbsp;%2$d\\\",0,null,null,null,null,null,0,null,0,\\\"#222\\\",0,1,null,1,null,0,null,null,null,null,null,0,null,0,0,0]\\n\"]\r\n]\r\n,[\"group_config\",null,null,null,null,[null,\"[null,null,null,null,1,null,null,1]\\n\"]\r\n]\r\n,[\"image_group\",null,null,null,null,[null,\"[[[0,\\\"mdba4buxuK9BKM:\\\",[\\\"https://encrypted-tbn0.gstatic.com/images?q\\\\u003dtbn:ANd9GcRR2V_IL-Zh_LdrVzgBgvq6zcL68YSL01zNGsQJxXTo2cpXaqVz6A\\\",256,197]\\n,[\\\"https://www.30grad.shop/item/images/11296/3000x3000/azet-kmngang-shirt-fastlife.jpg\\\",3000,2315]\\n,null,0,{\\\"2001\\\":[]\\n,\\\"2003\\\":[null,\\\"RXXHApKYqqq7RM\\\",\\\"https://www.30grad.shop/hersteller/kmn-gang/\\\",\\\"KMN Gang | 30° Shop\\\",\\\"KMN Gang Azet T-Shirt Fast Life\\\",null,null,null,null,null,null,null,\\\"30° Merchandise Shop\\\"]\\n}]\\n]\\n]\\n\"]\r\n]\r\n]\r\n]\r\n,[\"t-cuCqGEujB5w\",\"iP3_T8N4D_s8\",\"r-iP3_T8N4D_s8\",[[\"enable_close_for_background\",null,null,null,null,[null,null,null,null,1]\r\n]\r\n,[\"initial_open\",null,null,null,null,[null,null,null,null,null,0]\r\n]\r\n,[\"remain_in_lightbox_container\",null,null,null,null,[null,null,null,null,0]\r\n]\r\n,[\"ux\",null,null,null,null,[null,\"[{\\\"220802553\\\":1}]\\n\"]\r\n]\r\n,[\"gsa\",null,null,null,null,[null,\"[{\\\"46740956\\\":0,\\\"244399487\\\":0}]\\n\"]\r\n]\r\n]\r\n]\r\n,[\"t-RHI35lUscno\",\"igbzzOoE9k74\",\"r-igbzzOoE9k74\"]\r\n,[\"t-3mFqq0A9uuY\",\"iymxTruthWUk\",\"r-iymxTruthWUk\",[[\"hide_label_on_focus\",null,null,null,null,[null,null,null,null,0]\r\n]\r\n]\r\n]\r\n,[\"t-mqWFpp0vPaI\",\"iJcTl2Z4mNb0\",\"r-iJcTl2Z4mNb0\"]\r\n,[\"t-mqWFpp0vPaI\",\"iLi8ChEUFkT8\",\"r-iLi8ChEUFkT8\"]\r\n,[\"t-mqWFpp0vPaI\",\"iM4MogigWfMk\",\"r-iM4MogigWfMk\"]\r\n]\r\n]";
var jArray = JArray.Parse(s);
Console.WriteLine(jArray.ToString());
}
}
}

Retain HTML tags on JSON to XML conversion

I have a JSON object which I convert to XML using the following code:
private string ConvertFileToXml(string file)
{
string fileContent = File.ReadAllText(file);
XmlDocument doc = JsonConvert.DeserializeXmlNode(fileContent, "root");
// Retain html tags.
doc.InnerXml = HttpUtility.HtmlDecode(doc.InnerXml);
return XDocument.Parse(doc.InnerXml).ToString();
}
where string json is the following object:
{
"id": "2639",
"type": "www.stack.com",
"bodyXML": "\n<body><p>Democrats also want to “reinvigorate and modernise” US <ft-content type=\"http://www.stack.com/ontology/content/Article\" url=\"http://api.stack.com/content/d2c32614-61c6-11e7-91a7-502f7ee26895\">antitrust</ft-content> laws for a broad attack on corporations.</p>\n<p>Mr Schumer said the Democrats’ new look should appeal to groups that backed Mrs Clinton, such as the young and minority groups, and members of the white working-class who deserted Democrats for Mr Trump. </p>\n</body>",
"title": "Democrats seek to reclaim populist mantle from Donald Trump",
"standfirst": "New economic plan is pitched as an assault on growing corporate power",
"byline": "David J Lynch in Washington",
"firstPublishedDate": "2017-07-24T17:51:25Z",
"publishedDate": "2017-07-24T17:50:25Z",
"requestUrl": "http://api.stack.com/content/e8bec6dc-708d-11e7-aca6-c6bd07df1a3c",
"brands": [
"http://api.ft.com/things/dbb0bdae-1f0c-11e4-b0cb-b2227cce2b54"
],
"standout": {
"editorsChoice": false,
"exclusive": false,
"scoop": false
},
"canBeSyndicated": "yes",
"webUrl": "http://www.stack.com/cms/s/e8bec6dc-708d-11e7-aca6-c6bd07df1a3c.html"
}
and the output of the method generates this:
<root>
<id>2639</id>
<type>www.stack.com</type>
<bodyXML>
<p>Democrats also want to “reinvigorate and modernise” US <ft-content type="http://www.stack.com/ontology/content/Article" url="http://api.stack.com/content/d2c32614-61c6-11e7-91a7-502f7ee26895">antitrust</ft-content> laws for a broad attack on corporations.</p>
<p>Mr Schumer said the Democrats’ new look should appeal to groups that backed Mrs Clinton, such as the young and minority groups, and members of the white working-class who deserted Democrats for Mr Trump. </p>
</body></bodyXML>
<title>Democrats seek to reclaim populist mantle from Donald Trump</title>
<standfirst>New economic plan is pitched as an assault on growing corporate power</standfirst>
<byline>David J Lynch in Washington</byline>
<firstPublishedDate>2017-07-24T17:51:25Z</firstPublishedDate>
<publishedDate>2017-07-24T17:50:25Z</publishedDate>
<requestUrl>http://api.stack.com/content/e8bec6dc-708d-11e7-aca6-c6bd07df1a3c</requestUrl>
<brands>http://api.ft.com/things/dbb0bdae-1f0c-11e4-b0cb-b2227cce2b54</brands>
<standout>
<editorsChoice>false</editorsChoice>
<exclusive>false</exclusive>
<scoop>false</scoop>
</standout>
<canBeSyndicated>yes</canBeSyndicated>
<webUrl>http://www.stack.com/cms/s/e8bec6dc-708d-11e7-aca6-c6bd07df1a3c.html</webUrl>
</root>
Within the original "bodyXML" of the JSON, there is HTML text with HTML tags but they get crushed into HTML entities after the conversion. What I want to do is retain these HTML tags after conversion.
How do I do this?
Help would be much appreciated!
I don't think its possible to have the 'Encoded' HTML tags in the inner text of an xml Node
But its possible to do an HTML Decode on the inner text of that Xml Node after you parse the XmlDocument.
This will get you the text with all the HTML tags intact.
Eg.,
private static string ConvertFileToXml()
{
string fileContent = File.ReadAllText("text.json");
XmlDocument doc = JsonConvert.DeserializeXmlNode(fileContent, "root");
return System.Web.HttpUtility.HtmlDecode(doc.SelectSingleNode("root").SelectSingleNode("bodyXML").InnerText);
}
Namespace required : System.Web

parsing nested json array without labels in c#

I have a nested json array looking like this:
[
[
[
[1234.5 ,9876,5],
[1234.5 ,9876,5]
],
[
[1234.5 ,9876,5],
[1234.5 ,9876,5]
]
],
[
[
[1234.5 ,9876,5],
[1234.5 ,9876,5]
],
[
[1234.5 ,9876,5],
[1234.5 ,9876,5]
]
]
]
I already saw many posts with answers if you have named keys in objects. However I just have large, nested array's. How should you make objects which can store this in c#? The length of the array's can vary.
List<List<List<List<int>>>>
Thanks for the start #user2033402
Using the package "NewtonSoft" i got it working using the following line as an example:
var results = Newtonsoft.Json.JsonConvert.DeserializeObject<List<List<List<List<double>>>>>(nestedJsonarray);

How to handle spaces in JSON keys when serializing to XML?

I'm using Json.NET in a .NET 4.0 application in order to convert a JSON RESTful response into XML. I am running into issues converting JSON into XML if a JSON child key has a space.
So far, I am able to convert most JSON responses.
Here are example responses along with the code which I am using to generate the XML.
{
num_reviews: "2",
page_id: "17816",
merchant_id: 7165
}
And here is the response which is causing an error:
[
{
headline: "ant bully",
created_date: "2010/06/12",
merchant_group_id: 10126,
profile_id: 0,
provider_id: 10000,
locale: "en_US",
helpful_score: 1314,
locale_id: 1,
variant: "",
bottomline: "Yes",
name: "Jessie",
page_id: "17816",
review_tags: [
{
Pros: [
"Easy to Learn",
"Engaging Story Line",
"Graphics",
"Good Audio",
"Multiplayer",
"Gameplay"
]
},
{
Describe Yourself: [
"Casual Gamer"
]
},
{
Best Uses: [
"Multiple Players"
]
},
{
Primary use: [
"Personal"
]
}
],
rating: 4,
merchant_id: 7165,
reviewer_type: "Verified Reviewer",
comments: "fun to play"
},
{
headline: "Ok game, but great price!",
created_date: "2010/02/28",
merchant_group_id: 10126,
profile_id: 0,
provider_id: 10000,
locale: "en_US",
helpful_score: 1918,
locale_id: 1,
variant: "",
bottomline: "Yes",
name: "Alleycatsandconmen",
page_id: "17816",
review_tags: [
{
Pros: [
"Easy to Learn",
"Engaging Story Line"
]
},
{
Describe Yourself: [
"Frequent Player"
]
},
{
Primary use: [
"Personal"
]
},
{
Best Uses: [
"Kids"
]
}
],
rating: 3,
merchant_id: 7165,
reviewer_type: "Verified Reviewer",
comments: "This is a cute game for the kids and at a great price. Just don't expect a whole lot."
}
]
So far, I have been considering on creating a mapping of the JSON data to a C# object and generating XML for that class. However, is there a way to keep this dynamic? Or is there a way to treat spaces as %20 encodings?
This question is same as how to validate JSON string before converting to XML in C#
If you have any further queries, please let me know.
You can call XmlConvert.EncodeName, which will escape any invalid characters using _s.
For example, a space would become _x0020_.
You cannot have an XMLElement Name with a space in it. You would need to replace the space with an Underscore or anyother element. If that is not feasible for you, try putting that value as an attribute for that Node.
I hope this makes sense.

Remove Unneeded Spaces from JSON Output

I am sterilizing a JSON.Net object, and it contains many arrays. Here is the output I currently get:
"children": [
{
"children": [
{
},
{
}
}
However, just for the ease of reading and comparing, I would like to remove the line breaks between each brace and bracket and between the comma and next brace, so it looks like this:
"children": [ {
"children": [ {
}, {
}
}
I am already sterilizing my JSON with the Formatting.Indented argument, so I would like to know if there is another setting I can change so that JSON.Net sterilizes without the extra line brakes, but retaining the indented formatting.
There is no feature in Json.NET to give you that kind of indentation. You'll either have to do it yourself outside of Json.NET or modify the source code.
Can you split on '{' and then join the array again by spaces?

Categories