Capture Invalid XML - c#

I have the following code I'm using to parse XML strings that contain collections of objects. I want to capture the XML for any given object as a string if I can't parse it. I want store it and analyze it if it won't parse correctly. I would prefer not to make a radical change, but I can't figure out how to grab that part of the XML which is invalid and capture it. xmlReader.ReadOuterXml() throw an exception when it can't parse. Thanks in advance.
// we want to read each canonical item in the report from oracle separately
while (xmlReader.ReadToFollowing(ROOT_ELEMENT))
{
string itemXml = String.Empty;
try
{
// this gives us the whole segment of xml including the root element tag
itemXml = xmlReader.ReadOuterXml();
xmlReader.
T processedItem = default;
processedItem = reportMapper.Mapper(itemXml);
successfulItems.Add(new ProcessingResult<T>()
{
ProcessedItem = processedItem
});
}

Related

parsing through a dynamic json

I'm building an import system that will used for importing product from various vendors, and the app has to conform to whatever the vendor gives us, so there's no normalized column. I have a mongodb collection has a mapping of what goes to what
for example, vendor A has sku binded to code but vendor B may call it itemCode
so when I'm parsing my json data, how would I be able to just dynamically tell my app that sku is that field?
I'd like to be able to do like what I'm doing for vendors using xml which is like
doc.LoadXml(content);
XmlNodeList itemPath = doc.SelectNodes(Config.XmlItemPath);
foreach (XmlNode item in itemPath)
{
Console.WriteLine(item[MapToValue("CurrencyCode")]?.InnerText);
}
I haven't seen such a way with Json.NET so I'm sorta lost on how I can easily parse through this data.
Have a look at Json.Net's LINQ-to-JSON API. You could write very similar code with it:
JToken root = JToken.Parse(jsonContent);
IEnumerable<JToken> itemTokens = root.SelectTokens(Config.JsonItemPath);
foreach (JToken item in itemTokens)
{
Console.WriteLine(item[MapToValue("CurrencyCode")]?.ToString());
}
API Reference
I decided to just convert the json to xml seeing as xml has a bit more flexibility in .NET
var doc = JsonConvert.DeserializeXmlNode(content, "root");
XmlNodeList itemPath = doc.SelectNodes(Config.XmlItemPath);
if (itemPath == null) throw new Exception("Invalid XML Path.");
{
foreach (XmlNode row in itemPath)
{
Console.WriteLine(row[GetJsonProperty("Brand")]?.InnerText);
}
}
this does exactly what I want now, I'm open to hear how to do it without converting to xml but this will get me moving onward with my work.

Check if user is on list

I have Array in JSON file. File looks like this:
["Maverick", "rick", "Rick", "prick", "rick_07"]
I have a username. I want to check if this username is in Array.
public string UserToCheck = "rick";
So im reading json file from URL...
using (var webClient = new System.Net.WebClient())
{
var json = webClient.DownloadString("http://example.ex/users.json");
// Here I want to check if user is on list
}
}
But how Can I check if "UserToCheck" exactly match one of users from array?
You could parse your Json with the great Newtonsoft Json Library:
var users = JsonConvert.DeserializeObject<List<string>>(json);
users.Contains(UserToCheck);
As this is case sensitive, you could use LINQ: users.Any(u => String.Equals(u, UserToCheck, StringComparison.OrdinalIgnoreCase))
Trying parsing the object using JSON parsing. This requires placing the JSON string object into JSON.Parse method.
This portion may be missing:
JObject jObj = JObject.Parse(json);
Console.WriteLine(jObj);
Helpful links: http://www.newtonsoft.com/json/help/html/ParseJsonObject.htm
http://masnun.com/2011/07/08/quick-json-parsing-with-c-sharp.html
To check the string for names, break the names up into a list with C# and iterate through that to check the results.

Newtonsoft Object serialized to String. JObject instance expected

Hi so am trying to parse this JSON line but i got some others that are like this in files thats why i want to automate this so i can remove the invalid lines to make the file a valid JSON for reading, The problem is that the JSON contains multiple JSON in 1 line
Example:
{"item":"value"}{"anotheritem":"value"}
Is there anyway to remove
{"anotheritem":"value"}
So it turns in to a valid JSON that is readable to start parsing the files
I tried doing using StreamReader cause there in a file i have multiple files that contain these invalid JSON
So i got it to be able to detect the Invalid JSON but for some reason i can't get it to read the JSON so i can use .remove to remove the invalid line
using (StreamReader r = new StreamReader(itemDir))
{
string json = r.ReadToEnd();
if (json.Contains("anotheritem"))
{
JObject NoGood = JObject.FromObject(json);
MessageBox.Show(NoGood.ToString());
}
}
The Error:
Object serialized to String. JObject instance expected.
Thank you all for your time and help.
If each object are side by side without space or any other character, you can convert your string to an json array.
string value = "{\"item\":\"value\"}{\"anotheritem\":\"value\"}";
string arrayValue = "[" + value.Replace("}{", "},{") + "]";
var array = JArray.Parse(arrayValue);
var goopArray = array.OfType<JObject>().Where(o => o.Property("anotheritem") == null);
Edit : see my second answer. More robust solution. More modern. And support dotnet core builtin json serializer.
Json.Net
Even better solution, Json.NET have a builtin feature for this exact scenario. See Read Multiple Fragments With JsonReader
The JsonTextReader have a property SupportMultipleContent that allow to read consecutive items when set to true
string value = "{\"item\":\"value\"}{\"anotheritem\":\"value\"}";
var reader = new JsonTextReader(new System.IO.StringReader(value));
reader.SupportMultipleContent = true;
var list = new List<JObject>();
while (reader.Read())
{
var item = JObject.Load(reader);
list.Add(item);
}
System.Text.Json
If you want to use System.Text.Json, it's also acheivable. They are no SupportMultipleContent property but Utf8JsonReader will do the job for you.
string value = "{\"item\":\"value\"}{\"anotheritem\":\"value\"}";
var bytes = Encoding.UTF8.GetBytes(value).AsSpan();
var list = new List<JsonDocument>();
while (bytes.Length != 0)
{
var reader = new Utf8JsonReader(bytes);
var item = JsonDocument.ParseValue(ref reader);
list.Add(item);
bytes = bytes.Slice((int) reader.BytesConsumed);
}

converting graph api for xml

I'm having trouble converting a string of json facebook graph api, I used the facebook C# and json.Net.
But at conversion time it returns this error: Name can not begin with the '0 'character, hexadecimal value 0x30.
This is the code:
dynamic result = await _fb.GetTaskAsync ("me / feed");
FBxml JsonConvert.DeserializeXNode string = (result.ToString ()). ToString ();
It looks like there is a problem with portion of the json string as mentioned below (taken from your link http://jsfiddle.net/btripoloni/PaLC2/)
"story_tags": {
"0": [{
"id": "100000866891334",
"name": "Bruno Tripoloni",
"offset": 0,
"length": 15,
"type": "user"}]
},
Json cannot create class that begins with a numeric value such as '0'. Try creating the classes using the link http://json2csharp.com/ you will get an idea.
To solve this problem you can create a dynamic object and go through each properties OR create a JsonConverter and write your code in the ReadJson to convert the "0" to a meaningful name. May be this can help you http://blog.maskalik.com/asp-net/json-net-implement-custom-serialization
If this is not your problem then update the question with more information like class structure of FBxml, call stack of the exception (from which line of the json code is throwing the exception), Json version etc.
As keyr says, the problem is with those JSON properties that have numeric names. In XML names can contain numeric characters but cannot begin with one: XML (see the Well-formedness and error-handling section).
My idea was to recursively parse the JSON with JSON.Net, replacing properties that had numeric names:
var jsonObject = JObject.Parse(json);
foreach (var obj in jsonObject)
{
Process(obj.Value);
}
XDocument document = JsonConvert.DeserializeXNode(jsonObject.ToString());
....
private static void Process(JToken jToken)
{
if (jToken.Type == JTokenType.Property)
{
JProperty property = jToken as JProperty;
int value;
if (int.TryParse(property.Name, out value))
{
JToken token = new JProperty("_" + property.Name, property.Value);
jToken.Replace(token);
}
}
if (jToken.HasValues)
{
//foreach won't work here as the call to jToken.Replace(token) above
//results in the collection modifed error.
for(int i = 0; i < jToken.Values().Count(); i++)
{
JToken obj = jToken.Values().ElementAt(i);
Process(obj);
}
}
}
This seemed to work well, prefixing numeric names with _. At this line:
XDocument document = JsonConvert.DeserializeXNode(jsonObject.ToString());
it crashed with an error saying that invalid/not well formed XML had been created. I don't have the actual error with me, but you can run the above code to replicate it.
I think from here you may need to revisit converting the JSON to XML in the first place. Is this a specific requirement?

Which approach to templating in C# should I take?

What I have
I have templates that are stored in a database, and JSON data that gets converted into a dictionary in C#.
Example: 
Template: "Hi {FirstName}"
Data: "{FirstName: 'Jack'}"
This works easily with one level of data by using a regular expression to pull out anything within {} in the template.
What I want
I would like to be able to go deeper in the JSON than the first layer.
Example:
Template: "Hi {Name: {First}}"
Data: "{Name: {First: 'Jack', Last: 'Smith'}}"
What approach should I be taking? (and some guidance on where to start with your pick)
A regular expression
Not use JSON in the template (in favor of xslt or something similar)
Something else
I'd also like to be able to loop through data in the template, but I have no idea at all where to start with that one!
Thanks heaps
You are in luck! SmartFormat does exactly as you describe. It is a lightweight, open-source string formatting utility.
It supports named placeholders:
var template = " {Name:{Last}, {First}} ";
var data = new { Name = new { First="Dwight", Last="Schrute" } };
var result = Smart.Format(template, data);
// Outputs: " Schrute, Dwight " SURPRISE!
It also supports list formatting:
var template = " {People:{}|, |, and} ";
var data = new { People = new[]{ "Dwight", "Michael", "Jim", "Pam" } };
var result = Smart.Format(template, data);
// Outputs: " Dwight, Michael, Jim, and Pam "
You can check out the unit tests for Named Placeholders and List Formatter to see plenty more examples!
It even has several forms of error-handling (ignore errors, output errors, throw errors).
Note: the named placeholder feature uses reflection and/or dictionary lookups, so you can deserialize the JSON into C# objects or nested Dictionaries, and it will work great!
Here is how I would do it:
Change your template to this format Hi {Name.First}
Now create a JavaScriptSerializer to convert JSON in Dictionary<string, object>
JavaScriptSerializer jss = new JavaScriptSerializer();
dynamic d = jss.Deserialize(data, typeof(object));
Now the variable d has the values of your JSON in a dictionary.
Having that you can run your template against a regex to replace {X.Y.Z.N} with the keys of the dictionary, recursively.
Full Example:
public void Test()
{
// Your template is simpler
string template = "Hi {Name.First}";
// some JSON
string data = #"{""Name"":{""First"":""Jack"",""Last"":""Smith""}}";
JavaScriptSerializer jss = new JavaScriptSerializer();
// now `d` contains all the values you need, in a dictionary
dynamic d = jss.Deserialize(data, typeof(object));
// running your template against a regex to
// extract the tokens that need to be replaced
var result = Regex.Replace(template, #"{?{([^}]+)}?}", (m) =>
{
// Skip escape values (ex: {{escaped value}} )
if (m.Value.StartsWith("{{"))
return m.Value;
// split the token by `.` to run against the dictionary
var pieces = m.Groups[1].Value.Split('.');
dynamic value = d;
// go after all the pieces, recursively going inside
// ex: "Name.First"
// Step 1 (value = value["Name"])
// value = new Dictionary<string, object>
// {
// { "First": "Jack" }, { "Last": "Smith" }
// };
// Step 2 (value = value["First"])
// value = "Jack"
foreach (var piece in pieces)
{
value = value[piece]; // go inside each time
}
return value;
});
}
I didn't handle exceptions (e.g. the value couldn't be found), you can handle this case and return the matched value if it wasn't found. m.Value for the raw value or m.Groups[1].Value for the string between {}.
Have you thought of using Javascript as your scripting language? I had great success with Jint, although the startup cost is high. Another option is Jurassic, which I haven't used myself.
If you happen to have a Web Application, using Razor maybe an idea, see here.
Using Regex or any sort of string parsing can certainly work for trivial things, but can get painful when you want logic or even just basic hierarchies. If you deserialize your JSON into nested Dictionaries, you can build a parser relatively easily:
// Untested and error prone, just to illustrate the concept
var parts = "parentObj.childObj.property".split('.');
Dictionary<object,object> current = yourDeserializedObject;
foreach(var key in parts.Take(parts.Length-1)){
current = current[key];
}
var value = current[parts.Last()];
Just whatever you do, don't do XSLT. Really, if XSLT is the answer then the question must have been really desperate :)
Why not us nvelocity or something?

Categories