Split a text with a regular expression

Split a text with a regular expression - c#

i try to slit a text and put it into a dictionary , the problem i my text doesn't have a a clear structure :
text :
{
"about": "where I'm meant to be...",
"bio": "Visit my official blog at:\n\nhttp://ABC.com/ \n\nAdd me on Twitter:\n\nhttp://www.ABC.com/ABC",
"category": "Public figure",
"is_published": true,
"location": {
"street": "",
"city": "Los Angeles",
"state": "CA",
"country": "United States",
"zip": ""
},
"talking_about_count": 254637,
"username": "ABC",
"website": "http://kimkardashian.celebuzz.com/\nhttp://www.twitter.com/kimkardashian\n",
"were_here_count": 0,
"id": "114696805612",
"name": "ABC",
"link": "http://www.ABC.com/ABC",
"likes": 0,
"cover": {
"cover_id": "000000000",
"source": "http://ABC.jpg",
"offset_y": 0,
"offset_x": 200
}
}
As you see i have the "," as a delimiter , the problem is that there some composed objects like the :
"location": {
"street": "",
"city": "Los Angeles",
"state": "CA",
"country": "United States",
"zip": ""
},
that's why I can't use the string.Split(' ');
i heard about the regular expressions but I don't know how to use them
Is there any solution to get those information separated into a dictionary or any other structure

Your data is in a standard format (JSON) and there are parsers already written for it. You can download Json.NET easy through NuGet in Visual Studio.
Regular expressions are a powerful tool that makes pattern matching a lot simpler. For me that's as far as they go. They can be used to create parsers and all sorts of other things, but it's complicated.
So you could create your own JSON parser using regular expressions, but it'll take a lot of time. It would be like building a lockpick when there is a key available.

JavaScriptSerializer may satisfy your needs
using System.Web.Script.Serialization;
var jss = new JavaScriptSerializer();
var dict = jss.Deserialize<Dictionary<string,string>>(jsonText);
Console.WriteLine(dict["some_number"]);
See: http://msdn.microsoft.com/en-us/library/system.web.script.serialization.javascriptserializer.aspx

Related

How do I extract particular value based another another value?

so I want to be able to extract and ID based on whether that object has a particular property. I NEED this to be done via Regex. Here is an example of the JSON I am working with:
{
"workspaceid": ws01,
"data": {
"workspacetitle": "My Workspace"
},
"collections": {
"projects": [{
"id": 01,
"data": {
"title": "My Project 01",
"enddateperiod": "2020-02-20T23:59:59",
"profilecomplete": true,
"synced": false
},
"lists": {
"projectcode": [{
"id": pcodered,
"data": {
"code": "myproject123",
"name": "OffshoreProject"
}
}]
}
}, {
"id": 02,
"data": {
"title": "My Project 02",
"enddateperiod": "2020-02-20T23:59:59",
"profilecomplete": false,
"synced": false
},
"lists": {
"projectcode": [{
"id": pcodered,
"data": {
"code": "myproject123",
"name": "OffshoreProject"
}
}]
}
}]
}}
So what I want to extract is the ID of the project whose profile is not complete ("profilecomplete":false). So in this case, I want to select Project 2's id (which is 02).
How can I do this via Regex? I've managed to remove all of the whitespace and new lines as well so the JSON is essentially all one long line. Would it be easier to extract the Regex like this? Either way, I could use some help on how to get this ID.
NOTE: The format of the JSON cannot change.

This one works
/"id": ([^,]*?)(?=,[^{]*{[^}]*"profilecomplete": false)/
Explanations :
Read all these chars first "id":[space]
Then read in a group chars that aren't ","
And then a lookahead : you expect "," then chars that aren't "{", then "{"; and finally, before matching the closing "}", you want to read "profilecomplete": false
But I agree that a JSON parser would have been my preferred option!

Masking the value based on the name using regular expression in JSON string

The JSON string will be like below
{"data": [{
"id": "BankDetails.FirstName",
"value": "abcd",
"type": "Text"
},
{
"id": "BankDetails.AccountNumber",
"value": "12345678",
"type": "Text"
},
{
"id": "BankDetails.SortCode",
"value": "123",
"type": "Text"
}]
}
The "value": "12345678" under the "id": "BankDetails.AccountNumber" should be replaced as "value": "********". How can we write a Regex Pattern for this?
So the exact output will be
{"data": [{
"id": "BankDetails.FirstName",
"value": "abcd",
"type": "Text"
},
{
"id": "BankDetails.AccountNumber",
"value": "********",
"type": "Text"
},
{
"id": "BankDetails.SortCode",
"value": "123",
"type": "Text"
}]
}
Note: BankDetails.AccountNumber will not always be the third object.

You can use variable width positive look behind (supported in C#) to target each digit and replace it with * using this regex,
(?<="id": "BankDetails.AccountNumber",\s*"value": "\d*)\d
Regex Demo

How to evaluate user inputs like Greater than today in LUIS and BOT Framework

I am building a BOT using Microsoft Bot framework in C# and LUIS.
Using the prebuilt entity datetime.V2, I am able to capture terms like "last week", "next month" etc. properly.
However I am struck when it comes to:
"Get me all products which has expiry life greater than 2 years",
"greater than today",
"> today" etc.,
Do I use LUIS composite entities? If so, would "Greater than" and "today" become the child for a composite entity named say, "DateComparer"?
Is there any github sample that I can refer to to understand how composite entites would be parsed?
Thanks for your help and time in advance.

I created a ComparerList taking Miskov's suggestion.
"composites": [
{
"name": "DateComparer",
"children": [
"ComparerList",
"datetimeV2"
]
}
],
"closedLists": [
{
"name": "ComparerList",
"subLists": [
{
"canonicalForm": "gt",
"list": [
"greater than",
"larger than",
"more than",
"over",
"exceeding",
"higher than",
">"
]
},
{
"canonicalForm": "lt",
"list": [
"<",
"less than"
]
},
{
"canonicalForm": "eq",
"list": [
"=",
"equal to"
]
},
{
"canonicalForm": "le",
"list": [
"<=",
"less than or equal to"
]
},
{
"canonicalForm": "ge",
"list": [
">=",
"greater than or equal to"
]
}
]
}
],
"bing_entities": [
"datetimeV2"
],
I am able to train Luis to return the following json based on
utterances like "Give me all items with expiry greater than yesterday".
And I get the following json back when testing.
"entities": [
{
"entity": "greater than",
"type": "ComparerList",
"startIndex": 33,
"endIndex": 44,
"resolution": {
"values": [
"gt"
]
}
},
{
"entity": "greater than yesterday",
"type": "DateComparer",
"startIndex": 33,
"endIndex": 54,
"score": 0.6950233
},
{
"entity": "yesterday",
"type": "builtin.datetimeV2.date",
"startIndex": 46,
"endIndex": 54,
"resolution": {
"values": [
{
"timex": "2017-09-14",
"type": "date",
"value": "2017-09-14"
}
]
}
}
From this, I retrieve the "gt" resolution and use it in my code.

receinv bad json format from service

I get the following: How to make it as a valid JSON?
{{
"id": "123",
"name": "Kaizen",
"living": {
"city": "Sydney",
"state": "NSW"
},
"Country": {
"name": "Australia",
"region": "APAC"
}
}}

It looks like a valid JSON except for the opening and closing bracket.
You can simply cut it out:
string jsonString = yourServerClient.GetData();
jsonString = jsonString.Trim();
jsonString = jsonString.Substring(1, jsonString.Length - 2);
var jsonObj = JsonConvert.DeserializeObject(jsonString);
However, I would recommend you to refuse using any incorrect or invalid data sources - it is the road to hell.
You can never expect what they do next, and you definitely do not want to spend much of your time every time they change their service, and rewrite (and worsen) your code such that it now supports their incorrect format.

How to handle spaces in JSON keys when serializing to XML?

I'm using Json.NET in a .NET 4.0 application in order to convert a JSON RESTful response into XML. I am running into issues converting JSON into XML if a JSON child key has a space.
So far, I am able to convert most JSON responses.
Here are example responses along with the code which I am using to generate the XML.
{
num_reviews: "2",
page_id: "17816",
merchant_id: 7165
}
And here is the response which is causing an error:
[
{
headline: "ant bully",
created_date: "2010/06/12",
merchant_group_id: 10126,
profile_id: 0,
provider_id: 10000,
locale: "en_US",
helpful_score: 1314,
locale_id: 1,
variant: "",
bottomline: "Yes",
name: "Jessie",
page_id: "17816",
review_tags: [
{
Pros: [
"Easy to Learn",
"Engaging Story Line",
"Graphics",
"Good Audio",
"Multiplayer",
"Gameplay"
]
},
{
Describe Yourself: [
"Casual Gamer"
]
},
{
Best Uses: [
"Multiple Players"
]
},
{
Primary use: [
"Personal"
]
}
],
rating: 4,
merchant_id: 7165,
reviewer_type: "Verified Reviewer",
comments: "fun to play"
},
{
headline: "Ok game, but great price!",
created_date: "2010/02/28",
merchant_group_id: 10126,
profile_id: 0,
provider_id: 10000,
locale: "en_US",
helpful_score: 1918,
locale_id: 1,
variant: "",
bottomline: "Yes",
name: "Alleycatsandconmen",
page_id: "17816",
review_tags: [
{
Pros: [
"Easy to Learn",
"Engaging Story Line"
]
},
{
Describe Yourself: [
"Frequent Player"
]
},
{
Primary use: [
"Personal"
]
},
{
Best Uses: [
"Kids"
]
}
],
rating: 3,
merchant_id: 7165,
reviewer_type: "Verified Reviewer",
comments: "This is a cute game for the kids and at a great price. Just don't expect a whole lot."
}
]
So far, I have been considering on creating a mapping of the JSON data to a C# object and generating XML for that class. However, is there a way to keep this dynamic? Or is there a way to treat spaces as %20 encodings?

This question is same as how to validate JSON string before converting to XML in C#
If you have any further queries, please let me know.

You can call XmlConvert.EncodeName, which will escape any invalid characters using _s.
For example, a space would become _x0020_.

You cannot have an XMLElement Name with a space in it. You would need to replace the space with an Underscore or anyother element. If that is not feasible for you, try putting that value as an attribute for that Node.
I hope this makes sense.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Split a text with a regular expression - c#

Related

How do I extract particular value based another another value?

Masking the value based on the name using regular expression in JSON string

How to evaluate user inputs like Greater than today in LUIS and BOT Framework

receinv bad json format from service

How to handle spaces in JSON keys when serializing to XML?

Categories

Resources