Remove single quote characters from specific field inside JSON data - c#

I have this json data string below, which I need to do some cleaning before I can Deserialize into an object in C#. Here's my json string:
{'data':[
{'ID':'01','Name':'Name 1','Description':'abc','Skills':[{'Type':'abc','Technical':'abc','Description':'abc'}],'Status':false,'Inactive':0},
{'ID':'02','Name':'Name 2','Description':'abc','Skills':[{'Type':'abc','Technical':'abc','Description':'abc'}],'Status':false,'Inactive':0},
{'ID':'03','Name':'Name 3','Description':'abc','Skills':[{'Type':'abc','Technical':'abc','Description':'abc'}],'Status':false,'Inactive':1}]}
What I'm trying to do is REMOVE single quote (') character from the following field in the above data:
'Skills':[{'Type':'abc','Technical':'abc','Description':'abc'}]
So what I need to achieve is to have "Skills" field to look like this:
'Skills':[{Type:abc,Technical:abc,Description:abc}]
I designed this Regex patter:
(?<='Skills':\[\{)(.*?)(?=\}\],)
It matches the string below, but I don't know how to exclude single quotes.
'Type':'abc','Technical':'abc','Description':'abc'
Can someone please help?

It's better to modify the source to get pretty formatted JSON, it's not standard JSON format.
if you don't have access to modify the source output, you can use this :
string content = Console.ReadLine();
var matchResult = new Regex("(?<='Skills':).*?}]").Matches(content);
foreach(Match match in matchResult)
{
string matchValueWithoutSingleQuote = match.Value.Replace("'", string.Empty);
content = content.Replace(match.Value, matchValueWithoutSingleQuote);
}
Console.WriteLine(content);
Console.ReadLine();
the output is :
{'data':[
{'ID':'01','Name':'Name 1','Description':'abc','Skills':[{Type:abc,Technical:abc,Description:abc}],'Status':false,'Inactive':0},
{'ID':'02','Name':'Name 2','Description':'abc','Skills':[{Type:abc,Technical:abc,Description:abc}],'Status':false,'Inactive':0},
{'ID':'03','Name':'Name 3','Description':'abc','Skills':[{Type:abc,Technical:abc,Description:abc}],'Status':false,'Inactive':1}]}
Linq version :
string content = Console.ReadLine();
var matchResult = new Regex("(?<='Skills':).*?}]").Matches(content);
var jsonWithNormalizedSkillField = matchResult.Cast<Match>().Select(s => content.Replace(s.Value, s.Value.Replace("'", string.Empty))).FirstOrDefault();
Console.WriteLine(jsonWithNormalizedSkillField);
Console.ReadLine();

Related

Identify the string that does not exists in another string using regex and C#

I am trying to capture a string that does not contains in another string.
string searchedString = " This is my search string";
string subsetofSearchedString = "This is my";
My output should be "Search string". I would like to go with only regex so that I can handle complex strings.
The below is the code that I have tried so far and I am not successful.
Match match = new Regex(subsetofSearchedString ).Match(searchedString );
if (!string.IsNullOrWhiteSpace(match.Value))
{
UnmatchedString= UnmatchedString.Replace(match.Value, string.Empty);
}
Update : The above code is not working for the below texts.
text1 = 'Property Damage (2015 ACURA)' Exposure Added Automatically for IP:Claimant DriverLoss Reserve Line :Property DamageReserve Amount $ : STATIP Role(s): Owner, DriverExposure Owner :Jaimee Watson_csr Author:
text2 = 'Property Damage (2015 ACURA)' Exposure Added Automatically for IP:Claimant DriverLoss Reserve Line :Property DamageReserve Amount $ : STATIP Role(s): Owner, Driver
Match match = new Regex(text2).Match(text1);
You can use Regex.Split:
var ans = Regex.Split(searchedString, subsetofSearchedString);
If you want the answer as a single string minus the subset, you can join it:
var ansjoined = String.Join("", ans);
Replacing with String.Empty will also work:
var ans = Regex.Replace(searchedString, subsetOfSearchedString, String.Empty);
Answer :
Regex wasn't working for me because of the presence of metacharacters in my string. Regex.Escape did not help me with the comparison.
String Contains worked like a charm here
if (text1.Contains(text2))
{
status = TestResult.Pass;
text1= text1.Replace(text2, string.Empty);
}

Problems when a json string has extra " in it using JObject.Parse

The Following string gives an error when using the following code:
data = await resposta.Content.ReadAsStringAsync();
dynamic j = JObject.Parse(data);
The data in contains the following string:
{"code": 100, "message": "The entity with the name "Esther Rea" its not in DB."}
How to take off the " from Esther Rea?
As suggested, the correct solution would be to have whoever's returning this value escape the quotes. However, if that's really not an option, you can try to brute-force your way into escaping the double quotes yourself, assuming the return schema is always the same, using something like this:
var pattern = "(\"message\":\\s+\")(?<messageContent>(.*))(\"})";
var regex = Regex.Match(data, pattern);
var message = regex.Groups["messageContent"].Value;
if (!string.IsNullOrEmpty(message))
{
message = message.Replace("\"", "\\\"");
var newData = Regex.Replace(data, pattern, "$1" + message + "$3");
var jObject = JObject.Parse(newData);
}
This will extract the actual message string and escapes all double quotes in it (message.Replace("\"", "\\\"");), causing serialization to succeed.
If you really want to remove the quotes instead of escaping them, you can do message = message.Replace("\"", "");

String.Split not working to get values

I have an XML which is passes as a string variable to me. I want to get the value of specific tags from that XML. Following is the XML I have and what I'm trying to achieve:
<code>
string xmlData = #"
<HEADER>
<TYPE>AAA</TYPE>
<SUBTYPE>ANNUAL</SUBTYPE>
<TYPEID>12345</TYPEID>
<SUBTYPEID>56789</SUBTYPEID>
<ACTIVITY>C</ACTION>
</HEADER>";
var typeId = data.Split("<TYPEID>")[0]; //Requirement
var activity = data.Split("<ACTIVITY>")[0]; //Requirement
</code>
I know string.Split(); doesn't work here as it requires a single character only. Other alternate is to use regex which seems a bit threatening to me. Although I have tried to work with it but doesn't getting the desired result. Can someone help with the regex code?
You should have used XML Parsing to get the values but since you are trying split to split a string from a string and not char you can choose
string typeId = xmlData.Split(new string[] { "<TYPEID>" }, StringSplitOptions.None)[1];
string typeIdVal = typeId.Split(new string[] { "</TYPEID>" }, StringSplitOptions.None)[0];
and it looks very neat and clean with XML Parsing
XmlDocument xmlDoc= new XmlDocument();
xmlDoc.Load("yourXMLFile.xml");
XmlNodeList XTypeID = xmlDoc.GetElementsByTagName("TYPEID");
string TypeID = XTypeID[0].InnerText;
You can also choose SubString like
string typeidsubstr = xmlData.Substring(xmlData.IndexOf("<TYPEID>") + 8, xmlData.IndexOf("</TYPEID>") - (xmlData.IndexOf("<TYPEID>") + 8));
I used +8 because the length of <TYPEID> is 8 you can also choose it string.length to evaluate the result.
You can use XML Linq objects to parse these.
NB: There is a typo in the ACTIVITY element, the closing tag should be /ACTIVITY, not /ACTION! (I've corrected below)
string xmlData = #"<HEADER>
<TYPE>AAA</TYPE>
<SUBTYPE>ANNUAL</SUBTYPE>
<TYPEID>12345</TYPEID>
<SUBTYPEID>56789</SUBTYPEID>
<ACTIVITY>C</ACTIVITY>
</HEADER>";
var doc = XDocument.Parse(xmlData);
var typeId = doc.Root.Elements("TYPEID").First().Value;
var activity = doc.Root.Elements("ACTIVITY").First().Value;

retain the newline in a regex Match, c#

So, i've created the following regex which captures everything i need from my string:
const string tag = ":59";
var result = Regex.Split(message, String.Format(":{0}[^:]?:*[^:]*", tag),RegexOptions.Multiline);
the string follows this patter:
:59A:/sometext\n
somemore text\n
:71A:somemore text
I'm trying to capture everything in between :59A: and :71A: - this isn't fixed in stone though, as :71A: could be something else. hence, why i was using [^:]
EDIT
So, just to be clear on my requirements. I have a file(string) which is passed into a C# method, which should return only those values specified in the parameter tag. For instance, if the file(string) contains the following tags:
:20:
:21:
:59A:
:71A:
and i pass in 59 then i only need to return everything in between the start of tag :59A: and the start of the next tag, which in this instance is :71A:, but could be something else.
You can use the following code to match what you need:
string input = ":59A:/sometext\nsomemore text\n:71A:somemore text";
string pattern = "(?<=:[^:]+:)[^:]+\n";
var m = Regex.Match(input, pattern, RegexOptions.Singleline).Value;
If you want to use your tag constant, you can use this code
const string tag = ":59";
string input = ":59A:/sometext\nsomemore text\n:71A:somemore text";
string pattern = String.Format("(?<={0}[^:]*:)[^:]+\n", tag);
var m = Regex.Match(input, pattern, RegexOptions.Singleline).Value;

how to normalise json from javascript in c#

hello there i have this following j.s .. i am sending an array to my C# file in r in json format
var r=['maths','computer','physics']
$.post("Global.aspx", { opt: "postpost", post: w.val(),tags:JSON.stringify(r)
}, function (d) {
});
but in c# i am getting this type of string:
["Maths""Computer""Physics"]
.
i want only the words maths,computer,physics not the [ sign and " sign .. please help me out
i have following c# code :
string[] _tags = Request.Form["tags"].ToString().Split(',');
string asd="";
foreach (string ad in _tags) {
asd += ad;
}
You're looking for JSON deserialization:
List<string> list = new JavaScriptSerializer().Deserialize<List<string>>(Request.Form["tags"]);
As pointed out, you've split your string on the , character leaving you with an array of:
[0] = "[\"Maths\""
[1] = "\"Computer\""
[2] = "\"Physics\"]"
Because JSON is a data type, those square brackets actually have functional meaning. They're not just useless extra characters. As such, you need to parse the data into a format you can actually work that.

Categories