How to compare and convert emoji characters in C# - c#

I am trying to figure out how to check if a string contains a specfic emoji. For example, look at the following two emoji:
Bicyclist: http://unicode.org/emoji/charts/full-emoji-list.html#1f6b4
US Flag: http://unicode.org/emoji/charts/full-emoji-list.html#1f1fa_1f1f8
Bicyclist is U+1F6B4, and the US flag is U+1F1FA U+1F1F8.
However, the emoji to check for are provided to me in an array like this, with just the numerical value in strings:
var checkFor = new string[] {"1F6B4","1F1FA-1F1F8"};
How can I convert those array values into actual unicode characters and check to see if a string contains them?
I can get something working for the Bicyclist, but for the US flag I'm stumped.
For the Bicyclist, I'm doing the following:
const string comparisonStr = "..."; //some string containing text and emoji
var hexVal = Convert.ToInt32(checkFor[0], 16);
var strVal = Char.ConvertFromUtf32(hexVal);
//now I can successfully do the following check
var exists = comparisonStr.Contains(strVal);
But this will not work with the US Flag because of the multiple code points.

You already got past the hard part. All you were missing is parsing the value in the array, and combining the 2 unicode characters before performing the check.
Here is a sample program that should work:
static void Main(string[] args)
{
const string comparisonStr = "bicyclist: \U0001F6B4, and US flag: \U0001F1FA\U0001F1F8"; //some string containing text and emoji
var checkFor = new string[] { "1F6B4", "1F1FA-1F1F8" };
foreach (var searchStringInHex in checkFor)
{
string searchString = string.Join(string.Empty, searchStringInHex.Split('-')
.Select(hex => char.ConvertFromUtf32(Convert.ToInt32(hex, 16))));
if (comparisonStr.Contains(searchString))
{
Console.WriteLine($"Found {searchStringInHex}!");
}
}
}

Related

better ways to combine two array items into one string

Hello Please could you suggest better ways of writing this C# code.
Basically when NumberList has missing values between '-' I am trying to rebuild the String with default Values.
The final result should be "123-10-45-9-09"
As you can see value of "second-10" is replaced as the second item in the string.
10, 9 and 09 are filled in from the value string values.
This is the bad string which is missing some values.
string NumberList = "123--45--";
I have stored this string value in my app.config file.
string valuestring = "first-12,second-10,third-99,fourth-9,fifth-09";
protected string MissingNumberString(string Number)
{
string NumberList = "123--45--";
string valuestring = "first-12,second-10,third-99,fourth-9,fifth-09";
var companyAccountList = valuestring.Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
var result = NumberList.Split('-');
int counter = 0;
var builder = new System.Text.StringBuilder();
foreach (string s in companyAccountList)
{
string t = s.Substring(s.IndexOf('-') + 1);
if (string.IsNullOrEmpty(result[counter]))
builder.Append(t).Append("-");
else
{
if (companyAccountList.Length == counter)
builder.Append(result[counter]);
else
builder.Append(result[counter]).Append("-");
}
counter++;
}
return builder.ToString();
}
One way (assuming valuestring is in order and do not miss any defaults) to achieve this would be
string MissingNumber(string Number)
{
string valuestring = "first-12,second-10,third-99,fourth-9,fifth-09";
var regex = Regex.Matches(valuestring,#"(?<=-)(\d*)(?<=,)?");
var defaults = regex.Cast<Match>().Select(x=>x.Value).ToList();
var newArray = Number.Split('-').Select((x,index)=>string.IsNullOrEmpty(x)?defaults[index]:x);
return string.Join("-",newArray);
}
The code uses Regular Expression to break the ValueString and read the default values.
Regex.Matches(valuestring,#"(?<=-)(\d*)(?<=,)?");
The regular expression uses non-capturing groups to capture a number which is prefixed as by an optional "-" character and suffixed by an optional "," character.
Once the defaults are parsed into a List (assuming that the positions are in order and do not miss any values), we loop through the input string (which has been split based on delimiter), check if it is Empty, and if so, use the value from the Defaults (based on our assumption, it should have same index).
Update
Based on the comments, it looks like you other data in the original string, and hence the concerned sub-string has to be captured first.
We could update the Missing Number method as
static string MissingNumber(string Number)
{
string valuestring = "first-12,second-10,third-99,fourth-9,fifth-09";
var regexDefaultValues = Regex.Matches(valuestring,#"(?<=-)(\d*)(?<=,)?");
var defaults = regexDefaultValues.Cast<Match>().Select(x=>x.Value).ToList();
var regexNumberToParse = new Regex(#"(\d)*-(\d)*-(\d)*-(\d)*-(\d)*");
var capturedNumberFormat = regexNumberToParse.Match(Number).Value;
var newArray = capturedNumberFormat.Split('-').Select((x,index)=>string.IsNullOrEmpty(x)?defaults[index]:x);
var ValueWithDefaults = string.Join("-",newArray);
return regexNumberToParse.Replace(Number,ValueWithDefaults);
}
Demo Code

How to read between a specified character in a string?

I was trying to create a list from a user input with something like this:
Create newlist: word1, word2, word3, etc...,
but how do I get those words one by one only by using commas as references going through them (in order) and placing them into an Array etc? Example:
string Input = Console.ReadLine();
if (Input.Contains("Create new list:"))
{
foreach (char character in Input)
{
if (character == ',')//when it reach a comma
{
//code goes here, where I got stuck...
}
}
}
Edit: I didn`t know the existence of "Split" my mistake... but at least it would great if you could explain me to to use it for the problem above?
You can use this:
String words = "word1, word2, word3";
List:
List<string> wordsList= words.Split(',').ToList<string>();
Array:
string[] namesArray = words.Split(',');
#patrick Artner beat me to it, but you can just split the input with the comma as the argument, or whatever you want the argument to be.
This is the example, and you will learn from the documentation.
using System;
public class Example {
public static void Main() {
String value = "This is a short string.";
Char delimiter = 's';
String[] substrings = value.Split(delimiter);
foreach (var substring in substrings)
Console.WriteLine(substring);
}
}
The example displays the following output:
Thi
i
a
hort
tring.

C# String Length returning the wrong amount

I have a string of 13 characters. 8C4B99823CB9C.
I am assigning it to a string.
string serialNumber = "‭8C4B99823CB9C‬";
This string then enters a method
GenerateCode(proxy, serialNumber, product);
Inside this method I have this line...
codeSerial = serial_no.Substring(serial_no.Length - codeSerialLength, codeSerialLength);
In the watch this is showing the length as 15.
Here is the full code
[TestMethod]
public void CanGenerateCodeNumberWithPrefixWithHEX()
{
string serialNumber = "‭8C4B99823CB9C‬";
Entity product = new Entity();
product.Attributes["piv_codeseriallength"] = 8;
product.Attributes["piv_codeprefix"] = "G";
string result = GenerateCode(proxy, serialNumber, product);
string expected = "G9823CB9C";
Assert.AreEqual(expected, result, "The Code was not generated correctly");
}
public static string GenerateCode(IOrganizationService _service, string serial_no, Entity product)
{
string codeSerial = null;
//Serial Length
if (product.Attributes.ContainsKey("piv_codeseriallength"))
{
codeSerial = serial_no;
int codeSerialLength = product.GetAttributeValue<int>("piv_codeseriallength");
codeSerial = serial_no.Substring(serial_no.Length - codeSerialLength, codeSerialLength);
string prefix = product.Attributes.ContainsKey("piv_codeprefix") ? product.GetAttributeValue<string>("piv_codeprefix") : "";
codeSerial = prefix + codeSerial;
}
return codeSerial;
}
This unit test fails because it thinks the string is 15 characters long and so taking the wrong section of the string
You have hidden unicode characters in your string. One good way to find out is to copy&paste the full string into a text editor, then try to move the caret left and right along the string. You'll see that you need to press left or right twice around the quotes, meaning that there's more characters then meet the eye. Of course, another way would be simply to open the string in a hexadecimal editor.
Assuming you only need simple characters, you can sanitize your input with a regex, to strip the extra characters:
var sanitizedInput = Regex.Replace(input, #"[^\w:/ ]", string.Empty);
In debug you can watch serialNumber.ToArray() you will notice that there is char 8237 at the begining of string and 8236 at the end

Parsing string with 3 hyphens C#

I have string COO70-123456789-12345-1. I need to parse based on the "-" not the length of the substrings and use the parsed values. I have tried using Regular expressions but having issues.Please suggest.
Also after I have split the values I need to use each values: string A = COO70, int B = 123456789, int C = 12345, short D = 1 . How do I get it in different variables A,B,C,D.
string[] results = UniqueId.Split('-');
string A = results[0];
string B = results[1];
string C = results[2];
int k_id = Convert.ToInt32(k_id);
string D = results[3];
short seq = Convert.ToInt16(seq);
string s = "COO70-123456789-12345-1";
string[] split = s.Split('-'); //=> {"COO70", "123456789", "12345", "1"}
Use indexOf
To find everything before the first hyphen use:
string original= "COO70-123456789-12345-1";
string toFirstHyphen=original.Substring(0,original.IndexOf("-"));
Or if you want every section use split like the above example.
You can verify whether the input is formatted as you want and then split to get the parts.
using System;
using System.Text.RegularExpressions;
class Program
{
static void Main()
{
string pattern = #"(?x)^(\w+-\w+-\w+-\w+)$";
Regex reg = new Regex(pattern);
string test = "word-6798-3401-001";
if((reg.Match(test).Success))
foreach (var x in test.Split(new char[] {'-'}))
Console.WriteLine(x);
}
}
So it sounds like you first want to split it, but then store it into your values.
I would do something like this:
var myString = "COO70-123456789-12345-1";
var stringSet = myString.Split("-"); // This returns an array of values.
Now we need to verify that we receive only 4 sub strings:
if (stringSet.Count != 4)
throw Exception e; // Throw a real exception, not this
From here we need to know what order our strings should be in and assign them:
var A = stringSet[0];
var B = stringSet[1];
var C = stringSet[2];
var D = stringSet[3];
While this should answer your question as posed, I would recommend you work with stringSet differently personally.

Extracting data from plain text string

I am trying to process a report from a system which gives me the following code
000=[GEN] OK {Q=1 M=1 B=002 I=3e5e65656-e5dd-45678-b785-a05656569e}
I need to extract the values between the curly brackets {} and save them in to variables. I assume I will need to do this using regex or similar? I've really no idea where to start!! I'm using c# asp.net 4.
I need the following variables
param1 = 000
param2 = GEN
param3 = OK
param4 = 1 //Q
param5 = 1 //M
param6 = 002 //B
param7 = 3e5e65656-e5dd-45678-b785-a05656569e //I
I will name the params based on what they actually mean. Can anyone please help me here? I have tried to split based on spaces, but I get the other garbage with it!
Thanks for any pointers/help!
If the format is pretty constant, you can use .NET string processing methods to pull out the values, something along the lines of
string line =
"000=[GEN] OK {Q=1 M=1 B=002 I=3e5e65656-e5dd-45678-b785-a05656569e}";
int start = line.IndexOf('{');
int end = line.IndexOf('}');
string variablePart = line.Substring(start + 1, end - start);
string[] variables = variablePart.Split(' ');
foreach (string variable in variables)
{
string[] parts = variable.Split('=');
// parts[0] holds the variable name, parts[1] holds the value
}
Wrote this off the top of my head, so there may be an off-by-one error somewhere. Also, it would be advisable to add error checking e.g. to make sure the input string has both a { and a }.
I would suggest a regular expression for this type of work.
var objRegex = new System.Text.RegularExpressions.Regex(#"^(\d+)=\[([A-Z]+)\] ([A-Z]+) \{Q=(\d+) M=(\d+) B=(\d+) I=([a-z0-9\-]+)\}$");
var objMatch = objRegex.Match("000=[GEN] OK {Q=1 M=1 B=002 I=3e5e65656-e5dd-45678-b785-a05656569e}");
if (objMatch.Success)
{
Console.WriteLine(objMatch.Groups[1].ToString());
Console.WriteLine(objMatch.Groups[2].ToString());
Console.WriteLine(objMatch.Groups[3].ToString());
Console.WriteLine(objMatch.Groups[4].ToString());
Console.WriteLine(objMatch.Groups[5].ToString());
Console.WriteLine(objMatch.Groups[6].ToString());
Console.WriteLine(objMatch.Groups[7].ToString());
}
I've just tested this out and it works well for me.
Use a regular expression.
Quick and dirty attempt:
(?<ID1>[0-9]*)=\[(?<GEN>[a-zA-Z]*)\] OK {Q=(?<Q>[0-9]*) M=(?<M>[0-9]*) B=(?<B>[0-9]*) I=(?<I>[a-zA-Z0-9\-]*)}
This will generate named groups called ID1, GEN, Q, M, B and I.
Check out the MSDN docs for details on using Regular Expressions in C#.
You can use Regex Hero for quick C# regex testing.
You can use String.Split
string[] parts = s.Split(new string[] {"=[", "] ", " {Q=", " M=", " B=", " I=", "}"},
StringSplitOptions.None);
This solution breaks up your report code into segments and stores the desired values into an array.
The regular expression matches one report code segment at a time and stores the appropriate values in the "Parsed Report Code Array".
As your example implied, the first two code segments are treated differently than the ones after that. I made the assumption that it is always the first two segments that are processed differently.
private static string[] ParseReportCode(string reportCode) {
const int FIRST_VALUE_ONLY_SEGMENT = 3;
const int GRP_SEGMENT_NAME = 1;
const int GRP_SEGMENT_VALUE = 2;
Regex reportCodeSegmentPattern = new Regex(#"\s*([^\}\{=\s]+)(?:=\[?([^\s\]\}]+)\]?)?");
Match matchReportCodeSegment = reportCodeSegmentPattern.Match(reportCode);
List<string> parsedCodeSegmentElements = new List<string>();
int segmentCount = 0;
while (matchReportCodeSegment.Success) {
if (++segmentCount < FIRST_VALUE_ONLY_SEGMENT) {
string segmentName = matchReportCodeSegment.Groups[GRP_SEGMENT_NAME].Value;
parsedCodeSegmentElements.Add(segmentName);
}
string segmentValue = matchReportCodeSegment.Groups[GRP_SEGMENT_VALUE].Value;
if (segmentValue.Length > 0) parsedCodeSegmentElements.Add(segmentValue);
matchReportCodeSegment = matchReportCodeSegment.NextMatch();
}
return parsedCodeSegmentElements.ToArray();
}

Categories