order by culture is not working as expected - c#

Why "Ū" comes first instead "U"?
CultureInfo ci = CultureInfo.GetCultureInfo("lt-LT");
bool ignoreCase = true; //whether comparison should be case-sensitive
StringComparer comp = StringComparer.Create(ci, ignoreCase);
string[] unordered = { "Za", "Žb", "Ūa", "Ub" };
var ordered = unordered.OrderBy(s => s, comp);
Results of ordered :
Ūa
Ub
Za
Žb
Should be: Ub Ūa Za Žb
Here is lithuanian letters in order. https://www.assorti.lt/userfiles/uploader/no/norvegiska-lietuviska-delione-abecele-maxi-3-7-m-vaikams-larsen.jpg

I just made what could be a (limited) solution to your problem.
This is not optimized, but it can give an idea of how to solve it.
I create a LithuanianString class which is only used to encapsulate your string.
This class implement IComparable in order to be able to sort a list of LithuanianString.
Here is what could be the class:
public class LithuanianString : IComparable<LithuanianString>
{
const string UpperAlphabet = "AĄBCČDEĘĖFGHIĮYJKLMNOPRSŠTUŲŪVZŽ";
const string LowerAlphabet = "aąbcčdeęėfghiįyjklmnoprsštuųūvzž";
public string String;
public LithuanianString(string inputString)
{
this.String = inputString;
}
public int CompareTo(LithuanianString other)
{
var maxIndex = this.String.Length <= other.String.Length ? this.String.Length : other.String.Length;
for (var i = 0; i < maxIndex; i++)
{
//We make the method non case sensitive
var indexOfThis = LowerAlphabet.Contains(this.String[i])
? LowerAlphabet.IndexOf(this.String[i])
: UpperAlphabet.IndexOf(this.String[i]);
var indexOfOther = LowerAlphabet.Contains(other.String[i])
? LowerAlphabet.IndexOf(other.String[i])
: UpperAlphabet.IndexOf(other.String[i]);
if (indexOfOther != indexOfThis)
return indexOfThis - indexOfOther;
}
return this.String.Length - other.String.Length;
}
}
And here is a sample I made to try it :
static void Main(string[] args)
{
string[] unordered = { "Za", "Žb", "Ūa", "Ub" };
//Create a list of lithuanian string from your array
var lithuanianStringList = (from unorderedString in unordered
select new LithuanianString(unorderedString)).ToList();
//Sort it
lithuanianStringList.Sort();
//Display it
Console.WriteLine(Environment.NewLine + "My Comparison");
lithuanianStringList.ForEach(c => Console.WriteLine(c.String));
}
The output is the expected one :
Ub Ūa Za Žb
This class allows to create custom alphabets just by replacing characters in the two constants defined at the beginning.

Related

If a string contains a double character change it to another string C#

I created a Person Model:
Model Person = new Model();
Person.LastName = values[0];
[LastName is a string]
I would like to replace the values [0] which is "Anna" with another string value like "Frank" if it contains a double character, in this case "if Anna contains a double character, change the value
with another string".
how to do?
Write a helper function to test for consecutive equal characters:
private static bool HasDoubleCharacter(string s)
{
char? previous = null;
foreach (char ch in s) {
if (ch == previous) {
return true;
}
previous = ch;
}
return false;
}
The you can write
Model Person = new Model();
string name = values[0];
if (HasDoubleCharacter(name)) {
name = "Frank";
}
Person.LastName = name;
You could also create a new array containing only names with no double character and use that one instead:
Model Person = new Model();
string[] names = values
.Where(v => !HasDoubleCharacter(v))
.ToArray();
if (names.Length > 0) {
Person.LastName = names[0];
}
Same idea as in Michał Turczyn post - a check for double character, but implemented via regular expressions:
using System.Text.RegularExpressions;
...
public static bool HasDoubleChar(string value) =>
value != null && Regex.IsMatch(value, #"(.)\1");
pattern (.)\1 explained:
(.) - group which contains an arbitrary character
\1 - the group #1 repeated again
Note, that we have some flexibility here: (\p{L})\1 will check for double letters only, not, say, spaces; (?i)(.)\1 will compare ignoring case (Aaron will be matched) etc.
then as usual
Person.LastName = HasDoubleChar(values[0])
? "Frank"
: values[0];
This could be achieved with easy extensions funciton for string:
public static class StringExtensions
{
public static bool HasDoubleChar(this string #this)
{
ArgumentNullException.ThrowIfNull(#this);
for (int i = 1; i < #this.Length; i++)
{
if (#this[i] == #this[i - 1])
{
return true;
}
}
return false;
}
}
and then you could use it:
Person.LastName = values[0].HasDoubleChar()
? "Frank"
: values[0];
public static string Do(string source, string replacement)
{
char c = source.ToLower()[0]; // do you care about case? E.g., Aaron.
for(int i = 1; i < source.Length; i++)
{
if (source.ToLower()[i] == c)
{
return replacement;
}
}
return source;
}
public static void Main()
{
string thing = "Aaron";
thing = Do(thing, "Frank");
}

rearrange the characters of a string so that any two adjacent characters are not the same

How to rearrange the characters of a string so that any two adjacent characters are not the same? using c#
c#
Without using Hashmaps and Dictionary
I managed to find each element of the string, and the occurrence of each element.
This is what I've done so far
Using LINQ, you can gather the characters of the string, group them by duplicate characters, then pivot the groups and join then back into a string.
First, some extension methods to make Join easier:
public static class IEnumerableExt {
public static string Join(this IEnumerable<char> chars) => String.Concat(chars); // faster >= .Net Core 2.1
public static string Join(this IEnumerable<string> strings) => String.Concat(strings);
}
Then, an extension method to pivot an IEnumerable<IEnumerable<T>>:
public static class IEnumerableIEnumerableExt {
public static IEnumerable<IEnumerable<T>> Pivot<T>(this IEnumerable<IEnumerable<T>> src) {
var enums = src.Select(ie => ie.GetEnumerator()).ToList();
var hasMores = Enumerable.Range(0, enums.Count).Select(n => true).ToList();
for (; ; ) {
var oneGroup = new List<T>();
var hasMore = false;
for (int j1 = 0; j1 < enums.Count; ++j1) {
if (hasMores[j1]) {
hasMores[j1] = enums[j1].MoveNext();
hasMore = hasMore || hasMores[j1];
if (hasMores[j1])
oneGroup.Add(enums[j1].Current);
}
}
if (!hasMore)
break;
yield return oneGroup;
}
for (int j1 = 0; j1 < enums.Count; ++j1)
enums[j1].Dispose();
}
}
Finally, use these to solve your problem:
var s = "How to rearrange the characters of a string";
var tryAns = s.OrderBy(c => c)
.GroupBy(ch => ch)
.Pivot()
.Select(gch => gch.Join())
.Join();
var dupRE = new Regex(#"(.)\1", RegexOptions.Compiled);
var hasDups = dupRE.IsMatch(tryAns);
// tryAns will be " Hacefghinorstw aceghnorst aeort aert ar r "
// hasDups will be false
If the resulting string has two adjacent identical characters, then hasDups will be true.

Extract some values in formatted string

I would like to retrieve values in string formatted like this :
public var any:int = 0;
public var anyId:Number = 2;
public var theEnd:Vector.<uint>;
public var test:Boolean = false;
public var others1:Vector.<int>;
public var firstValue:CustomType;
public var field2:Boolean = false;
public var secondValue:String = "";
public var isWorks:Boolean = false;
I want to store field name, type and value in a custom class Property :
public class Property
{
public string Name { get; set; }
public string Type { get; set; }
public string Value { get; set; }
}
And with a Regex expression get these values.
How can I do ?
Thanks
EDIT : I tried this but I don't know how to go further with vectors..etc
/public var ([a-zA-Z0-9]*):([a-zA-Z0-9]*)( = \"?([a-zA-Z0-9]*)\"?)?;/g
Ok, posting my regex-based answer.
Your regex - /public var ([a-zA-Z0-9]*):([a-zA-Z0-9]*)( = \"?([a-zA-Z0-9]*)\"?)?;/g - contains regex delimiters, and they are not supported in C#, and thus are treated as literal symbols. You need to remove them and the modifier g since to obtain multiple matches in C# Regex.Matches, or Regex.Match with while and Match.Success/.NextMatch() can be used.
The regex I am using is (?<=\s*var\s*)(?<name>[^=:\n]+):(?<type>[^;=\n]+)(?:=(?<value>[^;\n]+))?. The newline symbols are included as negated character classes can match a newline character.
var str = "public var any:int = 0;\r\npublic var anyId:Number = 2;\r\npublic var theEnd:Vector.<uint>;\r\npublic var test:Boolean = false;\r\npublic var others1:Vector.<int>;\r\npublic var firstValue:CustomType;\r\npublic var field2:Boolean = false;\r\npublic var secondValue:String = \"\";\r\npublic var isWorks:Boolean = false;";
var rx = new Regex(#"(?<=\s*var\s*)(?<name>[^=:\n]+):(?<type>[^;=\n]+)(?:=(?<value>[^;\n]+))?");
var coll = rx.Matches(str);
var props = new List<Property>();
foreach (Match m in coll)
props.Add(new Property(m.Groups["name"].Value,m.Groups["type"].Value, m.Groups["value"].Value));
foreach (var item in props)
Console.WriteLine("Name = " + item.Name + ", Type = " + item.Type + ", Value = " + item.Value);
Or with LINQ:
var props = rx.Matches(str)
.OfType<Match>()
.Select(m =>
new Property(m.Groups["name"].Value,
m.Groups["type"].Value,
m.Groups["value"].Value))
.ToList();
And the class example:
public class Property
{
public string Name { get; set; }
public string Type { get; set; }
public string Value { get; set; }
public Property()
{}
public Property(string n, string t, string v)
{
this.Name = n;
this.Type = t;
this.Value = v;
}
}
NOTE ON PERFORMANCE:
The regex is not the quickest, but it certainly beats the one in the other answer. Here is a test performed at regexhero.net:
It seems, that you don't want regular expressions; in a simple case
as you've provided:
String text =
#"public var any:int = 0;
public var anyId:Number = 2;
public var theEnd:Vector.<uint>;
public var test:Boolean = false;
public var others1:Vector.<int>;
public var firstValue:CustomType;
public var field2:Boolean = false;";
List<Property> result = text
.Split(new Char[] {'\r','\n'}, StringSplitOptions.RemoveEmptyEntries)
.Select(line => {
int varIndex = line.IndexOf("var") + "var".Length;
int columnIndex = line.IndexOf(":") + ":".Length;
int equalsIndex = line.IndexOf("="); // + "=".Length;
// '=' can be absent
equalsIndex = equalsIndex < 0 ? line.Length : equalsIndex + "=".Length;
return new Property() {
Name = line.Substring(varIndex, columnIndex - varIndex - 1).Trim(),
Type = line.Substring(columnIndex, columnIndex - varIndex - 1).Trim(),
Value = line.Substring(equalsIndex).Trim(' ', ';')
};
})
.ToList();
if text can contain comments and other staff, e.g.
"public (*var is commented out*) var sample: int = 123;;;; // another comment"
you have to implement a parser
You can use the following pattern:
\s*(?<vis>\w+?)\s+var\s+(?<name>\w+?)\s*:\s*(?<type>\S+?)(\s*=\s*(?<value>\S+?))?\s*;
to match each element in a line. Appending ? after a quantifier results in a non-greedy match which makes the pattern a lot simpler - no need to negate all unwanted classes.
Values are optional, so the value group is wrapped in another, optional group (\s*=\s*(?<value>\S+?))?
Using the RegexOptions.Multiline option means we don't have to worry about accidentally matching newlines.
The C# 6 syntax in the following example isn't required, but multiline string literals and interpolated strings make for much cleaner code.
var input= #"public var any:int = 0;
public var anyId:Number = 2;
public var theEnd:Vector.<uint>;
public var test:Boolean = false;
public var others1:Vector.<int>;
public var firstValue:CustomType;
public var field2:Boolean = false;
public var secondValue:String = """";
public var isWorks:Boolean = false;";
var pattern= #"\s*(?<vis>\w+?)\s+var\s+(?<name>\w+?)\s*:\s*(?<type>\S+?)(\s*=\s*(?<value>\S+?))?\s*;"
var regex = new Regex(pattern, RegexOptions.Multiline);
var results=regex.Matches(input);
foreach (Match m in results)
{
var g = m.Groups;
Console.WriteLine($"{g["name"],-15} {g["type"],-10} {g["value"],-10}");
}
var properties = (from m in results.OfType<Match>()
let g = m.Groups
select new Property
{
Name = g["name"].Value,
Type = g.["type"].Value,
Value = g["value"].Value
})
.ToList();
I would consider using a parser generator like ANTLR though, if I had to parse more complex input or if there are multiple patterns to match. Learning how to write the grammar takes some time, but once you learn it, it's easy to create parsers that can match input that would require very complicated regular expressions. Whitespace management also becomes a lot easier
In this case, the grammar could be something like:
property : visibility var name COLON type (EQUALS value)? SEMICOLON;
visibility : ALPHA+;
var : ALPHA ALPHA ALPHA;
name : ALPHANUM+;
type : (ALPHANUM|DOT|LEFT|RIGHT);
value : ALPHANUM
| literal;
literal : DOUBLE_QUOTE ALPHANUM* DOUBLE_QUOTE;
ALPHANUM : ALPHA
| DIGIT;
ALPHA : [A-Z][a-z];
DIGIT : [0-9];
...
WS : [\r\n\s] -> skip;
With a parser, adding eg comments would be as simple as adding comment before SEMICOLON in the property rule and a new comment rule that would match the pattern of a comment

Convert ASCII code to orginalstring in c#

I'm trying to convert String to ASCII code, so i use this function :
public List<string> PrimaryCode(string OrginalStr)
{
List<string> lstResult = new List<string>();
int lenOrginal = OrginalStr.Length;
string subOrginalStr;
byte[] AsciisubOrginalStr;
int AC;
for (int i = 0; i < lenOrginal; i++)
{
subOrginalStr = OrginalStr.Substring(i, 1);
AsciisubOrginalStr = Encoding.ASCII.GetBytes(subOrginalStr);
if (AsciisubOrginalStr[0] > 100)
{
AC = Convert.ToInt32(AsciisubOrginalStr[0]);
lstResult.Add((AC ).ToString());
}
else
{
AC = Convert.ToInt32(AsciisubOrginalStr[0]);
lstResult.Add((AC).ToString());
}
}
return lstResult;
}
The other part of my project i need to convert the ASCII code to original text as you can see i use this function :
public List<string> PrimaryCodeRev(List<string> CodedStr)
{
string res = "";
foreach (string s in CodedStr)
{
res = res+s;
}
List<string> lstResult = new List<string>();
int lenOrginal = res.Length;
string subOrginalStr;
byte[] AsciisubOrginalStr;
int AC;
for (int i = 0; i < lenOrginal; i++)
{
subOrginalStr = res.Substring(i, 1);
AsciisubOrginalStr = Encoding.ASCII.GetBytes(subOrginalStr);
if (AsciisubOrginalStr[0] < 100)
{
AC = Convert.ToInt32(AsciisubOrginalStr[0]);
lstResult.Add((AC).ToString());
}
else
{
AC = Convert.ToInt32(AsciisubOrginalStr[0]);
lstResult.Add((AC).ToString());
}
}
return lstResult;
}
The string input hello
convert to ascii result :
Convert ascii to main text :
But it doesn't work and it doesn't return the main text. Why ?
You seem to be making it too complicated...
If your input string only ever contains ASCII characters (which must be a requirement), then you can encode it as follows:
public static IEnumerable<string> ToDecimalAscii(string input)
{
return input.Select(c => ((int)c).ToString());
}
You can convert it back to a string like so:
public static string FromDecimalAscii(IEnumerable<string> input)
{
return new string(input.Select(s => (char)int.Parse(s)).ToArray());
}
Putting it together into a compilable console program:
using System;
using System.Collections.Generic;
using System.Linq;
namespace Demo
{
class Program
{
static void Main(string[] args)
{
string original = "hello";
var encoded = ToDecimalAscii(original);
Console.WriteLine("Encoded:");
Console.WriteLine(string.Join("\n", encoded));
Console.WriteLine("\nDecoded: " + FromDecimalAscii(encoded));
}
public static IEnumerable<string> ToDecimalAscii(string input)
{
return input.Select(c => ((int)c).ToString());
}
public static string FromDecimalAscii(IEnumerable<string> input)
{
return new string(input.Select(s => (char)int.Parse(s)).ToArray());
}
}
}
Let me reiterate: This will ONLY work if your input string is guaranteed to contain only characters that are in the ASCII set.
This does not really answer the question of why you want to do this. If you are trying to encode something, you might be better using some sort of encrypting method which outputs an array of bytes, and converting that array to base 64.
This does look like an X-Y question to me.
For getting Ascii code simply do as follows.
byte[] asciiBytes = Encoding.ASCII.GetBytes(value);
For string from AScii code,
char c1 = (char) asciiCode;

compare the characters in two strings

In C#, how do I compare the characters in two strings.
For example, let's say I have these two strings
"bc3231dsc" and "bc3462dsc"
How do I programically figure out the the strings
both start with "bc3" and end with "dsc"?
So the given would be two variables:
var1 = "bc3231dsc";
var2 = "bc3462dsc";
After comparing each characters from var1 to var2, I would want the output to be:
leftMatch = "bc3";
center1 = "231";
center2 = "462";
rightMatch = "dsc";
Conditions:
1. The strings will always be a length of 9 character.
2. The strings are not case sensitive.
The string class has 2 methods (StartsWith and Endwith) that you can use.
After reading your question and the already given answers i think there are some constraints are missing, which are maybe obvious to you, but not to the community. But maybe we can do a little guess work:
You'll have a bunch of string pairs that should be compared.
The two strings in each pair are of the same length or you are only interested by comparing the characters read simultaneously from left to right.
Get some kind of enumeration that tells me where each block starts and how long it is.
Due to the fact, that a string is only a enumeration of chars you could use LINQ here to get an idea of the matching characters like this:
private IEnumerable<bool> CommonChars(string first, string second)
{
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
var charsToCompare = first.Zip(second, (LeftChar, RightChar) => new { LeftChar, RightChar });
var matchingChars = charsToCompare.Select(pair => pair.LeftChar == pair.RightChar);
return matchingChars;
}
With this we can proceed and now find out how long each block of consecutive true and false flags are with this method:
private IEnumerable<Tuple<int, int>> Pack(IEnumerable<bool> source)
{
if (source == null)
throw new ArgumentNullException("source");
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
bool current = iterator.Current;
int index = 0;
int length = 1;
while (iterator.MoveNext())
{
if(current != iterator.Current)
{
yield return Tuple.Create(index, length);
index += length;
length = 0;
}
current = iterator.Current;
length++;
}
yield return Tuple.Create(index, length);
}
}
Currently i don't know if there is an already existing LINQ function that provides the same functionality. As far as i have already read it should be possible with SelectMany() (cause in theory you can accomplish any LINQ task with this method), but as an adhoc implementation the above was easier (for me).
These functions could then be used in a way something like this:
var firstString = "bc3231dsc";
var secondString = "bc3462dsc";
var commonChars = CommonChars(firstString, secondString);
var packs = Pack(commonChars);
foreach (var item in packs)
{
Console.WriteLine("Left side: " + firstString.Substring(item.Item1, item.Item2));
Console.WriteLine("Right side: " + secondString.Substring(item.Item1, item.Item2));
Console.WriteLine();
}
Which would you then give this output:
Left side: bc3
Right side: bc3
Left side: 231
Right side: 462
Left side: dsc
Right side: dsc
The biggest drawback is in someway the usage of Tuple cause it leads to the ugly property names Item1 and Item2 which are far away from being instantly readable. But if it is really wanted you could introduce your own simple class holding two integers and has some rock-solid property names. Also currently the information is lost about if each block is shared by both strings or if they are different. But once again it should be fairly simply to get this information also into the tuple or your own class.
static void Main(string[] args)
{
string test1 = "bc3231dsc";
string tes2 = "bc3462dsc";
string firstmatch = GetMatch(test1, tes2, false);
string lasttmatch = GetMatch(test1, tes2, true);
string center1 = test1.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
string center2 = test2.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
}
public static string GetMatch(string fist, string second, bool isReverse)
{
if (isReverse)
{
fist = ReverseString(fist);
second = ReverseString(second);
}
StringBuilder builder = new StringBuilder();
char[] ar1 = fist.ToArray();
for (int i = 0; i < ar1.Length; i++)
{
if (fist.Length > i + 1 && ar1[i].Equals(second[i]))
{
builder.Append(ar1[i]);
}
else
{
break;
}
}
if (isReverse)
{
return ReverseString(builder.ToString());
}
return builder.ToString();
}
public static string ReverseString(string s)
{
char[] arr = s.ToCharArray();
Array.Reverse(arr);
return new string(arr);
}
Pseudo code of what you need..
int stringpos = 0
string resultstart = ""
while not end of string (either of the two)
{
if string1.substr(stringpos) == string1.substr(stringpos)
resultstart =resultstart + string1.substr(stringpos)
else
exit while
}
resultstart has you start string.. you can do the same going backwards...
Another solution you can use is Regular Expressions.
Regex re = new Regex("^bc3.*?dsc$");
String first = "bc3231dsc";
if(re.IsMatch(first)) {
//Act accordingly...
}
This gives you more flexibility when matching. The pattern above matches any string that starts in bc3 and ends in dsc with anything between except a linefeed. By changing .*? to \d, you could specify that you only want digits between the two fields. From there, the possibilities are endless.
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
List<string> common_str = commonStrings(s1,s2);
foreach ( var s in common_str)
Console.WriteLine(s);
}
static public List<string> commonStrings(string s1, string s2){
int len = s1.Length;
char [] match_chars = new char[len];
for(var i = 0; i < len ; ++i)
match_chars[i] = (Char.ToLower(s1[i])==Char.ToLower(s2[i]))? '#' : '_';
string pat = new String(match_chars);
Regex regex = new Regex("(#+)", RegexOptions.Compiled);
List<string> result = new List<string>();
foreach (Match match in regex.Matches(pat))
result.Add(s1.Substring(match.Index, match.Length));
return result;
}
}
for UPDATE CONDITION
using System;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
int len = 9;//s1.Length;//cond.1)
int l_pos = 0;
int r_pos = len;
for(int i=0;i<len && Char.ToLower(s1[i])==Char.ToLower(s2[i]);++i){
++l_pos;
}
for(int i=len-1;i>0 && Char.ToLower(s1[i])==Char.ToLower(s2[i]);--i){
--r_pos;
}
string leftMatch = s1.Substring(0,l_pos);
string center1 = s1.Substring(l_pos, r_pos - l_pos);
string center2 = s2.Substring(l_pos, r_pos - l_pos);
string rightMatch = s1.Substring(r_pos);
Console.Write(
"leftMatch = \"{0}\"\n" +
"center1 = \"{1}\"\n" +
"center2 = \"{2}\"\n" +
"rightMatch = \"{3}\"\n",leftMatch, center1, center2, rightMatch);
}
}

Categories