C# Sort Lithuanian Letters - c#

I need sort letters from file as alphabet. How can i do this? I need ToString method. Now console prints:
ABCDEFGIJKLM...ĄČĖĮ...
I need to get this:
AĄBCČDEĘĖFGHIĮYJKLMNOPRSŠTUŲŪVZŽ
Thanks
static char[] Letters(string e) //
{
e = e.ToUpper();
char[] mas = new char[32];
int n = 0;
foreach (char r in e)
if (Array.IndexOf(mas, r) < 0)
if (Char.IsLetter(r))
mas[n++] = r;
Array.Resize(ref mas, n);
Array.Sort(mas);
return mas;
}

You can solve this by sorting the characters using a comparer that understands how to compare characters alphabetically (the default is ordinal comparison).
This implementation is very inefficient, because it converts chars to strings every time it does a compare, but it works:
public class CharComparer : IComparer<char>
{
readonly CultureInfo culture;
public CharComparer(CultureInfo culture)
{
this.culture = culture;
}
public int Compare(char x, char y)
{
return string.Compare(new string(x, 1), 0, new string(y, 1), 0, 1, false, culture);
}
}
(Note: The culture is not actually necessary here; it works without it. I just included it for completeness.)
Then you can use that with sort functions that accept anIComparer, such as Array.Sort():
static void Main()
{
var test = "AĄBCČDEĘĖFGHIĮYJKLMNOPRSŠTUŲŪVZŽ".ToCharArray();
Console.OutputEncoding = System.Text.Encoding.Unicode;
Array.Sort(test);
Console.WriteLine(new string(test)); // Wrong result using default char comparer.
Array.Sort(test, new CharComparer(CultureInfo.GetCultureInfo("lt"))); // Right result using string comparer.
Console.WriteLine(new string(test));
}
An alternative approach is to use an array of single-character strings rather than an array of chars, and sort that instead. This works because the sort functions will use the string comparer, which understands alphabetical order:
var test = "AĄBCČDEĘĖFGHIĮYJKLMNOPRSŠTUŲŪVZŽ".Select(x => new string(x, 1)).ToArray();
Console.OutputEncoding = System.Text.Encoding.Unicode;
Array.Sort(test); // Correct result because it uses the string comparer, which understands alphabetical order.
Console.WriteLine(string.Concat(test));
Or using Linq:
var test = "AĄBCČDEĘĖFGHIĮYJKLMNOPRSŠTUŲŪVZŽ".Select(x => new string(x, 1)).ToArray();
Console.OutputEncoding = System.Text.Encoding.Unicode;
// Correct result because it uses the string comparer, which understands alphabetical order.
test = test.OrderBy(x => x).ToArray();
Console.WriteLine(string.Concat(test));
Using an array of strings instead of an array of chars is probably more performant when sorting like this.

You could use following method to remove diacritics:
static string RemoveDiacritics(string text)
{
var normalizedString = text.Normalize(NormalizationForm.FormD);
var stringBuilder = new StringBuilder();
foreach (var c in normalizedString)
{
var unicodeCategory = CharUnicodeInfo.GetUnicodeCategory(c);
if (unicodeCategory != UnicodeCategory.NonSpacingMark)
{
stringBuilder.Append(c);
}
}
return stringBuilder.ToString().Normalize(NormalizationForm.FormC);
}
Then you can use those chars for the ordering:
string e = "ABCDEFGIJKLM...ĄČĖĮ...";
var normalizedCharList = e.Zip(RemoveDiacritics(e), (chr, n) => new { chr, normValue = (int)n }).ToList();
var orderedChars = normalizedCharList.OrderBy(x => x.normValue).Select(x => x.chr);
string ordered = new String(orderedChars.ToArray());

Related

rearrange the characters of a string so that any two adjacent characters are not the same

How to rearrange the characters of a string so that any two adjacent characters are not the same? using c#
c#
Without using Hashmaps and Dictionary
I managed to find each element of the string, and the occurrence of each element.
This is what I've done so far
Using LINQ, you can gather the characters of the string, group them by duplicate characters, then pivot the groups and join then back into a string.
First, some extension methods to make Join easier:
public static class IEnumerableExt {
public static string Join(this IEnumerable<char> chars) => String.Concat(chars); // faster >= .Net Core 2.1
public static string Join(this IEnumerable<string> strings) => String.Concat(strings);
}
Then, an extension method to pivot an IEnumerable<IEnumerable<T>>:
public static class IEnumerableIEnumerableExt {
public static IEnumerable<IEnumerable<T>> Pivot<T>(this IEnumerable<IEnumerable<T>> src) {
var enums = src.Select(ie => ie.GetEnumerator()).ToList();
var hasMores = Enumerable.Range(0, enums.Count).Select(n => true).ToList();
for (; ; ) {
var oneGroup = new List<T>();
var hasMore = false;
for (int j1 = 0; j1 < enums.Count; ++j1) {
if (hasMores[j1]) {
hasMores[j1] = enums[j1].MoveNext();
hasMore = hasMore || hasMores[j1];
if (hasMores[j1])
oneGroup.Add(enums[j1].Current);
}
}
if (!hasMore)
break;
yield return oneGroup;
}
for (int j1 = 0; j1 < enums.Count; ++j1)
enums[j1].Dispose();
}
}
Finally, use these to solve your problem:
var s = "How to rearrange the characters of a string";
var tryAns = s.OrderBy(c => c)
.GroupBy(ch => ch)
.Pivot()
.Select(gch => gch.Join())
.Join();
var dupRE = new Regex(#"(.)\1", RegexOptions.Compiled);
var hasDups = dupRE.IsMatch(tryAns);
// tryAns will be " Hacefghinorstw aceghnorst aeort aert ar r "
// hasDups will be false
If the resulting string has two adjacent identical characters, then hasDups will be true.

How to output unique symbols except case c# without LINQ

Output unique symbols ignoring case
IDictionary<char, int> charDict = new Dictionary<char, int>();
foreach (var ch in text)
{
if (!charDict.TryGetValue(ch, out n)) {
charDict.Add(new KeyValuePair<char, int>(ch, 1));
} else
{
charDict[ch]++;
}
}
Appellodppsafs => Apelodsf
And Is it possible not to use LINQ?
Use a HashSet<char> to remember existing characters (that's what Distinct() does internally)
Assuming your input and expected result are type string
string input = "Appellodppsafs";
HashSet<char> crs = new HashSet<char>();
string result = string.Concat(input.Where(x => crs.Add(char.ToLower(x)))); //Apelodsf
You can try this (if you do not have long strings or performance issues):
string str = "Appellodppsafs";
string result = string.Concat(str.Select(s => $"{s}")
.Distinct(StringComparer.InvariantCultureIgnoreCase));
Console.WriteLine(result);
Output:
Apelodsf

Removing matching characters between two strings

I want to remove the characters which are matching between the two given strings. Eg.
string str1 = "Abbbccd";
string str2 = "Ebbd";
From these two strings I want the output as:
"Abcc", only those many matching characters should be removed from str1,which are present in str2.
I tried the following code:
public string Sub(string str1, string str2)
{
char[] arr1 = str1.ToCharArray();
char[] arr2 = str2.ToCharArray();
char[] arrDifference = arr1.Except(arr2).ToArray();
string final = new string(arrDifference);
return final;
}
With this code I get the output as "Ac". It removes all the matching characters between two arrays and stores 'c' only once.
First create this helper method:
IEnumerable<Tuple<char, int>> IndexDistinct(IEnumerable<char> source)
{
var D = new Dictionary<char, int>();
foreach (var c in source)
{
D[c] = D.ContainsKey(c) ? (D[c] + 1) : 0;
yield return Tuple.Create(c, D[c]);
}
}
It converts a string "aabcccd" to [(a,0),(a,1),(b,0),(c,0),(c,1),(c,2),(d,0)]. The idea is to make every character distinct by adding a counting index on equal characters.
Then modify your proposed function like this:
string Sub(string str1, string str2)
{
return new string(
IndexDistinct(str1)
.Except(IndexDistinct(str2))
.Select(x => x.Item1)
.ToArray());
}
Now that you are doing Except on Tuple<char, int> instead of just char, you should get the behavior you specified.
You can do it with lists as well:
List<char> one = new List<char>("Abbbccd".ToCharArray());
List<char> two = new List<char>("Ebbd".ToCharArray());
foreach (char c in two) {
try { one.RemoveAt(one.IndexOf(c)); } catch { }
}
string result = new string(one.ToArray());
Use C#'s string commands to modify the string.
public string testmethod(string str1, string str2)
{
string result = str1;
foreach (char character in str2.ToCharArray())
{
result = result.Replace(character.ToString(), "");
}
return result;
}

compare the characters in two strings

In C#, how do I compare the characters in two strings.
For example, let's say I have these two strings
"bc3231dsc" and "bc3462dsc"
How do I programically figure out the the strings
both start with "bc3" and end with "dsc"?
So the given would be two variables:
var1 = "bc3231dsc";
var2 = "bc3462dsc";
After comparing each characters from var1 to var2, I would want the output to be:
leftMatch = "bc3";
center1 = "231";
center2 = "462";
rightMatch = "dsc";
Conditions:
1. The strings will always be a length of 9 character.
2. The strings are not case sensitive.
The string class has 2 methods (StartsWith and Endwith) that you can use.
After reading your question and the already given answers i think there are some constraints are missing, which are maybe obvious to you, but not to the community. But maybe we can do a little guess work:
You'll have a bunch of string pairs that should be compared.
The two strings in each pair are of the same length or you are only interested by comparing the characters read simultaneously from left to right.
Get some kind of enumeration that tells me where each block starts and how long it is.
Due to the fact, that a string is only a enumeration of chars you could use LINQ here to get an idea of the matching characters like this:
private IEnumerable<bool> CommonChars(string first, string second)
{
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
var charsToCompare = first.Zip(second, (LeftChar, RightChar) => new { LeftChar, RightChar });
var matchingChars = charsToCompare.Select(pair => pair.LeftChar == pair.RightChar);
return matchingChars;
}
With this we can proceed and now find out how long each block of consecutive true and false flags are with this method:
private IEnumerable<Tuple<int, int>> Pack(IEnumerable<bool> source)
{
if (source == null)
throw new ArgumentNullException("source");
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
bool current = iterator.Current;
int index = 0;
int length = 1;
while (iterator.MoveNext())
{
if(current != iterator.Current)
{
yield return Tuple.Create(index, length);
index += length;
length = 0;
}
current = iterator.Current;
length++;
}
yield return Tuple.Create(index, length);
}
}
Currently i don't know if there is an already existing LINQ function that provides the same functionality. As far as i have already read it should be possible with SelectMany() (cause in theory you can accomplish any LINQ task with this method), but as an adhoc implementation the above was easier (for me).
These functions could then be used in a way something like this:
var firstString = "bc3231dsc";
var secondString = "bc3462dsc";
var commonChars = CommonChars(firstString, secondString);
var packs = Pack(commonChars);
foreach (var item in packs)
{
Console.WriteLine("Left side: " + firstString.Substring(item.Item1, item.Item2));
Console.WriteLine("Right side: " + secondString.Substring(item.Item1, item.Item2));
Console.WriteLine();
}
Which would you then give this output:
Left side: bc3
Right side: bc3
Left side: 231
Right side: 462
Left side: dsc
Right side: dsc
The biggest drawback is in someway the usage of Tuple cause it leads to the ugly property names Item1 and Item2 which are far away from being instantly readable. But if it is really wanted you could introduce your own simple class holding two integers and has some rock-solid property names. Also currently the information is lost about if each block is shared by both strings or if they are different. But once again it should be fairly simply to get this information also into the tuple or your own class.
static void Main(string[] args)
{
string test1 = "bc3231dsc";
string tes2 = "bc3462dsc";
string firstmatch = GetMatch(test1, tes2, false);
string lasttmatch = GetMatch(test1, tes2, true);
string center1 = test1.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
string center2 = test2.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
}
public static string GetMatch(string fist, string second, bool isReverse)
{
if (isReverse)
{
fist = ReverseString(fist);
second = ReverseString(second);
}
StringBuilder builder = new StringBuilder();
char[] ar1 = fist.ToArray();
for (int i = 0; i < ar1.Length; i++)
{
if (fist.Length > i + 1 && ar1[i].Equals(second[i]))
{
builder.Append(ar1[i]);
}
else
{
break;
}
}
if (isReverse)
{
return ReverseString(builder.ToString());
}
return builder.ToString();
}
public static string ReverseString(string s)
{
char[] arr = s.ToCharArray();
Array.Reverse(arr);
return new string(arr);
}
Pseudo code of what you need..
int stringpos = 0
string resultstart = ""
while not end of string (either of the two)
{
if string1.substr(stringpos) == string1.substr(stringpos)
resultstart =resultstart + string1.substr(stringpos)
else
exit while
}
resultstart has you start string.. you can do the same going backwards...
Another solution you can use is Regular Expressions.
Regex re = new Regex("^bc3.*?dsc$");
String first = "bc3231dsc";
if(re.IsMatch(first)) {
//Act accordingly...
}
This gives you more flexibility when matching. The pattern above matches any string that starts in bc3 and ends in dsc with anything between except a linefeed. By changing .*? to \d, you could specify that you only want digits between the two fields. From there, the possibilities are endless.
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
List<string> common_str = commonStrings(s1,s2);
foreach ( var s in common_str)
Console.WriteLine(s);
}
static public List<string> commonStrings(string s1, string s2){
int len = s1.Length;
char [] match_chars = new char[len];
for(var i = 0; i < len ; ++i)
match_chars[i] = (Char.ToLower(s1[i])==Char.ToLower(s2[i]))? '#' : '_';
string pat = new String(match_chars);
Regex regex = new Regex("(#+)", RegexOptions.Compiled);
List<string> result = new List<string>();
foreach (Match match in regex.Matches(pat))
result.Add(s1.Substring(match.Index, match.Length));
return result;
}
}
for UPDATE CONDITION
using System;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
int len = 9;//s1.Length;//cond.1)
int l_pos = 0;
int r_pos = len;
for(int i=0;i<len && Char.ToLower(s1[i])==Char.ToLower(s2[i]);++i){
++l_pos;
}
for(int i=len-1;i>0 && Char.ToLower(s1[i])==Char.ToLower(s2[i]);--i){
--r_pos;
}
string leftMatch = s1.Substring(0,l_pos);
string center1 = s1.Substring(l_pos, r_pos - l_pos);
string center2 = s2.Substring(l_pos, r_pos - l_pos);
string rightMatch = s1.Substring(r_pos);
Console.Write(
"leftMatch = \"{0}\"\n" +
"center1 = \"{1}\"\n" +
"center2 = \"{2}\"\n" +
"rightMatch = \"{3}\"\n",leftMatch, center1, center2, rightMatch);
}
}

How to sort a string array by numeric style?

I have a filenames array, I want to sort it by numeric style, please give to me a solution.
Example1:
Original array: [name99.txt, name98.txt, name100.txt]
Sorted array: [name98.txt, name99.txt, name100.txt]
(Using string sorting, result of sorting is [name100.txt, name98.txt, name99.txt])
Example2:
Original array: [a99.txt, b98.txt, b100.txt]
Sorted array: [a99.txt, b98.txt, b100.txt]
(Using string sorting, result of sorting is [a99.txt, b100.txt, b99.txt])
string[] ar = new string[] { "name99.txt", "name98.txt", "name100.txt" };
Array.Sort(ar, (a, b) => int.Parse(Regex.Replace(a, "[^0-9]", "")) - int.Parse(Regex.Replace(b, "[^0-9]", "")));
foreach (var a in ar)
Console.WriteLine(a);
The above assumed that your files are allways called name###.txt. For the real numeric sorting use the following more complicated version:
public static void NumericalSort(string[] ar)
{
Regex rgx = new Regex("([^0-9]*)([0-9]+)");
Array.Sort(ar, (a, b) =>
{
var ma = rgx.Matches(a);
var mb = rgx.Matches(b);
for (int i = 0; i < ma.Count; ++i)
{
int ret = ma[i].Groups[1].Value.CompareTo(mb[i].Groups[1].Value);
if (ret != 0)
return ret;
ret = int.Parse(ma[i].Groups[2].Value) - int.Parse(mb[i].Groups[2].Value);
if (ret != 0)
return ret;
}
return 0;
});
}
static void Main(string[] args)
{
string[] ar = new string[] { "a99.txt", "b98.txt", "b100.txt" };
NumericalSort(ar);
foreach (var a in ar)
Console.WriteLine(a);
}
There may well be a managed way to do this, but I would probably just P/invoke to StrCmpLogicalW.
[DllImport("shlwapi.dll", CharSet=CharSet.Unicode, ExactSpelling=true)]
static extern int StrCmpLogicalW(String x, String y);
If you use this function, rather than rolling your own comparison function, you'll get the same behaviour as Explorer and other system components that use logical comparison.
Note, however, that this will not work in environments where WinAPI is inaccessible (such as Windows Phone, Mono or Silverlight), might work differently on different systems and should be decorated with a comment so the future maintainer of your code knows why P/Invoke is used for sorting.
One solution can be found here: Alphanumeric Sorting
My approach is good when the length of the numeric chunks is no longer than 9 digits:
private string[] NumericalSort(IEnumerable<string> list)
{
var ar = list.ToArray();
Array.Sort(ar, (a, b) =>
{
var aa = Regex.Replace(a, #"\d+", m => m.Value.PadLeft(9, '0'));
var bb = Regex.Replace(b, #"\d+", m => m.Value.PadLeft(9, '0'));
return string.Compare(aa, bb);
});
return ar;
}

Categories