Lexicographically Sort String Array - c#

I've been working on a challenge and have researched it for several hours and am still having trouble how to "properly" sort a string array or list of string in lexicographical order in C#.
The problem of the challenge I'm working:
Take into account only the lower case letters of two strings.
Compare them to find out which characters occur in both strings.
Compare those characters and find out which of the original strings contains the most occurrences of each character.
Append the result as a string in a format that depicts which string has the higher count "1:" or "2:" or if equal "=:" followed by all the characters in that string joined by "/".
Sort the result in decreasing order of their length, and when they are of equal lengths sort the substrings in ascending lexicographic order.
An example of the result is and below it is my output:
"2:eeeee/2:yy/=:hh/=:rr"
"2:eeeee/2:yy/=:rr/=:hh"
Another example of a correct result is and below it is my output:
1:ooo/1:uuu/2:sss/=:nnn/1:ii/2:aa/2:dd/2:ee/=:gg
=:nnn/1:ooo/1:uuu/2:sss/=:gg/1:ii/2:aa/2:dd/2:ee
The line of code that is causing this is:
strings = strings.OrderByDescending(x => x.Length).ThenBy(c => c).ToArray();
I've tried different ways of approaching this problem such as splitting the string into individual string arrays of certain lengths, perform a lexicographic order, then append them back into the result. But with the many different test cases, one would pass and the other would fail.
My issue is finding out why C# sees "=" as LESS THAN digits, when really it's greater on the ASCII chart. I ran a test and that is what String.Compare gave me. In Python, it gave me something different.
Here is my complete code, feel free to point out any bugs. I've only been programming for 9 months. I am aware it isn't the best solution.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace Rextester
{
public class Program
{
public static void Main(string[] args)
{
string s1 = "looping is fun but dangerous";
string s2 = "less dangerous than coding";
// Expected
Console.WriteLine("1:ooo/1:uuu/2:sss/=:nnn/1:ii/2:aa/2:dd/2:ee/=:gg\n");
// Result
Console.WriteLine(StringsMix(s1, s2));
}
public static string StringsMix(string s1, string s2)
{
StringBuilder sb = new StringBuilder();
// Convert string to char arrays order by ascending
char[] s1Chars = s1.Where(x => char.IsLower(x)).OrderBy(x => x).ToArray();
char[] s2Chars = s2.Where(x => char.IsLower(x)).OrderBy(x => x).ToArray();
// Insert arrays to find characters that appear in both
char[] inter = s1Chars.Intersect(s2Chars).ToArray();
for (int i = 0; i < inter.Length; i++){
// For each character, put all occurences in their respective array
// Get count
char[] s1Ch = s1.Where(x => x.Equals(inter[i])).ToArray();
char[] s2Ch = s2.Where(x => x.Equals(inter[i])).ToArray();
int s1Count = s1Ch.Length;
int s2Count = s2Ch.Length;
if (s1Count > s2Count)
{
string chars = new String(s1Ch);
sb.Append("1:" + chars + "/");
}
else if (s2Count > s1Count)
{
string chars = new String(s2Ch);
sb.Append("2:" + chars + "/");
}
else if (s1Count == s2Count)
{
string chars = new String(s1Ch);
sb.Append("=:" + chars + "/");
}
}
string final = String.Empty;
string[] strings = sb.ToString().Split('/');
strings = strings.OrderByDescending(x => x.Length).ThenBy(c => c).ToArray(); // "Lexicographical ordering"
final = String.Join("/", strings);
strings = final.Split('/').Where(x => x.Length > 3).Select(x => x).ToArray(); // Remove trailing single characters
final = String.Join("/", strings);
return final;
}
}
}

This happens because '=' sorts before '1' and '2'; you want it to sort after the digits.
You can force this order by adding a special condition in the middle:
var specialOrder = "12=";
var ordered = data
.OrderByDescending(s => s.Length)
.ThenBy(s => specialOrder.IndexOf(s[0])) // <<== Add this
.ThenBy(s => s);
This will ensure that the initial character sorts according to the order of characters in specialOrder string, i.e. '1', then '2', then '='.
Demo.
Note: The code makes an assumption that the sequence has no empty strings. Your code ensures that each string has at least three characters, so it is not a problem.

Related

Trying to filter only digits in string array using LINQ

I'm trying to filter only digits in string array. This works if I have this array:
12324 asddd 123 123, but if I have chars and digits in one string e.g. asd1234, it does not take it.
Can u help me how to do it ?
int[] result = input
.Where(x => x.All(char.IsDigit))// tried with .Any(), .TakeWhile() and .SkipWhile()
.Select(int.Parse)
.Where(x => x % 2 == 0)
.ToArray();
Something like this should work. The function digitString will select only digits from the input string, and recombine into a new string. The rest is simple, just predicates selecting non-empty strings and even numbers.
var values = new[]
{
"helloworld",
"hello2",
"4",
"hello123world123"
};
bool isEven(int i) => i % 2 == 0;
bool notEmpty(string s) => s.Length > 0;
string digitString(string s) => new string(s.Where(char.IsDigit).ToArray());
var valuesFiltered = values
.Select(digitString)
.Where(notEmpty)
.Select(int.Parse)
.Where(isEven)
.ToArray();
You need to do it in 2 steps: First filter out all the invalid strings, then filter out all the non-digits in the valid strings.
A helper Method would be very readable here, but it is also possible with pure LINQ:
var input = new[]{ "123d", "12e", "pp", "33z3"};
input
.Where(x => x.Any(char.IsDigit))
.Select(str => string.Concat(str.Where(char.IsDigit)));
Possible null values should be drop to avoid NullReferenceException.
string.Join() suitable for concatenation with digit filtering.
Additinally empty texts should be dropped because it cannot be converted to an integer.
string[] input = new string[] { "1234", "asd124", "2345", "2346", null, "", "asdfas", "2" };
int[] result = input
.Where(s => s != null)
.Select(s => string.Join("", s.Where(char.IsDigit)))
.Where(s => s != string.Empty)
.Select(int.Parse)
.Where(x => x % 2 == 0)
.ToArray();
Using Linq Aggregate method and TryParse() can give you perfect result:
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
var input = new string[] { "aaa123aaa124", "aa", "778", "a777", null };
var result = input.Aggregate(
new List<int>(),
(x, y) =>
{
if (y is null)
return x;
var digitOnlyString = Regex.Replace(y, "[^0-9]", string.Empty);
if (int.TryParse(digitOnlyString, out var temp) && temp % 2 == 0)
x.Add(temp);
return x;
})
.ToArray();
Max,
You can do this in a single expression like so:
using System.Linq;
using System.Text.RegularExpressions;
var input = new[] { "aaa123aaa124", "aa", "778", "a777", null };
var rx = new Regex(#"[0-9]+");
var numbersOnly = input.Where(s => !string.IsNullOrEmpty(s) && rx.IsMatch(s))
.Select(s => string.Join("", rx.Matches(s).Cast<Match>().Select(m => m.Value)));
foreach (var number in numbersOnly) Console.WriteLine(number);
Which returns:
123124
778
777
if I have chars and digits in one string e.g. asd1234, it does not take it
Apparently you want to parse this line also. You want to translate "asd1234" to "1234" and then parse it.
But what if your input sequence of strings contains a string with two numbers: "123asd456". Do you want to interpret this as "123", or maybe as "123456", or maybe you consider this as two numbers "123" and "456".
Let's assume you don't have this problem: every string contains at utmost one number, or if you have a string with more than one number, you only want the first number.
In fact, you only want to keep those string that are "zero or more non-digits followed by one or more digits followed by zero or more characters.
Enter Regular Expressions!
const string regexTxt = "\D*\d+.*";
Regex regex = new Regex(regexTxt);
\D: any non-digit
*: zero or more
\d: any digit
+: one or more
. any character
(...) capture the parts between the parentheses
So this regular expression matches any string that starts with zero or more non-digits, followed by at least one digit, followed by zero or more characters. Capture the "at least one digit" part.
If you try to Match() an input string with this regular expression, you get a Match object. Property Success tells you whether the input string is according to the regular expression.
The Match object has a property Groups which contains all Matches. Groups[0] is the complete string, Groups1 contains a Group which has the first captured string in property Value.
A simple program that shows how to use the regular expression:
const string regexTxt = #"\D*(\d+).*";
Regex regex = new Regex(regexTxt);
var lines = new string[]
{
String.Empty,
"A",
"A lot of text, no numbers",
"1",
"123456",
"Some text and then a number 123",
"Several characters, then a number: 123 followed by another number 456!",
"___123---456...",
};
foreach (var line in lines)
{
Match match = regex.Match(line);
if (match.Success)
{
string capturedDigits = match.Groups[1].Value;
int capturedNumber = Int32.Parse(capturedDigits);
Console.WriteLine("{0} => {1}", line, capturedNumber);
}
}
Or in a LINQ statement:
const string regexTxt = #"\D*(\d+).*";
Regex regex = new Regex(regexTxt);
IEnumerable<string> sourceLines = ...
var numbers= sourceLines
.Select(line => regex.Match(line)) // Match the Regex
.Where(match => match.IsMatch) // Keep only the Matches that match
.Select(match => Int32.Parse(match.Groups[1].Value);
// Finally Parse the captured text to int

How to split a string of numbers on the white space character and convert to integers

I'm working on some homework and I need to get an input from the user which is a single line of numbers separated by spaces. I want to split this string and get the individual numbers out so that I can insert them into a Binary Search Tree.
I tried the split function and was able to rid of the white space but I'm not sure how to "collect" the individual numbers.
string data;
string[] newdata = { };
Console.WriteLine("Please enter a list of integers with spaces
between each number.\n");
data = Console.ReadLine();
newdata = data.Split(null);
Console.WriteLine(String.Join(Environment.NewLine, newdata));
I want to somehow collect the elements from newdata string array and convert them into integers but I'm having a tough time figuring out how to do that.
Well, you could use Linq .Select method combined with .Split method:
List<int> newData = data.Split(' ').Select(int.Parse).ToList();
If you want user to be able to enter empty spaces we need to trim the resulting strings after split. For that we can use another overload of string.Split method that accepts StringSplitOptions :
List<int> newData = data.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries).Select(int.Parse).ToList();
Finally if you want to allow user to enter incorrect data at times and still get collection of valid ints you could use int.TryParse and filter out values that were parsed incorrectly:
List<int> newData = data.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries)
.Select(s => int.TryParse(s.Trim(), out var n) ? (int?)n : null)
.Where(x => x != null)
.Select(i => i.Value)
.ToList();
Some smart LINQ answers are already provided, here is my extended step by step solution which also allows to ignore invalid numbers:
//Read input string
Console.Write("Input numbers separated by space: ");
string inputString = Console.ReadLine();
//Split by spaces
string[] splittedInput = inputString.Split(' ');
//Create a list to store all valid numbers
List<int> validNumbers = new List<int>();
//Iterate all splitted parts
foreach (string input in splittedInput)
{
//Try to parse the splitted part
if (int.TryParse(input, out int number) == true)
{
//Add the valid number
validNumbers.Add(number);
}
}
//Print all valid numbers
Console.WriteLine(string.Join(", ", validNumbers));
OK as code:
var words = data.Split();
int i;
List<int> integers = new List<int>();
foreach(var s in words)
{
if (int.TryParse(s, out i)) {integers.Add(i);}
}
// now you have a list of integers
// if using decimal, use decimal instead of integer
You can do as follows.
var numbers = Console.ReadLine();
var listOfNumbers = numbers.Split(new[]{" "},StringSplitOptions.RemoveEmptyEntries)
.Select(x=> Int32.Parse(x));
The above lines split the user input based on "whitespace", removing any empty entries in between, and then converts the string numbers to integers.
The StringSplitOptions.RemoveEmptyEntries ensures that empty entries are removed. An example of empty entry would be an string where two delimiters occur next to each other. For example, "2 3 4 5", there are two whitespaces between 2 and 3,which means, when you are spliting the string with whitespace as delimiter, you end up with an empty element in array. This is eliminated by usage of StringSplitOptions.RemoveEmptyEntries
Depending on whether you are expecting Integers or Decimals, you can use Int32.Parse or Double.Parse (or float/decimal etc)
Furthermore, you can include checks to ensure you have a valid number, otherwise throw an exception. You can alter the query as follows.
var listOfNumbers = numbers.Split(new[]{" "},StringSplitOptions.RemoveEmptyEntries)
.Select(x=>
{
Console.WriteLine(x);
if(Int32.TryParse(x,out var number))
return number;
else
throw new Exception("Element is not a number");
});
This ensures all the element in the list are valid numbers, otherwise throw an exception.
Keep the spaces and do the "split" using the space "data.split(' ');".

split strings into many strings by newline?

i have incoming data that needs to be split into multiple values...ie.
2345\n564532\n345634\n234 234543\n1324 2435\n
The length is inconsistent when i receive it, the spacing is inconsistent when it is present, and i want to analyze the last 3 digits before each \n. how do i break off the string and turn it into a new string? like i said, this round, it may have 3 \n commands, next time, it may have 10, how do i create 3 new strings, analyze them, then destroy them before the next 10 come in?
string[] result = x.Split('\r');
result = x.Split(splitAtReturn, StringSplitOptions.None);
string stringToAnalyze = null;
foreach (string s in result)
{
if (s != "\r")
{
stringToAnalyze += s;
}
else
{
how do i analyze the characters here?
}
}
You could use the string.Split method. In particular I suggest to use the overload that use a string array of possible separators. This because splitting on the newline character poses an unique problem. In you example all the newline chars are simply a '\n', but for some OS the newline char is '\r\n' and if you can't rule out the possibility to have the twos in the same file then
string test = "2345\n564532\n345634\n234 234543\n1324 2435\n";
string[] result = test.Split(new string[] {"\n", "\r\n"}, StringSplitOptions.RemoveEmptyEntries);
Instead if your are certain that the file contains only the newline separator allowed by your OS then you could use
string test = "2345\n564532\n345634\n234 234543\n1324 2435\n";
string[] result = test.Split(new string[] {Environment.NewLine}, StringSplitOptions.RemoveEmptyEntries);
The StringSplitOptions.RemoveEmptyEntries allows to capture a pair of consecutive newline or an ending newline as an empty string.
Now you can work on the array examining the last 3 digits of every string
foreach(string s in result)
{
// Check to have at least 3 chars, no less
// otherwise an exception will occur
int maxLen = Math.Min(s.Length, 3);
string lastThree = s.Substring(s.Length - maxLen, maxLen);
... work on last 3 digits
}
Instead, if you want to work only using the index of the newline character without splitting the original string, you could use string.IndexOf in this way
string test = "2345\n564532\n345634\n234 234543\n1324 2435\n";
int pos = -1;
while((pos = test.IndexOf('\n', pos + 1)) != -1)
{
if(pos < test.Length)
{
string last3part = test.Substring(pos - 3, 3);
Console.WriteLine(last3part);
}
}
string lines = "2345\n564532\n345634\n234 234543\n1324 2435\n";
var last3Digits = lines.Split("\r\n".ToCharArray(), StringSplitOptions.RemoveEmptyEntries)
.Select(line => line.Substring(line.Length - 3))
.ToList();
foreach(var my3digitnum in last3Chars)
{
}
last3Digits : [345, 532, 634, 543, 435]
This has been answered before, check this thread:
Easiest way to split a string on newlines in .NET?
An alternative way is using StringReader:
using (System.IO.StringReader reader = new System.IO.StringReader(input)) {
string line = reader.ReadLine();
}
Your answer is: theStringYouGot.Split('\n'); where you get an array of strings to do your processing for.

Get multiple numbers from a string

I have strings like
AS_!SD 2453iur ks#d9304-52kasd
I need to get the 2 frist numbres of the string:
for that case will be:
2453 and 9304
I don't have any delimiter in the string to try a split, and the length of the numbers and string is variable, I'm working in C# framework 4.0 in a WPF.
thanks for the help, and sorry for my bad english
This solution will take two first numbers, each can have any number of digits
string s = "AS_!SD 2453iur ks#d9304-52kasd";
MatchCollection matches = Regex.Matches(s, #"\d+");
string[] result = matches.Cast<Match>()
.Take(2)
.Select(match => match.Value)
.ToArray();
Console.WriteLine( string.Join(Environment.NewLine, result) );
will print
2453
9304
you can parse them to int[] by result.Select(int.Parse).ToArray();
You can loop chars of your string parsing them, if you got a exception thats a letter if not is a number them you must to have a list to add this two numbers, and a counter to limitate this.
follow a pseudocode:
for char in string:
if counter == 2:
stop loop
if parse gets exception
continue
else
loop again in samestring stating this point
if parse gets exception
stop loop
else add char to list
Alternatively you can use the ASCII encoding:
string value = "AS_!SD 2453iur ks#d9304-52kasd";
byte zero = 48; // 0
byte nine = 57; // 9
byte[] asciiBytes = Encoding.ASCII.GetBytes(value);
byte[] asciiNumbers = asciiBytes.Where(b => b >= zero && b <= nine)
.ToArray();
char[] numbers = Encoding.ASCII.GetChars(asciiNumbers);
// OR
string numbersString = Encoding.ASCII.GetString(asciiNumbers);
//First two number from char array
int aNum = Convert.ToInt32(numbers[0]);
int bNum = Convert.ToInt32(numbers[1]);
//First two number from string
string aString = numbersString.Substring(0,2);

How to find the number of occurrences of a letter in only the first sentence of a string?

I want to find number of letter "a" in only first sentence. The code below finds "a" in all sentences, but I want in only first sentence.
static void Main(string[] args)
{
string text; int k = 0;
text = "bla bla bla. something second. maybe last sentence.";
foreach (char a in text)
{
char b = 'a';
if (b == a)
{
k += 1;
}
}
Console.WriteLine("number of a in first sentence is " + k);
Console.ReadKey();
}
This will split the string into an array seperated by '.', then counts the number of 'a' char's in the first element of the array (the first sentence).
var count = Text.Split(new[] { '.', '!', '?', })[0].Count(c => c == 'a');
This example assumes a sentence is separated by a ., ? or !. If you have a decimal number in your string (e.g. 123.456), that will count as a sentence break. Breaking up a string into accurate sentences is a fairly complex exercise.
This is perhaps more verbose than what you were looking for, but hopefully it'll breed understanding as you read through it.
public static void Main()
{
//Make an array of the possible sentence enders. Doing this pattern lets us easily update
// the code later if it becomes necessary, or allows us easily to move this to an input
// parameter
string[] SentenceEnders = new string[] {"$", #"\.", #"\?", #"\!" /* Add Any Others */};
string WhatToFind = "a"; //What are we looking for? Regular Expressions Will Work Too!!!
string SentenceToCheck = "This, but not to exclude any others, is a sample."; //First example
string MultipleSentencesToCheck = #"
Is this a sentence
that breaks up
among multiple lines?
Yes!
It also has
more than one
sentence.
"; //Second Example
//This will split the input on all the enders put together(by way of joining them in [] inside a regular
// expression.
string[] SplitSentences = Regex.Split(SentenceToCheck, "[" + String.Join("", SentenceEnders) + "]", RegexOptions.IgnoreCase);
//SplitSentences is an array, with sentences on each index. The first index is the first sentence
string FirstSentence = SplitSentences[0];
//Now, split that single sentence on our matching pattern for what we should be counting
string[] SubSplitSentence = Regex.Split(FirstSentence, WhatToFind, RegexOptions.IgnoreCase);
//Now that it's split, it's split a number of times that matches how many matches we found, plus one
// (The "Left over" is the +1
int HowMany = SubSplitSentence.Length - 1;
System.Console.WriteLine(string.Format("We found, in the first sentence, {0} '{1}'.", HowMany, WhatToFind));
//Do all this again for the second example. Note that ideally, this would be in a separate function
// and you wouldn't be writing code twice, but I wanted you to see it without all the comments so you can
// compare and contrast
SplitSentences = Regex.Split(MultipleSentencesToCheck, "[" + String.Join("", SentenceEnders) + "]", RegexOptions.IgnoreCase | RegexOptions.Singleline);
SubSplitSentence = Regex.Split(SplitSentences[0], WhatToFind, RegexOptions.IgnoreCase | RegexOptions.Singleline);
HowMany = SubSplitSentence.Length - 1;
System.Console.WriteLine(string.Format("We found, in the second sentence, {0} '{1}'.", HowMany, WhatToFind));
}
Here is the output:
We found, in the first sentence, 3 'a'.
We found, in the second sentence, 4 'a'.
You didn't define "sentence", but if we assume it's always terminated by a period (.), just add this inside the loop:
if (a == '.') {
break;
}
Expand from this to support other sentence delimiters.
Simply "break" the foreach(...) loop when you encounter a "." (period)
Well, assuming you define a sentence as being ended with a '.''
Use String.IndexOf() to find the position of the first '.'. After that, searchin a SubString instead of the entire string.
find the place of the '.' in the text ( you can use split )
count the 'a' in the text from the place 0 to instance of the '.'
string SentenceToCheck = "Hi, I can wonder this situation where I can do best";
//Here I am giving several way to find this
//Using Regular Experession
int HowMany = Regex.Split(SentenceToCheck, "a", RegexOptions.IgnoreCase).Length - 1;
int i = Regex.Matches(SentenceToCheck, "a").Count;
// Simple way
int Count = SentenceToCheck.Length - SentenceToCheck.Replace("a", "").Length;
//Linq
var _lamdaCount = SentenceToCheck.ToCharArray().Where(t => t.ToString() != string.Empty)
.Select(t => t.ToString().ToUpper().Equals("A")).Count();
var _linqAIEnumareable = from _char in SentenceToCheck.ToCharArray()
where !String.IsNullOrEmpty(_char.ToString())
&& _char.ToString().ToUpper().Equals("A")
select _char;
int a =linqAIEnumareable.Count;
var _linqCount = from g in SentenceToCheck.ToCharArray()
where g.ToString().Equals("a")
select g;
int a = _linqCount.Count();

Categories