Count the number of contiguous equal characters - c#

I need to validate Serial numbers and one of the rules is that there are up to 5 contiguous equal characters allowed.
Example valid:
012W212222123 // 4x the digit 2 contiguous
Example invalid:
012W764444443 // 6x the digit 4
So I tried to get the maximum number of contiguous characters without success
int maxCount = "012W764444443".GroupBy(x => x).Max().Count();

I suggest using a regex for a check to see if there are 5 or more consecutive digits:
Regex.IsMatch(input, #"^(?!.*([0-9])\1{4})")
If any characters are meant:
Regex.IsMatch(input, #"^(?!.*(.)\1{4})")
See the regex demo
The regex finds a match in a string that contains less than 5 identical consecutive digits (version with [0-9]) or any characters other than a newline (version with .).
Details:
^ - start of string
-(?!.*(.)\1{4}) - a negative lookahead that fails the match if the pattern is matched:
.* - any 0+ chars other than a newline
(.) - Group 1 capturing any char but a newline
\1{4} - exactly 4 consecutive occurrences of the same value stored inside Group 1 (where \1 is a backreference and the {4} is a range/bound/limiting quantifier).
C#:
var strs = new List<string> { "012W212222123", "012W764444443"};
foreach (var s in strs)
Console.WriteLine("{0}: {1}", s, Regex.IsMatch(s, #"^(?!.*(.)\1{4})"));

Yet another option is to use this function:
public static int MaxNumberOfConsecutiveCharacters(string s)
{
if (s == null) throw new ArgumentNullException(nameof(s));
if (s.Length == 0) return 0;
int maxCount = 1;
int count = 1;
for (int i = 1; i < s.Length; i++)
{
if (s[i] == s[i-1])
{
count++;
if (count > maxCount) maxCount = count;
}
else
{
count = 1;
}
}
return maxCount;
}
Obviously, this is a lot more code than a regular expression. Depending on your knowledge of regular expressions, this may or may not be more readable to you. Also, this is probably more efficient than using a regular expression, which may or may not be important to you.

It's a little inefficient, but this works:
var max =
"012W212222123"
.Aggregate(
new { Char = ' ', Count = 0, Max = 0 },
(a, c) =>
a.Char == c
? new { Char = c, Count = a.Count + 1, Max = a.Max > a.Count + 1 ? a.Max : a.Count + 1 }
: new { Char = c, Count = 1, Max = a.Max > 1 ? a.Max : 1 })
.Max;
I tried with both inputs and got the right number of maximum repeats each time.

Related

regex to match comma delimited balanced square brackets having recursion

I'd like a regex to match comma-delimited, balanced square brackets where the contents of the square brackets might be comma-delimited balanced square brackets themselves.
Here are some examples:
example 1
input = "[abc],[def]"
groups
group 1 = "abc"
group 2 = "def"
example 2
input = "[[ghi],[jkl]],[mno[pqr]],[[stu]]"
groups
group 1 = "[ghi],[jkl]"
group 2 = "mno[pqr]"
group 3 = "[stu]"
So note that in the second example, "ghi" and "jkl" are not their own groups. I don't need to recurse all the way down, I just need a regex to find the "level 0" groups.
Here's some code that can get you started on how to parse those values out.
public static IEnumerable<string> SplitSquareBraketByComma(string input)
{
int start = 0;
int brakets = 0;
for(int i = 0; i < input.Length; i++)
{
if(input[i] == '[')
{
brakets++;
continue;
}
if(input[i] == ']')
{
brakets--;
continue;
}
if(brakets == 0 && input[i] == ',')
{
yield return input.Substring(start, i - start);
start = i + 1;
}
}
if(start < input.Length)
{
yield return input.Substring(start);
}
}
It basically keeps count of the brackets and when it sees a comma when the number of open brackets is zero it splits out a new string from the previous split.
Note: this does not have any code to check that the input is valid (having all balanced brackets) and thus it's leaving in the outer most brackets just in case.

Shortening a string of numbers

I have the following sequence of numbers:
You can see that those numbers a lot. I want to shorten that string. Let's say if the string contains more than 20 numbers, it should display 18 numbers, then "..." and then the last two of the sequence.
I could probably do that by adding those numbers in a List<int> or HashSet<int> (HashSet might be faster in this case), but I think it will be slow.
StringBuilder temp = new StringBuilder();
for (...)
{
temp.Append($"{number} ");
}
var sequence = temp.ToString();
Example of what I want:
7 9 12 16 18 21 25 27 30 34 36 39 43 45 48 52 54 57 ... 952 954
Note that I want only fast ways.
This version is about 8 times faster than the other answers and allocates only about 6% as much memory. I think you'll be hard-pressed to find a faster version:
static string Truncated(string input)
{
var indexOfEighteenthSpace = IndexOfCharSeekFromStart(input, ' ', 18);
if (indexOfEighteenthSpace <= 0) return input;
var indexOfSecondLastSpace = IndexOfCharSeekFromEnd(input, ' ', 2);
if (indexOfSecondLastSpace <= 0) return input;
if (indexOfSecondLastSpace <= indexOfEighteenthSpace) return input;
var leadingSegment = input.AsSpan().Slice(0, indexOfEighteenthSpace);
var trailingSegment = input.AsSpan().Slice(indexOfSecondLastSpace + 1);
return string.Concat(leadingSegment, " ... ", trailingSegment);
static int IndexOfCharSeekFromStart(string input, char value, int count)
{
var startIndex = 0;
for (var i = 0; i < count; i++)
{
startIndex = input.IndexOf(value, startIndex + 1);
if (startIndex <= 0) return startIndex;
}
return startIndex;
}
static int IndexOfCharSeekFromEnd(string input, char value, int count)
{
var endIndex = input.Length - 1;
for (var i = 0; i < count; i++)
{
endIndex = input.LastIndexOf(value, endIndex - 1);
if (endIndex <= 0) return endIndex;
}
return endIndex;
}
}
Small individual steps
How do I make a list from this sequence (string)?
var myList = myOriginalSequence.Split(' ').ToList();
How do you take the first 18 numbers from a list?
var first18Numbers = myList.Take(18);
How do you take the last 2 numbers from a list?
var last2Numbers = myList.Skip(myList.Count() - 2);
How do you ensure that this is only done when there are more than 20 numbers in the list?
if(myList.Count() > 20)
How do you make a new sequence string from a list?
var myNewSequence = String.Join(" ", myList);
Putting it all together
var myList = myOriginalSequence.Split(' ').ToList();
string myNewSequence;
if(myList.Count() > 20)
{
var first18Numbers = myList.Take(18);
var first18NumbersString = String.Join(" ", first18Numbers);
var last2Numbers = myList.Skip(myList.Count() - 2);
var last2NumbersString = String.Join(" ", last2Numbers);
myNewSequence = $"{first18NumbersString} ... {last2NumbersString}"
}
else
{
myNewSequence = myOriginalSequence;
}
Console.WriteLine(myNewSequence);
Try this:
public string Shorten(string str, int startCount, int endCount)
{
//first remove any leading or trailing whitespace
str = str.Trim();
//find the first startCount numbers by using IndexOf space
//i.e. this counts the number of spaces from the start until startCount is achieved
int spaceCount = 1;
int startInd = str.IndexOf(' ');
while (spaceCount < startCount && startInd > -1)
{
startInd = str.IndexOf(' ',startInd +1);
spaceCount++;
}
//find the last endCount numbers by using LastIndexOf space
//i.e. this counts the number of spaces from the end until endCount is achieved
int lastSpaceCount = 1;
int lastInd = str.LastIndexOf(' ');
while (lastSpaceCount < endCount && lastInd > -1)
{
lastInd = str.LastIndexOf(' ', lastInd - 1);
lastSpaceCount++;
}
//if the start ind or end ind are -1 or if lastInd <= startIndjust return the str
//as its not long enough and so doesn't need shortening
if (startInd == -1 || lastInd == -1 || lastInd <= startInd) return str;
//otherwise return the required shortened string
return $"{str.Substring(0, startInd)} ... {str.Substring(lastInd + 1)}";
}
the output of this:
Console.WriteLine(Shorten("123 123 123 123 123 123 123 123 123 123 123",4,3));
is:
123 123 123 123 ... 123 123 123
I think this may help :
public IEnumerable<string> ShortenList(string input)
{
List<int> list = input.Split(" ").Select(x=>int.Parse(x)).ToList();
if (list.Count > 20)
{
List<string> trimmedStringList = list.Take(18).Select(x=>x.ToString()).ToList();
trimmedStringList.Add("...");
trimmedStringList.Add(list[list.Count-2].ToString());
trimmedStringList.Add(list[list.Count - 1].ToString());
return trimmedStringList;
}
return list.Select(x => x.ToString());
}
No idea what the speed on this would be like but as a wild suggestion, you said the numbers come in string format and it looks like they're seperated by spaces. You could get the index of the 19th space (to display 18 numbers) using any of the methods found here, and substring from index 0 to that index and concatenate 3 dots. Something like this:
numberListString.SubString(0, IndexOfNth(numberListString, ' ', 19)) + "..."
(Not accurate code, adding or subtracting indexes or adjusting values (19) may be required).
EDIT: Just saw that after the dots you wanted to have the last 2 numbers, you can use the same technique! Just concatenate that result again.
NOTE: I used this whacky technique because the OP said they wanted fast ways, I'm just offering a potential option to benchmark :)
There is an alternative way that prevents iteration through the entire string of numbers and is reasonably fast.
Strings in .NET are basically an array of chars, and can be referenced on an individual basis using array referencing ([1..n]). This can be used to our advantage by simply testing for the correct number of spaces from the start and end respectively.
There are no niceties in the code, but they could be optimised later (for instance, by ensuring that there's actually something in the string, that the string is trimmed etc.).
The functions below could also be optimised to a single function if you're feeling energetic.
string finalNumbers = GetStartNumbers(myListOfNumbers, 18);
if(finalNumbers.EndsWith(" ... "))
finalNumbers += GetEndNumbers(myListOfNumbers, 2);
public string GetStartNumbers(string listOfNumbers, int collectThisManyNumbers)
{
int spaceCounter = 0; // The current count of spaces
int charPointer = 0; // The current character in the string
// Loop through the list of numbers until we either run out of characters
// or get to the appropriate 'space' position...
while(spaceCounter < collectThisManyNumbers && charPointer <= listOfNumbers.Length)
{
// The following line will add 1 to spaceCounter if the character at the
// charPointer position is a space. The charPointer is then incremented...
spaceCounter += ( listOfNumbers[charPointer++]==' ' ? 1 : 0 );
}
// Now return our value based on the last value of charPointer. Note that
// if our string doesn't have the right number of elements, then it will
// not be suffixed with ' ... '
if(spaceCounter < collectThisManyNumbers)
return listOfNumbers.Substring(0, charPointer - 1);
else
return listOfNumbers.Substring(0, charPointer - 1) + " ... ";
}
public string GetEndNumbers(string listOfNumbers, int collectThisManyNumbers)
{
int spaceCounter = 0; // The current count of spaces
int charPointer = listOfNumbers.Length; // The current character in the string
// Loop through the list of numbers until we either run out of characters
// or get to the appropriate 'space' position...
while(spaceCounter < collectThisManyNumbers && charPointer >= 0)
{
// The following line will add 1 to spaceCounter if the character at the
// charPointer position is a space. The charPointer is then decremented...
spaceCounter += ( listOfNumbers[charPointer--]==' ' ? 1 : 0 );
}
// Now return our value based on the last value of charPointer...
return listOfNumbers.Substring(charPointer);
}
Some people find the use of ++ and -- objectionable but it's up to you. If you want to do the maths and logic, knock yourself out!
Please note that this code is quite long because it's commented to the far end of a fart.

Count consequitive numbers in a string C#

I have a string {101110111010001111} I'm searching for the total number of all sequences of equal bits with an exact length of 3 bits. In above string the answer willbe 3 (please note that the last one "1111" doesn't count as it has more than 3 equal bits.
Any suggestions how to do this?
If you don't want a simple solution, try this:
string s = "1101110111010001111";
var regex = new Regex(#"(.)\1+");
var matches = regex.Matches(s);
int count = matches.Cast<Match>().Where(x => x.Length == 3).Count();
Explanation:
The regex finds sets of 2 or more identical characters (not limited to 0's and 1's)
Then only sets of exactly 3 characters are counted
Is it that you need? Sometimes the simplest solutions are the best:
public static int Count(string txt)
{
// TODO validation etc
var result = 0;
char lstChr = txt[0];
int lastChrCnt = 1;
for (var i = 1; i < txt.Length; i++)
{
if (txt[i] == lstChr)
{
lastChrCnt += 1;
}
else
{
if (lastChrCnt == 3)
{
result += 1;
}
lstChr = txt[i];
lastChrCnt = 1;
}
}
return lastChrCnt == 3 ? result + 1 : result;
}
You can use a regular expression:
Regex.Matches(s, #"((?<=0)|^)111((?=0)|$)|((?<=1)|^)000((?=1)|$)");
Here's the same expression with comments:
Regex.Matches(s, #"
(
(?<=0) # is preceeded by a 0
| # or
^ # is at start
)
111 # 3 1's
(
(?=0) # is followed by a 0
| # or
$ # is at start
)
| # - or -
(
(?<=1) # is preceeded by a 1
| # or
^ # is at start
)
000 # 3 0's
(
(?=1) # followed by a 1
| # or
$ # is at end
)", RegexOptions.IgnorePatternWhitespace).Dump();
You can split the string by "111" to get an array. You can then simply count the lines that not begins with "1" with the linq.
See the sample:
using System;
using System.Linq;
namespace experiment
{
class Program
{
static void Main(string[] args)
{
string data = "{101110111010001111}";
string[] sequences = data.Split(new string[] {"111"}, StringSplitOptions.None);
int validCounts = sequences.Count(i => i.Substring(0, 1) != "1");
Console.WriteLine("Count of sequences: {0}", validCounts);
// See the splited array
foreach (string item in sequences) {
Console.WriteLine(item);
}
Console.ReadKey();
}
}
}
Yet another approach: Since you are only counting bits, you can split the string by "1" and "0" and then count all elements with lenght 3:
string inputstring = "101110111010001111";
string[] str1 = inputstring.Split('0');
string[] str2 = inputstring.Split('1');
int result = str1.Where(s => s.Length == 3).Count() + str2.Where(s => s.Length == 3).Count();
return result
An algorithmic solution. Checks for any n consecutive characters. But this is not completely tested for all negative scenarios.
public static int GetConsecutiveCharacterCount(this string input, int n)
{
// Does not contain expected number of characters
if (input.Length < n || n < 1)
return 0;
return Enumerable.Range(0, input.Length - (n - 1)) // Last n-1 characters will be covered in the last but one iteration.
.Where(x => Enumerable.Range(x, n).All(y => input[x] == input[y]) && // Check whether n consecutive characters match
((x - 1) > -1 ? input[x] != input[x - 1] : true) && // Compare the previous character where applicable
((x + n) < input.Length ? input[x] != input[x + n] : true) // Compare the next character where applicable
)
.Count();
}

Using Substring to get digit before a string

I am reading a file in C#. I want to check value from a string. The line consists as following:
15 EMP_L_NAME HAPPENS 5 TIMES.
40 SUP HAPPENS 12 TIMES.
I want to find the number of times which is in the string before the string "TIMES". I have written the following code:
int arrayLength = 0;
int timesindex = line.IndexOf("TIMES");
if (timesindex > 0)
{
//Positon of the digit "5" in the first line
int indexCount = timesindex - 2;
if (int.TryParse(line.Substring(indexCount, 1), out occursCount))
{
arrayLength = occursCount;
}
}
Using the above code, I can find the number of "TIMES" for a single digigt number. But if it is a double digit, it won't work( e.g the second line). I have to develop a logic to find the digit which is separted by a space with "TIMES". How I can do that?
You can do:
Split your string on space and remove empty enteries.
Find Index of "TIMES."
Access element Index - 1
Like:
string str = "15 EMP_L_NAME HAPPENS 5 TIMES. ";
string[] array = str.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
int index = Array.IndexOf(array, "TIMES.");
int number;
if (!int.TryParse(array[index - 1], out number))
{
//invalid number
}
Console.WriteLine(number);
If the input is reliable you can do a quicky with String.Split()...
int arrayLength = 0;
int timesindex = line.IndexOf("TIMES");
if (timesindex > 0)
{
string[] items = line.Split(new char[] {' '}, StringSplitOptions.RemoveEmptyEntries);
if (int.TryParse(items[items.Length - 2], out occursCount))
{
arrayLength = occursCount;
}
}
This method relies on the desired number being the second from last "word" in each line
If your strings are always the same format, with exactly five words or "sections" or whatever you want to call them, you could use:
int times = 0;
Int32.TryParse(line.Split(' ')[3], out times);
This would have to be more robust there's a chance the number may not exist in the string, or the string is in a completely different format.
Look at LastIndexOf combined with your timesindex. You can look for the space before the space before (timesindex-1), and then you have the two positions around the number.
int timesindex = line.IndexOf("TIMES");
int firstSpace = line.LastIndexOf(" ", timesindex-1);
string number = line.Substring(firstSpace, timesindex-firstSpace);
Though this might need some adjustments on the indexes, but that's the idea anyway
Try this
int timesindex = line.IndexOf("TIMES");
int happensindex = line.IndexOf("HAPPENS") + 7; //Add 7 cause HAPPEND is 7 chars long
if (timesindex > 0)
{
//Positon of the digit "5" in the first line
if (int.TryParse(line.Substring(happensindex, timesindex).trim(), out occursCount))
{
arrayLength = occursCount;
}
}
A Regex would be cleaner:
var regex = new Regex(#"(\d+)\sTIMES"); // match a number followed by whitespace then "TIMES"
string num = regex.Match(" 15 EMP_L_NAME HAPPENS 5 TIMES").Groups[1].ToString();
int val = int.Parse(num);
Using LINQ:
string[] lines = {"15 EMP_L_NAME HAPPENS 5,1 TIMES.", "40 SUP HAPPENS 12 TIMES. "};
var allValues = lines.Select(line =>
{
double temp;
var words = line.Split(new char[] {' '}, StringSplitOptions.RemoveEmptyEntries);
var value = words[Array.IndexOf(words,"TIMES.") - 1];
if (double.TryParse(value, out temp)) return temp;
else return 0;
}).ToList();
foreach (var value in allValues)
{
Console.WriteLine(value);
}
// Output:
// 5,1
// 12
You can use System.Text.RegularExpressions.Regex, i.e. a regular expression, in order to find a pattern in a string:
string input = "40 SUP HAPPENS 12 TIMES.";
Match match = Regex.Match(input, #"(?<=HAPPENS\s)\d+(?=\sTIMES)");
if (match.Success) {
Console.WriteLine(match.Value); '==> "12"
}
Explanation of the regular expression: It uses the general pattern (?<=prefix)find(?=suffix) in order to find a position between a prefix and suffix.
(?<=HAPPENS\s) Prefix consisting of "HAPPENS" plus a whitespace (\s)
\d+ A digit (\d) repeated one or more times (+)
(?=\sTIMES) Suffix consisting of a whitespace (\s) plus "TIMES"
If you only want to test for "TIMES" but not for "HAPPENS", you can just drop the first part:
Match match = Regex.Match(input, #"\d+(?=\sTIMES)");
Since you are using the same search pattern many times, it is advisable to create a Regex once instead of calling a static method:
Regex regex = new Regex(#"\d+(?=\sTIMES)");
// Use many times with different inputs
Match match = regex.Match(input);

Masking all characters of a string except for the last n characters

I want to know how can I replace a character of a string with condition of "except last number characters"?
Example:
string = "4111111111111111";
And I want to make it that
new_string = "XXXXXXXXXXXXX1111"
In this example I replace the character to "X" except the last 4 characters.
How can I possibly achieve this?
Would that suit you?
var input = "4111111111111111";
var length = input.Length;
var result = new String('X', length - 4) + input.Substring(length - 4);
Console.WriteLine(result);
// Ouput: XXXXXXXXXXXX1111
How about something like...
new_string = new String('X', YourString.Length - 4)
+ YourString.Substring(YourString.Length - 4);
create a new string based on the length of the current string -4 and just have it all "X"s. Then add on the last 4 characters of the original string
Here's a way to think through it. Call the last number characters to leave n:
How many characters will be replaced by X? The length of the string minus n.
How can we replace characters with other characters? You can't directly modify a string, but you can build a new one.
How to get the last n characters from the original string? There's a couple ways to do this, but the simplest is probably Substring, which allows us to grab part of a string by specifying the starting point and optionally the ending point.
So it would look something like this (where n is the number of characters to leave from the original, and str is the original string - string can't be the name of your variable because it's a reserved keyword):
// 2. Start with a blank string
var new_string = "";
// 1. Replace first Length - n characters with X
for (var i = 0; i < str.Length - n; i++)
new_string += "X";
// 3. Add in the last n characters from original string.
new_string += str.Substring(str.Length - n);
This might be a little Overkill for your ask. But here is a quick extension method that does this.
it defaults to using x as the masking Char but can be changed with an optional char
public static class Masking
{
public static string MaskAllButLast(this string input, int charsToDisplay, char maskingChar = 'x')
{
int charsToMask = input.Length - charsToDisplay;
return charsToMask > 0 ? $"{new string(maskingChar, charsToMask)}{input.Substring(charsToMask)}" : input;
}
}
Here a unit tests to prove it works
using Xunit;
namespace Tests
{
public class MaskingTest
{
[Theory]
[InlineData("ThisIsATest", 4, 'x', "xxxxxxxTest")]
[InlineData("Test", 4, null, "Test")]
[InlineData("ThisIsATest", 4, '*', "*******Test")]
[InlineData("Test", 16, 'x', "Test")]
[InlineData("Test", 0, 'y', "yyyy")]
public void Testing_Masking(string input, int charToDisplay, char maskingChar, string expected)
{
//Act
string actual = input.MaskAllButLast(charToDisplay, maskingChar);
//Assert
Assert.Equal(expected, actual);
}
}
}
StringBuilder sb = new StringBuilder();
Char[] stringChar = string.toCharArray();
for(int x = 0; x < stringChar.length-4; x++){
sb.append(stringChar[x]);
}
sb.append(string.substring(string.length()-4));
string = sb.toString();
I guess you could use Select with index
string input = "4111111111111111";
string new_string = new string(input.Select((c, i) => i < input.Length - 4 ? 'X' : c).ToArray());
Some of the other concise answers here did not account for strings less than n characters. Here's my take:
var length = input.Length;
input = length > 4 ? new String('*', length - 4) + input.Substring(length - 4) : input;
lui,
Please Try this one...
string dispString = DisplayString("4111111111111111", 4);
Create One function with pass original string and no of digit.
public string DisplayString(string strOriginal,int lastDigit)
{
string strResult = new String('X', strOriginal.Length - lastDigit) + strOriginal.Substring(strOriginal.Length - lastDigit);
return strResult;
}
May be help you....
Try this:
String maskedString = "...."+ (testString.substring(testString.length() - 4, testString.length()));
Late to the party but I also wanted to mask all but the last 'x' characters, but only mask numbers or letters so that any - ( ), other formatting, etc would still be shown. Here's my quick extension method that does this - hopefully it helps someone. I started with the example from Luke Hammer, then changed the guts to fit my needs.
public static string MaskOnlyChars(this string input, int charsToDisplay, char maskingChar = 'x')
{
StringBuilder sbOutput = new StringBuilder();
int intMaskCount = input.Length - charsToDisplay;
if (intMaskCount > 0) //only mask if string is longer than requested unmasked chars
{
for (var intloop = 0; intloop < input.Length; intloop++)
{
char charCurr = Char.Parse(input.Substring(intloop, 1));
byte[] charByte = Encoding.ASCII.GetBytes(charCurr.ToString());
int intCurrAscii = charByte[0];
if (intloop <= (intMaskCount - 1))
{
switch (intCurrAscii)
{
case int n when (n >= 48 && n <= 57):
//0-9
sbOutput.Append(maskingChar);
break;
case int n when (n >= 65 && n <= 90):
//A-Z
sbOutput.Append(maskingChar);
break;
case int n when (n >= 97 && n <= 122):
//a-z
sbOutput.Append(maskingChar);
break;
default:
//Leave other characters unmasked
sbOutput.Append(charCurr);
break;
}
}
else
{
//Characters at end to remain unmasked
sbOutput.Append(charCurr);
}
}
}
else
{
//if not enough characters to mask, show unaltered input
return input;
}
return sbOutput.ToString();
}

Categories