How can I reduce the complexity of these two methods? - c#

I have some code like
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
class Solution
{
// returns true or false based on whether s1 and s2 are
// an unordered anagrammatic pair
// e.g. "aac","cac" --> false
// "aac","aca" --> true
// Complexity: O(n)
static bool IsAnagrammaticPair(string s1, string s2)
{
if(s1.Length != s2.Length)
return false;
int[] counter1 = new int[26],
counter2 = new int[26];
for(int i = 0; i < s1.Length; ++i)
{
counter1[(int)s1[i] - (int)'a'] += 1;
counter2[(int)s2[i] - (int)'a'] += 1;
}
for(int i = 0; i < 26; ++i)
if(counter1[i] != counter2[i])
return false;
return true;
}
// gets all substrings of s (not including the empty string,
// including s itself)
// Complexity: O(n^2)
static IEnumerable<string> GetSubstrings(string s)
{
return from i in Enumerable.Range(0, s.Length)
from j in Enumerable.Range(0, s.Length - i + 1)
where j >= 1
select s.Substring(i, j);
}
// gets the number of anagrammatical pairs of substrings in s
// Complexity: O(n^2)
static int NumAnagrammaticalPairs(string s)
{
var substrings = GetSubstrings(s).ToList();
var indices = Enumerable.Range(0, substrings.Count);
return (from i in indices
from j in indices
where i < j && IsAnagrammaticPair(substrings[i], substrings[j])
select 1).Count();
}
static void Main(String[] args)
{
int T = Int32.Parse(Console.ReadLine());
for(int t = 0; t < T; ++t)
{
string line = Console.ReadLine();
Console.WriteLine(NumAnagrammaticalPairs(line));
}
}
}
which is not meeting the performance benchmarks of the problem. The two helper methods I have
GetSubstrings
and
NumAnagrammaticalPairs
I know are O(n^2) as I mentioned in the comments, however I don't see how I can reduce the number of operations involved in retrieving the answer. Any ideas?

You can use LINQ to check for anagarmatic pair. Not sure about performance though.
public class Program
{
public static void Main()
{
string x="aac";
string y ="caa";
List<char> lstX = x.ToCharArray().OrderBy(m=>m).ToList<char>();
List<char> lstY =y.ToCharArray().OrderBy(m=>m).ToList<char>();
Console.WriteLine(IsAnagramaticPair(x,y));
}
public static bool IsAnagramaticPair(string x ,string y)
{
List<char> lstX = x.ToCharArray().OrderBy(m=>m).ToList<char>();
List<char> lstY =y.ToCharArray().OrderBy(m=>m).ToList<char>();
return lstX.SequenceEqual(lstY);
}
}

There are two possibilities that come to mind. First, sort both strings and compare the results:
static bool IsAnagrammaticPair(string s1, string s2)
{
var srt1 = s1.OrderBy(c => c);
var srt2 = s2.OrderBy(c => c);
return s1.SequenceEqual(s2);
}
If m and n are the lengths of the strings, then that is O(n log n + m log m).
Another possibility is to make a histogram of characters from one string, and then compare the characters and associated counts from the other string. The strings have to be the same length for that to work, though:
static bool IsAnagrammaticPair(string s1, string s2)
{
if (s1.Length != s2.Length) return false;
var l1 = s1.ToLookup(c => c);
return s2.GroupBy(c => c).All(g => l1.Contains(g.Key) && l1[g.Key].Count() == g.Count());
}
That's O(n) for the first part, with O(n) extra space. And O(m) for the second part.

You can take subsets of the substrings so that each subset is composed of strings having the same length.
After that you can or using a sort O(n log n) to make a direct comparison between the strings.
And also use the sort to arrange each subset in the good order (so no ixj comparisons).
static void Main( string[] args )
{
int T = Int32.Parse( Console.ReadLine() );
for ( int t = 0 ; t < T ; ++t )
{
string line = Console.ReadLine();
Console.WriteLine( NumAnagrammaticalPairs( line ) );
}
}
static int NumAnagrammaticalPairs( string s )
{
int sum = 0;
foreach ( var substrings in GetSubstringsByLength( s ) )
{
// Count for each string how many others are identical
int toSum = 0;
for ( int i = 1 ; i < substrings.Count ; i++ )
if ( substrings[i - 1] == substrings[i] )
{
toSum++;
sum += toSum;
}
else
{
toSum = 0;
}
}
return sum;
}
static IEnumerable<List<string>> GetSubstringsByLength( string s )
{
for ( int length = 1 ; length < s.Length ; length++ )
yield return GetSubstringsOfLength( s, length );
}
static List<string> GetSubstringsOfLength( string s, int length )
{
var result = new List<string>();
for ( int i = 0 ; i <= s.Length - length ; i++ )
{
var substring = s.Substring( i, length );
result.Add( new string( substring.OrderBy( c => c ).ToArray<char>() ) );
}
result.Sort();
return result;
}

Related

How to create a sequence of strings between "start" and "end" strings [duplicate]

I have a question about iterate through the Alphabet.
I would like to have a loop that begins with "a" and ends with "z". After that, the loop begins "aa" and count to "az". after that begins with "ba" up to "bz" and so on...
Anybody know some solution?
Thanks
EDIT: I forgot that I give a char "a" to the function then the function must return b. if u give "bnc" then the function must return "bnd"
First effort, with just a-z then aa-zz
public static IEnumerable<string> GetExcelColumns()
{
for (char c = 'a'; c <= 'z'; c++)
{
yield return c.ToString();
}
char[] chars = new char[2];
for (char high = 'a'; high <= 'z'; high++)
{
chars[0] = high;
for (char low = 'a'; low <= 'z'; low++)
{
chars[1] = low;
yield return new string(chars);
}
}
}
Note that this will stop at 'zz'. Of course, there's some ugly duplication here in terms of the loops. Fortunately, that's easy to fix - and it can be even more flexible, too:
Second attempt: more flexible alphabet
private const string Alphabet = "abcdefghijklmnopqrstuvwxyz";
public static IEnumerable<string> GetExcelColumns()
{
return GetExcelColumns(Alphabet);
}
public static IEnumerable<string> GetExcelColumns(string alphabet)
{
foreach(char c in alphabet)
{
yield return c.ToString();
}
char[] chars = new char[2];
foreach(char high in alphabet)
{
chars[0] = high;
foreach(char low in alphabet)
{
chars[1] = low;
yield return new string(chars);
}
}
}
Now if you want to generate just a, b, c, d, aa, ab, ac, ad, ba, ... you'd call GetExcelColumns("abcd").
Third attempt (revised further) - infinite sequence
public static IEnumerable<string> GetExcelColumns(string alphabet)
{
int length = 0;
char[] chars = null;
int[] indexes = null;
while (true)
{
int position = length-1;
// Try to increment the least significant
// value.
while (position >= 0)
{
indexes[position]++;
if (indexes[position] == alphabet.Length)
{
for (int i=position; i < length; i++)
{
indexes[i] = 0;
chars[i] = alphabet[0];
}
position--;
}
else
{
chars[position] = alphabet[indexes[position]];
break;
}
}
// If we got all the way to the start of the array,
// we need an extra value
if (position == -1)
{
length++;
chars = new char[length];
indexes = new int[length];
for (int i=0; i < length; i++)
{
chars[i] = alphabet[0];
}
}
yield return new string(chars);
}
}
It's possible that it would be cleaner code using recursion, but it wouldn't be as efficient.
Note that if you want to stop at a certain point, you can just use LINQ:
var query = GetExcelColumns().TakeWhile(x => x != "zzz");
"Restarting" the iterator
To restart the iterator from a given point, you could indeed use SkipWhile as suggested by thesoftwarejedi. That's fairly inefficient, of course. If you're able to keep any state between call, you can just keep the iterator (for either solution):
using (IEnumerator<string> iterator = GetExcelColumns())
{
iterator.MoveNext();
string firstAttempt = iterator.Current;
if (someCondition)
{
iterator.MoveNext();
string secondAttempt = iterator.Current;
// etc
}
}
Alternatively, you may well be able to structure your code to use a foreach anyway, just breaking out on the first value you can actually use.
Edit: Made it do exactly as the OP's latest edit wants
This is the simplest solution, and tested:
static void Main(string[] args)
{
Console.WriteLine(GetNextBase26("a"));
Console.WriteLine(GetNextBase26("bnc"));
}
private static string GetNextBase26(string a)
{
return Base26Sequence().SkipWhile(x => x != a).Skip(1).First();
}
private static IEnumerable<string> Base26Sequence()
{
long i = 0L;
while (true)
yield return Base26Encode(i++);
}
private static char[] base26Chars = "abcdefghijklmnopqrstuvwxyz".ToCharArray();
private static string Base26Encode(Int64 value)
{
string returnValue = null;
do
{
returnValue = base26Chars[value % 26] + returnValue;
value /= 26;
} while (value-- != 0);
return returnValue;
}
The following populates a list with the required strings:
List<string> result = new List<string>();
for (char ch = 'a'; ch <= 'z'; ch++){
result.Add (ch.ToString());
}
for (char i = 'a'; i <= 'z'; i++)
{
for (char j = 'a'; j <= 'z'; j++)
{
result.Add (i.ToString() + j.ToString());
}
}
I know there are plenty of answers here, and one's been accepted, but IMO they all make it harder than it needs to be. I think the following is simpler and cleaner:
static string NextColumn(string column){
char[] c = column.ToCharArray();
for(int i = c.Length - 1; i >= 0; i--){
if(char.ToUpper(c[i]++) < 'Z')
break;
c[i] -= (char)26;
if(i == 0)
return "A" + new string(c);
}
return new string(c);
}
Note that this doesn't do any input validation. If you don't trust your callers, you should add an IsNullOrEmpty check at the beginning, and a c[i] >= 'A' && c[i] <= 'Z' || c[i] >= 'a' && c[i] <= 'z' check at the top of the loop. Or just leave it be and let it be GIGO.
You may also find use for these companion functions:
static string GetColumnName(int index){
StringBuilder txt = new StringBuilder();
txt.Append((char)('A' + index % 26));
//txt.Append((char)('A' + --index % 26));
while((index /= 26) > 0)
txt.Insert(0, (char)('A' + --index % 26));
return txt.ToString();
}
static int GetColumnIndex(string name){
int rtn = 0;
foreach(char c in name)
rtn = rtn * 26 + (char.ToUpper(c) - '#');
return rtn - 1;
//return rtn;
}
These two functions are zero-based. That is, "A" = 0, "Z" = 25, "AA" = 26, etc. To make them one-based (like Excel's COM interface), remove the line above the commented line in each function, and uncomment those lines.
As with the NextColumn function, these functions don't validate their inputs. Both with give you garbage if that's what they get.
Here’s what I came up with.
/// <summary>
/// Return an incremented alphabtical string
/// </summary>
/// <param name="letter">The string to be incremented</param>
/// <returns>the incremented string</returns>
public static string NextLetter(string letter)
{
const string alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
if (!string.IsNullOrEmpty(letter))
{
char lastLetterInString = letter[letter.Length - 1];
// if the last letter in the string is the last letter of the alphabet
if (alphabet.IndexOf(lastLetterInString) == alphabet.Length - 1)
{
//replace the last letter in the string with the first leter of the alphbat and get the next letter for the rest of the string
return NextLetter(letter.Substring(0, letter.Length - 1)) + alphabet[0];
}
else
{
// replace the last letter in the string with the proceeding letter of the alphabet
return letter.Remove(letter.Length-1).Insert(letter.Length-1, (alphabet[alphabet.IndexOf(letter[letter.Length-1])+1]).ToString() );
}
}
//return the first letter of the alphabet
return alphabet[0].ToString();
}
just curious , why not just
private string alphRecursive(int c) {
var alphabet = "abcdefghijklmnopqrstuvwxyz".ToCharArray();
if (c >= alphabet.Length) {
return alphRecursive(c/alphabet.Length) + alphabet[c%alphabet.Length];
} else {
return "" + alphabet[c%alphabet.Length];
}
}
This is like displaying an int, only using base 26 in stead of base 10. Try the following algorithm to find the nth entry of the array
q = n div 26;
r = n mod 26;
s = '';
while (q > 0 || r > 0) {
s = alphabet[r] + s;
q = q div 26;
r = q mod 26;
}
Of course, if you want the first n entries, this is not the most efficient solution. In this case, try something like daniel's solution.
I gave this a go and came up with this:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace Alphabetty
{
class Program
{
const string alphabet = "abcdefghijklmnopqrstuvwxyz";
static int cursor = 0;
static int prefixCursor;
static string prefix = string.Empty;
static bool done = false;
static void Main(string[] args)
{
string s = string.Empty;
while (s != "Done")
{
s = GetNextString();
Console.WriteLine(s);
}
Console.ReadKey();
}
static string GetNextString()
{
if (done) return "Done";
char? nextLetter = GetNextLetter(ref cursor);
if (nextLetter == null)
{
char? nextPrefixLetter = GetNextLetter(ref prefixCursor);
if(nextPrefixLetter == null)
{
done = true;
return "Done";
}
prefix = nextPrefixLetter.Value.ToString();
nextLetter = GetNextLetter(ref cursor);
}
return prefix + nextLetter;
}
static char? GetNextLetter(ref int letterCursor)
{
if (letterCursor == alphabet.Length)
{
letterCursor = 0;
return null;
}
char c = alphabet[letterCursor];
letterCursor++;
return c;
}
}
}
Here is something I had cooked up that may be similar. I was experimenting with iteration counts in order to design a numbering schema that was as small as possible, yet gave me enough uniqueness.
I knew that each time a added an Alpha character, it would increase the possibilities 26x but I wasn't sure how many letters, numbers, or the pattern I wanted to use.
That lead me to the code below. Basically you pass it an AlphaNumber string, and every position that has a Letter, would eventually increment to "z\Z" and every position that had a Number, would eventually increment to "9".
So you can call it 1 of two ways..
//This would give you the next Itteration... (H3reIsaStup4dExamplf)
string myNextValue = IncrementAlphaNumericValue("H3reIsaStup4dExample")
//Or Loop it resulting eventually as "Z9zzZzzZzzz9zZzzzzzz"
string myNextValue = "H3reIsaStup4dExample"
while (myNextValue != null)
{
myNextValue = IncrementAlphaNumericValue(myNextValue)
//And of course do something with this like write it out
}
(For me, I was doing something like "1AA000")
public string IncrementAlphaNumericValue(string Value)
{
//We only allow Characters a-b, A-Z, 0-9
if (System.Text.RegularExpressions.Regex.IsMatch(Value, "^[a-zA-Z0-9]+$") == false)
{
throw new Exception("Invalid Character: Must be a-Z or 0-9");
}
//We work with each Character so it's best to convert the string to a char array for incrementing
char[] myCharacterArray = Value.ToCharArray();
//So what we do here is step backwards through the Characters and increment the first one we can.
for (Int32 myCharIndex = myCharacterArray.Length - 1; myCharIndex >= 0; myCharIndex--)
{
//Converts the Character to it's ASCII value
Int32 myCharValue = Convert.ToInt32(myCharacterArray[myCharIndex]);
//We only Increment this Character Position, if it is not already at it's Max value (Z = 90, z = 122, 57 = 9)
if (myCharValue != 57 && myCharValue != 90 && myCharValue != 122)
{
myCharacterArray[myCharIndex]++;
//Now that we have Incremented the Character, we "reset" all the values to the right of it
for (Int32 myResetIndex = myCharIndex + 1; myResetIndex < myCharacterArray.Length; myResetIndex++)
{
myCharValue = Convert.ToInt32(myCharacterArray[myResetIndex]);
if (myCharValue >= 65 && myCharValue <= 90)
{
myCharacterArray[myResetIndex] = 'A';
}
else if (myCharValue >= 97 && myCharValue <= 122)
{
myCharacterArray[myResetIndex] = 'a';
}
else if (myCharValue >= 48 && myCharValue <= 57)
{
myCharacterArray[myResetIndex] = '0';
}
}
//Now we just return an new Value
return new string(myCharacterArray);
}
}
//If we got through the Character Loop and were not able to increment anything, we retun a NULL.
return null;
}
Here's my attempt using recursion:
public static void PrintAlphabet(string alphabet, string prefix)
{
for (int i = 0; i < alphabet.Length; i++) {
Console.WriteLine(prefix + alphabet[i].ToString());
}
if (prefix.Length < alphabet.Length - 1) {
for (int i = 0; i < alphabet.Length; i++) {
PrintAlphabet(alphabet, prefix + alphabet[i]);
}
}
}
Then simply call PrintAlphabet("abcd", "");

Index out of bounds of array but only sometimes [duplicate]

Suppose I had a string:
string str = "1111222233334444";
How can I break this string into chunks of some size?
e.g., breaking this into sizes of 4 would return strings:
"1111"
"2222"
"3333"
"4444"
static IEnumerable<string> Split(string str, int chunkSize)
{
return Enumerable.Range(0, str.Length / chunkSize)
.Select(i => str.Substring(i * chunkSize, chunkSize));
}
Please note that additional code might be required to gracefully handle edge cases (null or empty input string, chunkSize == 0, input string length not divisible by chunkSize, etc.). The original question doesn't specify any requirements for these edge cases and in real life the requirements might vary so they are out of scope of this answer.
In a combination of dove+Konstatin's answers...
static IEnumerable<string> WholeChunks(string str, int chunkSize) {
for (int i = 0; i < str.Length; i += chunkSize)
yield return str.Substring(i, chunkSize);
}
This will work for all strings that can be split into a whole number of chunks, and will throw an exception otherwise.
If you want to support strings of any length you could use the following code:
static IEnumerable<string> ChunksUpto(string str, int maxChunkSize) {
for (int i = 0; i < str.Length; i += maxChunkSize)
yield return str.Substring(i, Math.Min(maxChunkSize, str.Length-i));
}
However, the the OP explicitly stated he does not need this; it's somewhat longer and harder to read, slightly slower. In the spirit of KISS and YAGNI, I'd go with the first option: it's probably the most efficient implementation possible, and it's very short, readable, and, importantly, throws an exception for nonconforming input.
Why not loops? Here's something that would do it quite well:
string str = "111122223333444455";
int chunkSize = 4;
int stringLength = str.Length;
for (int i = 0; i < stringLength ; i += chunkSize)
{
if (i + chunkSize > stringLength) chunkSize = stringLength - i;
Console.WriteLine(str.Substring(i, chunkSize));
}
Console.ReadLine();
I don't know how you'd deal with case where the string is not factor of 4, but not saying you're idea is not possible, just wondering the motivation for it if a simple for loop does it very well? Obviously the above could be cleaned and even put in as an extension method.
Or as mentioned in comments, you know it's /4 then
str = "1111222233334444";
for (int i = 0; i < stringLength; i += chunkSize)
{Console.WriteLine(str.Substring(i, chunkSize));}
This is based on #dove solution but implemented as an extension method.
Benefits:
Extension method
Covers corner cases
Splits string with any chars: numbers, letters, other symbols
Code
public static class EnumerableEx
{
public static IEnumerable<string> SplitBy(this string str, int chunkLength)
{
if (String.IsNullOrEmpty(str)) throw new ArgumentException();
if (chunkLength < 1) throw new ArgumentException();
for (int i = 0; i < str.Length; i += chunkLength)
{
if (chunkLength + i > str.Length)
chunkLength = str.Length - i;
yield return str.Substring(i, chunkLength);
}
}
}
Usage
var result = "bobjoecat".SplitBy(3); // bob, joe, cat
Unit tests removed for brevity (see previous revision)
Using regular expressions and Linq:
List<string> groups = (from Match m in Regex.Matches(str, #"\d{4}")
select m.Value).ToList();
I find this to be more readable, but it's just a personal opinion. It can also be a one-liner : ).
How's this for a one-liner?
List<string> result = new List<string>(Regex.Split(target, #"(?<=\G.{4})", RegexOptions.Singleline));
With this regex it doesn't matter if the last chunk is less than four characters, because it only ever looks at the characters behind it.
I'm sure this isn't the most efficient solution, but I just had to toss it out there.
Starting with .NET 6, we can also use the Chunk method:
var result = str
.Chunk(4)
.Select(x => new string(x))
.ToList();
I recently had to write something that accomplishes this at work, so I thought I would post my solution to this problem. As an added bonus, the functionality of this solution provides a way to split the string in the opposite direction and it does correctly handle unicode characters as previously mentioned by Marvin Pinto above. So, here it is:
using System;
using Extensions;
namespace TestCSharp
{
class Program
{
static void Main(string[] args)
{
string asciiStr = "This is a string.";
string unicodeStr = "これは文字列です。";
string[] array1 = asciiStr.Split(4);
string[] array2 = asciiStr.Split(-4);
string[] array3 = asciiStr.Split(7);
string[] array4 = asciiStr.Split(-7);
string[] array5 = unicodeStr.Split(5);
string[] array6 = unicodeStr.Split(-5);
}
}
}
namespace Extensions
{
public static class StringExtensions
{
/// <summary>Returns a string array that contains the substrings in this string that are seperated a given fixed length.</summary>
/// <param name="s">This string object.</param>
/// <param name="length">Size of each substring.
/// <para>CASE: length > 0 , RESULT: String is split from left to right.</para>
/// <para>CASE: length == 0 , RESULT: String is returned as the only entry in the array.</para>
/// <para>CASE: length < 0 , RESULT: String is split from right to left.</para>
/// </param>
/// <returns>String array that has been split into substrings of equal length.</returns>
/// <example>
/// <code>
/// string s = "1234567890";
/// string[] a = s.Split(4); // a == { "1234", "5678", "90" }
/// </code>
/// </example>
public static string[] Split(this string s, int length)
{
System.Globalization.StringInfo str = new System.Globalization.StringInfo(s);
int lengthAbs = Math.Abs(length);
if (str == null || str.LengthInTextElements == 0 || lengthAbs == 0 || str.LengthInTextElements <= lengthAbs)
return new string[] { str.ToString() };
string[] array = new string[(str.LengthInTextElements % lengthAbs == 0 ? str.LengthInTextElements / lengthAbs: (str.LengthInTextElements / lengthAbs) + 1)];
if (length > 0)
for (int iStr = 0, iArray = 0; iStr < str.LengthInTextElements && iArray < array.Length; iStr += lengthAbs, iArray++)
array[iArray] = str.SubstringByTextElements(iStr, (str.LengthInTextElements - iStr < lengthAbs ? str.LengthInTextElements - iStr : lengthAbs));
else // if (length < 0)
for (int iStr = str.LengthInTextElements - 1, iArray = array.Length - 1; iStr >= 0 && iArray >= 0; iStr -= lengthAbs, iArray--)
array[iArray] = str.SubstringByTextElements((iStr - lengthAbs < 0 ? 0 : iStr - lengthAbs + 1), (iStr - lengthAbs < 0 ? iStr + 1 : lengthAbs));
return array;
}
}
}
Also, here is an image link to the results of running this code: http://i.imgur.com/16Iih.png
It's not pretty and it's not fast, but it works, it's a one-liner and it's LINQy:
List<string> a = text.Select((c, i) => new { Char = c, Index = i }).GroupBy(o => o.Index / 4).Select(g => new String(g.Select(o => o.Char).ToArray())).ToList();
This should be much faster and more efficient than using LINQ or other approaches used here.
public static IEnumerable<string> Splice(this string s, int spliceLength)
{
if (s == null)
throw new ArgumentNullException("s");
if (spliceLength < 1)
throw new ArgumentOutOfRangeException("spliceLength");
if (s.Length == 0)
yield break;
var start = 0;
for (var end = spliceLength; end < s.Length; end += spliceLength)
{
yield return s.Substring(start, spliceLength);
start = end;
}
yield return s.Substring(start);
}
You can use morelinq by Jon Skeet. Use Batch like:
string str = "1111222233334444";
int chunkSize = 4;
var chunks = str.Batch(chunkSize).Select(r => new String(r.ToArray()));
This will return 4 chunks for the string "1111222233334444". If the string length is less than or equal to the chunk size Batch will return the string as the only element of IEnumerable<string>
For output:
foreach (var chunk in chunks)
{
Console.WriteLine(chunk);
}
and it will give:
1111
2222
3333
4444
Personally I prefer my solution :-)
It handles:
String lengths that are a multiple of the chunk size.
String lengths that are NOT a multiple of the chunk size.
String lengths that are smaller than the chunk size.
NULL and empty strings (throws an exception).
Chunk sizes smaller than 1 (throws an exception).
It is implemented as a extension method, and it calculates the number of chunks is going to generate beforehand. It checks the last chunk because in case the text length is not a multiple it needs to be shorter. Clean, short, easy to understand... and works!
public static string[] Split(this string value, int chunkSize)
{
if (string.IsNullOrEmpty(value)) throw new ArgumentException("The string cannot be null.");
if (chunkSize < 1) throw new ArgumentException("The chunk size should be equal or greater than one.");
int remainder;
int divResult = Math.DivRem(value.Length, chunkSize, out remainder);
int numberOfChunks = remainder > 0 ? divResult + 1 : divResult;
var result = new string[numberOfChunks];
int i = 0;
while (i < numberOfChunks - 1)
{
result[i] = value.Substring(i * chunkSize, chunkSize);
i++;
}
int lastChunkSize = remainder > 0 ? remainder : chunkSize;
result[i] = value.Substring(i * chunkSize, lastChunkSize);
return result;
}
Simple and short:
// this means match a space or not a space (anything) up to 4 characters
var lines = Regex.Matches(str, #"[\s\S]{0,4}").Cast<Match>().Select(x => x.Value);
I know question is years old, but here is a Rx implementation. It handles the length % chunkSize != 0 problem out of the box:
public static IEnumerable<string> Chunkify(this string input, int size)
{
if(size < 1)
throw new ArgumentException("size must be greater than 0");
return input.ToCharArray()
.ToObservable()
.Buffer(size)
.Select(x => new string(x.ToArray()))
.ToEnumerable();
}
public static IEnumerable<IEnumerable<T>> SplitEvery<T>(this IEnumerable<T> values, int n)
{
var ls = values.Take(n);
var rs = values.Skip(n);
return ls.Any() ?
Cons(ls, SplitEvery(rs, n)) :
Enumerable.Empty<IEnumerable<T>>();
}
public static IEnumerable<T> Cons<T>(T x, IEnumerable<T> xs)
{
yield return x;
foreach (var xi in xs)
yield return xi;
}
Best , Easiest and Generic Answer :).
string originalString = "1111222233334444";
List<string> test = new List<string>();
int chunkSize = 4; // change 4 with the size of strings you want.
for (int i = 0; i < originalString.Length; i = i + chunkSize)
{
if (originalString.Length - i >= chunkSize)
test.Add(originalString.Substring(i, chunkSize));
else
test.Add(originalString.Substring(i,((originalString.Length - i))));
}
static IEnumerable<string> Split(string str, int chunkSize)
{
IEnumerable<string> retVal = Enumerable.Range(0, str.Length / chunkSize)
.Select(i => str.Substring(i * chunkSize, chunkSize))
if (str.Length % chunkSize > 0)
retVal = retVal.Append(str.Substring(str.Length / chunkSize * chunkSize, str.Length % chunkSize));
return retVal;
}
It correctly handles input string length not divisible by chunkSize.
Please note that additional code might be required to gracefully handle edge cases (null or empty input string, chunkSize == 0).
static IEnumerable<string> Split(string str, double chunkSize)
{
return Enumerable.Range(0, (int) Math.Ceiling(str.Length/chunkSize))
.Select(i => new string(str
.Skip(i * (int)chunkSize)
.Take((int)chunkSize)
.ToArray()));
}
and another approach:
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
var x = "Hello World";
foreach(var i in x.ChunkString(2)) Console.WriteLine(i);
}
}
public static class Ext{
public static IEnumerable<string> ChunkString(this string val, int chunkSize){
return val.Select((x,i) => new {Index = i, Value = x})
.GroupBy(x => x.Index/chunkSize, x => x.Value)
.Select(x => string.Join("",x));
}
}
Six years later o_O
Just because
public static IEnumerable<string> Split(this string str, int chunkSize, bool remainingInFront)
{
var count = (int) Math.Ceiling(str.Length/(double) chunkSize);
Func<int, int> start = index => remainingInFront ? str.Length - (count - index)*chunkSize : index*chunkSize;
Func<int, int> end = index => Math.Min(str.Length - Math.Max(start(index), 0), Math.Min(start(index) + chunkSize - Math.Max(start(index), 0), chunkSize));
return Enumerable.Range(0, count).Select(i => str.Substring(Math.Max(start(i), 0),end(i)));
}
or
private static Func<bool, int, int, int, int, int> start = (remainingInFront, length, count, index, size) =>
remainingInFront ? length - (count - index) * size : index * size;
private static Func<bool, int, int, int, int, int, int> end = (remainingInFront, length, count, index, size, start) =>
Math.Min(length - Math.Max(start, 0), Math.Min(start + size - Math.Max(start, 0), size));
public static IEnumerable<string> Split(this string str, int chunkSize, bool remainingInFront)
{
var count = (int)Math.Ceiling(str.Length / (double)chunkSize);
return Enumerable.Range(0, count).Select(i => str.Substring(
Math.Max(start(remainingInFront, str.Length, count, i, chunkSize), 0),
end(remainingInFront, str.Length, count, i, chunkSize, start(remainingInFront, str.Length, count, i, chunkSize))
));
}
AFAIK all edge cases are handled.
Console.WriteLine(string.Join(" ", "abc".Split(2, false))); // ab c
Console.WriteLine(string.Join(" ", "abc".Split(2, true))); // a bc
Console.WriteLine(string.Join(" ", "a".Split(2, true))); // a
Console.WriteLine(string.Join(" ", "a".Split(2, false))); // a
List<string> SplitString(int chunk, string input)
{
List<string> list = new List<string>();
int cycles = input.Length / chunk;
if (input.Length % chunk != 0)
cycles++;
for (int i = 0; i < cycles; i++)
{
try
{
list.Add(input.Substring(i * chunk, chunk));
}
catch
{
list.Add(input.Substring(i * chunk));
}
}
return list;
}
I took this to another level. Chucking is an easy one liner, but in my case I needed whole words as well. Figured I would post it, just in case someone else needs something similar.
static IEnumerable<string> Split(string orgString, int chunkSize, bool wholeWords = true)
{
if (wholeWords)
{
List<string> result = new List<string>();
StringBuilder sb = new StringBuilder();
if (orgString.Length > chunkSize)
{
string[] newSplit = orgString.Split(' ');
foreach (string str in newSplit)
{
if (sb.Length != 0)
sb.Append(" ");
if (sb.Length + str.Length > chunkSize)
{
result.Add(sb.ToString());
sb.Clear();
}
sb.Append(str);
}
result.Add(sb.ToString());
}
else
result.Add(orgString);
return result;
}
else
return new List<string>(Regex.Split(orgString, #"(?<=\G.{" + chunkSize + "})", RegexOptions.Singleline));
}
Results based on below comment:
string msg = "336699AABBCCDDEEFF";
foreach (string newMsg in Split(msg, 2, false))
{
Console.WriteLine($">>{newMsg}<<");
}
Console.ReadKey();
Results:
>>33<<
>>66<<
>>99<<
>>AA<<
>>BB<<
>>CC<<
>>DD<<
>>EE<<
>>FF<<
>><<
Another way to pull it:
List<string> splitData = (List<string>)Split(msg, 2, false);
for (int i = 0; i < splitData.Count - 1; i++)
{
Console.WriteLine($">>{splitData[i]}<<");
}
Console.ReadKey();
New Results:
>>33<<
>>66<<
>>99<<
>>AA<<
>>BB<<
>>CC<<
>>DD<<
>>EE<<
>>FF<<
An important tip if the string that is being chunked needs to support all Unicode characters.
If the string is to support international characters like 𠀋, then split up the string using the System.Globalization.StringInfo class. Using StringInfo, you can split up the string based on number of text elements.
string internationalString = '𠀋';
The above string has a Length of 2, because the String.Length property returns the number of Char objects in this instance, not the number of Unicode characters.
Changed slightly to return parts whose size not equal to chunkSize
public static IEnumerable<string> Split(this string str, int chunkSize)
{
var splits = new List<string>();
if (str.Length < chunkSize) { chunkSize = str.Length; }
splits.AddRange(Enumerable.Range(0, str.Length / chunkSize).Select(i => str.Substring(i * chunkSize, chunkSize)));
splits.Add(str.Length % chunkSize > 0 ? str.Substring((str.Length / chunkSize) * chunkSize, str.Length - ((str.Length / chunkSize) * chunkSize)) : string.Empty);
return (IEnumerable<string>)splits;
}
I think this is an straight forward answer:
public static IEnumerable<string> Split(this string str, int chunkSize)
{
if(string.IsNullOrEmpty(str) || chunkSize<1)
throw new ArgumentException("String can not be null or empty and chunk size should be greater than zero.");
var chunkCount = str.Length / chunkSize + (str.Length % chunkSize != 0 ? 1 : 0);
for (var i = 0; i < chunkCount; i++)
{
var startIndex = i * chunkSize;
if (startIndex + chunkSize >= str.Length)
yield return str.Substring(startIndex);
else
yield return str.Substring(startIndex, chunkSize);
}
}
And it covers edge cases.
static List<string> GetChunks(string value, int chunkLength)
{
var res = new List<string>();
int count = (value.Length / chunkLength) + (value.Length % chunkLength > 0 ? 1 : 0);
Enumerable.Range(0, count).ToList().ForEach(f => res.Add(value.Skip(f * chunkLength).Take(chunkLength).Select(z => z.ToString()).Aggregate((a,b) => a+b)));
return res;
}
demo
Here's my 2 cents:
IEnumerable<string> Split(string str, int chunkSize)
{
while (!string.IsNullOrWhiteSpace(str))
{
var chunk = str.Take(chunkSize).ToArray();
str = str.Substring(chunk.Length);
yield return new string(chunk);
}
}//Split
I've slightly build up on João's solution.
What I've done differently is in my method you can actually specify whether you want to return the array with remaining characters or whether you want to truncate them if the end characters do not match your required chunk length, I think it's pretty flexible and the code is fairly straight forward:
using System;
using System.Linq;
using System.Text.RegularExpressions;
namespace SplitFunction
{
class Program
{
static void Main(string[] args)
{
string text = "hello, how are you doing today?";
string[] chunks = SplitIntoChunks(text, 3,false);
if (chunks != null)
{
chunks.ToList().ForEach(e => Console.WriteLine(e));
}
Console.ReadKey();
}
private static string[] SplitIntoChunks(string text, int chunkSize, bool truncateRemaining)
{
string chunk = chunkSize.ToString();
string pattern = truncateRemaining ? ".{" + chunk + "}" : ".{1," + chunk + "}";
string[] chunks = null;
if (chunkSize > 0 && !String.IsNullOrEmpty(text))
chunks = (from Match m in Regex.Matches(text,pattern)select m.Value).ToArray();
return chunks;
}
}
}
public static List<string> SplitByMaxLength(this string str)
{
List<string> splitString = new List<string>();
for (int index = 0; index < str.Length; index += MaxLength)
{
splitString.Add(str.Substring(index, Math.Min(MaxLength, str.Length - index)));
}
return splitString;
}
I can't remember who gave me this, but it works great. I speed tested a number of ways to break Enumerable types into groups. The usage would just be like this...
List<string> Divided = Source3.Chunk(24).Select(Piece => string.Concat<char>(Piece)).ToList();
The extention code would look like this...
#region Chunk Logic
private class ChunkedEnumerable<T> : IEnumerable<T>
{
class ChildEnumerator : IEnumerator<T>
{
ChunkedEnumerable<T> parent;
int position;
bool done = false;
T current;
public ChildEnumerator(ChunkedEnumerable<T> parent)
{
this.parent = parent;
position = -1;
parent.wrapper.AddRef();
}
public T Current
{
get
{
if (position == -1 || done)
{
throw new InvalidOperationException();
}
return current;
}
}
public void Dispose()
{
if (!done)
{
done = true;
parent.wrapper.RemoveRef();
}
}
object System.Collections.IEnumerator.Current
{
get { return Current; }
}
public bool MoveNext()
{
position++;
if (position + 1 > parent.chunkSize)
{
done = true;
}
if (!done)
{
done = !parent.wrapper.Get(position + parent.start, out current);
}
return !done;
}
public void Reset()
{
// per http://msdn.microsoft.com/en-us/library/system.collections.ienumerator.reset.aspx
throw new NotSupportedException();
}
}
EnumeratorWrapper<T> wrapper;
int chunkSize;
int start;
public ChunkedEnumerable(EnumeratorWrapper<T> wrapper, int chunkSize, int start)
{
this.wrapper = wrapper;
this.chunkSize = chunkSize;
this.start = start;
}
public IEnumerator<T> GetEnumerator()
{
return new ChildEnumerator(this);
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
private class EnumeratorWrapper<T>
{
public EnumeratorWrapper(IEnumerable<T> source)
{
SourceEumerable = source;
}
IEnumerable<T> SourceEumerable { get; set; }
Enumeration currentEnumeration;
class Enumeration
{
public IEnumerator<T> Source { get; set; }
public int Position { get; set; }
public bool AtEnd { get; set; }
}
public bool Get(int pos, out T item)
{
if (currentEnumeration != null && currentEnumeration.Position > pos)
{
currentEnumeration.Source.Dispose();
currentEnumeration = null;
}
if (currentEnumeration == null)
{
currentEnumeration = new Enumeration { Position = -1, Source = SourceEumerable.GetEnumerator(), AtEnd = false };
}
item = default(T);
if (currentEnumeration.AtEnd)
{
return false;
}
while (currentEnumeration.Position < pos)
{
currentEnumeration.AtEnd = !currentEnumeration.Source.MoveNext();
currentEnumeration.Position++;
if (currentEnumeration.AtEnd)
{
return false;
}
}
item = currentEnumeration.Source.Current;
return true;
}
int refs = 0;
// needed for dispose semantics
public void AddRef()
{
refs++;
}
public void RemoveRef()
{
refs--;
if (refs == 0 && currentEnumeration != null)
{
var copy = currentEnumeration;
currentEnumeration = null;
copy.Source.Dispose();
}
}
}
/// <summary>Speed Checked. Works Great!</summary>
public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source, int chunksize)
{
if (chunksize < 1) throw new InvalidOperationException();
var wrapper = new EnumeratorWrapper<T>(source);
int currentPos = 0;
T ignore;
try
{
wrapper.AddRef();
while (wrapper.Get(currentPos, out ignore))
{
yield return new ChunkedEnumerable<T>(wrapper, chunksize, currentPos);
currentPos += chunksize;
}
}
finally
{
wrapper.RemoveRef();
}
}
#endregion
class StringHelper
{
static void Main(string[] args)
{
string str = "Hi my name is vikas bansal and my email id is bansal.vks#gmail.com";
int offSet = 10;
List<string> chunks = chunkMyStr(str, offSet);
Console.Read();
}
static List<string> chunkMyStr(string str, int offSet)
{
List<string> resultChunks = new List<string>();
for (int i = 0; i < str.Length; i += offSet)
{
string temp = str.Substring(i, (str.Length - i) > offSet ? offSet : (str.Length - i));
Console.WriteLine(temp);
resultChunks.Add(temp);
}
return resultChunks;
}
}

Get permutation of specific characters in strings

Given a string like "N00MNM" I need all permutations of zero '0' char inside the string maintaining all other chars in fixed order.
The result must be:
"N0M0NM" "N0MN0M" "N0MNM0" "NM00NM" "NM0N0M" "NM0NM0" "NMN0M0" "NMNM00"
"0N0MNM" "0NM0NM" "0NMN0M" "0NMNM0"
Standard permutation function takes too time to do that work (we are talking of about 1500ms) and strings to test are longer than the sample one.
There's an algorithm for this?
What you're trying to do can be done by getting all different positions in which the character 0 (in this case) can be placed and then including the total of 0 characters (00 in this case) in all positions of the string. These positions are taken from the string without all occurrences of 0. The code bellow does it:
public static IEnumerable<string> Combs(string str, char c)
{
int count = str.Count(_c => _c == c);
string _str = new string(str.Where(_c => _c != c).ToArray());
// Compute all combinations with different positions
foreach (var positions in GetPositionsSets(0, _str.Length, count))
{
StringBuilder _b = new StringBuilder();
int index = 0;
foreach (var _char in _str)
{
if (positions.Contains(index))
{ _b.Append($"{c}{_char}"); }
else
{ _b.Append(_char); }
index++;
}
if (positions.Contains(index))
_b.Append(c);
yield return _b.ToString();
}
//Compute the remaining combinations. I.e., those whose at some position
//have the amount of supplied characters.
string p = new string(c, count);
for (int i = 0; i < _str.Length; i++)
{
yield return _str.Insert(i, p);
}
yield return _str + p;
}
//Gets all posible positions sets that can be obtain from minPos
//until maxPos with positionsCount positions, that is, C(n,k)
//where n = maxPos - minPos + 1 and k = positionsCount
private static IEnumerable<HashSet<int>> GetPositionsSets(int minPos, int maxPos, int positionsCount)
{
if (positionsCount == 0)
yield return new HashSet<int>();
for (int i = minPos; i <= maxPos; i++)
{
foreach (var positions in GetPositionsSets(i + 1, maxPos, positionsCount - 1))
{
positions.Add(i);
yield return positions;
}
}
}
The output of the code above for "N00MNM" is:
0N0MNM
0NM0NM
0NMN0M
0NMNM0
N0M0NM
N0MN0M
N0MNM0
NM0N0M
NM0NM0
NMN0M0
00NMNM
N00MNM
NM00NM
NMN00M
NMNM00

how to get all combination of an arraylist?

I have an arraylist of strings "abcde"
I want to a method to return another arraylist with all the possible combination of a given arraylist (ex:ab,ac,ad...) in C#
anyone knows a simple method?
NB: all possible combinations of length 2, and would be better if the length is variable(can be changed)
Pertaining your comment requiring combinations of length two:
string s = "abcde";
var combinations = from c in s
from d in s.Remove(s.IndexOf(c), 1)
select new string(new[] { c, d });
foreach (var combination in combinations) {
Console.WriteLine(combination);
}
Responding to your edit for any length:
static IEnumerable<string> GetCombinations(string s, int length) {
Guard.Against<ArgumentNullException>(s == null);
if (length > s.Length || length == 0) {
return new[] { String.Empty };
if (length == 1) {
return s.Select(c => new string(new[] { c }));
}
return from c in s
from combination in GetCombinations(
s.Remove(s.IndexOf(c), 1),
length - 1
)
select c + combination;
}
Usage:
string s = "abcde";
var combinations = GetCombinations(s, 3);
Console.WriteLine(String.Join(", ", combinations));
Output:
abc, abd, abe, acb, acd, ace, adb, adc, ade, aeb, aec, aed, bac, bad, bae, bca,
bcd, bce, bda, bdc, bde, bea, bec, bed, cab, cad, cae, cba, cbd, cbe, cda, cdb,
cde, cea, ceb, ced, dab, dac, dae, dba, dbc, dbe, dca, dcb, dce, dea, deb, dec,
eab, eac, ead, eba, ebc, ebd, eca, ecb, ecd, eda, edb, edc
Here is my generic function which can return all the combinations of type T:
static IEnumerable<IEnumerable<T>> GetCombinations<T>(IEnumerable<T> list, int length)
{
if (length == 1) return list.Select(t => new T[] { t });
return GetCombinations(list, length - 1)
.SelectMany(t => list, (t1, t2) => t1.Concat(new T[] { t2 }));
}
Usage:
Console.WriteLine(
string.Join(", ",
GetCombinations("abcde".ToCharArray(), 2).Select(list => string.Join("", list))
)
);
Output:
aa, ab, ac, ad, ae, ba, bb, bc, bd, be, ca, cb, cc, cd, ce, da, db, dc, dd, de, ea, eb, ec, ed, ee
UPDATED
Please see my answer here for other scenarios, e.g. permutations and k-combinations etc.
Combination of numbers in array by using arrays only and recursion:
static int n = 4;
int[] baseArr = { 1, 2, 3, 4 };
int[] LockNums;
static void Main(string[] args)
{
int len = baseArr.Length;
LockNums = new int[n];
for (int i = 0; i < n; i++)
{
int num = baseArr[i];
DoCombinations(num, baseArr, len);
//for more than 4 numbers the print screen is too long if we need to check the result next line will help
//Console.ReadLine();
}
}
private void DoCombinations(int lockNum, int[] arr, int arrLen )
{
int n1 = arr.Length;
// next line shows the difference in length between the previous and its previous array
int point = arrLen - n1;
LockNums[n - arr.Length] = lockNum;
int[] tempArr = new int[arr.Length - 1];
FillTempArr(lockNum, arr, tempArr);
//next condition will print the last number from the current combination
if (arr.Length == 1)
{
Console.Write(" {0}", lockNum);
Console.WriteLine();
}
for (int i = 0; i < tempArr.Length; i++)
{
if ((point == 1) && (i != 0))
{
//without this code the program will fail to print the leading number of the next combination
//and 'point' is the exact moment when this code has to be executed
PrintFirstNums(baseArr.Length - n1);
}
Console.Write(" {0}", lockNum);
int num1 = tempArr[i];
DoCombinations(num1, tempArr, n1);
}
}
private void PrintFirstNums(int missNums)
{
for (int i = 0; i < missNums; i++)
{
Console.Write(" {0}", LockNums[i]);
}
}
private void FillTempArr(int lockN, int[] arr, int[] tempArr)
{
int idx = 0;
foreach (int number in arr)
{
if (number != lockN)
{
tempArr[idx++] = number;
}
}
}
private void PrintResult(int[] arr)
{
foreach (int num in arr)
{
Console.Write(" {0}", num);
}
}

calculate number of repetition of character in string in c#

how can I calculate the number of repetition of character in string in c# ?
example I have sasysays number of repetition of character 's' is 4
Here is a version using LINQ (written using extension methods):
int s = str.Where(c => c == 's').Count();
This uses the fact that string is IEnumerable<char>, so we can filter all characters that are equal to the one you're looking for and then count the number of selected elements. In fact, you can write just this (because the Count method allows you to specify a predicate that should hold for all counted elements):
int s = str.Count(c => c == 's');
Another option is:
int numberOfS = str.Count('s'.Equals);
This is a little backwards - 's' is a char, and every char has an Equals method, which can be used as the argument for Count.
Of course, this is less flexible than c => c == 's' - you cannot trivially change it to a complex condition.
s.Where(c => c == 's').Count()
given s is a string and you are looking for 's'
for(int i=0; i < str.Length; i++) {
if(str[i] == myChar) {
charCount++;
}
}
A more general solution, to count number of occurrences of all characters :
var charFrequencies = new Dictionary<char, int>();
foreach(char c in s)
{
int n;
charFrequencies.TryGetValue(c, out n);
n++;
charFrequencies[c] = n;
}
Console.WriteLine("There are {0} instances of 's' in the string", charFrequencies['s']);
string s = "sasysays ";
List<char> list = s.ToList<char>();
numberOfChar = list.Count<char>(c => c=='s');
Try this code :
namespace Count_char
{
class Program
{
static void Main(string[] args)
{
string s1 = Convert.ToString(Console.ReadLine());
for (int i = 97; i < 123; i++)
{
string s2 = Convert.ToString(Convert.ToChar(i));
CountStringOccurrences(s1, s2);
}
Console.ReadLine();
}
public static void CountStringOccurrences(string text, string pattern)
{
int count = 0;
int i = 0;
while ((i = text.IndexOf(pattern, i)) != -1)
{
i += pattern.Length;
count++;
}
if (count != 0)
{
Console.WriteLine("{0}-->{1}", pattern, count);
}
}
}
}

Categories