Related
I have a question about iterate through the Alphabet.
I would like to have a loop that begins with "a" and ends with "z". After that, the loop begins "aa" and count to "az". after that begins with "ba" up to "bz" and so on...
Anybody know some solution?
Thanks
EDIT: I forgot that I give a char "a" to the function then the function must return b. if u give "bnc" then the function must return "bnd"
First effort, with just a-z then aa-zz
public static IEnumerable<string> GetExcelColumns()
{
for (char c = 'a'; c <= 'z'; c++)
{
yield return c.ToString();
}
char[] chars = new char[2];
for (char high = 'a'; high <= 'z'; high++)
{
chars[0] = high;
for (char low = 'a'; low <= 'z'; low++)
{
chars[1] = low;
yield return new string(chars);
}
}
}
Note that this will stop at 'zz'. Of course, there's some ugly duplication here in terms of the loops. Fortunately, that's easy to fix - and it can be even more flexible, too:
Second attempt: more flexible alphabet
private const string Alphabet = "abcdefghijklmnopqrstuvwxyz";
public static IEnumerable<string> GetExcelColumns()
{
return GetExcelColumns(Alphabet);
}
public static IEnumerable<string> GetExcelColumns(string alphabet)
{
foreach(char c in alphabet)
{
yield return c.ToString();
}
char[] chars = new char[2];
foreach(char high in alphabet)
{
chars[0] = high;
foreach(char low in alphabet)
{
chars[1] = low;
yield return new string(chars);
}
}
}
Now if you want to generate just a, b, c, d, aa, ab, ac, ad, ba, ... you'd call GetExcelColumns("abcd").
Third attempt (revised further) - infinite sequence
public static IEnumerable<string> GetExcelColumns(string alphabet)
{
int length = 0;
char[] chars = null;
int[] indexes = null;
while (true)
{
int position = length-1;
// Try to increment the least significant
// value.
while (position >= 0)
{
indexes[position]++;
if (indexes[position] == alphabet.Length)
{
for (int i=position; i < length; i++)
{
indexes[i] = 0;
chars[i] = alphabet[0];
}
position--;
}
else
{
chars[position] = alphabet[indexes[position]];
break;
}
}
// If we got all the way to the start of the array,
// we need an extra value
if (position == -1)
{
length++;
chars = new char[length];
indexes = new int[length];
for (int i=0; i < length; i++)
{
chars[i] = alphabet[0];
}
}
yield return new string(chars);
}
}
It's possible that it would be cleaner code using recursion, but it wouldn't be as efficient.
Note that if you want to stop at a certain point, you can just use LINQ:
var query = GetExcelColumns().TakeWhile(x => x != "zzz");
"Restarting" the iterator
To restart the iterator from a given point, you could indeed use SkipWhile as suggested by thesoftwarejedi. That's fairly inefficient, of course. If you're able to keep any state between call, you can just keep the iterator (for either solution):
using (IEnumerator<string> iterator = GetExcelColumns())
{
iterator.MoveNext();
string firstAttempt = iterator.Current;
if (someCondition)
{
iterator.MoveNext();
string secondAttempt = iterator.Current;
// etc
}
}
Alternatively, you may well be able to structure your code to use a foreach anyway, just breaking out on the first value you can actually use.
Edit: Made it do exactly as the OP's latest edit wants
This is the simplest solution, and tested:
static void Main(string[] args)
{
Console.WriteLine(GetNextBase26("a"));
Console.WriteLine(GetNextBase26("bnc"));
}
private static string GetNextBase26(string a)
{
return Base26Sequence().SkipWhile(x => x != a).Skip(1).First();
}
private static IEnumerable<string> Base26Sequence()
{
long i = 0L;
while (true)
yield return Base26Encode(i++);
}
private static char[] base26Chars = "abcdefghijklmnopqrstuvwxyz".ToCharArray();
private static string Base26Encode(Int64 value)
{
string returnValue = null;
do
{
returnValue = base26Chars[value % 26] + returnValue;
value /= 26;
} while (value-- != 0);
return returnValue;
}
The following populates a list with the required strings:
List<string> result = new List<string>();
for (char ch = 'a'; ch <= 'z'; ch++){
result.Add (ch.ToString());
}
for (char i = 'a'; i <= 'z'; i++)
{
for (char j = 'a'; j <= 'z'; j++)
{
result.Add (i.ToString() + j.ToString());
}
}
I know there are plenty of answers here, and one's been accepted, but IMO they all make it harder than it needs to be. I think the following is simpler and cleaner:
static string NextColumn(string column){
char[] c = column.ToCharArray();
for(int i = c.Length - 1; i >= 0; i--){
if(char.ToUpper(c[i]++) < 'Z')
break;
c[i] -= (char)26;
if(i == 0)
return "A" + new string(c);
}
return new string(c);
}
Note that this doesn't do any input validation. If you don't trust your callers, you should add an IsNullOrEmpty check at the beginning, and a c[i] >= 'A' && c[i] <= 'Z' || c[i] >= 'a' && c[i] <= 'z' check at the top of the loop. Or just leave it be and let it be GIGO.
You may also find use for these companion functions:
static string GetColumnName(int index){
StringBuilder txt = new StringBuilder();
txt.Append((char)('A' + index % 26));
//txt.Append((char)('A' + --index % 26));
while((index /= 26) > 0)
txt.Insert(0, (char)('A' + --index % 26));
return txt.ToString();
}
static int GetColumnIndex(string name){
int rtn = 0;
foreach(char c in name)
rtn = rtn * 26 + (char.ToUpper(c) - '#');
return rtn - 1;
//return rtn;
}
These two functions are zero-based. That is, "A" = 0, "Z" = 25, "AA" = 26, etc. To make them one-based (like Excel's COM interface), remove the line above the commented line in each function, and uncomment those lines.
As with the NextColumn function, these functions don't validate their inputs. Both with give you garbage if that's what they get.
Here’s what I came up with.
/// <summary>
/// Return an incremented alphabtical string
/// </summary>
/// <param name="letter">The string to be incremented</param>
/// <returns>the incremented string</returns>
public static string NextLetter(string letter)
{
const string alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
if (!string.IsNullOrEmpty(letter))
{
char lastLetterInString = letter[letter.Length - 1];
// if the last letter in the string is the last letter of the alphabet
if (alphabet.IndexOf(lastLetterInString) == alphabet.Length - 1)
{
//replace the last letter in the string with the first leter of the alphbat and get the next letter for the rest of the string
return NextLetter(letter.Substring(0, letter.Length - 1)) + alphabet[0];
}
else
{
// replace the last letter in the string with the proceeding letter of the alphabet
return letter.Remove(letter.Length-1).Insert(letter.Length-1, (alphabet[alphabet.IndexOf(letter[letter.Length-1])+1]).ToString() );
}
}
//return the first letter of the alphabet
return alphabet[0].ToString();
}
just curious , why not just
private string alphRecursive(int c) {
var alphabet = "abcdefghijklmnopqrstuvwxyz".ToCharArray();
if (c >= alphabet.Length) {
return alphRecursive(c/alphabet.Length) + alphabet[c%alphabet.Length];
} else {
return "" + alphabet[c%alphabet.Length];
}
}
This is like displaying an int, only using base 26 in stead of base 10. Try the following algorithm to find the nth entry of the array
q = n div 26;
r = n mod 26;
s = '';
while (q > 0 || r > 0) {
s = alphabet[r] + s;
q = q div 26;
r = q mod 26;
}
Of course, if you want the first n entries, this is not the most efficient solution. In this case, try something like daniel's solution.
I gave this a go and came up with this:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace Alphabetty
{
class Program
{
const string alphabet = "abcdefghijklmnopqrstuvwxyz";
static int cursor = 0;
static int prefixCursor;
static string prefix = string.Empty;
static bool done = false;
static void Main(string[] args)
{
string s = string.Empty;
while (s != "Done")
{
s = GetNextString();
Console.WriteLine(s);
}
Console.ReadKey();
}
static string GetNextString()
{
if (done) return "Done";
char? nextLetter = GetNextLetter(ref cursor);
if (nextLetter == null)
{
char? nextPrefixLetter = GetNextLetter(ref prefixCursor);
if(nextPrefixLetter == null)
{
done = true;
return "Done";
}
prefix = nextPrefixLetter.Value.ToString();
nextLetter = GetNextLetter(ref cursor);
}
return prefix + nextLetter;
}
static char? GetNextLetter(ref int letterCursor)
{
if (letterCursor == alphabet.Length)
{
letterCursor = 0;
return null;
}
char c = alphabet[letterCursor];
letterCursor++;
return c;
}
}
}
Here is something I had cooked up that may be similar. I was experimenting with iteration counts in order to design a numbering schema that was as small as possible, yet gave me enough uniqueness.
I knew that each time a added an Alpha character, it would increase the possibilities 26x but I wasn't sure how many letters, numbers, or the pattern I wanted to use.
That lead me to the code below. Basically you pass it an AlphaNumber string, and every position that has a Letter, would eventually increment to "z\Z" and every position that had a Number, would eventually increment to "9".
So you can call it 1 of two ways..
//This would give you the next Itteration... (H3reIsaStup4dExamplf)
string myNextValue = IncrementAlphaNumericValue("H3reIsaStup4dExample")
//Or Loop it resulting eventually as "Z9zzZzzZzzz9zZzzzzzz"
string myNextValue = "H3reIsaStup4dExample"
while (myNextValue != null)
{
myNextValue = IncrementAlphaNumericValue(myNextValue)
//And of course do something with this like write it out
}
(For me, I was doing something like "1AA000")
public string IncrementAlphaNumericValue(string Value)
{
//We only allow Characters a-b, A-Z, 0-9
if (System.Text.RegularExpressions.Regex.IsMatch(Value, "^[a-zA-Z0-9]+$") == false)
{
throw new Exception("Invalid Character: Must be a-Z or 0-9");
}
//We work with each Character so it's best to convert the string to a char array for incrementing
char[] myCharacterArray = Value.ToCharArray();
//So what we do here is step backwards through the Characters and increment the first one we can.
for (Int32 myCharIndex = myCharacterArray.Length - 1; myCharIndex >= 0; myCharIndex--)
{
//Converts the Character to it's ASCII value
Int32 myCharValue = Convert.ToInt32(myCharacterArray[myCharIndex]);
//We only Increment this Character Position, if it is not already at it's Max value (Z = 90, z = 122, 57 = 9)
if (myCharValue != 57 && myCharValue != 90 && myCharValue != 122)
{
myCharacterArray[myCharIndex]++;
//Now that we have Incremented the Character, we "reset" all the values to the right of it
for (Int32 myResetIndex = myCharIndex + 1; myResetIndex < myCharacterArray.Length; myResetIndex++)
{
myCharValue = Convert.ToInt32(myCharacterArray[myResetIndex]);
if (myCharValue >= 65 && myCharValue <= 90)
{
myCharacterArray[myResetIndex] = 'A';
}
else if (myCharValue >= 97 && myCharValue <= 122)
{
myCharacterArray[myResetIndex] = 'a';
}
else if (myCharValue >= 48 && myCharValue <= 57)
{
myCharacterArray[myResetIndex] = '0';
}
}
//Now we just return an new Value
return new string(myCharacterArray);
}
}
//If we got through the Character Loop and were not able to increment anything, we retun a NULL.
return null;
}
Here's my attempt using recursion:
public static void PrintAlphabet(string alphabet, string prefix)
{
for (int i = 0; i < alphabet.Length; i++) {
Console.WriteLine(prefix + alphabet[i].ToString());
}
if (prefix.Length < alphabet.Length - 1) {
for (int i = 0; i < alphabet.Length; i++) {
PrintAlphabet(alphabet, prefix + alphabet[i]);
}
}
}
Then simply call PrintAlphabet("abcd", "");
I wonder what would be the best way to format numbers so that the NumberGroupSeparator would work not only on the integer part to the left of the comma, but also on the fractional part, on the right of the comma.
Math.PI.ToString("###,###,##0.0##,###,###,###") // As documented ..
// ..this doesn't work
3.14159265358979 // result
3.141,592,653,589,79 // desired result
As documented on MSDN the NumberGroupSeparator works only to the left of the comma. I wonder why??
A little clunky, and it won't work for scientific numbers but here is a try:
class Program
{
static void Main(string[] args)
{
var π=Math.PI*10000;
Debug.WriteLine(Display(π));
// 31,415.926,535,897,931,899
}
static string Display(double x)
{
int s=Math.Sign(x);
x=Math.Abs(x);
StringBuilder text=new StringBuilder();
var y=Math.Truncate(x);
text.Append((s*y).ToString("#,#"));
x-=y;
if (x>0)
{
// 15 decimal places is max reasonable precision
y=Math.Truncate(x*Math.Pow(10, 15));
text.Append(".");
text.Append(y.ToString("#,#").TrimEnd('0'));
}
return text.ToString();
}
}
It might be best to work with the string generated by your .ToString():
class Program
{
static string InsertSeparators(string s)
{
string decSeparator = System.Threading.Thread.CurrentThread.CurrentCulture.NumberFormat.NumberDecimalSeparator;
int separatorPos = s.IndexOf(decSeparator);
if (separatorPos >= 0)
{
string decPart = s.Substring(separatorPos + decSeparator.Length);
// split the string into parts of 3 or less characters
List<String> parts = new List<String>();
for (int i = 0; i < decPart.Length; i += 3)
{
string part = "";
for (int j = 0; (j < 3) && (i + j < decPart.Length); j++)
{
part += decPart[i + j];
}
parts.Add(part);
}
string groupSeparator = System.Threading.Thread.CurrentThread.CurrentCulture.NumberFormat.NumberGroupSeparator;
s = s.Substring(0, separatorPos) + decSeparator + String.Join(groupSeparator, parts);
}
return s;
}
static void Main(string[] args)
{
for (int n = 0; n < 15; n++)
{
string s = Math.PI.ToString("0." + new string('#', n));
Console.WriteLine(InsertSeparators(s));
}
Console.ReadLine();
}
}
Outputs:
3
3.1
3.14
3.142
3.141,6
3.141,59
3.141,593
3.141,592,7
3.141,592,65
3.141,592,654
3.141,592,653,6
3.141,592,653,59
3.141,592,653,59
3.141,592,653,589,8
3.141,592,653,589,79
OK, not my strong side, but I guess this may be my best bet:
string input = Math.PI.ToString();
string decSeparator = System.Threading.Thread.CurrentThread
.CurrentCulture.NumberFormat.NumberGroupSeparator;
Regex RX = new Regex(#"([0-9]{3})");
string result = RX.Replace(input , #"$1" + decSeparator);
Thanks for listening..
Suppose I had a string:
string str = "1111222233334444";
How can I break this string into chunks of some size?
e.g., breaking this into sizes of 4 would return strings:
"1111"
"2222"
"3333"
"4444"
static IEnumerable<string> Split(string str, int chunkSize)
{
return Enumerable.Range(0, str.Length / chunkSize)
.Select(i => str.Substring(i * chunkSize, chunkSize));
}
Please note that additional code might be required to gracefully handle edge cases (null or empty input string, chunkSize == 0, input string length not divisible by chunkSize, etc.). The original question doesn't specify any requirements for these edge cases and in real life the requirements might vary so they are out of scope of this answer.
In a combination of dove+Konstatin's answers...
static IEnumerable<string> WholeChunks(string str, int chunkSize) {
for (int i = 0; i < str.Length; i += chunkSize)
yield return str.Substring(i, chunkSize);
}
This will work for all strings that can be split into a whole number of chunks, and will throw an exception otherwise.
If you want to support strings of any length you could use the following code:
static IEnumerable<string> ChunksUpto(string str, int maxChunkSize) {
for (int i = 0; i < str.Length; i += maxChunkSize)
yield return str.Substring(i, Math.Min(maxChunkSize, str.Length-i));
}
However, the the OP explicitly stated he does not need this; it's somewhat longer and harder to read, slightly slower. In the spirit of KISS and YAGNI, I'd go with the first option: it's probably the most efficient implementation possible, and it's very short, readable, and, importantly, throws an exception for nonconforming input.
Why not loops? Here's something that would do it quite well:
string str = "111122223333444455";
int chunkSize = 4;
int stringLength = str.Length;
for (int i = 0; i < stringLength ; i += chunkSize)
{
if (i + chunkSize > stringLength) chunkSize = stringLength - i;
Console.WriteLine(str.Substring(i, chunkSize));
}
Console.ReadLine();
I don't know how you'd deal with case where the string is not factor of 4, but not saying you're idea is not possible, just wondering the motivation for it if a simple for loop does it very well? Obviously the above could be cleaned and even put in as an extension method.
Or as mentioned in comments, you know it's /4 then
str = "1111222233334444";
for (int i = 0; i < stringLength; i += chunkSize)
{Console.WriteLine(str.Substring(i, chunkSize));}
This is based on #dove solution but implemented as an extension method.
Benefits:
Extension method
Covers corner cases
Splits string with any chars: numbers, letters, other symbols
Code
public static class EnumerableEx
{
public static IEnumerable<string> SplitBy(this string str, int chunkLength)
{
if (String.IsNullOrEmpty(str)) throw new ArgumentException();
if (chunkLength < 1) throw new ArgumentException();
for (int i = 0; i < str.Length; i += chunkLength)
{
if (chunkLength + i > str.Length)
chunkLength = str.Length - i;
yield return str.Substring(i, chunkLength);
}
}
}
Usage
var result = "bobjoecat".SplitBy(3); // bob, joe, cat
Unit tests removed for brevity (see previous revision)
Using regular expressions and Linq:
List<string> groups = (from Match m in Regex.Matches(str, #"\d{4}")
select m.Value).ToList();
I find this to be more readable, but it's just a personal opinion. It can also be a one-liner : ).
How's this for a one-liner?
List<string> result = new List<string>(Regex.Split(target, #"(?<=\G.{4})", RegexOptions.Singleline));
With this regex it doesn't matter if the last chunk is less than four characters, because it only ever looks at the characters behind it.
I'm sure this isn't the most efficient solution, but I just had to toss it out there.
Starting with .NET 6, we can also use the Chunk method:
var result = str
.Chunk(4)
.Select(x => new string(x))
.ToList();
I recently had to write something that accomplishes this at work, so I thought I would post my solution to this problem. As an added bonus, the functionality of this solution provides a way to split the string in the opposite direction and it does correctly handle unicode characters as previously mentioned by Marvin Pinto above. So, here it is:
using System;
using Extensions;
namespace TestCSharp
{
class Program
{
static void Main(string[] args)
{
string asciiStr = "This is a string.";
string unicodeStr = "これは文字列です。";
string[] array1 = asciiStr.Split(4);
string[] array2 = asciiStr.Split(-4);
string[] array3 = asciiStr.Split(7);
string[] array4 = asciiStr.Split(-7);
string[] array5 = unicodeStr.Split(5);
string[] array6 = unicodeStr.Split(-5);
}
}
}
namespace Extensions
{
public static class StringExtensions
{
/// <summary>Returns a string array that contains the substrings in this string that are seperated a given fixed length.</summary>
/// <param name="s">This string object.</param>
/// <param name="length">Size of each substring.
/// <para>CASE: length > 0 , RESULT: String is split from left to right.</para>
/// <para>CASE: length == 0 , RESULT: String is returned as the only entry in the array.</para>
/// <para>CASE: length < 0 , RESULT: String is split from right to left.</para>
/// </param>
/// <returns>String array that has been split into substrings of equal length.</returns>
/// <example>
/// <code>
/// string s = "1234567890";
/// string[] a = s.Split(4); // a == { "1234", "5678", "90" }
/// </code>
/// </example>
public static string[] Split(this string s, int length)
{
System.Globalization.StringInfo str = new System.Globalization.StringInfo(s);
int lengthAbs = Math.Abs(length);
if (str == null || str.LengthInTextElements == 0 || lengthAbs == 0 || str.LengthInTextElements <= lengthAbs)
return new string[] { str.ToString() };
string[] array = new string[(str.LengthInTextElements % lengthAbs == 0 ? str.LengthInTextElements / lengthAbs: (str.LengthInTextElements / lengthAbs) + 1)];
if (length > 0)
for (int iStr = 0, iArray = 0; iStr < str.LengthInTextElements && iArray < array.Length; iStr += lengthAbs, iArray++)
array[iArray] = str.SubstringByTextElements(iStr, (str.LengthInTextElements - iStr < lengthAbs ? str.LengthInTextElements - iStr : lengthAbs));
else // if (length < 0)
for (int iStr = str.LengthInTextElements - 1, iArray = array.Length - 1; iStr >= 0 && iArray >= 0; iStr -= lengthAbs, iArray--)
array[iArray] = str.SubstringByTextElements((iStr - lengthAbs < 0 ? 0 : iStr - lengthAbs + 1), (iStr - lengthAbs < 0 ? iStr + 1 : lengthAbs));
return array;
}
}
}
Also, here is an image link to the results of running this code: http://i.imgur.com/16Iih.png
It's not pretty and it's not fast, but it works, it's a one-liner and it's LINQy:
List<string> a = text.Select((c, i) => new { Char = c, Index = i }).GroupBy(o => o.Index / 4).Select(g => new String(g.Select(o => o.Char).ToArray())).ToList();
This should be much faster and more efficient than using LINQ or other approaches used here.
public static IEnumerable<string> Splice(this string s, int spliceLength)
{
if (s == null)
throw new ArgumentNullException("s");
if (spliceLength < 1)
throw new ArgumentOutOfRangeException("spliceLength");
if (s.Length == 0)
yield break;
var start = 0;
for (var end = spliceLength; end < s.Length; end += spliceLength)
{
yield return s.Substring(start, spliceLength);
start = end;
}
yield return s.Substring(start);
}
You can use morelinq by Jon Skeet. Use Batch like:
string str = "1111222233334444";
int chunkSize = 4;
var chunks = str.Batch(chunkSize).Select(r => new String(r.ToArray()));
This will return 4 chunks for the string "1111222233334444". If the string length is less than or equal to the chunk size Batch will return the string as the only element of IEnumerable<string>
For output:
foreach (var chunk in chunks)
{
Console.WriteLine(chunk);
}
and it will give:
1111
2222
3333
4444
Personally I prefer my solution :-)
It handles:
String lengths that are a multiple of the chunk size.
String lengths that are NOT a multiple of the chunk size.
String lengths that are smaller than the chunk size.
NULL and empty strings (throws an exception).
Chunk sizes smaller than 1 (throws an exception).
It is implemented as a extension method, and it calculates the number of chunks is going to generate beforehand. It checks the last chunk because in case the text length is not a multiple it needs to be shorter. Clean, short, easy to understand... and works!
public static string[] Split(this string value, int chunkSize)
{
if (string.IsNullOrEmpty(value)) throw new ArgumentException("The string cannot be null.");
if (chunkSize < 1) throw new ArgumentException("The chunk size should be equal or greater than one.");
int remainder;
int divResult = Math.DivRem(value.Length, chunkSize, out remainder);
int numberOfChunks = remainder > 0 ? divResult + 1 : divResult;
var result = new string[numberOfChunks];
int i = 0;
while (i < numberOfChunks - 1)
{
result[i] = value.Substring(i * chunkSize, chunkSize);
i++;
}
int lastChunkSize = remainder > 0 ? remainder : chunkSize;
result[i] = value.Substring(i * chunkSize, lastChunkSize);
return result;
}
Simple and short:
// this means match a space or not a space (anything) up to 4 characters
var lines = Regex.Matches(str, #"[\s\S]{0,4}").Cast<Match>().Select(x => x.Value);
I know question is years old, but here is a Rx implementation. It handles the length % chunkSize != 0 problem out of the box:
public static IEnumerable<string> Chunkify(this string input, int size)
{
if(size < 1)
throw new ArgumentException("size must be greater than 0");
return input.ToCharArray()
.ToObservable()
.Buffer(size)
.Select(x => new string(x.ToArray()))
.ToEnumerable();
}
public static IEnumerable<IEnumerable<T>> SplitEvery<T>(this IEnumerable<T> values, int n)
{
var ls = values.Take(n);
var rs = values.Skip(n);
return ls.Any() ?
Cons(ls, SplitEvery(rs, n)) :
Enumerable.Empty<IEnumerable<T>>();
}
public static IEnumerable<T> Cons<T>(T x, IEnumerable<T> xs)
{
yield return x;
foreach (var xi in xs)
yield return xi;
}
Best , Easiest and Generic Answer :).
string originalString = "1111222233334444";
List<string> test = new List<string>();
int chunkSize = 4; // change 4 with the size of strings you want.
for (int i = 0; i < originalString.Length; i = i + chunkSize)
{
if (originalString.Length - i >= chunkSize)
test.Add(originalString.Substring(i, chunkSize));
else
test.Add(originalString.Substring(i,((originalString.Length - i))));
}
static IEnumerable<string> Split(string str, int chunkSize)
{
IEnumerable<string> retVal = Enumerable.Range(0, str.Length / chunkSize)
.Select(i => str.Substring(i * chunkSize, chunkSize))
if (str.Length % chunkSize > 0)
retVal = retVal.Append(str.Substring(str.Length / chunkSize * chunkSize, str.Length % chunkSize));
return retVal;
}
It correctly handles input string length not divisible by chunkSize.
Please note that additional code might be required to gracefully handle edge cases (null or empty input string, chunkSize == 0).
static IEnumerable<string> Split(string str, double chunkSize)
{
return Enumerable.Range(0, (int) Math.Ceiling(str.Length/chunkSize))
.Select(i => new string(str
.Skip(i * (int)chunkSize)
.Take((int)chunkSize)
.ToArray()));
}
and another approach:
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
var x = "Hello World";
foreach(var i in x.ChunkString(2)) Console.WriteLine(i);
}
}
public static class Ext{
public static IEnumerable<string> ChunkString(this string val, int chunkSize){
return val.Select((x,i) => new {Index = i, Value = x})
.GroupBy(x => x.Index/chunkSize, x => x.Value)
.Select(x => string.Join("",x));
}
}
Six years later o_O
Just because
public static IEnumerable<string> Split(this string str, int chunkSize, bool remainingInFront)
{
var count = (int) Math.Ceiling(str.Length/(double) chunkSize);
Func<int, int> start = index => remainingInFront ? str.Length - (count - index)*chunkSize : index*chunkSize;
Func<int, int> end = index => Math.Min(str.Length - Math.Max(start(index), 0), Math.Min(start(index) + chunkSize - Math.Max(start(index), 0), chunkSize));
return Enumerable.Range(0, count).Select(i => str.Substring(Math.Max(start(i), 0),end(i)));
}
or
private static Func<bool, int, int, int, int, int> start = (remainingInFront, length, count, index, size) =>
remainingInFront ? length - (count - index) * size : index * size;
private static Func<bool, int, int, int, int, int, int> end = (remainingInFront, length, count, index, size, start) =>
Math.Min(length - Math.Max(start, 0), Math.Min(start + size - Math.Max(start, 0), size));
public static IEnumerable<string> Split(this string str, int chunkSize, bool remainingInFront)
{
var count = (int)Math.Ceiling(str.Length / (double)chunkSize);
return Enumerable.Range(0, count).Select(i => str.Substring(
Math.Max(start(remainingInFront, str.Length, count, i, chunkSize), 0),
end(remainingInFront, str.Length, count, i, chunkSize, start(remainingInFront, str.Length, count, i, chunkSize))
));
}
AFAIK all edge cases are handled.
Console.WriteLine(string.Join(" ", "abc".Split(2, false))); // ab c
Console.WriteLine(string.Join(" ", "abc".Split(2, true))); // a bc
Console.WriteLine(string.Join(" ", "a".Split(2, true))); // a
Console.WriteLine(string.Join(" ", "a".Split(2, false))); // a
List<string> SplitString(int chunk, string input)
{
List<string> list = new List<string>();
int cycles = input.Length / chunk;
if (input.Length % chunk != 0)
cycles++;
for (int i = 0; i < cycles; i++)
{
try
{
list.Add(input.Substring(i * chunk, chunk));
}
catch
{
list.Add(input.Substring(i * chunk));
}
}
return list;
}
I took this to another level. Chucking is an easy one liner, but in my case I needed whole words as well. Figured I would post it, just in case someone else needs something similar.
static IEnumerable<string> Split(string orgString, int chunkSize, bool wholeWords = true)
{
if (wholeWords)
{
List<string> result = new List<string>();
StringBuilder sb = new StringBuilder();
if (orgString.Length > chunkSize)
{
string[] newSplit = orgString.Split(' ');
foreach (string str in newSplit)
{
if (sb.Length != 0)
sb.Append(" ");
if (sb.Length + str.Length > chunkSize)
{
result.Add(sb.ToString());
sb.Clear();
}
sb.Append(str);
}
result.Add(sb.ToString());
}
else
result.Add(orgString);
return result;
}
else
return new List<string>(Regex.Split(orgString, #"(?<=\G.{" + chunkSize + "})", RegexOptions.Singleline));
}
Results based on below comment:
string msg = "336699AABBCCDDEEFF";
foreach (string newMsg in Split(msg, 2, false))
{
Console.WriteLine($">>{newMsg}<<");
}
Console.ReadKey();
Results:
>>33<<
>>66<<
>>99<<
>>AA<<
>>BB<<
>>CC<<
>>DD<<
>>EE<<
>>FF<<
>><<
Another way to pull it:
List<string> splitData = (List<string>)Split(msg, 2, false);
for (int i = 0; i < splitData.Count - 1; i++)
{
Console.WriteLine($">>{splitData[i]}<<");
}
Console.ReadKey();
New Results:
>>33<<
>>66<<
>>99<<
>>AA<<
>>BB<<
>>CC<<
>>DD<<
>>EE<<
>>FF<<
An important tip if the string that is being chunked needs to support all Unicode characters.
If the string is to support international characters like 𠀋, then split up the string using the System.Globalization.StringInfo class. Using StringInfo, you can split up the string based on number of text elements.
string internationalString = '𠀋';
The above string has a Length of 2, because the String.Length property returns the number of Char objects in this instance, not the number of Unicode characters.
Changed slightly to return parts whose size not equal to chunkSize
public static IEnumerable<string> Split(this string str, int chunkSize)
{
var splits = new List<string>();
if (str.Length < chunkSize) { chunkSize = str.Length; }
splits.AddRange(Enumerable.Range(0, str.Length / chunkSize).Select(i => str.Substring(i * chunkSize, chunkSize)));
splits.Add(str.Length % chunkSize > 0 ? str.Substring((str.Length / chunkSize) * chunkSize, str.Length - ((str.Length / chunkSize) * chunkSize)) : string.Empty);
return (IEnumerable<string>)splits;
}
I think this is an straight forward answer:
public static IEnumerable<string> Split(this string str, int chunkSize)
{
if(string.IsNullOrEmpty(str) || chunkSize<1)
throw new ArgumentException("String can not be null or empty and chunk size should be greater than zero.");
var chunkCount = str.Length / chunkSize + (str.Length % chunkSize != 0 ? 1 : 0);
for (var i = 0; i < chunkCount; i++)
{
var startIndex = i * chunkSize;
if (startIndex + chunkSize >= str.Length)
yield return str.Substring(startIndex);
else
yield return str.Substring(startIndex, chunkSize);
}
}
And it covers edge cases.
static List<string> GetChunks(string value, int chunkLength)
{
var res = new List<string>();
int count = (value.Length / chunkLength) + (value.Length % chunkLength > 0 ? 1 : 0);
Enumerable.Range(0, count).ToList().ForEach(f => res.Add(value.Skip(f * chunkLength).Take(chunkLength).Select(z => z.ToString()).Aggregate((a,b) => a+b)));
return res;
}
demo
Here's my 2 cents:
IEnumerable<string> Split(string str, int chunkSize)
{
while (!string.IsNullOrWhiteSpace(str))
{
var chunk = str.Take(chunkSize).ToArray();
str = str.Substring(chunk.Length);
yield return new string(chunk);
}
}//Split
I've slightly build up on João's solution.
What I've done differently is in my method you can actually specify whether you want to return the array with remaining characters or whether you want to truncate them if the end characters do not match your required chunk length, I think it's pretty flexible and the code is fairly straight forward:
using System;
using System.Linq;
using System.Text.RegularExpressions;
namespace SplitFunction
{
class Program
{
static void Main(string[] args)
{
string text = "hello, how are you doing today?";
string[] chunks = SplitIntoChunks(text, 3,false);
if (chunks != null)
{
chunks.ToList().ForEach(e => Console.WriteLine(e));
}
Console.ReadKey();
}
private static string[] SplitIntoChunks(string text, int chunkSize, bool truncateRemaining)
{
string chunk = chunkSize.ToString();
string pattern = truncateRemaining ? ".{" + chunk + "}" : ".{1," + chunk + "}";
string[] chunks = null;
if (chunkSize > 0 && !String.IsNullOrEmpty(text))
chunks = (from Match m in Regex.Matches(text,pattern)select m.Value).ToArray();
return chunks;
}
}
}
public static List<string> SplitByMaxLength(this string str)
{
List<string> splitString = new List<string>();
for (int index = 0; index < str.Length; index += MaxLength)
{
splitString.Add(str.Substring(index, Math.Min(MaxLength, str.Length - index)));
}
return splitString;
}
I can't remember who gave me this, but it works great. I speed tested a number of ways to break Enumerable types into groups. The usage would just be like this...
List<string> Divided = Source3.Chunk(24).Select(Piece => string.Concat<char>(Piece)).ToList();
The extention code would look like this...
#region Chunk Logic
private class ChunkedEnumerable<T> : IEnumerable<T>
{
class ChildEnumerator : IEnumerator<T>
{
ChunkedEnumerable<T> parent;
int position;
bool done = false;
T current;
public ChildEnumerator(ChunkedEnumerable<T> parent)
{
this.parent = parent;
position = -1;
parent.wrapper.AddRef();
}
public T Current
{
get
{
if (position == -1 || done)
{
throw new InvalidOperationException();
}
return current;
}
}
public void Dispose()
{
if (!done)
{
done = true;
parent.wrapper.RemoveRef();
}
}
object System.Collections.IEnumerator.Current
{
get { return Current; }
}
public bool MoveNext()
{
position++;
if (position + 1 > parent.chunkSize)
{
done = true;
}
if (!done)
{
done = !parent.wrapper.Get(position + parent.start, out current);
}
return !done;
}
public void Reset()
{
// per http://msdn.microsoft.com/en-us/library/system.collections.ienumerator.reset.aspx
throw new NotSupportedException();
}
}
EnumeratorWrapper<T> wrapper;
int chunkSize;
int start;
public ChunkedEnumerable(EnumeratorWrapper<T> wrapper, int chunkSize, int start)
{
this.wrapper = wrapper;
this.chunkSize = chunkSize;
this.start = start;
}
public IEnumerator<T> GetEnumerator()
{
return new ChildEnumerator(this);
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
private class EnumeratorWrapper<T>
{
public EnumeratorWrapper(IEnumerable<T> source)
{
SourceEumerable = source;
}
IEnumerable<T> SourceEumerable { get; set; }
Enumeration currentEnumeration;
class Enumeration
{
public IEnumerator<T> Source { get; set; }
public int Position { get; set; }
public bool AtEnd { get; set; }
}
public bool Get(int pos, out T item)
{
if (currentEnumeration != null && currentEnumeration.Position > pos)
{
currentEnumeration.Source.Dispose();
currentEnumeration = null;
}
if (currentEnumeration == null)
{
currentEnumeration = new Enumeration { Position = -1, Source = SourceEumerable.GetEnumerator(), AtEnd = false };
}
item = default(T);
if (currentEnumeration.AtEnd)
{
return false;
}
while (currentEnumeration.Position < pos)
{
currentEnumeration.AtEnd = !currentEnumeration.Source.MoveNext();
currentEnumeration.Position++;
if (currentEnumeration.AtEnd)
{
return false;
}
}
item = currentEnumeration.Source.Current;
return true;
}
int refs = 0;
// needed for dispose semantics
public void AddRef()
{
refs++;
}
public void RemoveRef()
{
refs--;
if (refs == 0 && currentEnumeration != null)
{
var copy = currentEnumeration;
currentEnumeration = null;
copy.Source.Dispose();
}
}
}
/// <summary>Speed Checked. Works Great!</summary>
public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source, int chunksize)
{
if (chunksize < 1) throw new InvalidOperationException();
var wrapper = new EnumeratorWrapper<T>(source);
int currentPos = 0;
T ignore;
try
{
wrapper.AddRef();
while (wrapper.Get(currentPos, out ignore))
{
yield return new ChunkedEnumerable<T>(wrapper, chunksize, currentPos);
currentPos += chunksize;
}
}
finally
{
wrapper.RemoveRef();
}
}
#endregion
class StringHelper
{
static void Main(string[] args)
{
string str = "Hi my name is vikas bansal and my email id is bansal.vks#gmail.com";
int offSet = 10;
List<string> chunks = chunkMyStr(str, offSet);
Console.Read();
}
static List<string> chunkMyStr(string str, int offSet)
{
List<string> resultChunks = new List<string>();
for (int i = 0; i < str.Length; i += offSet)
{
string temp = str.Substring(i, (str.Length - i) > offSet ? offSet : (str.Length - i));
Console.WriteLine(temp);
resultChunks.Add(temp);
}
return resultChunks;
}
}
I am looking to generate a Sequence Number in this format
00000A
00000B
00000B
and so on till
00000Z
and then
00001A
00001B
00001C
...
00001Z
...
00010A
till
99999Z
I know that I can generate Max 2.6 million rows using this method but I guess that is enough
so, if I have the a String, lets say 26522C, Now i want the next number as 26522D
or If I have 34287Z, i want 34288A
I can write the Algorithm about it but there will be lots of parsing of the input string characters by characters
I was wondering is there any easier way of doing it
String GetNextNumberInSequence(String inputString)
{
if (inputString.Length == 6)
{
var charArray = inputString.ToCharArray();
char[] inputChars = { charArray[0], charArray[1], charArray[2],charArray[3],charArray[4],charArray[5] };
if(Char.IsDigit(charArray[5]))
{
//Parse first 5 characters
}
}
}
private static String GetNextNumberInSequence(String inputString)
{
var integerpart = int.Parse(inputString.Substring(0, 5));
var characterPart = inputString[5];
if (characterPart == 'Z')
return string.Format("{0}{1}", (++integerpart).ToString("D5"), "A");
var nextChar = (char)(characterPart + 1);
return string.Format("{0}{1}", (integerpart).ToString("D5"), nextChar.ToString());
}
You can achieve this by converting a number to Base36.
Take a look at this sample:
private const string CharList = "0123456789abcdefghijklmnopqrstuvwxyz";
public static String Base36Encode(long input, char paddingChar, int totalWidth)
{
char[] clistarr = CharList.ToCharArray();
var result = new Stack<char>();
while (input != 0)
{
result.Push(clistarr[input % 36]);
input /= 36;
}
return new string(result.ToArray()).PadLeft(totalWidth, paddingChar).ToUpper();
}
and then use it this way:
for(int i = 0; i < 1000; i++)
{
Debug.WriteLine(Base36Encode(i, '0', 6));
}
which will produce this:
000000, 000001, 000002, 000003, 000004, 000005, 000006, 000007, 000008, 000009, 00000A, 00000B, 00000C, 00000D, 00000E, 00000F, 00000G, 00000H, 00000I, 00000J, 00000K, 00000L, 00000M, 00000N, 00000O, 00000P, 00000Q, 00000R, 00000S, 00000T, 00000U, 00000V, 00000W, 00000X, 00000Y, 00000Z, 000010, 000011, 000012, 000013, 000014, 000015, 000016, 000017, 000018, 000019, 00001A, 00001B, 00001C, 00001D, 00001E, 00001F, 00001G, 00001H, 00001I, 00001J, 00001K, 00001L, 00001M, 00001N, 00001O, 00001P, 00001Q, 00001R, 00001S, 00001T, 00001U, 00001V, 00001W, 00001X, 00001Y, 00001Z, 000020, 000021, 000022, 000023, 000024, 000025, 000026, 000027, 000028, 000029, 00002A, 00002B, 00002C, 00002D, 00002E, 00002F, 00002G, 00002H, 00002I, 00002J, 00002K, 00002L, 00002M, 00002N, 00002O, 00002P, 00002Q, 00002R, 00002S, 00002T...
and the positive thing about this approach is that you can convert this back to number by using:
public static Int64 Base36Decode(string input)
{
var reversed = input.ToLower().Reverse();
long result = 0;
int pos = 0;
foreach (char c in reversed)
{
result += CharList.IndexOf(c) * (long)Math.Pow(36, pos);
pos++;
}
return result;
}
I need to parse a decimal integer that appears at the start of a string.
There may be trailing garbage following the decimal number. This needs to be ignored (even if it contains other numbers.)
e.g.
"1" => 1
" 42 " => 42
" 3 -.X.-" => 3
" 2 3 4 5" => 2
Is there a built-in method in the .NET framework to do this?
int.TryParse() is not suitable. It allows trailing spaces but not other trailing characters.
It would be quite easy to implement this but I would prefer to use the standard method if it exists.
You can use Linq to do this, no Regular Expressions needed:
public static int GetLeadingInt(string input)
{
return Int32.Parse(new string(input.Trim().TakeWhile(c => char.IsDigit(c) || c == '.').ToArray()));
}
This works for all your provided examples:
string[] tests = new string[] {
"1",
" 42 ",
" 3 -.X.-",
" 2 3 4 5"
};
foreach (string test in tests)
{
Console.WriteLine("Result: " + GetLeadingInt(test));
}
foreach (var m in Regex.Matches(" 3 - .x. 4", #"\d+"))
{
Console.WriteLine(m);
}
Updated per comments
Not sure why you don't like regular expressions, so I'll just post what I think is the shortest solution.
To get first int:
Match match = Regex.Match(" 3 - .x. - 4", #"\d+");
if (match.Success)
Console.WriteLine(int.Parse(match.Value));
There's no standard .NET method for doing this - although I wouldn't be surprised to find that VB had something in the Microsoft.VisualBasic assembly (which is shipped with .NET, so it's not an issue to use it even from C#).
Will the result always be non-negative (which would make things easier)?
To be honest, regular expressions are the easiest option here, but...
public static string RemoveCruftFromNumber(string text)
{
int end = 0;
// First move past leading spaces
while (end < text.Length && text[end] == ' ')
{
end++;
}
// Now move past digits
while (end < text.Length && char.IsDigit(text[end]))
{
end++;
}
return text.Substring(0, end);
}
Then you just need to call int.TryParse on the result of RemoveCruftFromNumber (don't forget that the integer may be too big to store in an int).
I like #Donut's approach.
I'd like to add though, that char.IsDigit and char.IsNumber also allow for some unicode characters which are digits in other languages and scripts (see here).
If you only want to check for the digits 0 to 9 you could use "0123456789".Contains(c).
Three example implementions:
To remove trailing non-digit characters:
var digits = new string(input.Trim().TakeWhile(c =>
("0123456789").Contains(c)
).ToArray());
To remove leading non-digit characters:
var digits = new string(input.Trim().SkipWhile(c =>
!("0123456789").Contains(c)
).ToArray());
To remove all non-digit characters:
var digits = new string(input.Trim().Where(c =>
("0123456789").Contains(c)
).ToArray());
And of course: int.Parse(digits) or int.TryParse(digits, out output)
This doesn't really answer your question (about a built-in C# method), but you could try chopping off characters at the end of the input string one by one until int.TryParse() accepts it as a valid number:
for (int p = input.Length; p > 0; p--)
{
int num;
if (int.TryParse(input.Substring(0, p), out num))
return num;
}
throw new Exception("Malformed integer: " + input);
Of course, this will be slow if input is very long.
ADDENDUM (March 2016)
This could be made faster by chopping off all non-digit/non-space characters on the right before attempting each parse:
for (int p = input.Length; p > 0; p--)
{
char ch;
do
{
ch = input[--p];
} while ((ch < '0' || ch > '9') && ch != ' ' && p > 0);
p++;
int num;
if (int.TryParse(input.Substring(0, p), out num))
return num;
}
throw new Exception("Malformed integer: " + input);
string s = " 3 -.X.-".Trim();
string collectedNumber = string.empty;
int i;
for (x = 0; x < s.length; x++)
{
if (int.TryParse(s[x], out i))
collectedNumber += s[x];
else
break; // not a number - that's it - get out.
}
if (int.TryParse(collectedNumber, out i))
Console.WriteLine(i);
else
Console.WriteLine("no number found");
This is how I would have done it in Java:
int parseLeadingInt(String input)
{
NumberFormat fmt = NumberFormat.getIntegerInstance();
fmt.setGroupingUsed(false);
return fmt.parse(input, new ParsePosition(0)).intValue();
}
I was hoping something similar would be possible in .NET.
This is the regex-based solution I am currently using:
int? parseLeadingInt(string input)
{
int result = 0;
Match match = Regex.Match(input, "^[ \t]*\\d+");
if (match.Success && int.TryParse(match.Value, out result))
{
return result;
}
return null;
}
Might as well add mine too.
string temp = " 3 .x£";
string numbersOnly = String.Empty;
int tempInt;
for (int i = 0; i < temp.Length; i++)
{
if (Int32.TryParse(Convert.ToString(temp[i]), out tempInt))
{
numbersOnly += temp[i];
}
}
Int32.TryParse(numbersOnly, out tempInt);
MessageBox.Show(tempInt.ToString());
The message box is just for testing purposes, just delete it once you verify the method is working.
I'm not sure why you would avoid Regex in this situation.
Here's a little hackery that you can adjust to your needs.
" 3 -.X.-".ToCharArray().FindInteger().ToList().ForEach(Console.WriteLine);
public static class CharArrayExtensions
{
public static IEnumerable<char> FindInteger(this IEnumerable<char> array)
{
foreach (var c in array)
{
if(char.IsNumber(c))
yield return c;
}
}
}
EDIT:
That's true about the incorrect result (and the maintenance dev :) ).
Here's a revision:
public static int FindFirstInteger(this IEnumerable<char> array)
{
bool foundInteger = false;
var ints = new List<char>();
foreach (var c in array)
{
if(char.IsNumber(c))
{
foundInteger = true;
ints.Add(c);
}
else
{
if(foundInteger)
{
break;
}
}
}
string s = string.Empty;
ints.ForEach(i => s += i.ToString());
return int.Parse(s);
}
private string GetInt(string s)
{
int i = 0;
s = s.Trim();
while (i<s.Length && char.IsDigit(s[i])) i++;
return s.Substring(0, i);
}
Similar to Donut's above but with a TryParse:
private static bool TryGetLeadingInt(string input, out int output)
{
var trimmedString = new string(input.Trim().TakeWhile(c => char.IsDigit(c) || c == '.').ToArray());
var canParse = int.TryParse( trimmedString, out output);
return canParse;
}