How to get all permutations of groups in a string? - c#

This is not homework, although it may seem like it. I've been browsing through the UK Computing Olympiad's website and found this problem (Question 1): here. I was baffled by it, and I'd want to see what you guys thought of how to do it. I can't think of any neat ways to get everything into groups (checking whether it's a palindrome after that is simple enough, i.e. originalString == new String(groupedString.Reverse.SelectMany(c => c).ToArray), assuming it is a char array).
Any ideas? Thanks!
Text for those at work:
A palindrome is a word that shows the same sequence of letters when
reversed. If a word can have its letters grouped together in two or
more blocks (each containing one or more adjacent letters) then it is
a block palindrome if reversing the order of those blocks results in
the same sequence of blocks.
For example, using brackets to indicate blocks, the following are
block palindromes:
• BONBON can be grouped together as (BON)(BON);
• ONION can be grouped together as (ON)(I)(ON);
• BBACBB can be grouped together as (B)(BACB)(B) or (BB)(AC)(BB) or
(B)(B)(AC)(B)(B)
Note that (BB)(AC)(B)(B) is not valid as the reverse (B)(B)(AC)(BB)
shows the blocks in a different order.
And the question is essentially how to generate all of those groups, to then check whether they are palindromes!

And the question is essentially how to generate all of those groups, to then check whether they are palindromes!
I note that this is not necessarily the best strategy. Generating all the groups first and then checking to see if they are palidromes is considerably more inefficient than generating only those groups which are palindromes.
But in the spirit of answering the question asked, let's solve the problem recursively. I will just generate all the groups; checking whether a set of groups is a palindrome is left as an exercise. I am also going to ignore the requirement that a set of groups contains at least two elements; that is easily checked.
The way to solve this problem elegantly is to reason recursively. As with all recursive solutions, we begin with a trivial base case:
How many groupings are there of the empty string? There is only the empty grouping; that is, the grouping with no elements in it.
Now we assume that we have a solution to a smaller problem, and ask "if we had a solution to a smaller problem, how could we use that solution to solve a larger problem?"
OK, suppose we have a larger problem. We have a string with 6 characters in it and we wish to produce all the groupings. Moreover, the groupings are symmetrical; the first group is the same size as the last group. By assumption we know how to solve the problem for any smaller string.
We solve the problem as follows. Suppose the string is ABCDEF. We peel off A and F from both ends, we solve the problem for BCDE, which remember we know how to do by assumption, and now we prepend A and append F to each of those solutions.
The solutions for BCDE are (B)(C)(D)(E), (B)(CD)(E), (BC)(DE), (BCDE). Again, we assume as our inductive hypothesis that we have the solution to the smaller problem. We then combine those with A and F to produce the solutions for ABCDEF: (A)(B)(C)(D)(E)(F), (A)(B)(CD)(E)(F), (A)(BC)(DE)(F) and (A)(BCDE)(F).
We've made good progress. Are we done? No. Next we peel off AB and EF, and recursively solve the problem for CD. I won't labour how that is done. Are we done? No. We peel off ABC and DEF and recursively solve the problem for the empty string in the middle. Are we done? No. (ABCDEF) is also a solution. Now we're done.
I hope that sketch motivates the solution, which is now straightforward. We begin with a helper function:
public static IEnumerable<T> AffixSequence<T>(T first, IEnumerable<T> body, T last)
{
yield return first;
foreach (T item in body)
yield return item;
yield return last;
}
That should be easy to understand. Now we do the real work:
public static IEnumerable<IEnumerable<string>> GenerateBlocks(string s)
{
// The base case is trivial: the blocks of the empty string
// is the empty set of blocks.
if (s.Length == 0)
{
yield return new string[0];
yield break;
}
// Generate all the sequences for the middle;
// combine them with all possible prefixes and suffixes.
for (int i = 1; s.Length >= 2 * i; ++i)
{
string prefix = s.Substring(0, i);
string suffix = s.Substring(s.Length - i, i);
string middle = s.Substring(i, s.Length - 2 * i);
foreach (var body in GenerateBlocks(middle))
yield return AffixSequence(prefix, body, suffix);
}
// Finally, the set of blocks that contains only this string
// is a solution.
yield return new[] { s };
}
Let's test it.
foreach (var blocks in GenerateBlocks("ABCDEF"))
Console.WriteLine($"({string.Join(")(", blocks)})");
The output is
(A)(B)(C)(D)(E)(F)
(A)(B)(CD)(E)(F)
(A)(BC)(DE)(F)
(A)(BCDE)(F)
(AB)(C)(D)(EF)
(AB)(CD)(EF)
(ABC)(DEF)
(ABCDEF)
So there you go.
You could now check to see whether each grouping is a palindrome, but why? The algorithm presented above can be easily modified to eliminate all non-palindromes by simply not recursing if the prefix and suffix are unequal:
if (prefix != suffix) continue;
The algorithm now enumerates only block palindromes. Let's test it:
foreach (var blocks in GenerateBlocks("BBACBB"))
Console.WriteLine($"({string.Join(")(", blocks)})");
The output is below; again, note that I am not filtering out the "entire string" block but doing so is straightforward.
(B)(B)(AC)(B)(B)
(B)(BACB)(B)
(BB)(AC)(BB)
(BBACBB)
If this subject interests you, consider reading my series of articles on using this same technique to generate every possible tree topology and every possible string in a language. It starts here:
http://blogs.msdn.com/b/ericlippert/archive/2010/04/19/every-binary-tree-there-is.aspx

This should work:
public List<string> BlockPalin(string s) {
var list = new List<string>();
for (int i = 1; i <= s.Length / 2; i++) {
int backInx = s.Length - i;
if (s.Substring(0, i) == s.Substring(backInx, i)) {
var result = string.Format("({0})", s.Substring(0, i));
result += "|" + result;
var rest = s.Substring(i, backInx - i);
if (rest == string.Empty) {
list.Add(result.Replace("|", rest));
return list;
}
else if (rest.Length == 1) {
list.Add(result.Replace("|", string.Format("({0})", rest)));
return list;
}
else {
list.Add(result.Replace("|", string.Format("({0})", rest)));
var recursiveList = BlockPalin(rest);
if (recursiveList.Count > 0) {
foreach (var recursiveResult in recursiveList) {
list.Add(result.Replace("|", recursiveResult));
}
}
else {
//EDIT: Thx to #juharr this list.Add is not needed...
// list.Add(result.Replace("|",string.Format("({0})",rest)));
return list;
}
}
}
}
return list;
}
And call it like this (EDIT: Again thx to #juharr, the distinct is not needed):
var x = BlockPalin("BONBON");//.Distinct().ToList();
var y = BlockPalin("ONION");//.Distinct().ToList();
var z = BlockPalin("BBACBB");//.Distinct().ToList();
The result:
x contains 1 element: (BON)(BON)
y contains 1 element: (ON)(I)(ON)
z contains 3 elements: (B)(BACB)(B),(B)(B)(AC)(B)(B) and (BB)(AC)(BB)

Although not so elegant as the one provided by #Eric Lippert, one might find interesting the following iterative string allocation free solution:
struct Range
{
public int Start, End;
public int Length { get { return End - Start; } }
public Range(int start, int length) { Start = start; End = start + length; }
}
static IEnumerable<Range[]> GetPalindromeBlocks(string input)
{
int maxLength = input.Length / 2;
var ranges = new Range[maxLength];
int count = 0;
for (var range = new Range(0, 1); ; range.End++)
{
if (range.End <= maxLength)
{
if (!IsPalindromeBlock(input, range)) continue;
ranges[count++] = range;
range.Start = range.End;
}
else
{
if (count == 0) break;
yield return GenerateResult(input, ranges, count);
range = ranges[--count];
}
}
}
static bool IsPalindromeBlock(string input, Range range)
{
return string.Compare(input, range.Start, input, input.Length - range.End, range.Length) == 0;
}
static Range[] GenerateResult(string input, Range[] ranges, int count)
{
var last = ranges[count - 1];
int midLength = input.Length - 2 * last.End;
var result = new Range[2 * count + (midLength > 0 ? 1 : 0)];
for (int i = 0; i < count; i++)
{
var range = result[i] = ranges[i];
result[result.Length - 1 - i] = new Range(input.Length - range.End, range.Length);
}
if (midLength > 0)
result[count] = new Range(last.End, midLength);
return result;
}
Test:
foreach (var input in new [] { "BONBON", "ONION", "BBACBB" })
{
Console.WriteLine(input);
var blocks = GetPalindromeBlocks(input);
foreach (var blockList in blocks)
Console.WriteLine(string.Concat(blockList.Select(range => "(" + input.Substring(range.Start, range.Length) + ")")));
}
Removing the line if (!IsPalindromeBlock(input, range)) continue; will produce the answer to the OP question.

It's not clear if you want all possible groupings, or just a possible grouping. This is one way, off the top-of-my-head, that you might get a grouping:
public static IEnumerable<string> GetBlocks(string testString)
{
if (testString.Length == 0)
{
yield break;
}
int mid = testString.Length / 2;
int i = 0;
while (i < mid)
{
if (testString.Take(i + 1).SequenceEqual(testString.Skip(testString.Length - (i + 1))))
{
yield return new String(testString.Take(i+1).ToArray());
break;
}
i++;
}
if (i == mid)
{
yield return testString;
}
else
{
foreach (var block in GetBlocks(new String(testString.Skip(i + 1).Take(testString.Length - (i + 1) * 2).ToArray())))
{
yield return block;
}
}
}
If you give it bonbon, it'll return bon. If you give it onion it'll give you back on, i. If you give it bbacbb, it'll give you b,b,ac.

Here's my solution (didn't have VS so I did it using java):
int matches = 0;
public void findMatch(String pal) {
String st1 = "", st2 = "";
int l = pal.length() - 1;
for (int i = 0; i < (pal.length())/2 ; i ++ ) {
st1 = st1 + pal.charAt(i);
st2 = pal.charAt(l) + st2;
if (st1.equals(st2)) {
matches++;
// DO THE SAME THING FOR THE MATCH
findMatch(st1);
}
l--;
}
}
The logic is pretty simple. I made two array of characters and compare them to find a match in each step. The key is you need to check the same thing for each match too.
findMatch("bonbon"); // 1
findMatch("bbacbb"); // 3

What about something like this for BONBON...
string bonBon = "BONBON";
First check character count for even or odd.
bool isEven = bonBon.Length % 2 == 0;
Now, if it is even, split the string in half.
if (isEven)
{
int halfInd = bonBon.Length / 2;
string firstHalf = bonBon.Substring(0, halfInd );
string secondHalf = bonBon.Substring(halfInd);
}
Now, if it is odd, split the string into 3 string.
else
{
int halfInd = (bonBon.Length - 1) / 2;
string firstHalf = bonBon.Substring(0, halfInd);
string middle = bonBon.Substring(halfInd, bonBon.Length - halfInd);
string secondHalf = bonBon.Substring(firstHalf.Length + middle.length);
}
May not be exactly correct, but it's a start....
Still have to add checking if it is actually a palindrome...
Good luck!!

Related

C#: Need to split a string into a string[] and keeping the delimiter (also a string) at the beginning of the string

I think I am too dumb to solve this problem...
I have some formulas which need to be "translated" from one syntax to another.
Let's say I have a formula that goes like that (it's a simple one, others have many "Ceilings" in it):
string formulaString = "If([Param1] = 0, 1, Ceiling([Param2] / 0.55) * [Param3])";
I need to replace "Ceiling()" with "Ceiling(; 1)" (basically, insert "; 1" before the ")").
My attempt is to split the fomulaString at "Ceiling(" so I am able to iterate through the string array and insert my string at the correct index (counting every "(" and ")" to get the right index)
What I have so far:
//splits correct, but loses "CEILING("
string[] parts = formulaString.Split(new[] { "CEILING(" }, StringSplitOptions.None);
//splits almost correct, "CEILING(" is in another group
string[] parts = Regex.Split(formulaString, #"(CEILING\()");
//splits almost every letter
string[] parts = Regex.Split(formulaString, #"(?=[(CEILING\()])");
When everything is done, I concat the string so I have my complete formula again.
What do I have to set as Regex pattern to achieve this sample? (Or any other method that will help me)
part1 = "If([Param1] = 0, 1, ";
part2 = "Ceiling([Param2] / 0.55) * [Param3])";
//part3 = next "CEILING(" in a longer formula and so on...
As I mention in a comment, you almost got it: (?=Ceiling). This is incomplete for your use case unfortunately.
I need to replace "Ceiling()" with "Ceiling(; 1)" (basically, insert "; 1" before the ")").
Depending on your regex engine (for example JS) this works:
string[] parts = Regex.Split(formulaString, #"(?<=Ceiling\([^)]*(?=\)))");
string modifiedFormula = String.join("; 1", parts);
The regex
(?<=Ceiling\([^)]*(?=\)))
(?<= ) Positive lookbehind
Ceiling\( Search for literal "Ceiling("
[^)] Match any char which is not ")" ..
* .. 0 or more times
(?=\)) Positive lookahead for ")", effectively making us stop before the ")"
This regex is a zero-assertion, therefore nothing is lost and it will cut your strings before the last ")" in every "Ceiling()".
This solution would break whenever you have nested "Ceiling()". Then your only solution would be writing your own parser for the same reasons why you can't parse markup with regex.
Regex.Replace(formulaString, #"(?<=Ceiling\()(.*?)(?=\))","$1; 1");
Note: This will not work for nested "Ceilings", but it does for Ceiling(), It will also not work fir Ceiling(AnotherFunc(x)). For that you need something like:
Regex.Replace(formulaString, #"(?<=Ceiling\()((.*\((?>[^()]+|(?1))*\))*|[^\)]*)(\))","$1; 1$3");
but I could not get that to work with .NET, only in JavaScript.
This is my solution:
private string ConvertCeiling(string formula)
{
int ceilingsCount = formula.CountOccurences("Ceiling(");
int startIndex = 0;
int bracketCounter;
for (int i = 0; i < ceilingsCount; i++)
{
startIndex = formula.IndexOf("Ceiling(", startIndex);
bracketCounter = 0;
for (int j = 0; j < formula.Length; j++)
{
if (j < startIndex) continue;
var c = formula[j];
if (c == '(')
{
bracketCounter++;
}
if (c == ')')
{
bracketCounter--;
if (bracketCounter == 0)
{
// found end
formula = formula.Insert(j, "; 1");
startIndex++;
break;
}
}
}
}
return formula;
}
And CountOccurence:
public static int CountOccurences(this string value, string parameter)
{
int counter = 0;
int startIndex = 0;
int indexOfCeiling;
do
{
indexOfCeiling = value.IndexOf(parameter, startIndex);
if (indexOfCeiling < 0)
{
break;
}
else
{
startIndex = indexOfCeiling + 1;
counter++;
}
} while (true);
return counter;
}

Is it possible to `continue` a foreach loop while in a for loop?

Okay, so what I want to do should sound pretty simple. I have a method that checks every character in a string if it's a letter from a to m. I now have to continue a foreach loop, while in a for loop. Is there a possible way to do what I want to do?
public static string Function(String s)
{
int error = 0;
foreach (char c in s)
{
for (int i = 97; i <= 109; i++)
{
if (c == (char)i)
{
// Here immediately continue the upper foreach loop, not the for loop
continue;
}
}
error++;
}
int length = s.Length;
return error + "/" + length;
}
If there's a character that's not in the range of a to m, there should be 1 added to error. In the end, the function should return the number of errors and the number of total characters in the string, f.E: "3/17".
Edit
What I wanted to achieve is not possible. There are workarounds, demonstrated in BsdDaemon's answer, by using a temporary variable.
The other answers fix my issue directly, by simply improving my code.
i think ('a' == (char)i) should be (c == (char)i) .
and why not replace the for with just if((int)c >= 97 && (int)c <= 109)?
you solution might work but is extremly unperformant
How about using LINQ:
int errors = s
.Count(c => !Enumerable.Range(97, 13).Contains(c));
Then there is no need to break out of the loop.
Or to avoid the nested loop altogether, which will improve performance:
int errors = s.Count(c => c < 97 || c > 109);
char is implicitly convertible to int so there's no need to cast.
You can do this by breaking the internal loop, this means the internal loop will be escaped as if the iterations ended. After this, you can use a boolean to control with continue if the rest of the underlying logic processes:
public static string Function(String s)
{
int error = 0;
foreach (char c in s)
{
bool skip = false;
for (int i = 97; i <= 109; i++)
{
if ('a' == (char)i)
{
skip = true;
break;
}
}
if (skip) continue;
error++;
}
string length = Convert.ToString(s.Length);
return error + "/" + length;
}
Regular Expressions are perfectly suited to handle this type of "problem" and is considerably more flexible...for one thing you would not be limited to consecutive characters. The following console app demonstrates how to use Regex to extract the desired information from the targeted string.
private static string TestSearchTextRegex(string textToSearch)
{
var pattern = "[^a-m]";
var ms = Regex.Matches(textToSearch, pattern);
return $"{ms.Count}/{textToSearch.Length}";
}
NOTE
The pattern "[^a-m]" basically says: find a match that is NOT (^) in the set of characters. This pattern could easily be defined as "[^a-mz]" which in addition to characters "a-m" would also consider "z" to also be a character that would not be counted in the error group.
Another advantage to the Regex solution, you are able to use the actual characters you are looking for as apposed to the number that represents that character.
The continue will skip the further lines in that iterations and if you need to break out the loop use break.
public static string Function(String s)
{
int error = 0;
foreach (char c in s)
{
for (int i = 97; i <= 109; i++)
{
if ('a' == (char)i)
{
break; // break the whole looping, continue will break only the iteration
}
}
error++;
}
string length = Convert.ToString(s.Length);
return error + "/" + length;
}

How to run-length encode 'EEDDDNE' to '2E3DNE'?

Explanation: The task itself is that we have 13 strings (stored in the sor[] array) like the one in the title or 'EEENKDDDDKKKNNKDK'
and we have to shorten it in a way that if there's two or more of the same letter next to eachother then we have to write it in the form of 'NumberoflettersLetter'
So by this rule, 'EEENKDDDDKKKNNKDK' would become '3ENK4D3K2NKDK'
using System;
public class Program
{
public static void Main(string[] args)
{
string[] sor = new string[] { "EEENKDDDDKKKNNKDK", "'EEDDDNE'" };
char holder;
int counter = 0;
string temporary;
int indexholder;
for (int i = 0; i < sor.Length; i++)
{
for (int q = 0; q < sor[i].Length; q++)
{
holder = sor[i][q];
indexholder = q;
counter = 0;
while (sor[i][q] == holder)
{
q++;
counter++;
}
if (counter > 1)
{
temporary = Convert.ToString(counter) + holder;
sor[i].Replace(sor[i].Substring(indexholder, q), temporary); // EX here
}
}
}
Console.ReadLine();
}
}
Sorry I didn't make the error clear, it says that :
"The value of index and length has to represent a place inside the string (System.ArgumentOutOfRangeException) - name of parameter: length"
...but I have no clue what's wrong with it, maybe it's a tiny little mistake, maybe the whole thing is messed up, so this is why I'd like someone to help me with this D:
(Ps 'indexholder' is there because i need it for another exercise)
EDIT:
'sor' is the string array that holds these strings (there are 13 of them) like the one mentioned in the title or in the example
You can use regex for this:
Regex.Replace("EEENKDDDDKKKNNKDK", #"(.)\1+", m => $"{m.Length}{m.Groups[1].Value}")
Explanation:
(.) matches any character and puts it in group #1
\1+ matches group #1 as many times can it can
Shortening the same string inplace is more difficult then construction a new one while iterating the old one char by char. If you plan to iteratively add to a string it is better to use the StringBuilder - class instead of adding directly to a string (performance reasons).
You can streamline your approach by using IEnumerable.Aggregate function wich does the iteration on one string for you automatically:
using System;
using System.Linq;
using System.Text;
public class Program
{
public static string RunLengthEncode(string s)
{
if (string.IsNullOrEmpty(s)) // avoid null ref ex and do simple case
return "";
// we need a "state" between the differenc chars of s that we store here:
char curr_c = s[0]; // our current char, we start with the 1st one
int count = 0; // our char counter, we start with 0 as it will be
// incremented as soon as it is processed by Aggregate
// ( and then incremented to 1)
var agg = s.Aggregate(new StringBuilder(), (acc, c) => // StringBuilder
// performs better for multiple string-"additions" then string itself
{
if (c == curr_c)
count++; // same char, increment
else
{
// other char
if (count > 1) // store count if > 1
acc.AppendFormat("{0}", count);
acc.Append(curr_c); // store char
curr_c = c; // set current char to new one
count = 1; // startcount now is 1
}
return acc;
});
// add last things
if (count > 1) // store count if > 1
agg.AppendFormat("{0}", count);
agg.Append(curr_c); // store char
return agg.ToString(); // return the "simple" string
}
Test with
public static void Main(string[] args)
{
Console.WriteLine(RunLengthEncode("'EEENKDDDDKKKNNKDK' "));
Console.ReadLine();
}
}
Output for "'EEENKDDDDKKKNNKDK' ":
'3ENK4D3K2NKDK'
Your approach without using the same string is more like this:
var data = "'EEENKDDDDKKKNNKDK' ";
char curr_c = '\x0'; // avoid unasssinged warning
int count = 0; // counter for the curr_c occurences in row
string result = string.Empty; // resulting string
foreach (var c in data) // process every character of data in order
{
if (c != curr_c) // new character found
{
if (count > 1) // more then 1, add count as string and the char
result += Convert.ToString(count) + curr_c;
else if (count > 0) // avoid initial `\x0` being put into string
result += curr_c;
curr_c = c; // remember new character
count = 1; // so far we found this one
}
else
count++; // not new, increment counter
}
// add the last counted char as well
if (count > 1)
result += Convert.ToString(count) + curr_c;
else
result += curr_c;
// output
Console.WriteLine(data + " ==> " + result);
Output:
'EEENKDDDDKKKNNKDK' ==> '3ENK4D3K2NKDK'
Instead of using the indexing operator [] on your string and have to struggle with indexes all over I use foreach c in "sometext" ... which will proceed char-wise through the string - much less hassle.
If you need to run-length encode an array/list (your sor) of strings, simply apply the code to each one (preferably by using foreach s in yourStringList ....

C# iterate a continuously growing multi dimensional array

Imagine I wanted to iterate from A to Z. We would use either Foreach or For loop. After attaining Z I would then like to iterate from AA to ZZ, so it starts at AA, then goes to AB, AC...AZ, BA, BC..BZ..ZA,ZB, ZZ. At which point we would move to three chars, then 4 etc up to an undefined point.
Because we don't have a defined length for the array we cannot use nested for loops... so
Question: How can this be done?
Note, No code has been given because we all know how to foreach over an array and nest foreach loops.
Here's some code that will do what you want. Full explanation follows but in summary it takes advantage of the fact that once you have done all the letters of a given length you do A followed by that entire sequence again then B followed by the entire sequence again, etc.
private IEnumerable<string> EnumerateLetters()
{
int count = 1;
while (true)
{
foreach(var letters in EnumerateLetters(count))
{
yield return letters;
}
count++;
}
}
private IEnumerable<string> EnumerateLetters(int count)
{
if (count==0)
{
yield return String.Empty;
}
else
{
char letter = 'A';
while(letter<='Z')
{
foreach(var letters in EnumerateLetters(count-1))
{
yield return letter+letters;
}
letter++;
}
}
}
There are two methods. The first is the one that you call and will generate an infinite sequence of letters. The second does the recursion magic.
The first is pretty simple. it has a count of how many letters we are on, calls the second method with that count and then enumerates through them returning them. Once it has done all for one size it increases the count and loops.
The second method is the one that does the magic. It takes in a count for the number of letters in the generated string. If the count is zero it returns an empty string and breaks.
If the count is more than one it will loop through the letters A to Z and for each letter it will append the sequence that it one shorter than it to the A. Then for the B and so on.
This will then keep going indefinitely.
The sequence will keep generating indefinitely. Because it uses recursion it would be theoretically possible to start stack overflowing if your letter string becomes too long but at one level of recursion per letter in the string you will need to be getting up to very long strings before you need to worry about that (and I suspect if you've gone that far in a loop that you'll run into other problems first).
The other key point (if you are not aware) is that yield return uses deferred execution so it will generate each new element in the sequence as it is needed so it will only generate as many items as you ask for. If you iterate through five times it will only generate A-E and won't have wasted any time thinking about what comes next.
Yet another generator (adding 1 to a number with radix == 26: A stands for 0, B for 1, ... Z for 25):
// please, notice, that Generator() can potentially spawn ifinitely many items
private static IEnumerable<String> Generator() {
char[] data = new char[] { 'A' }; // number to start with - "A"
while (true) {
yield return new string(data);
// trying to add one
for (int i = data.Length - 1; i >= 0; --i)
if (data[i] == 'Z')
data[i] = 'A';
else {
data[i] = (char) (data[i] + 1);
break;
}
// have we exhausted N-length numbers?
if (data.All(item => item == 'A'))
data = Enumerable
.Repeat('A', data.Length + 1) // ... continue with N + 1-length numbers
.ToArray();
}
}
Test
// take first 1000 items:
foreach (var item in Generator().Take(1000))
Console.WriteLine(item);
Outcome
A
B
C
..
X
Y
Z
AA
AB
..
AZ
BA
BB
BC
..
ZY
ZZ
AAA
AAB
AAC
..
ALK
ALL
You could do something like this, it gives me unending output of your pattern (sorry, not exact your pattern, but you understand how to do it)
public static IEnumerable<string> Produce()
{
string seed = "A";
int i = 0;
while (true)
{
yield return String.Join("", Enumerable.Repeat(seed, i));
if (seed == "Z")
{
seed = "A";
i++;
}
else
{
seed = ((char)(seed[0]+1)).ToString();
}
}
}
And than :
foreach (var s in Produce())
{
//Do something
}
EDIT I have desired output with this method :
public static IEnumerable<string> Produce()
{
int i = 1;
while (true)
{
foreach(var c in produceAmount(i))
{
yield return c;
}
i++;
}
}
private static IEnumerable<string> produceAmount(int i)
{
var firstRow = Enumerable.Range('A', 'Z' - 'A'+1).Select(x => ((char)x).ToString());
if (i >= 1)
{
var second = produceAmount(i - 1);
foreach (var c in firstRow)
{
foreach (var s in second)
{
yield return c + s;
}
}
}
else
{
yield return "";
}
}
The way to go is to use simple recursive approach. C# is a good language to present an idea with the use of generators:
private static IEnumerable<string> EnumerateLetters(int length) {
for (int i = 1; i <= length; i++) {
foreach (var letters in EnumerateLettersExact(i)) {
yield return letters;
}
}
}
private static IEnumerable<string> EnumerateLettersExact(int length) {
if (length == 0) {
yield return "";
}
else {
for (char c = 'A'; c <= 'Z'; ++c) {
foreach (var letters in EnumerateLettersExact(length - 1)) {
yield return c + letters;
}
}
}
}
private static void Main(string[] args) {
foreach (var letters in EnumerateLetters(2)) {
Console.Write($"{letters} ");
}
}
EnumerateLetters generates successive sequences of letters. The parameter decides up to which length would you like to request sequences.
EnumerateLettersExact takes care of generating sequences recursively. It can either be empty or is a concatenation of some letter with all sequences of shorter length.
Your're about to have an array from A to Z [A,...,Z].
Then your going to make multiple for loops
for example:
PSEUDOCODE
foreach(in array){
first = declare first variable (array)
foreach(in array{
second =declare 2nd variable (array)
return first + second
}
}
Try the following. This is a method to generate the appropriate string for a given number. You can write a for loop for however many number of iterations you want.
string SingleEntry(int number)
{
char[] array = " ABCDEFGHIJKLMNOPQRSTUVWXYZ".ToArray();
Stack<string> entry = new Stack<string>();
List<string> list = new List<string>();
int bas = 26;
int remainder = number, index = 0;
do
{
if ((remainder % bas) == 0)
{
index = bas;
remainder--;
}
else
index = remainder % bas;
entry.Push(array[index].ToString());
remainder = remainder / bas;
}
while (remainder != 0);
string s = "";
while (entry.Count > 0)
{
s += entry.Pop();
}
return s;
}

Parsing strings recursively

I am trying to extract information out of a string - a fortran formatting string to be specific. The string is formatted like:
F8.3, I5, 3(5X, 2(A20,F10.3)), 'XXX'
with formatting fields delimited by "," and formatting groups inside brackets, with the number in front of the brackets indicating how many consecutive times the formatting pattern is repeated. So, the string above expands to:
F8.3, I5, 5X, A20,F10.3, A20,F10.3, 5X, A20,F10.3, A20,F10.3, 5X, A20,F10.3, A20,F10.3, 'XXX'
I am trying to make something in C# that will expand a string that conforms to that pattern. I have started going about it with lots of switch and if statements, but am wondering if I am not going about it the wrong way?
I was basically wondering if some Regex wizzard thinks that Regular expressions can do this in one neat-fell swoop? I know nothing about regular expressions, but if this could solve my problem I am considering putting in some time to learn how to use them... on the other hand if regular expressions can't sort this out then I'd rather spend my time looking at another method.
This has to be doable with Regex :)
I've expanded my previous example and it test nicely with your example.
// regex to match the inner most patterns of n(X) and capture the values of n and X.
private static readonly Regex matcher = new Regex(#"(\d+)\(([^(]*?)\)", RegexOptions.None);
// create new string by repeating X n times, separated with ','
private static string Join(Match m)
{
var n = Convert.ToInt32(m.Groups[1].Value); // get value of n
var x = m.Groups[2].Value; // get value of X
return String.Join(",", Enumerable.Repeat(x, n));
}
// expand the string by recursively replacing the innermost values of n(X).
private static string Expand(string text)
{
var s = matcher.Replace(text, Join);
return (matcher.IsMatch(s)) ? Expand(s) : s;
}
// parse a string for occurenses of n(X) pattern and expand then.
// return the string as a tokenized array.
public static string[] Parse(string text)
{
// Check that the number of parantheses is even.
if (text.Sum(c => (c == '(' || c == ')') ? 1 : 0) % 2 == 1)
throw new ArgumentException("The string contains an odd number of parantheses.");
return Expand(text).Split(new[] { ',', ' ' }, StringSplitOptions.RemoveEmptyEntries);
}
I would suggest using a recusive method like the example below( not tested ):
ResultData Parse(String value, ref Int32 index)
{
ResultData result = new ResultData();
Index startIndex = index; // Used to get substrings
while (index < value.Length)
{
Char current = value[index];
if (current == '(')
{
index++;
result.Add(Parse(value, ref index));
startIndex = index;
continue;
}
if (current == ')')
{
// Push last result
index++;
return result;
}
// Process all other chars here
}
// We can't find the closing bracket
throw new Exception("String is not valid");
}
You maybe need to modify some parts of the code, but this method have i used when writing a simple compiler. Although it's not completed, just a example.
Personally, I would suggest using a recursive function instead. Every time you hit an opening parenthesis, call the function again to parse that part. I'm not sure if you can use a regex to match a recursive data structure.
(Edit: Removed incorrect regex)
Ended up rewriting this today. It turns out that this can be done in one single method:
private static string ExpandBrackets(string Format)
{
int maxLevel = CountNesting(Format);
for (int currentLevel = maxLevel; currentLevel > 0; currentLevel--)
{
int level = 0;
int start = 0;
int end = 0;
for (int i = 0; i < Format.Length; i++)
{
char thisChar = Format[i];
switch (Format[i])
{
case '(':
level++;
if (level == currentLevel)
{
string group = string.Empty;
int repeat = 0;
/// Isolate the number of repeats if any
/// If there are 0 repeats the set to 1 so group will be replaced by itself with the brackets removed
for (int j = i - 1; j >= 0; j--)
{
char c = Format[j];
if (c == ',')
{
start = j + 1;
break;
}
if (char.IsDigit(c))
repeat = int.Parse(c + (repeat != 0 ? repeat.ToString() : string.Empty));
else
throw new Exception("Non-numeric character " + c + " found in front of the brackets");
}
if (repeat == 0)
repeat = 1;
/// Isolate the format group
/// Parse until the first closing bracket. Level is decremented as this effectively takes us down one level
for (int j = i + 1; j < Format.Length; j++)
{
char c = Format[j];
if (c == ')')
{
level--;
end = j;
break;
}
group += c;
}
/// Substitute the expanded group for the original group in the format string
/// If the group is empty then just remove it from the string
if (string.IsNullOrEmpty(group))
{
Format = Format.Remove(start - 1, end - start + 2);
i = start;
}
else
{
string repeatedGroup = RepeatString(group, repeat);
Format = Format.Remove(start, end - start + 1).Insert(start, repeatedGroup);
i = start + repeatedGroup.Length - 1;
}
}
break;
case ')':
level--;
break;
}
}
}
return Format;
}
CountNesting() returns the highest level of bracket nesting in the format statement, but could be passed in as a parameter to the method. RepeatString() just repeats a string the specified number of times and substitutes it for the bracketed group in the format string.

Categories