Hierarchical and numerical ordering of strings made from delimited integers (C#) - c#

I have a list of folder names that represent chapters, subchapters, sections, paragraphs and lines in a specification. A small sample of these folders looks like the following.
1_1_1
1_1_12
1_1_2
1_2_1
1_2_1_3_1
1_2_2
I need to write a function that sorts these numerically and taking account for hierarchical nesting. For instance the correct output of sorting the above would be.
1_1_1
1_1_2
1_1_12
1_2_1
1_2_1_3_1
1_2_2
Since this is very much the same way version numbers are sorted I have tried the following code which worked until it attempts to process an input with more than 4 sections (i.e. 1_2_1_3_1)
private List<Resource> OrderResources(List<Resource> list)
{
return list.OrderBy(v => System.Version.Parse(v.Id.Replace('_', '.'))).ToList();
}
The error I get is
System.ArgumentException : Version string portion was too short or too long. (Parameter 'input')

Sorting is possible if you add n characters 0 between the digits.
Also You can use long and move the numbers n digits to the left, but then there will be a limit on the length of the number.
static void Main(string[] args)
{
var chapter = new List<string>();
chapter.Add("1_1_1");
chapter.Add("1_1_12");
chapter.Add("1_1_2");
chapter.Add("1_2_1");
chapter.Add("1_2_1_3_1");
chapter.Add("1_2_2");
var result = chapter.OrderBy(x=>SortCalc(x)).ToArray();
foreach (var s in result)
{
Console.WriteLine($"{s}->{SortCalc(s)}");
}
}
private static string SortCalc(string x, int count = 3)
{
var array = x.Split('_').ToList();
for (var index = 0; index < array.Count; index++)
{
var length = count - array[index].Length;
if (length <=0)
continue;
array[index] = new string('0', length)+ array[index];
}
var num = string.Join("", array);
return num;
}
Output will be
1_1_1->001001001
1_1_2->001001002
1_1_12->001001012
1_2_1->001002001
1_2_1_3_1->001002001003001
1_2_2->001002002

I am posting this as a way to help the OP that has a problem with the solution posted at Natural Sort Order in C#
This answer is just an example on how to use that class described in that answer. So I am pretty close to a verbatim copy of the mentioned answer. But, in any case, please look at the comment in the question.
Here we go.
First the class NaturalStringComparer is taken in full from the answer linked above.
Next I add a fac-simile of the class Resource used by the OP and some code to initialize it
public class Resource
{
public string ID { get; set; }
.... other properties???....
}
void Main()
{
List<Resource> unordered = new List<Resource>
{
new Resource{ ID = "1_1_12"},
new Resource{ ID = "1_2_1_3_1"},
new Resource{ ID = "1_1_2"},
new Resource{ ID = "1_2_2"},
new Resource{ ID = "1_1_1"},
new Resource{ ID = "1_2_1"},
};
Now the call to the NaturalStringOrder is the following
var c = new NaturalStringComparer();
var r = unordered.OrderBy(u => u.ID, c);
And the results are
foreach(var x in r)
Console.WriteLine(x.ID);
---------
1_1_1
1_1_2
1_1_12
1_2_1
1_2_1_3_1
1_2_2
I provide also a benchmark to check the performances.
Stopwatch sw = new Stopwatch();
sw.Start();
for (int x = 0; x < 1000000; x++)
{
var c = new NaturalStringComparer();
var r = unordered.OrderBy(u => u.ID, c);
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
I get 17ms, that looks pretty good.

I decided to implement it using Linq as well:
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args)
{
var chapter = new List<string>();
chapter.Add("1_5");
chapter.Add("1_4_1");
chapter.Add("1_1_1");
chapter.Add("1_1_12");
chapter.Add("1_1_2");
chapter.Add("1_2_1");
chapter.Add("1_2_1_3_1");
chapter.Add("1_2_2");
chapter.Add("1_2_22");
chapter.Add("1_2_21");
var result = chapter
.OrderBy(aX=> aX.Replace("_", "0"))
.ToArray();
foreach (var s in result)
{
Console.WriteLine($"{s} -> {s.Replace("_", "0")}");
}
}
}
Also maybe for some extra revs, the original implementation could benefit from Span<char> chars = stackalloc char[x.Length];. You can then skip the part while doing Split and make it:
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args)
{
var chapter = new List<string>();
chapter.Add("1_5");
chapter.Add("1_4_1");
chapter.Add("1_1_1");
chapter.Add("1_1_12");
chapter.Add("1_1_2");
chapter.Add("1_2_1");
chapter.Add("1_2_1_3_1");
chapter.Add("1_2_2");
chapter.Add("1_2_22");
chapter.Add("1_2_21");
var spanResult = chapter
.OrderBy(Sort)
.ToArray();
foreach (var s in spanResult)
{
Console.WriteLine($"{s} -> {Sort(s)}");
}
}
static string Sort(string aValue)
{
const char zero = '0';
const char underscore = '_';
Span<char> chars = stackalloc char[aValue.Length];
for (int i = 0; i < aValue.Length; i++)
{
if (aValue[i] == underscore)
{
if (i != aValue.Length - 1)
{
chars[i] = zero;
}
continue;
}
chars[i] = aValue[i];
}
return chars.ToString();
}
}
The output:
Linq
1 -> 1
1_ -> 10
1_1_1 -> 10101
1_1_12 -> 101012
1_1_2 -> 10102
1_2_1 -> 10201
1_2_1_3_1 -> 102010301
1_2_2 -> 10202
1_2_21 -> 102021
1_2_22 -> 102022
1_4_1 -> 10401
1_5 -> 105
Span
1 -> 1
1_ -> 1
1_1_1 -> 10101
1_1_12 -> 101012
1_1_2 -> 10102
1_2_1 -> 10201
1_2_1_3_1 -> 102010301
1_2_2 -> 10202
1_2_21 -> 102021
1_2_22 -> 102022
1_4_1 -> 10401
1_5 -> 105
The assumption in my implementation is that _ will only be applied in case there is "hierarchy", otherwise the implementation would need to check for that, of course.
I have added the updated implementation in case the "assumption" is not correct.

Related

Match existing characters together into words and check if the words appear in the given word list?

I have a list of words like this:
string[] listWords = "la,lam,lan,son,som,some,mos,mao,sehi,noesrh,nroeh,doise".Split(',');
The above list words is a combination of characters and they all have meanings. We can temporarily call it a dictionary.
Next, I have a multiple lists of character arrays like this:
string[] charArr1 = "a,j,s".Split(',');
string[] charArr2 = "c,l,o".Split(',');
string[] charArr3 = "d,m,n".Split(',');
string[] charArr4 = "n,e,w".Split(',');
string[] charArr5 = "f,o,x".Split(',');
string[] charArr6 = "h,q,z".Split(',');
string[] charArr7 = "i,r".Split(',');
I want to concatenate characters together. For each charArray I will take 1 character out and concatenate them together to become words, then I will check if these concatenated words are in the listwords[] list or not. If it is present, I will save the word in the saveWords[] array.
Condition:
Characters of the same charArray[] are not concatenated together and each charArray[] can only select one single character each time.
Match all cases and not miss any cases.
Eg:
a+c -> ac (Match correctly) -> search in listwords[] -> does not appear
a+j (Improper matching)
a+s (Improper matching)
a+c+d -> acd (Match properly) -> search in listwords[] -> does not appear
s+e+i+h -> seih (Match correctly) -> search in listwords[] -> does not appear
s+e+h+i -> sehi (Match correctly) -> search in listwords[] -> if this word appears-> save to saveWords[] array
What I mean is that the concatenation of characters will not miss any cases. Eg:
charArr1[]+charArr2[] -> will match the following cases: a+c, a+l, a+o, j+c, j+l, j+o, s+c, s+l, s+o
charArr2[]+charArr1[] -> will match the following cases: c+a, c+j, c+s, l+a,l+j, l+s,o+a,o+j, o+s
charArr1+charArr2+charArr3
charArr1[]+charArr3[]+charArr2[]
charArr2[]+charArr1[]+charArr3[]
charArr2[]+charArr3[]+charArr1[]
and so on...
Please help me as I am confused in figuring out the algorithm. Thanks a lot.
Given
public static IEnumerable<string[]> Permutate(string[] array, int i, int n)
{
if (i == n)
yield return array;
else
for (var j = i; j <= n; j++)
{
Swap(ref array[i], ref array[j]);
foreach (var s in Permutate(array, i + 1, n))
yield return s;
Swap(ref array[i], ref array[j]);
}
}
public static void Swap(ref string a, ref string b) => (a, b) = (b, a);
public static bool IncMask(string[][] source, int[] mask)
{
for (var i = 0; i < mask.Length; i++)
{
mask[i]++;
if (mask[i] > source[i].Length)
mask[i] = 0;
else
return false;
}
return true;
}
public static IEnumerable<string> Iterate(params string[][] source)
{
var masks = new int[source.Length];
while (true)
{
if (IncMask(source, masks))
break;
var array = masks
.Select((i, j) => (i, j))
.Where(x => x.i != 0)
.Select(x => source[x.j][x.i - 1])
.ToArray();
foreach (var result in Permutate(array, 0, array.Length - 1))
yield return string.Concat(result);
}
}
Usage
var listWords = "la,lam,lan,son,som,some,mos,mao,sehi,noesrh,nroeh,doise".Split(',');
var charArr1 = "a,j,s".Split(',');
var charArr2 = "c,l,o".Split(',');
var charArr3 = "d,m,n".Split(',');
var charArr4 = "n,e,w".Split(',');
var charArr5 = "f,o,x".Split(',');
var charArr6 = "h,q,z".Split(',');
var charArr7 = "i,r".Split(',');
var results = Iterate(charArr1, charArr2, charArr3, charArr4, charArr5, charArr6, charArr7)
.Where(x => listWords.Contains(x));
Console.WriteLine(string.Join(", ", results));
Results
la, lam, mao, som, mos, lan, son, lan, son, some, mao, som, mos, son, son, some, doise, doise, sehi, nroeh, noesrh, nroeh, noesrh
Note, this is a fairly computationally heavy problem, I didn't put much effort into making this efficient, nor cared about duplicates. It could likely be solved many other (more performant) ways.
Also, I have only minimally tested this, so I am not responsible for anyone you maim or otherwise injure with this code. It could be completely wrong ¯\_(ツ)_/¯

finding max value in a c# array

I am making a program in c# that will take in a list of names and scores from a text document, get the score on its own and then find the highest of the scores. I can separate the name from the score when it is just one but as soon as I try make it an array I do not have any idea what I am doing.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
class Program
{
static void Main(string[] args)
{
System.IO.File.Exists(#"U:\StudentExamMarks.txt");
string[] text = System.IO.File.ReadAllLines(#"U:\StudentExamMarks.txt");
int a = 0;
string[] results = new string[a];
for(int i=0;i<text.Length ; i++ )
{
int x = text[i].LastIndexOf("\t");
int y = text[i].Length;
int z = (y - (x + 1));
results[a] = text[i].Substring((x+1), (z));
a++;
Console.WriteLine("{0}", results);
}
}
}
This is what I have so far
the list is as follows
John Cross 100
Christina Chandler 105
Greg Hamilton 107
Pearl Becker 111
Angel Ford 115
Wendell Sparks 118
like I said when I attempted it without an array I can get it to display the 100 from the first result. I also do not know how when I find the largest result how to link it back to the students name.
I suggest to use a class to hold all properties, that improves readability a lot:
public class StudentExam
{
public string StudentName { get; set; }
public int Mark { get; set; }
}
and following to read all lines and to fill a List<StudentExam>:
var lines = File.ReadLines(#"U:\StudentExamMarks.txt")
.Where(l => !String.IsNullOrWhiteSpace(l));
List<StudentExam> studentsMarks = new List<StudentExam>();
foreach (string line in lines)
{
string[] tokens = line.Split('\t');
string markToken = tokens.Last().Trim();
int mark;
if (tokens.Length > 1 && int.TryParse(markToken, out mark))
{
StudentExam exam = new StudentExam{
Mark = mark,
StudentName = String.Join(" ", tokens.Take(tokens.Length - 1)).Trim()
};
studentsMarks.Add(exam);
}
}
Now it's easy to get the max-mark:
int maxMark = studentsMarks.Max(sm => sm.Mark); // 118
To find the highest score, you can use Linq with Regex like this
var lines = new[] {
"John Cross 100",
"Christina Chandler 105",
"Greg Hamilton 107",
"Pearl Becker 111"
};
var maxScore = lines.Max(l => int.Parse(Regex.Match(l, #"\b\d+\b").Value));
Here, I'm assuming you have read the file correctly into lines and all of them has a valid int value of the score.
If the end of each entry is always a space followed by the student's score, you can use a simple substring:
int max = text.Max(x => Convert.ToInt32(x.Substring(x.LastIndexOf(' '))));
For each entry, create a substring that starts at the last index of ' ' and then convert that to an integer. Then return the max of those values.

How to search for multiple strings and keep counters for them

What I'm trying to do is the following - I have hundreds of log files, that I need to search through and do some counting. The basic idea is this, take a .txt file, read every line, if search item 1 is found, increment the counter for search item 1, if search item 2 is found, increment the counter for search item 2 and so on.. For example, if the file contained something like...
a b c
d e f
g h i
j k h
And If I specified the searchables to be e & h, the output should say
e : 1
h : 2
The number of search terms is expandable, basically the user can give either 1 search number or 10, so i'm not sure how I can implement n number of counters based on the number of searchables.
The below is what I have so far, its just a basic approach to see what works and what doesnt... Right now, it only keeps the count for one of the search terms. At the moment, I am writing the results to the console to just test, ultimately, It will be written to a .txt or .xlsx. any help will be appreciated!
string line;
int Scounter = 0;
int Mcounter = 0;
List<string> searchables = new List<string>();
private void search_Log(string p)
{
searchables.Add("S");
searchables.Add("M");
StreamReader reader = new StreamReader(p);
while ((line = reader.ReadLine()) != null)
{
for (int i = 0; i < searchables.Count(); i++)
{
if (line.Contains(searchables[i]))
{
Scounter++;
}
}
}
reader.Close();
Console.WriteLine("# of S: " + Scounter);
Console.WriteLine("# of M: " + Mcounter);
}
A common approach to this is to use a Dictionary<string, int> to track the values and counts:
// Initialise the dictionary:
Dictionary<string, int> counters = new Dictionary<string, int>();
Then later:
if (line.Contains(searchables[i]))
{
if (counters.ContainsKey(searchables[i]))
{
counters[searchables[i]] ++;
}
else
{
counters.Add(searchables[i], 1);
}
}
Then, when you are finished processing:
// Add in any searches which had no results:
foreach (var searchTerm in searchables)
{
if (counters.ContainsKey(searchTerm) == false)
{
counters.Add(searchTerm, 0);
}
}
foreach (var item in counters)
{
Console.WriteLine("Value {0} occurred {1} times", item.Key, item.Value);
}
you could use a class for the searchables like:
public class Searchable
{
public string searchTerm;
public int count;
}
then
while ((line = reader.ReadLine()) != null)
{
foreach (var searchable in searchables)
{
if (line.Contains(searchable.searchTerm))
{
searchable.count++;
}
}
}
This would be one of many ways to track multiple search terms and their counts.
You can make use of linq here:
string lines = reader.ReadtoEnd();
var result = lines.Split(new string[]{" ","\r\n"},StringSplitOptions.RemoveEmptyEntries)
.GroupBy(x=>x)
.Select(g=> new
{
Alphabet = g.Key ,
Count = g.Count()
}
);
Input:
a b c
d e f
Output :
a: 1
b: 1
c: 1
d: 1
e: 1
f: 1
This version will count 1^n search terms that occur 1^n times per file line. It accounts for the possibility of a term existing more than once on one line.
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication5
{
class Program
{
static void Main(string[] args)
{
Func<string, string[], Dictionary<string, int>> searchForCounts = null;
searchForCounts = (filePathAndName, searchTerms) =>
{
Dictionary<string, int> results = new Dictionary<string, int>();
if (string.IsNullOrEmpty(filePathAndName) || !File.Exists(filePathAndName))
return results;
using (TextReader tr = File.OpenText(filePathAndName))
{
string line = null;
while ((line = tr.ReadLine()) != null)
{
for (int i = 0; i < searchTerms.Length; ++i)
{
var searchTerm = searchTerms[i].ToLower();
var index = 0;
while (index > -1)
{
index = line.IndexOf(searchTerm, index, StringComparison.OrdinalIgnoreCase);
if (index > -1)
{
if (results.ContainsKey(searchTerm))
results[searchTerm] += 1;
else
results[searchTerm] = 1;
index += searchTerm.Length - 1;
}
}
}
}
}
return results;
};
var counts = searchForCounts("D:\\Projects\\ConsoleApplication5\\ConsoleApplication5\\TestLog.txt", new string[] { "one", "two" });
Console.WriteLine("----Counts----");
foreach (var keyPair in counts)
{
Console.WriteLine("Term: " + keyPair.Key.PadRight(10, ' ') + " Count: " + keyPair.Value.ToString());
}
Console.ReadKey(true);
}
}
}
Input:
OnE, TwO
Output:
----Counts----
Term: one Count: 7
Term: two Count: 15

compare the characters in two strings

In C#, how do I compare the characters in two strings.
For example, let's say I have these two strings
"bc3231dsc" and "bc3462dsc"
How do I programically figure out the the strings
both start with "bc3" and end with "dsc"?
So the given would be two variables:
var1 = "bc3231dsc";
var2 = "bc3462dsc";
After comparing each characters from var1 to var2, I would want the output to be:
leftMatch = "bc3";
center1 = "231";
center2 = "462";
rightMatch = "dsc";
Conditions:
1. The strings will always be a length of 9 character.
2. The strings are not case sensitive.
The string class has 2 methods (StartsWith and Endwith) that you can use.
After reading your question and the already given answers i think there are some constraints are missing, which are maybe obvious to you, but not to the community. But maybe we can do a little guess work:
You'll have a bunch of string pairs that should be compared.
The two strings in each pair are of the same length or you are only interested by comparing the characters read simultaneously from left to right.
Get some kind of enumeration that tells me where each block starts and how long it is.
Due to the fact, that a string is only a enumeration of chars you could use LINQ here to get an idea of the matching characters like this:
private IEnumerable<bool> CommonChars(string first, string second)
{
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
var charsToCompare = first.Zip(second, (LeftChar, RightChar) => new { LeftChar, RightChar });
var matchingChars = charsToCompare.Select(pair => pair.LeftChar == pair.RightChar);
return matchingChars;
}
With this we can proceed and now find out how long each block of consecutive true and false flags are with this method:
private IEnumerable<Tuple<int, int>> Pack(IEnumerable<bool> source)
{
if (source == null)
throw new ArgumentNullException("source");
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
bool current = iterator.Current;
int index = 0;
int length = 1;
while (iterator.MoveNext())
{
if(current != iterator.Current)
{
yield return Tuple.Create(index, length);
index += length;
length = 0;
}
current = iterator.Current;
length++;
}
yield return Tuple.Create(index, length);
}
}
Currently i don't know if there is an already existing LINQ function that provides the same functionality. As far as i have already read it should be possible with SelectMany() (cause in theory you can accomplish any LINQ task with this method), but as an adhoc implementation the above was easier (for me).
These functions could then be used in a way something like this:
var firstString = "bc3231dsc";
var secondString = "bc3462dsc";
var commonChars = CommonChars(firstString, secondString);
var packs = Pack(commonChars);
foreach (var item in packs)
{
Console.WriteLine("Left side: " + firstString.Substring(item.Item1, item.Item2));
Console.WriteLine("Right side: " + secondString.Substring(item.Item1, item.Item2));
Console.WriteLine();
}
Which would you then give this output:
Left side: bc3
Right side: bc3
Left side: 231
Right side: 462
Left side: dsc
Right side: dsc
The biggest drawback is in someway the usage of Tuple cause it leads to the ugly property names Item1 and Item2 which are far away from being instantly readable. But if it is really wanted you could introduce your own simple class holding two integers and has some rock-solid property names. Also currently the information is lost about if each block is shared by both strings or if they are different. But once again it should be fairly simply to get this information also into the tuple or your own class.
static void Main(string[] args)
{
string test1 = "bc3231dsc";
string tes2 = "bc3462dsc";
string firstmatch = GetMatch(test1, tes2, false);
string lasttmatch = GetMatch(test1, tes2, true);
string center1 = test1.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
string center2 = test2.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
}
public static string GetMatch(string fist, string second, bool isReverse)
{
if (isReverse)
{
fist = ReverseString(fist);
second = ReverseString(second);
}
StringBuilder builder = new StringBuilder();
char[] ar1 = fist.ToArray();
for (int i = 0; i < ar1.Length; i++)
{
if (fist.Length > i + 1 && ar1[i].Equals(second[i]))
{
builder.Append(ar1[i]);
}
else
{
break;
}
}
if (isReverse)
{
return ReverseString(builder.ToString());
}
return builder.ToString();
}
public static string ReverseString(string s)
{
char[] arr = s.ToCharArray();
Array.Reverse(arr);
return new string(arr);
}
Pseudo code of what you need..
int stringpos = 0
string resultstart = ""
while not end of string (either of the two)
{
if string1.substr(stringpos) == string1.substr(stringpos)
resultstart =resultstart + string1.substr(stringpos)
else
exit while
}
resultstart has you start string.. you can do the same going backwards...
Another solution you can use is Regular Expressions.
Regex re = new Regex("^bc3.*?dsc$");
String first = "bc3231dsc";
if(re.IsMatch(first)) {
//Act accordingly...
}
This gives you more flexibility when matching. The pattern above matches any string that starts in bc3 and ends in dsc with anything between except a linefeed. By changing .*? to \d, you could specify that you only want digits between the two fields. From there, the possibilities are endless.
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
List<string> common_str = commonStrings(s1,s2);
foreach ( var s in common_str)
Console.WriteLine(s);
}
static public List<string> commonStrings(string s1, string s2){
int len = s1.Length;
char [] match_chars = new char[len];
for(var i = 0; i < len ; ++i)
match_chars[i] = (Char.ToLower(s1[i])==Char.ToLower(s2[i]))? '#' : '_';
string pat = new String(match_chars);
Regex regex = new Regex("(#+)", RegexOptions.Compiled);
List<string> result = new List<string>();
foreach (Match match in regex.Matches(pat))
result.Add(s1.Substring(match.Index, match.Length));
return result;
}
}
for UPDATE CONDITION
using System;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
int len = 9;//s1.Length;//cond.1)
int l_pos = 0;
int r_pos = len;
for(int i=0;i<len && Char.ToLower(s1[i])==Char.ToLower(s2[i]);++i){
++l_pos;
}
for(int i=len-1;i>0 && Char.ToLower(s1[i])==Char.ToLower(s2[i]);--i){
--r_pos;
}
string leftMatch = s1.Substring(0,l_pos);
string center1 = s1.Substring(l_pos, r_pos - l_pos);
string center2 = s2.Substring(l_pos, r_pos - l_pos);
string rightMatch = s1.Substring(r_pos);
Console.Write(
"leftMatch = \"{0}\"\n" +
"center1 = \"{1}\"\n" +
"center2 = \"{2}\"\n" +
"rightMatch = \"{3}\"\n",leftMatch, center1, center2, rightMatch);
}
}

Need algorithm to make simple program (sentence permutations)

I really cant understand how to make a simple algorithm on C# to solve my problem. So, we have a sentences:
{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}.
So, my program should make a lot of sentences looks like:
Hello my mate.
Hello my m8.
Hello my friend.
Hello my friends.
Hi my mate.
...
Hi-Hi my friends.
I know, there are a lot of programs which could do this, but i'd like to make it myself. Ofcourse, it should work with this too:
{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}.
Update I just wasn't too happy about my using the regexen to parse so simple input; yet I disliked the manual index manipulation jungle found in other answers.
So I replaced the tokenizing with a Enumerator-based scanner with two alternating token-states. This is more justified by the complexity of the input, and has a 'Linqy' feel to it (although it really isn't Linq). I have kept the original Regex based parser at the end of my post for interested readers.
This just had to be solved using Eric Lippert's/IanG's CartesianProduct Linq extension method, in which the core of the program becomes:
public static void Main(string[] args)
{
const string data = #"{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}.";
var pockets = Tokenize(data.GetEnumerator());
foreach (var result in CartesianProduct(pockets))
Console.WriteLine(string.Join("", result.ToArray()));
}
Using just two regexen (chunks and legs) to do the parsing into 'pockets', it becomes a matter of writing the CartesianProduct to the console :) Here is the full working code (.NET 3.5+):
using System;
using System.Text;
using System.Text.RegularExpressions;
using System.Linq;
using System.Collections.Generic;
namespace X
{
static class Y
{
private static bool ReadTill(this IEnumerator<char> input, string stopChars, Action<StringBuilder> action)
{
var sb = new StringBuilder();
try
{
while (input.MoveNext())
if (stopChars.Contains(input.Current))
return true;
else
sb.Append(input.Current);
} finally
{
action(sb);
}
return false;
}
private static IEnumerable<IEnumerable<string>> Tokenize(IEnumerator<char> input)
{
var result = new List<IEnumerable<string>>();
while(input.ReadTill("{", sb => result.Add(new [] { sb.ToString() })) &&
input.ReadTill("}", sb => result.Add(sb.ToString().Split('|'))))
{
// Console.WriteLine("Expected cumulative results: " + result.Select(a => a.Count()).Aggregate(1, (i,j) => i*j));
}
return result;
}
public static void Main(string[] args)
{
const string data = #"{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}.";
var pockets = Tokenize(data.GetEnumerator());
foreach (var result in CartesianProduct(pockets))
Console.WriteLine(string.Join("", result.ToArray()));
}
static IEnumerable<IEnumerable<T>> CartesianProduct<T>(this IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] {item}));
}
}
}
Old Regex based parsing:
static readonly Regex chunks = new Regex(#"^(?<chunk>{.*?}|.*?(?={|$))+$", RegexOptions.Compiled);
static readonly Regex legs = new Regex(#"^{((?<alternative>.*?)[\|}])+(?<=})$", RegexOptions.Compiled);
private static IEnumerable<String> All(this Regex regex, string text, string group)
{
return !regex.IsMatch(text)
? new [] { text }
: regex.Match(text).Groups[group].Captures.Cast<Capture>().Select(c => c.Value);
}
public static void Main(string[] args)
{
const string data = #"{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}.";
var pockets = chunks.All(data, "chunk").Select(v => legs.All(v, "alternative"));
The rest is unchanged
Not sure what you need Linq (#user568262) or "simple" recursion (#Azad Salahli) for. Here's my take on it:
using System;
using System.Text;
class Program
{
static Random rng = new Random();
static string GetChoiceTemplatingResult(string t)
{
StringBuilder res = new StringBuilder();
for (int i = 0; i < t.Length; ++i)
if (t[i] == '{')
{
int j;
for (j = i + 1; j < t.Length; ++j)
if (t[j] == '}')
{
if (j - i < 1) continue;
var choices = t.Substring(i + 1, j - i - 1).Split('|');
res.Append(choices[rng.Next(choices.Length)]);
i = j;
break;
}
if (j == t.Length)
throw new InvalidOperationException("No matching } found.");
}
else
res.Append(t[i]);
return res.ToString();
}
static void Main(string[] args)
{
Console.WriteLine(GetChoiceTemplatingResult(
"{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}."));
}
}
As others have noted, you can solve your problem by splitting up the string into a sequence of sets, and then taking the Cartesian product of all of those sets. I wrote a bit about generating arbitrary Cartesial products here:
http://blogs.msdn.com/b/ericlippert/archive/2010/06/28/computing-a-cartesian-product-with-linq.aspx
An alternative approach, more powerful than that, is to declare a grammar for your language and then write a program that generates every string in that language. I wrote a long series of articles on how to do so. It starts here:
http://blogs.msdn.com/b/ericlippert/archive/2010/04/26/every-program-there-is-part-one.aspx
You can use a Tuple to hold index values of each collection.
For example, you would have something like:
List<string> Greetings = new List<string>()
{
"Hello",
"Hi",
"Hallo"
};
List<string> Targets = new List<string>()
{
"Mate",
"m8",
"friend",
"friends"
};
So now you have your greetings, let's create random numbers and fetch items.
static void Main(string[] args)
{
List<string> Greetings = new List<string>()
{
"Hello",
"Hi",
"Hallo"
};
List<string> Targets = new List<string>()
{
"Mate",
"m8",
"friend",
"friends"
};
var combinations = new List<Tuple<int, int>>();
Random random = new Random();
//Say you want 5 unique combinations.
while (combinations.Count < 6)
{
Tuple<int, int> tmpCombination = new Tuple<int, int>(random.Next(Greetings.Count), random.Next(Targets.Count));
if (!combinations.Contains(tmpCombination))
{
combinations.Add(tmpCombination);
}
}
foreach (var item in combinations)
{
Console.WriteLine("{0} my {1}", Greetings[item.Item1], Targets[item.Item2]);
}
Console.ReadKey();
}
This doesn't look trivial. You need to
1. do some parsing, to extract all the lists of words that you want to combine,
2. obtain all the actual combinations of these words (which is made harder by the fact that the number of lists you want to combine is not fixed)
3. rebuild the original sentence putting all the combinations in the place of the group they came from
part 1 (the parsing part) is probably the easiest: it could be done with a Regex like this
// get all the text within {} pairs
var pattern = #"\{(.*?)\}";
var query = "{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}.";
var matches = Regex.Matches(query, pattern);
// create a List of Lists
for(int i=0; i< matches.Count; i++)
{
var nl = matches[i].Groups[1].ToString().Split('|').ToList();
lists.Add(nl);
// build a "template" string like "{0} my {1}"
query = query.Replace(matches[i].Groups[1].ToString(), i.ToString());
}
for part 2 (taking a List of Lists and obtain all resulting combinations) you can refer to this answer
for part 3 (rebuilding your original sentence) you can now take the "template" string you have in query and use String.Format to substitute all the {0}, {1} .... with the combined values from part 2
// just one example,
// you will need to loop through all the combinations obtained from part 2
var OneResultingCombination = new List<string>() {"hi", "mate"};
var oneResult = string.Format(query, OneResultingCombination.ToArray());

Categories