How to sort a string array by numeric style? - c#

I have a filenames array, I want to sort it by numeric style, please give to me a solution.
Example1:
Original array: [name99.txt, name98.txt, name100.txt]
Sorted array: [name98.txt, name99.txt, name100.txt]
(Using string sorting, result of sorting is [name100.txt, name98.txt, name99.txt])
Example2:
Original array: [a99.txt, b98.txt, b100.txt]
Sorted array: [a99.txt, b98.txt, b100.txt]
(Using string sorting, result of sorting is [a99.txt, b100.txt, b99.txt])

string[] ar = new string[] { "name99.txt", "name98.txt", "name100.txt" };
Array.Sort(ar, (a, b) => int.Parse(Regex.Replace(a, "[^0-9]", "")) - int.Parse(Regex.Replace(b, "[^0-9]", "")));
foreach (var a in ar)
Console.WriteLine(a);
The above assumed that your files are allways called name###.txt. For the real numeric sorting use the following more complicated version:
public static void NumericalSort(string[] ar)
{
Regex rgx = new Regex("([^0-9]*)([0-9]+)");
Array.Sort(ar, (a, b) =>
{
var ma = rgx.Matches(a);
var mb = rgx.Matches(b);
for (int i = 0; i < ma.Count; ++i)
{
int ret = ma[i].Groups[1].Value.CompareTo(mb[i].Groups[1].Value);
if (ret != 0)
return ret;
ret = int.Parse(ma[i].Groups[2].Value) - int.Parse(mb[i].Groups[2].Value);
if (ret != 0)
return ret;
}
return 0;
});
}
static void Main(string[] args)
{
string[] ar = new string[] { "a99.txt", "b98.txt", "b100.txt" };
NumericalSort(ar);
foreach (var a in ar)
Console.WriteLine(a);
}

There may well be a managed way to do this, but I would probably just P/invoke to StrCmpLogicalW.
[DllImport("shlwapi.dll", CharSet=CharSet.Unicode, ExactSpelling=true)]
static extern int StrCmpLogicalW(String x, String y);
If you use this function, rather than rolling your own comparison function, you'll get the same behaviour as Explorer and other system components that use logical comparison.
Note, however, that this will not work in environments where WinAPI is inaccessible (such as Windows Phone, Mono or Silverlight), might work differently on different systems and should be decorated with a comment so the future maintainer of your code knows why P/Invoke is used for sorting.

One solution can be found here: Alphanumeric Sorting

My approach is good when the length of the numeric chunks is no longer than 9 digits:
private string[] NumericalSort(IEnumerable<string> list)
{
var ar = list.ToArray();
Array.Sort(ar, (a, b) =>
{
var aa = Regex.Replace(a, #"\d+", m => m.Value.PadLeft(9, '0'));
var bb = Regex.Replace(b, #"\d+", m => m.Value.PadLeft(9, '0'));
return string.Compare(aa, bb);
});
return ar;
}

Related

Match existing characters together into words and check if the words appear in the given word list?

I have a list of words like this:
string[] listWords = "la,lam,lan,son,som,some,mos,mao,sehi,noesrh,nroeh,doise".Split(',');
The above list words is a combination of characters and they all have meanings. We can temporarily call it a dictionary.
Next, I have a multiple lists of character arrays like this:
string[] charArr1 = "a,j,s".Split(',');
string[] charArr2 = "c,l,o".Split(',');
string[] charArr3 = "d,m,n".Split(',');
string[] charArr4 = "n,e,w".Split(',');
string[] charArr5 = "f,o,x".Split(',');
string[] charArr6 = "h,q,z".Split(',');
string[] charArr7 = "i,r".Split(',');
I want to concatenate characters together. For each charArray I will take 1 character out and concatenate them together to become words, then I will check if these concatenated words are in the listwords[] list or not. If it is present, I will save the word in the saveWords[] array.
Condition:
Characters of the same charArray[] are not concatenated together and each charArray[] can only select one single character each time.
Match all cases and not miss any cases.
Eg:
a+c -> ac (Match correctly) -> search in listwords[] -> does not appear
a+j (Improper matching)
a+s (Improper matching)
a+c+d -> acd (Match properly) -> search in listwords[] -> does not appear
s+e+i+h -> seih (Match correctly) -> search in listwords[] -> does not appear
s+e+h+i -> sehi (Match correctly) -> search in listwords[] -> if this word appears-> save to saveWords[] array
What I mean is that the concatenation of characters will not miss any cases. Eg:
charArr1[]+charArr2[] -> will match the following cases: a+c, a+l, a+o, j+c, j+l, j+o, s+c, s+l, s+o
charArr2[]+charArr1[] -> will match the following cases: c+a, c+j, c+s, l+a,l+j, l+s,o+a,o+j, o+s
charArr1+charArr2+charArr3
charArr1[]+charArr3[]+charArr2[]
charArr2[]+charArr1[]+charArr3[]
charArr2[]+charArr3[]+charArr1[]
and so on...
Please help me as I am confused in figuring out the algorithm. Thanks a lot.
Given
public static IEnumerable<string[]> Permutate(string[] array, int i, int n)
{
if (i == n)
yield return array;
else
for (var j = i; j <= n; j++)
{
Swap(ref array[i], ref array[j]);
foreach (var s in Permutate(array, i + 1, n))
yield return s;
Swap(ref array[i], ref array[j]);
}
}
public static void Swap(ref string a, ref string b) => (a, b) = (b, a);
public static bool IncMask(string[][] source, int[] mask)
{
for (var i = 0; i < mask.Length; i++)
{
mask[i]++;
if (mask[i] > source[i].Length)
mask[i] = 0;
else
return false;
}
return true;
}
public static IEnumerable<string> Iterate(params string[][] source)
{
var masks = new int[source.Length];
while (true)
{
if (IncMask(source, masks))
break;
var array = masks
.Select((i, j) => (i, j))
.Where(x => x.i != 0)
.Select(x => source[x.j][x.i - 1])
.ToArray();
foreach (var result in Permutate(array, 0, array.Length - 1))
yield return string.Concat(result);
}
}
Usage
var listWords = "la,lam,lan,son,som,some,mos,mao,sehi,noesrh,nroeh,doise".Split(',');
var charArr1 = "a,j,s".Split(',');
var charArr2 = "c,l,o".Split(',');
var charArr3 = "d,m,n".Split(',');
var charArr4 = "n,e,w".Split(',');
var charArr5 = "f,o,x".Split(',');
var charArr6 = "h,q,z".Split(',');
var charArr7 = "i,r".Split(',');
var results = Iterate(charArr1, charArr2, charArr3, charArr4, charArr5, charArr6, charArr7)
.Where(x => listWords.Contains(x));
Console.WriteLine(string.Join(", ", results));
Results
la, lam, mao, som, mos, lan, son, lan, son, some, mao, som, mos, son, son, some, doise, doise, sehi, nroeh, noesrh, nroeh, noesrh
Note, this is a fairly computationally heavy problem, I didn't put much effort into making this efficient, nor cared about duplicates. It could likely be solved many other (more performant) ways.
Also, I have only minimally tested this, so I am not responsible for anyone you maim or otherwise injure with this code. It could be completely wrong ¯\_(ツ)_/¯

C# Sort Lithuanian Letters

I need sort letters from file as alphabet. How can i do this? I need ToString method. Now console prints:
ABCDEFGIJKLM...ĄČĖĮ...
I need to get this:
AĄBCČDEĘĖFGHIĮYJKLMNOPRSŠTUŲŪVZŽ
Thanks
static char[] Letters(string e) //
{
e = e.ToUpper();
char[] mas = new char[32];
int n = 0;
foreach (char r in e)
if (Array.IndexOf(mas, r) < 0)
if (Char.IsLetter(r))
mas[n++] = r;
Array.Resize(ref mas, n);
Array.Sort(mas);
return mas;
}
You can solve this by sorting the characters using a comparer that understands how to compare characters alphabetically (the default is ordinal comparison).
This implementation is very inefficient, because it converts chars to strings every time it does a compare, but it works:
public class CharComparer : IComparer<char>
{
readonly CultureInfo culture;
public CharComparer(CultureInfo culture)
{
this.culture = culture;
}
public int Compare(char x, char y)
{
return string.Compare(new string(x, 1), 0, new string(y, 1), 0, 1, false, culture);
}
}
(Note: The culture is not actually necessary here; it works without it. I just included it for completeness.)
Then you can use that with sort functions that accept anIComparer, such as Array.Sort():
static void Main()
{
var test = "AĄBCČDEĘĖFGHIĮYJKLMNOPRSŠTUŲŪVZŽ".ToCharArray();
Console.OutputEncoding = System.Text.Encoding.Unicode;
Array.Sort(test);
Console.WriteLine(new string(test)); // Wrong result using default char comparer.
Array.Sort(test, new CharComparer(CultureInfo.GetCultureInfo("lt"))); // Right result using string comparer.
Console.WriteLine(new string(test));
}
An alternative approach is to use an array of single-character strings rather than an array of chars, and sort that instead. This works because the sort functions will use the string comparer, which understands alphabetical order:
var test = "AĄBCČDEĘĖFGHIĮYJKLMNOPRSŠTUŲŪVZŽ".Select(x => new string(x, 1)).ToArray();
Console.OutputEncoding = System.Text.Encoding.Unicode;
Array.Sort(test); // Correct result because it uses the string comparer, which understands alphabetical order.
Console.WriteLine(string.Concat(test));
Or using Linq:
var test = "AĄBCČDEĘĖFGHIĮYJKLMNOPRSŠTUŲŪVZŽ".Select(x => new string(x, 1)).ToArray();
Console.OutputEncoding = System.Text.Encoding.Unicode;
// Correct result because it uses the string comparer, which understands alphabetical order.
test = test.OrderBy(x => x).ToArray();
Console.WriteLine(string.Concat(test));
Using an array of strings instead of an array of chars is probably more performant when sorting like this.
You could use following method to remove diacritics:
static string RemoveDiacritics(string text)
{
var normalizedString = text.Normalize(NormalizationForm.FormD);
var stringBuilder = new StringBuilder();
foreach (var c in normalizedString)
{
var unicodeCategory = CharUnicodeInfo.GetUnicodeCategory(c);
if (unicodeCategory != UnicodeCategory.NonSpacingMark)
{
stringBuilder.Append(c);
}
}
return stringBuilder.ToString().Normalize(NormalizationForm.FormC);
}
Then you can use those chars for the ordering:
string e = "ABCDEFGIJKLM...ĄČĖĮ...";
var normalizedCharList = e.Zip(RemoveDiacritics(e), (chr, n) => new { chr, normValue = (int)n }).ToList();
var orderedChars = normalizedCharList.OrderBy(x => x.normValue).Select(x => x.chr);
string ordered = new String(orderedChars.ToArray());

Replace every other of a certain char in a string

I have searched a lot to find a solution to this, but could not find anything. I do however suspect that it is because I don't know what to search for.
First, I have a string that I convert to an array. The string will be formatted like so:
"99.28099822998047,68.375 118.30699729919434,57.625 126.49999713897705,37.875 113.94499683380127,11.048999786376953 96.00499725341797,8.5"
I create the array with the following code:
public static Array StringToArray(string String)
{
var list = new List<string>();
string[] Coords = String.Split(' ', ',');
foreach (string Coord in Coords)
{
list.Add(Coord);
}
var array = list.ToArray();
return array;
}
Now my problem is; I am trying to find a way to convert it back into a string, with the same formatting. So, I could create a string simply using:
public static String ArrayToString(Array array)
{
string String = string.Join(",", array);
return String;
}
and then hopefully replace every 2nd "," with a space (" "). Is this possible? Or are there a whole other way you would do this?
Thank you in advance! I hope my question makes sense.
There is no built-in way of doing what you need. However, it's pretty trivial to achieve what it is you need e.g.
public static string[] StringToArray(string str)
{
return str.Replace(" ", ",").Split(',');
}
public static string ArrayToString(string[] array)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i <= array.Length-1; i++)
{
sb.AppendFormat(i % 2 != 0 ? "{0} " : "{0},", array[i]);
}
return sb.ToString();
}
If those are pairs of coordinates, you can start by parsing them like pairs, not like separate numbers:
public static IEnumerable<string[]> ParseCoordinates(string input)
{
return input.Split(' ').Select(vector => vector.Split(','));
}
It is easier then to reconstruct the original string:
public static string PrintCoordinates(IEnumerable<string[]> coords)
{
return String.Join(" ", coords.Select(vector => String.Join(",", vector)));
}
But if you absolutely need to have your data in a flat structure like array, it is then possible to convert it to a more structured format:
public static IEnumerable<string[]> Pairwise(string[] coords)
{
coords.Zip(coords.Skip(1), (coord1, coord2) => new[] { coord1, coord2 });
}
You then can use this method in conjunction with PrintCoordinates to reconstruct your initial string.
Here is a route to do it. I don't think other solutions were removing last comma or space. I also include a test.
public static String ArrayToString(Array array)
{
var useComma = true;
var stringBuilder = new StringBuilder();
foreach (var value in array)
{
if (useComma)
{
stringBuilder.AppendFormat("{0}{1}", value, ",");
}
else
{
stringBuilder.AppendFormat("{0}{1}", value, " ");
}
useComma = !useComma;
}
// Remove last space or comma
stringBuilder.Length = stringBuilder.Length - 1;
return stringBuilder.ToString();
}
[TestMethod]
public void ArrayToStringTest()
{
var expectedStringValue =
"99.28099822998047,68.375 118.30699729919434,57.625 126.49999713897705,37.875 113.94499683380127,11.048999786376953 96.00499725341797,8.5";
var array = new[]
{
"99.28099822998047",
"68.375",
"118.30699729919434",
"57.625",
"126.49999713897705",
"37.875",
"113.94499683380127",
"11.048999786376953",
"96.00499725341797",
"8.5",
};
var actualStringValue = ArrayToString(array);
Assert.AreEqual(expectedStringValue, actualStringValue);
}
Another way of doing it:
string inputString = "1.11,11.3 2.22,12.4 2.55,12.8";
List<string[]> splitted = inputString.Split(' ').Select(a => a.Split(',')).ToList();
string joined = string.Join(" ", splitted.Select(a => string.Join(",",a)).ToArray());
"splitted" list will look like this:
1.11 11.3
2.22 12.4
2.55 12.8
"joined" string is the same as "inputString"
Here's another approach to this problem.
public static string ArrayToString(string[] array)
{
Debug.Assert(array.Length % 2 == 0, "Array is not dividable by two.");
// Group all coordinates as pairs of two.
int index = 0;
var coordinates = from item in array
group item by index++ / 2
into pair
select pair;
// Format each coordinate pair with a comma.
var formattedCoordinates = coordinates.Select(i => string.Join(",", i));
// Now concatinate all the pairs with a space.
return string.Join(" ", formattedCoordinates);
}
And a simple demonstration:
public static void A_Simple_Test()
{
string expected = "1,2 3,4";
string[] array = new string[] { "1", "2", "3", "4" };
Debug.Assert(expected == ArrayToString(array));
}

compare the characters in two strings

In C#, how do I compare the characters in two strings.
For example, let's say I have these two strings
"bc3231dsc" and "bc3462dsc"
How do I programically figure out the the strings
both start with "bc3" and end with "dsc"?
So the given would be two variables:
var1 = "bc3231dsc";
var2 = "bc3462dsc";
After comparing each characters from var1 to var2, I would want the output to be:
leftMatch = "bc3";
center1 = "231";
center2 = "462";
rightMatch = "dsc";
Conditions:
1. The strings will always be a length of 9 character.
2. The strings are not case sensitive.
The string class has 2 methods (StartsWith and Endwith) that you can use.
After reading your question and the already given answers i think there are some constraints are missing, which are maybe obvious to you, but not to the community. But maybe we can do a little guess work:
You'll have a bunch of string pairs that should be compared.
The two strings in each pair are of the same length or you are only interested by comparing the characters read simultaneously from left to right.
Get some kind of enumeration that tells me where each block starts and how long it is.
Due to the fact, that a string is only a enumeration of chars you could use LINQ here to get an idea of the matching characters like this:
private IEnumerable<bool> CommonChars(string first, string second)
{
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
var charsToCompare = first.Zip(second, (LeftChar, RightChar) => new { LeftChar, RightChar });
var matchingChars = charsToCompare.Select(pair => pair.LeftChar == pair.RightChar);
return matchingChars;
}
With this we can proceed and now find out how long each block of consecutive true and false flags are with this method:
private IEnumerable<Tuple<int, int>> Pack(IEnumerable<bool> source)
{
if (source == null)
throw new ArgumentNullException("source");
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
bool current = iterator.Current;
int index = 0;
int length = 1;
while (iterator.MoveNext())
{
if(current != iterator.Current)
{
yield return Tuple.Create(index, length);
index += length;
length = 0;
}
current = iterator.Current;
length++;
}
yield return Tuple.Create(index, length);
}
}
Currently i don't know if there is an already existing LINQ function that provides the same functionality. As far as i have already read it should be possible with SelectMany() (cause in theory you can accomplish any LINQ task with this method), but as an adhoc implementation the above was easier (for me).
These functions could then be used in a way something like this:
var firstString = "bc3231dsc";
var secondString = "bc3462dsc";
var commonChars = CommonChars(firstString, secondString);
var packs = Pack(commonChars);
foreach (var item in packs)
{
Console.WriteLine("Left side: " + firstString.Substring(item.Item1, item.Item2));
Console.WriteLine("Right side: " + secondString.Substring(item.Item1, item.Item2));
Console.WriteLine();
}
Which would you then give this output:
Left side: bc3
Right side: bc3
Left side: 231
Right side: 462
Left side: dsc
Right side: dsc
The biggest drawback is in someway the usage of Tuple cause it leads to the ugly property names Item1 and Item2 which are far away from being instantly readable. But if it is really wanted you could introduce your own simple class holding two integers and has some rock-solid property names. Also currently the information is lost about if each block is shared by both strings or if they are different. But once again it should be fairly simply to get this information also into the tuple or your own class.
static void Main(string[] args)
{
string test1 = "bc3231dsc";
string tes2 = "bc3462dsc";
string firstmatch = GetMatch(test1, tes2, false);
string lasttmatch = GetMatch(test1, tes2, true);
string center1 = test1.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
string center2 = test2.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
}
public static string GetMatch(string fist, string second, bool isReverse)
{
if (isReverse)
{
fist = ReverseString(fist);
second = ReverseString(second);
}
StringBuilder builder = new StringBuilder();
char[] ar1 = fist.ToArray();
for (int i = 0; i < ar1.Length; i++)
{
if (fist.Length > i + 1 && ar1[i].Equals(second[i]))
{
builder.Append(ar1[i]);
}
else
{
break;
}
}
if (isReverse)
{
return ReverseString(builder.ToString());
}
return builder.ToString();
}
public static string ReverseString(string s)
{
char[] arr = s.ToCharArray();
Array.Reverse(arr);
return new string(arr);
}
Pseudo code of what you need..
int stringpos = 0
string resultstart = ""
while not end of string (either of the two)
{
if string1.substr(stringpos) == string1.substr(stringpos)
resultstart =resultstart + string1.substr(stringpos)
else
exit while
}
resultstart has you start string.. you can do the same going backwards...
Another solution you can use is Regular Expressions.
Regex re = new Regex("^bc3.*?dsc$");
String first = "bc3231dsc";
if(re.IsMatch(first)) {
//Act accordingly...
}
This gives you more flexibility when matching. The pattern above matches any string that starts in bc3 and ends in dsc with anything between except a linefeed. By changing .*? to \d, you could specify that you only want digits between the two fields. From there, the possibilities are endless.
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
List<string> common_str = commonStrings(s1,s2);
foreach ( var s in common_str)
Console.WriteLine(s);
}
static public List<string> commonStrings(string s1, string s2){
int len = s1.Length;
char [] match_chars = new char[len];
for(var i = 0; i < len ; ++i)
match_chars[i] = (Char.ToLower(s1[i])==Char.ToLower(s2[i]))? '#' : '_';
string pat = new String(match_chars);
Regex regex = new Regex("(#+)", RegexOptions.Compiled);
List<string> result = new List<string>();
foreach (Match match in regex.Matches(pat))
result.Add(s1.Substring(match.Index, match.Length));
return result;
}
}
for UPDATE CONDITION
using System;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
int len = 9;//s1.Length;//cond.1)
int l_pos = 0;
int r_pos = len;
for(int i=0;i<len && Char.ToLower(s1[i])==Char.ToLower(s2[i]);++i){
++l_pos;
}
for(int i=len-1;i>0 && Char.ToLower(s1[i])==Char.ToLower(s2[i]);--i){
--r_pos;
}
string leftMatch = s1.Substring(0,l_pos);
string center1 = s1.Substring(l_pos, r_pos - l_pos);
string center2 = s2.Substring(l_pos, r_pos - l_pos);
string rightMatch = s1.Substring(r_pos);
Console.Write(
"leftMatch = \"{0}\"\n" +
"center1 = \"{1}\"\n" +
"center2 = \"{2}\"\n" +
"rightMatch = \"{3}\"\n",leftMatch, center1, center2, rightMatch);
}
}

Need algorithm to make simple program (sentence permutations)

I really cant understand how to make a simple algorithm on C# to solve my problem. So, we have a sentences:
{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}.
So, my program should make a lot of sentences looks like:
Hello my mate.
Hello my m8.
Hello my friend.
Hello my friends.
Hi my mate.
...
Hi-Hi my friends.
I know, there are a lot of programs which could do this, but i'd like to make it myself. Ofcourse, it should work with this too:
{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}.
Update I just wasn't too happy about my using the regexen to parse so simple input; yet I disliked the manual index manipulation jungle found in other answers.
So I replaced the tokenizing with a Enumerator-based scanner with two alternating token-states. This is more justified by the complexity of the input, and has a 'Linqy' feel to it (although it really isn't Linq). I have kept the original Regex based parser at the end of my post for interested readers.
This just had to be solved using Eric Lippert's/IanG's CartesianProduct Linq extension method, in which the core of the program becomes:
public static void Main(string[] args)
{
const string data = #"{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}.";
var pockets = Tokenize(data.GetEnumerator());
foreach (var result in CartesianProduct(pockets))
Console.WriteLine(string.Join("", result.ToArray()));
}
Using just two regexen (chunks and legs) to do the parsing into 'pockets', it becomes a matter of writing the CartesianProduct to the console :) Here is the full working code (.NET 3.5+):
using System;
using System.Text;
using System.Text.RegularExpressions;
using System.Linq;
using System.Collections.Generic;
namespace X
{
static class Y
{
private static bool ReadTill(this IEnumerator<char> input, string stopChars, Action<StringBuilder> action)
{
var sb = new StringBuilder();
try
{
while (input.MoveNext())
if (stopChars.Contains(input.Current))
return true;
else
sb.Append(input.Current);
} finally
{
action(sb);
}
return false;
}
private static IEnumerable<IEnumerable<string>> Tokenize(IEnumerator<char> input)
{
var result = new List<IEnumerable<string>>();
while(input.ReadTill("{", sb => result.Add(new [] { sb.ToString() })) &&
input.ReadTill("}", sb => result.Add(sb.ToString().Split('|'))))
{
// Console.WriteLine("Expected cumulative results: " + result.Select(a => a.Count()).Aggregate(1, (i,j) => i*j));
}
return result;
}
public static void Main(string[] args)
{
const string data = #"{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}.";
var pockets = Tokenize(data.GetEnumerator());
foreach (var result in CartesianProduct(pockets))
Console.WriteLine(string.Join("", result.ToArray()));
}
static IEnumerable<IEnumerable<T>> CartesianProduct<T>(this IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] {item}));
}
}
}
Old Regex based parsing:
static readonly Regex chunks = new Regex(#"^(?<chunk>{.*?}|.*?(?={|$))+$", RegexOptions.Compiled);
static readonly Regex legs = new Regex(#"^{((?<alternative>.*?)[\|}])+(?<=})$", RegexOptions.Compiled);
private static IEnumerable<String> All(this Regex regex, string text, string group)
{
return !regex.IsMatch(text)
? new [] { text }
: regex.Match(text).Groups[group].Captures.Cast<Capture>().Select(c => c.Value);
}
public static void Main(string[] args)
{
const string data = #"{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}.";
var pockets = chunks.All(data, "chunk").Select(v => legs.All(v, "alternative"));
The rest is unchanged
Not sure what you need Linq (#user568262) or "simple" recursion (#Azad Salahli) for. Here's my take on it:
using System;
using System.Text;
class Program
{
static Random rng = new Random();
static string GetChoiceTemplatingResult(string t)
{
StringBuilder res = new StringBuilder();
for (int i = 0; i < t.Length; ++i)
if (t[i] == '{')
{
int j;
for (j = i + 1; j < t.Length; ++j)
if (t[j] == '}')
{
if (j - i < 1) continue;
var choices = t.Substring(i + 1, j - i - 1).Split('|');
res.Append(choices[rng.Next(choices.Length)]);
i = j;
break;
}
if (j == t.Length)
throw new InvalidOperationException("No matching } found.");
}
else
res.Append(t[i]);
return res.ToString();
}
static void Main(string[] args)
{
Console.WriteLine(GetChoiceTemplatingResult(
"{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}, {i|we} want to {tell|say} you {hello|hi|hi-hi}."));
}
}
As others have noted, you can solve your problem by splitting up the string into a sequence of sets, and then taking the Cartesian product of all of those sets. I wrote a bit about generating arbitrary Cartesial products here:
http://blogs.msdn.com/b/ericlippert/archive/2010/06/28/computing-a-cartesian-product-with-linq.aspx
An alternative approach, more powerful than that, is to declare a grammar for your language and then write a program that generates every string in that language. I wrote a long series of articles on how to do so. It starts here:
http://blogs.msdn.com/b/ericlippert/archive/2010/04/26/every-program-there-is-part-one.aspx
You can use a Tuple to hold index values of each collection.
For example, you would have something like:
List<string> Greetings = new List<string>()
{
"Hello",
"Hi",
"Hallo"
};
List<string> Targets = new List<string>()
{
"Mate",
"m8",
"friend",
"friends"
};
So now you have your greetings, let's create random numbers and fetch items.
static void Main(string[] args)
{
List<string> Greetings = new List<string>()
{
"Hello",
"Hi",
"Hallo"
};
List<string> Targets = new List<string>()
{
"Mate",
"m8",
"friend",
"friends"
};
var combinations = new List<Tuple<int, int>>();
Random random = new Random();
//Say you want 5 unique combinations.
while (combinations.Count < 6)
{
Tuple<int, int> tmpCombination = new Tuple<int, int>(random.Next(Greetings.Count), random.Next(Targets.Count));
if (!combinations.Contains(tmpCombination))
{
combinations.Add(tmpCombination);
}
}
foreach (var item in combinations)
{
Console.WriteLine("{0} my {1}", Greetings[item.Item1], Targets[item.Item2]);
}
Console.ReadKey();
}
This doesn't look trivial. You need to
1. do some parsing, to extract all the lists of words that you want to combine,
2. obtain all the actual combinations of these words (which is made harder by the fact that the number of lists you want to combine is not fixed)
3. rebuild the original sentence putting all the combinations in the place of the group they came from
part 1 (the parsing part) is probably the easiest: it could be done with a Regex like this
// get all the text within {} pairs
var pattern = #"\{(.*?)\}";
var query = "{Hello|Hi|Hi-Hi} my {mate|m8|friend|friends}.";
var matches = Regex.Matches(query, pattern);
// create a List of Lists
for(int i=0; i< matches.Count; i++)
{
var nl = matches[i].Groups[1].ToString().Split('|').ToList();
lists.Add(nl);
// build a "template" string like "{0} my {1}"
query = query.Replace(matches[i].Groups[1].ToString(), i.ToString());
}
for part 2 (taking a List of Lists and obtain all resulting combinations) you can refer to this answer
for part 3 (rebuilding your original sentence) you can now take the "template" string you have in query and use String.Format to substitute all the {0}, {1} .... with the combined values from part 2
// just one example,
// you will need to loop through all the combinations obtained from part 2
var OneResultingCombination = new List<string>() {"hi", "mate"};
var oneResult = string.Format(query, OneResultingCombination.ToArray());

Categories