Get all possible word combinations - c#

I have a list of n words (let's say 26). Now I want to get a list of all possible combinations, but with a maximum of k words per row (let's say 5)
So when my word list is: aaa, bbb, ..., zzz
I want to get:
aaa
bbb
...
aaabbb
aaaccc
...
aaabbbcccdddeeefff
aaabbbcccdddeeeggg
...
I want to make it variable, so that it will work with any n or k value.
There should be no word be twice and every combinations needs to be taken (even if there are very much).
How could I achieve that?
EDIT:
Thank you for your answers. It is not an assignment. Is is just that I forgot the combinations of my password and I want to be sure that I have all combinations tested. Although I have not 26 password parts, but this made it easier to explain what I want.
If there are other people with the same problem, this link could be helpfull:
Generate word combination array in c#

i wrote simple a function to do this
private string allState(int index,string[] inStr)
{
string a = inStr[index].ToString();
int l = index+1;
int k = l;
var result = string.Empty;
var t = inStr.Length;
int i = index;
while (i < t)
{
string s = a;
for (int j = l; j < k; j++)
{
s += inStr[j].ToString();
}
result += s+",";
k++;
i++;
}
index++;
if(index<inStr.Length)
result += allState(index, inStr);
return result.TrimEnd(new char[] { ',' });
}
allState(0, new string[] { "a", "b", "c"})

You could take a look at this
However, if you need to get large numbers of combinations (in the tens of millions) you should use lazy evaluation for the generation of the combinations.

Related

Logic to select a specific set from Cartesian set

I'm making a password brute forcing tool as a learning exercise, and I want it to be resumable.
So, what I want is to be able to say, this is the set of possible characters, if I computed the Cartesian set of every possible combination of this set up to length n, what is the set at point x?
However, I want to do this without computing the entire set. I've seen similar logic in one place online but I was unable to generalise this to fit.
Any help would be fantastic, thanks! I'm fluent in C# if that helps.
Edit: Here's the question I mentioned earlier: How to select specific item from cartesian product without calculating every other item
Edit: here's an example of what I mean:
Char set = [abcd]
Length n = 4
Permutations:
[aaaa]
[aaab]
[aaac]
[aaad]
[aaba]
....
[dddd]
So if I'm searching for the set at 4, I'd get [aaad]. But if I'm searching for element 7000, then it takes a long time to get to that point.
This implements the answer to the question you link:
static string Get(string chars, int n, int i)
{
string ret = "";
int sizes = 1;
for (int j = 0; j < n; j++) {
ret = chars[(i / sizes) % chars.Length] + ret;
sizes *= chars.Length;
}
return ret;
}
Example:
string chars = "abcd";
int n = 3;
for (int i = 0; i < Math.Pow(chars.Length, n); i++)
Console.WriteLine(i + "\t" + Get(chars, n, i));
0 aaa
1 aab
2 aac
3 aad
...
61 ddb
62 ddc
63 ddd

whats wrong in this logic of finding longest common child of string

i came up with this logic to find longest common child of two strings of equal length but it runs successfuly only on simple outputs and fails others,pls guide me what i am doing wrong here.
String a, b;
int sum = 0;
int[] ar,br;
ar = new int[26];
br = new int[26];
a = Console.ReadLine();
b = Console.ReadLine();
for (int i = 0; i < a.Length; i++)
{
ar[(a[i] - 65)]++;
br[(b[i] - 65)]++;
}
for(int i =0;i<ar.Length;i++)
{
if (ar[i] <= br[i]) { sum += ar[i]; }
else sum += br[i];
}
Console.Write(sum);
Console.ReadLine();
output:
AA
BB
0 correct.
HARRRY
SALLY
2 correct
for both above input it runs but when i submit for evaluation it fails on their test cases.i cant access their testacase on which my logic fails.i wanna know where does my logic fails.
Your second loop is all wrong. It is simply finding the count of characters that occur in both the array and the count is only updated with the the no. of the common characters contained in the string containing the least no. of these common characters.
refer this link for the correct implementation.
http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Longest_common_substring#Retrieve_the_Longest_Substring
Also convert your input to uppercase characters using String.ToUpper before you use the input string.

How to make this code faster?

I have a function which works a very slow for my task (it must be 10-100 times faster)
Here is code
public long Support(List<string[]> sequences, string[] words)
{
var count = 0;
foreach (var sequence in sequences)
{
for (int i = 0; i < sequence.Length - words.Length + 1; i++)
{
bool foundSeq = true;
for (int j = 0; j < words.Length; j++)
{
foundSeq = foundSeq && sequence[i + j] == words[j];
}
if (foundSeq)
{
count++;
break;
}
}
}
return count;
}
public void Support(List<string[]> sequences, List<SequenceInfo> sequenceInfoCollection)
{
System.Threading.Tasks.Parallel.ForEach(sequenceInfoCollection.Where(x => x.Support==null),sequenceInfo =>
{
sequenceInfo.Support = Support(sequences, sequenceInfo.Sequence);
});
}
Where List<string[]> sequences is a array of array of words. This array usually contains 250k+ rows. Each row is about 4-7 words. string[] words is a array of words(all words contains in sequences at least one time) which we trying to count.
The problem is foundSeq = foundSeq && sequence[i + j] == words[j];. This code take most of all execution time(Enumerable.MoveNext at second place). I want to hash all words in my array. Numbers compares faster then strings, right? I think it can help me to get 30%-80% of perfomance. But i need 10x! What can i to do? If you want to know it's a part of apriory algorithm.
Support function check if the words sequence is a part any sequence in the sequences list and count how much times.
Knuth–Morris–Pratt algorithm
In computer science, the Knuth–Morris–Pratt string searching algorithm (or KMP algorithm) searches for occurrences of a "word" W within a main "text string" S by employing the observation that when a mismatch occurs, the word itself embodies sufficient information to determine where the next match could begin, thus bypassing re-examination of previously matched characters.
The algorithm was conceived in 1974 by Donald Knuth and Vaughan Pratt, and independently by James H. Morris. The three published it jointly in 1977.
From Wikipedia: https://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm
This is one of the improvements that you should make. With a small difference: a "word" in your code is a "characters" in the terminology of the algorithm; your "words" array is what is a word in KMP.
The idea is that when you search for "abc def ghi jkl", and have matched "abc def ghi" already, but the next word does not match, you can jump three positions.
Search: abc def ghi jkl
Text: abc def ghi klm abc def ghi jkl
i=0: abc def ghi jkl?
skip 2: XXX XXX <--- you save two iterations here, i += 2
i=2: abc?
i=3: abc? ...
The first optimisation I would make is an early-fail. Your inner loop continues through the whole sequence even if you know it's failed, and you're doing an unnecessary bit of boolean logic. Your code is this:
for (int j = 0; j < words.Length; j++)
{
foundSeq = foundSeq && sequence[i + j] == words[j];
}
Instead, just do this (or equivalent):
for (int j = 0; j < words.Length; j++)
{
if (sequence[i + j] != words[j])
{
foundSeq = false;
break;
}
}
This will save you the majority of your comparisons (you will drop out at the first word if it doesn't match, instead of continuing to compare when you know the result is false). It could even make the tenfold difference you're looking for if you expect the occurrence of the individual words in each sequence to be low (if, say, you are finding sentences in a page of English text).
Theoretically, you could concatenate each sequence and using substring matching. I don't have a compiler at hand now, so I can't test whether it will actually improve performance, but this is the general idea:
List<string> sentences = sequences.Select(seq => String.Join(" ", seq));
string toMatch = String.Join(" ", words);
return sentences.Count(sentence => sentence.Contains(toMatch));

Permutation of a list of strings algorithm

I need help understanding how to write a permutation algorithm. (if this is even permutation, they have to be in order and use the same values).
List<string> str = new List<string>{"a", "b", "c", "d"};
How can I get a list of each permutation available in this list? For eg.
a, b, c, d
ab, c, d
ab, cd
abc, d
abcd
a, bc, d
a, bcd
a, b, cd
For some reason I cant find a pattern to start with. I'd also like to be able to disregard permutation when a joined string has a count of like X characters. So if X was 4, in that list, number 5 wouldn't exist and there would be 7 permutations.
private List<string> permute(List<string> values, int maxPermutation)
{
//alittle help on starting it would be great :)
}
I looked and read this, but he does not keep the order.
This is rather straightforward: you have three spots where you could either put a comma or to put nothing. There are eight combinations corresponding to 2^3 binary numbers.
For each number from 0 to 7, inclusive, produce a binary representation. Put a comma in each position where binary representation has 1; do not put comma where there's zero.
for (int m = 0 ; m != 8 ; m++) {
string s = "a";
if ((m & 1) != 0) s += ",";
s += "b";
if ((m & 2) != 0) s += ",";
s += "c";
if ((m & 4) != 0) s += ",";
s += "d";
Console.WriteLine(s);
}
You could take a recursive approach: Take the first letter, build all possible combinations starting with the second one (this is the recursion...) and prepend the first letter to each of them. Then take the first two letters together, recursively build all combinations starting with the third one. And so on ...
As for you additional requirement: If you want to exclude all "combinations" containing a string with X letters, just skip this number when constructing the first string.
The Binary approach above is correct and this is actually a partitioning problem (but not "The Partitioning Problem") and not a permutation problem.
http://en.wikipedia.org/wiki/Partition_of_a_set
Watch out because of the number of partitions grows faster than exponentially (e^e^n) so it will be really slow for large strings.
Try the following code. I haven't tested it, but I think it's what you are looking for.
List<string> str = new List<string>{ "a", "h", "q", "z", "b", "d" };
List<List<string>> combinations = combine(str.OrderBy(s=>s).ToList());
List<List<string>> combine(List<string> items)
{
List<List<string>> all = new List<List<string>>();
// For each index item in the items array
for(int i = 0; i < items.Length; i++)
{
// Create a new list of string
List<string> newEntry = new List<string>();
// Take first i items
newEntry.AddRange(items.Take(i));
// Concatenate the remaining items
newEntry.Add(String.concat(items.Skip(i)));
// Add these items to the main list
all.Add(newEntry);
// If this is not the last string in the list
if(i + 1 < items.Length)
{
// Process sub-strings
all.AddRange(combine(items.Skip(i + 1).ToList()));
}
}
return all;
}
If you need to generate combinations (or permutations or variations), then Combinatorics is a fantastic library.

How can I change numbers into letters in C#

I have code like this:
for (int i = 1; i < max; i++)
{
<div>#i</div>
<div>#test[i]</div>
}
I'm using MVC3 razor syntax so it might look a bit strange.
My max is always less than ten and I would like to have a value like "A", "B" .. etc appear between the first instead of the number "1", "2" .. which is the value of i. Is there an easy way I can convert i to a letter where i = 1 represent "A" and i=2 represents "B". I need to do this in C# which I can place in my MVC3 view file.
Marife
Personally I'd probably use the indexer into a string:
// Wherever you want to declare this
// As many as you'll need - the "X" is to put A=1
const string Letters = "XABCDEFGHIJKLMNOP";
...
<div>
for (int i = 1; i < max; i++)
{
<div>#i</div>
<div>#Letters[i]</div>
}
I find that simpler and more flexible than bit shifting etc, although that will certainly work too.
(char)(i + 64) will work (65 = 'A')
for (int i = 1; i < max; i++)
{
char c = (char)(i + 64); // c will be in [A..J]
...
}
You could shift i by 64 (with a 1-based index) and cast your int to a char.
If you don't need to use i anywhere else you can do this:
for (Char ch = 'A'; ch < 'K'; ch++)
{
MessageBox.Show(ch.ToString());
}
Ah, just realised the last letter isn't constant, so you would need to convert a number somwwehere.

Categories