Logic to select a specific set from Cartesian set - c#

I'm making a password brute forcing tool as a learning exercise, and I want it to be resumable.
So, what I want is to be able to say, this is the set of possible characters, if I computed the Cartesian set of every possible combination of this set up to length n, what is the set at point x?
However, I want to do this without computing the entire set. I've seen similar logic in one place online but I was unable to generalise this to fit.
Any help would be fantastic, thanks! I'm fluent in C# if that helps.
Edit: Here's the question I mentioned earlier: How to select specific item from cartesian product without calculating every other item
Edit: here's an example of what I mean:
Char set = [abcd]
Length n = 4
Permutations:
[aaaa]
[aaab]
[aaac]
[aaad]
[aaba]
....
[dddd]
So if I'm searching for the set at 4, I'd get [aaad]. But if I'm searching for element 7000, then it takes a long time to get to that point.

This implements the answer to the question you link:
static string Get(string chars, int n, int i)
{
string ret = "";
int sizes = 1;
for (int j = 0; j < n; j++) {
ret = chars[(i / sizes) % chars.Length] + ret;
sizes *= chars.Length;
}
return ret;
}
Example:
string chars = "abcd";
int n = 3;
for (int i = 0; i < Math.Pow(chars.Length, n); i++)
Console.WriteLine(i + "\t" + Get(chars, n, i));
0 aaa
1 aab
2 aac
3 aad
...
61 ddb
62 ddc
63 ddd

Related

Sort anything lexicographically

I have this code which I think I found somewhere on the internet some years ago and it doesn't quite work.
The purpose is to take any string and from that create a string that is lexicographically sorted by a large number - because then inverse (descending) ordering can be achieved by subtracting the number from another even larger number.
private static BigInteger maxSort = new BigInteger(Encoding.Unicode.GetBytes("5335522543087813528200259404529154678271640415603227881439560533607051111046319775598721171814499900"));
public static string GetSortString(string str, bool descending)
{
var sortNumber = new BigInteger(Encoding.Unicode.GetBytes(str));
if (descending)
{
sortNumber = maxSort - sortNumber;
}
return "$SORT!" + sortNumber.ToString().PadLeft(100, '0') + ":" + str;
}
The reason I need this is because I want to use it to insert as RowKey in Azure Table Storage which is the only way to sort in Table Storage. I need to sort any text, any number and any date, both ascending and descending.
Can anyone see the issue with the code or have any code that serves the same purpose?
The question is tagged with C# but of course this is not a question of syntax so if you have the answer in any other code that would be fine too.
Example
I want to convert any string to a number which is lexicographically sorted correctly - because if it's a number, then I can invert it and sort descending.
So for example, if I can convert:
ABBA to 1234
Beatles to 3131
ZZ Top to 9584
Then those numbers would sort them correctly ... and, if I subtract them from a large number, I would be able to invert the sort order:
10000 - 1234 = 8766
10000 - 3131 = 6869
10000 - 9584 = 0416
Of course, to support longer text input, I need to subtract them from a very large number, which is why I use the very large BigInteger.
Current output from this code
ABBA: $SORT!0000000000000000000000018296156958359617:ABBA
Beatles: $SORT!0000000009111360792640460912278748069954:Beatles
ZZ TOP: $SORT!0000000000000096715522885596192519618650:ZZ TOP
As you can see, the longest text gets the highest number. I have also tried to add padding immediately on the input str, but that didnt help either.
Answer
The accepted answer worked. For descending sort order, the "BigInteger" trick from above could be used.
There is some limitation as to how long the sortable string can be.
Here is the final code:
private static BigInteger maxSort = new BigInteger(Encoding.Unicode.GetBytes("5335522543087813528200259404529154678271640415603227881439560533607051111046319775598721171814499900"));
public static string GetSortString(string str, bool descending)
{
BigInteger result = 0;
int maxLength = 42;
foreach (var c in str.ToCharArray())
{
result = result * 256 + c;
}
for (int i = str.Length; i < maxLength; i++)
{
result = result * 256;
}
if (descending)
{
result = maxSort - result;
}
return "$SORT!" + result;
}
If you were looking for a way to give a a value to any string so that you could sort them accordingly to the number and get the same result as above you can't. The reason is that strings don't have any length limit. Because you can always add a char to a string and thereby get a larger number even through it should have a lower lexicographical value.
If they have a length limit you can do something like this
pseudo code
bignum res = 0;
maxLength = 42;
for (char c : string)
res = res * 256 + c
for (int i = string.length; i < maxLength; i++)
res = res *256
If you want to optimize a bit, the last loop could be a lookup table. If your only using a-z, the times 256 could reduced to 26 or 32.

Sudoku Backtracking - Order of walking through fields by amount of possible values

I've created a backtracking algorithm for solving Sudoku puzzles which basicly walks through all the empty fields from left to right, top down respectively. Now I need to make an extended version in which the order in which the algorithm walks through the fields is defined by the amount of possibilities (calculated once, at initialization) for each of the fields. E.g. empty fields which initially have the fewest amount of possible values should be visited first, and only the initially possible values should be checked (both to reduce the amount of iterations needed). Now I'm not sure how to go on implementing this without increasing the amount of iterations needed to actually define these values for each field and then obtain the next field with the fewest possibilities.
For my backtracking algorithm I have a nextPosition method which determines the next empty field in the sudoku to visit. Right now it looks like this:
protected virtual int[] nextPosition(int[] position)
{
int[] nextPosition = new int[2];
if (position[1] == (n * n) - 1)
{
nextPosition[0] = position[0]+1;
nextPosition[1] = 0;
}
else
{
nextPosition[0] = position[0];
nextPosition[1] = position[1]+1;
}
return nextPosition;
}
So it basicly walks through the sudoku left-right, top-down respectively. Now I need to alter this for my new version to walk through the fields ordered by the fewest amount of possible values for the fields (and only trying the possible values for each field in my backtracking algorithm). I figured I'd try to keep a list of invalid values for each field:
public void getInvalidValues(int x, int y)
{
for (int i = 0; i < n * n; i++)
if (grid[y, i] != 0)
this.invalidValues[y, i].Add(grid[y, i]);
for (int i = 0; i < n * n; i++)
if (grid[i, x] == 0)
this.invalidValues[i, x].Add(grid[i, x]);
int nX = (int)Math.Floor((double)x / n);
int nY = (int)Math.Floor((double)y / n);
for (int x = 0; x < n; x++)
for (int y = 0; y < n; y++)
if (grid[nY * n + y, nX * n + x] != 0)
this.invalidValues[y, x].Add(grid[y, x]);
}
Calling this method for every empty field in the sudoku (represented in this.grid as 2D array [nn,nn]). However this causes even more iterations since in order to determine the amount of different invalid values for each field it'll have to walk through each list again.
So my question is whether someone knows a way to efficiently walk through the fields of the sudoku ordered by the amount of possible values for each field (at the same time keeping track of these possible values for each field since they are needed for the backtracking algorithm). If anyone could help me out on this it'd be much appreciated.
Thanks in advance!

Algorithm for generating all combinations with 2 potential values in 5 variables

Apologies if this has been answered before but I can't come up with a good name to search for what I'm looking for. I have the potential for between 1-5 string variables (we'll call them A,B,C,D,E) that can have one of two values represented by 'P' and 'S'. These are for pluralized and singular word forms
The data will always be in the same order, ABCDE, so that is not a concern but it may not contain all five (could be only A, AB, ABC or ABCD). I'm looking for an algorithm that will handle that possibility while generating all potential plural/singular combinations. So in the case of a 5 variable string the results would be:
SSSSS,
SPSSS,
SPPSS,
SPSPS,
...
PPPPP
I have the logic to pluralize and to store the data it's just a question of what is the logic that will generate all those combinations. If it matters, I am working in C#. Any help would be greatly appreciated!
So there are only two possible values, 0 and 1. Wait a minute... Zeroes and ones... Why does that sound familiar...? Ah, binary to the rescue!
Let's count a little in binary, starting with 0.
0000 = 0
0001 = 1
0010 = 2
0011 = 3
0100 = 4
0101 = 5
0110 = 6
0111 = 7
1000 = 8
...etc
If you look at the rightmost bit of the first two rows, we have all the possible combinations for 1 bit, 0 and 1.
If you then look at the two rightmost bits of the first four rows, you get all 2 bit combinations: 00, 01, 10 and 11.
The first eight rows have all three bit combinations, etc.
If you want all possible combinations of x bits, count all numbers from 0 to (2^x)-1 and look at the last x bits of the numbers written in binary.
(Likewise, if you instead have three possible values (0, 1 and 2), you can count between 0 and (3^x)-1 and look at the last x digits when written in ternary, and so on for all possible amounts of values.)
"Recursive permutations C#" will do the trick for a google search. But I thought I'd attempt a solution for you using simple counting and bit masking. Here is some code that will do "binary" counting and, using bitshifting, determine if the position in the words should be pluralized (you mention you have those details already):
string input = "red bag";
string[] tokens = input.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
string[] test = new string[tokens.Length];
int size = (int)Math.Pow(tokens.Length, 2);
for (int i = 0; i < size; i++)
{
for (int j = 0; j < tokens.Length; j++)
{
int mask = (1 << j);
if ((mask & i) != 0)
{
test[j] = Pluralize(tokens[j]);
}
else
{
test[j] = Singularize(tokens[j]);
}
}
Console.WriteLine(string.Join(" ", test));
}
Output:
red bag
reds bag
red bags
reds bags
I would advise a recursive algorithm. For example an algorithm like this could be the answer to your problem (I dont really know what returnvalues you exactly want)
public void getAllWords(ref List<string> result, string Prefix, int wordLength)
{
if(wordLength == 0)
result.add(prefix);
else
{
getAllWords(result, prefix+"0", wordLength-1);
getAllWords(result, prefix+"1", wordLength-1);
}
}
to be called with
List<string> result = new List<string>();
getAllWords(result, "", 5);
I hope this works, I'm on mobile at the moment.
You can change that as you want to account for m a different alphabet (for example values 0,1,2..) as you like.
You can enumerate all integers from 0 to 2^5-1 (i.e. from 0 to 31 ) and represent each integer as bool[]. May be this will be helpful:
static bool[][] GetCombinations(int wordCount) {
int length = (int) Math.Pow(2, wordCount);
bool[][] res = new bool[length][];
for (int i = 0; i < length; i++)
{
res [i] = new bool[wordCount];
for (int j = 0; j < wordCount; j++) {
res [i] [j] = ((i & (int)Math.Pow (2, j)) != 0);
}
}
return res;
}

Most evenly distribute letters of the alphabet across sequence

I'm wondering if there is a sweet way I can do this in LINQ or something but I'm trying to most evenly distribute the letters of the alphabet across X parts where X is a whole number > 0 && <= 26. For example here might be some possible outputs.
X = 1 : 1 partition of 26
X = 2 : 2 partitions of 13
X = 3 : 2
partitions of 9 and one partition of 8
etc....
Ultimately I don't want to have any partitions that didn't end up getting at least one and I'm aiming to have them achieve the most even distribution that the range of difference between partition sizes is as small as posssible.
This is the code I tried orginally:
char[] alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ".ToCharArray();
int pieces = (int)Math.Round((double)alphabet.Count() / numberOfParts);
for (int i = 0; i < numberOfParts.Count; ++i) {
char[] subset = i == numberOfParts.Count - 1 ? alphabet.Skip(i * pieces).ToArray()
: alphabet.Skip(i * pieces).Take(pieces).ToArray();
... // more code following
This seemed to be working fine at first but I realized in testing that there is a problem when X is 10. Based on this logic I'm getting 8 groups of 3 and one group of 2, leaving the 10th group 0 which is no good as I'm going for the most even distribution.
The most ideal distribution for 10 in this case would be 6 groupings of 3 and 4 groupings of 2. Any thoughts on how this might be implemented?
In general, the easiest way to implement the logic is using the modulo operator, %. Get familiar with this operator; it's very useful for the situations where it helps. There are a number of ways to write the actual code to do the distribution of letters, using arrays or not as you wish etc., but this short expression should give you an idea:
"ABCDEFGHIJKLMNOPQRSTUVWXYZ".IndexOf(letter) % partitionCount
This expression gives the zero-based index of the partition in which to deposit an upper-case letter. The string is just shown for convenience, but could be an array or some other way of representing the alphabet. You could loop over the alphabet, using logic similar to the above to choose where to deposit each letter. Up to you would be where to put the logic: inside a loop, in a method, etc.
There's nothing magical about modular arithmetic; it just "wraps around" after the end of the set of usable numbers is reached. A simple context in which we've all encountered this is in division; the % operator is essentially just giving a division remainder. Now that you understand what the % operator is doing, you could easily write your own code to do the same thing, in any language.
Putting this all together, you could write a utility, class or extension method like this one--note the % to calculate the remainder, and that simple integer division discards it:
/// <summary>
/// Returns partition sized which are as close as possible to equal while using the indicated total size available, with any extra distributed to the front
/// </summary>
/// <param name="totalSize">The total number of elements to partition</param>
/// <param name="partitionCount">The number of partitions to size</param>
/// <param name="remainderAtFront">If true, any remainder will be distributed linearly starting at the beginning; if false, backwards from the end</param>
/// <returns>An int[] containing the partition sizes</returns>
public static int[] GetEqualizedPartitionSizes(int totalSize, int partitionCount, bool remainderAtFront = true)
{
if (totalSize < 1)
throw new ArgumentException("Cannot partition a non-positive number (" + totalSize + ")");
else if (partitionCount < 1)
throw new ArgumentException("Invalid partition count (" + partitionCount + ")");
else if (totalSize < partitionCount)
throw new ArgumentException("Cannot partition " + totalSize + " elements into " + partitionCount + " partitions");
int[] partitionSizes = new int[partitionCount];
int basePartitionSize = totalSize / partitionCount;
int remainder = totalSize % partitionCount;
int remainderPartitionSize = basePartitionSize + 1;
int x;
if (remainderAtFront)
{
for (x = 0; x < remainder; x++)
partitionSizes[x] = remainderPartitionSize;
for (x = remainder; x < partitionCount; x++)
partitionSizes[x] = basePartitionSize;
}
else
{
for (x = 0; x < partitionCount - remainder; x++)
partitionSizes[x] = basePartitionSize;
for (x = partitionCount - remainder; x < partitionCount; x++)
partitionSizes[x] = remainderPartitionSize;
}
return partitionSizes;
}
I feel like the simplest way to achieve this is to perform a round robin distribution on each letter. Cycle through each letter of the alphabet and add to it, then repeat. Have a running count that determines what letter you will be putting your item in, then when it hits >26, reset it back to 0!
What I did in one app I had to distribute things in groups was something like this
var numberOfPartitions = GetNumberOfPartitions();
var numberOfElements = GetNumberOfElements();
while (alphabet.Any())
{
var elementsInCurrentPartition = Math.Ceil((double)numberOfPartitions / numberOfElements)
for (int i = 0; i < elementsInCurrentPartition; i++)
{
//fill your partition one element at a time and remove the element from the alphabet
numberOfElements--;
}
numberOfPartitions--;
}
This won't give you the exact result you expected (i.e. ideal result for 10 partitions) but it's pretty close.
p.s. this isn't tested :)
A pseudocode algorithm I have now tested:
Double count = alphabet.Count()
Double exact = count / numberOfParts
Double last = 0.0
Do Until last >= count
Double next = last + exact
ranges.Add alphabet, from:=Round(last), to:=Round(next)
last = next
Loop
ranges.Add can ignore empty ranges :-)
Here is a LinqPad VB.NET implementation of this algorithm.
So a Linq version of this would be something like
Double count = alphabet.Count();
Double exact = count / numberOfParts;
var partitions = Enumerable.Range(0, numberOfParts + 1).Select(n => Round((Double)n * exact));
Here is a LinqPad VB.NET implementation using this Linq query.
(sorry for formatting, mobile)
First, you need something like a batch method:
public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int groupSize)
{
var tempSource = source.Select(n => n);
while (tempSource.Any())
{
yield return tempSource.Take(groupSize);
tempSource = tempSource.Skip(groupSize);
}
}
Then, just call it like this:
var result = alphabet.Batch((int)Math.Ceiling(x / 26.0));

Get all possible word combinations

I have a list of n words (let's say 26). Now I want to get a list of all possible combinations, but with a maximum of k words per row (let's say 5)
So when my word list is: aaa, bbb, ..., zzz
I want to get:
aaa
bbb
...
aaabbb
aaaccc
...
aaabbbcccdddeeefff
aaabbbcccdddeeeggg
...
I want to make it variable, so that it will work with any n or k value.
There should be no word be twice and every combinations needs to be taken (even if there are very much).
How could I achieve that?
EDIT:
Thank you for your answers. It is not an assignment. Is is just that I forgot the combinations of my password and I want to be sure that I have all combinations tested. Although I have not 26 password parts, but this made it easier to explain what I want.
If there are other people with the same problem, this link could be helpfull:
Generate word combination array in c#
i wrote simple a function to do this
private string allState(int index,string[] inStr)
{
string a = inStr[index].ToString();
int l = index+1;
int k = l;
var result = string.Empty;
var t = inStr.Length;
int i = index;
while (i < t)
{
string s = a;
for (int j = l; j < k; j++)
{
s += inStr[j].ToString();
}
result += s+",";
k++;
i++;
}
index++;
if(index<inStr.Length)
result += allState(index, inStr);
return result.TrimEnd(new char[] { ',' });
}
allState(0, new string[] { "a", "b", "c"})
You could take a look at this
However, if you need to get large numbers of combinations (in the tens of millions) you should use lazy evaluation for the generation of the combinations.

Categories