I have a task for uni, the requirements of which are as follows:
There is a collection of coins. For each non-negative integer k, there are two coins with the value 2k, i.e. the collection of coins is {1, 1, 2, 2, 4, 4, 8, 8, ... }
For a given number, I need to write a method that returns the unique number of ways to make change for that amount, given the collection of coins.
For example, if the number passed to the algorithm is 6, the relevant collection of coins would be {1, 1, 2, 2, 4, 4}, the subsets that add up to 6 are {1, 1, 2, 2}, {1, 1, 4}, {2, 4}, {2, 4}, {2, 4} and {2, 4}, the unique subsets are {1, 1, 2, 2}, {1, 1, 4} and {2, 4}, and therefore the total unique ways is 3.
The numbers (and potential combinations) can be very large: the largest number in the tester class is 999,999,999,999,999,999 (1018 - 1), for which the expected result is 29,665,503.
It's apparent to me that the approach should involve dynamic programming. I've used DP once before (for another task where we had to maximise our returns in a 'coin game'), and I've watched lots of videos (such as MIT OCW) on dynamic programming to try and understand how we could solve this particular problem, but I'm quite stuck, with my current confusion as follows:
I'm struggling to understand how we can frame this problem in terms of minimising or maximising something, and therefore how to structure the recurrence relationship. As opposed to trying to determine the minimum number of coins, we're interested in all combinations that work.
There's also the issue that we (I think?) need to keep track of the solutions themselves, otherwise we won't be able to filter out duplicates.
Although it may become apparent as I work out the recurrence relation and how it should be memoized, I feel like space will be an issue: wouldn't we need something like a Z*|C| (where |C| is the size of the array of coins) sized array to store our memoized results? For a Z of 1018, that array would be huge.
At the risk of making this post too long, I've tried to sketch out a few approaches, but always come down to the problem that the recurrence seems like an OR relationship. Something like:
Let z = desired amount
Let A be array of coins, and i be the index in that array
Recurrence relation: DP(i, z) = OR ( DP(i + 1, z), DP(i + 1, z - A[i]) )
// Unsure how to deal with this OR in actual code. We're not saying,
// "Return one of these, whichever is smaller/bigger". We're saying,
// "We want to know if either case works."
Or another approach where you don't actually have an array of coins, but just start at the largest power of 2 less than Z and work down:
Let z = desired amount
Let largestCoin = largestCoinLessThanZ(z) // e.g. for z = 6, largestCoin = 4
findChange(desiredSum, runningTotal, coin):
if runningTotal + coin = desiredSum:
[add path to pile of valid paths]
return ( findChange(desiredSum, runningTotal + coin, coin / 2) // using coin of this denomination once
or findChange(desiredSum, runningTotal + coin + coin, coin / 2) // using coin of this denomination twice
or findChange(desiredSum, runningTotal, coin / 2) // not using coin of this denomination at all
)
Main:
findchange(z, 0, largestCoin)
Sorry for the janky pseudocode - just trying to convey how I've approached it in my head.
In summary, I'm hoping for help understanding the recurrence relationship to solve this problem, and how to deal with potential space constraints. I'm working with C#, but I don't expect code - any math or pseudocode would be greatly appreciated.
I think you can find a solution without going over the array multiple times, or without a O(n^2) compexity.
If you can find the closest 2^k number to your desired change, you can once again think of the remaining amount as a change and calculate the closest 2^k, and once you iterate this you will find which 2^k numbers make up this change.
Let me give an example, your number is 290, the closest 2^k is 256(2^8).
The remaining is 34, the closest 2^k is 32(2^5).
The remaining is 2 so 2^1.
Once you find these:[2^8,2^5,2^1] you can now find the possible combinations that make up those.
{1,1,2,2,4,4,8,8,16,16,32,32,64,64,128,128}(numbers)
{1,2,3,4,5,6,7,8, 9 ,10,11,12,13,14, 15 ,16}(indexes)
So if you want to find the combinations for 256-2^8 there are multiple possibilities:
15 + 16
15 + (13 + 14)
15 + (13 + (11 + 12))
15 + (14 + (11 + 12))
16 + (13 + 14)
16 + (13 + 14)
16 + (13 + (11 + 12))
16 + (14 + (11 + 12))
If you notice, the first 12 elements dont make 128, similarly the first 14 elements add up to 254 which doesnt make up to 256. So the combinations are limited.
You just have to ensure that the same element isn't used in both 2^8 and 2^5.
Hope this helps.
Numbers: 1 1 2 2 4 4 8 8 16 16 32 32 64 64 128 128
Indexes: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Number Count Combinations
1 2 [1][2]
2 3 [4][3][2,1]
4 5 [6][5][4,3][4,2,1][3,2,1]
8 9 [8][7][6,5][6,4,3][5,4,3][6,4,2,1][6,3,2,1][5,4,2,1][5,3,2,1]
16 17 [10] [9] [8,7] [8,6,5] [8,6,4,3] [8,5,4,3]
(Combinations for 16 cont.)
[8,6,4,2,1] [8,6,3,2,1] [8,5,4,2,1] [8,5,3,2,1] [7,6,5] [7,6,4,3] [7,5,4,3]
[7,6,4,2,1] [7,6,3,2,1] [7,5,4,2,1] [7,5,3,2,1]
So, the formula is 2^(k) + 1
Try 16 for example, 16 is 2^4, so 2^(4) + 1 = 17
Or just +1 for the number you want to find the combinations for. If the number is 256 it's combinations will be 257.
Consider the table below. It has 400,000 rows with 40 columns with values which can range from 0 to 4,000.
MeasureValue1
MeasureValue2
MeasureValue3
...
MeasureValue40
1
5
7
...
2740
2
5
7
...
2749
2
6
7
...
2703
4
6
8
...
2721
Conditions are then given per column to which other columns in the row must suffice before the value in that specific column is counted, these conditions are only known at runtime. Essentially a group by is performed across every column where other columns in the row satisfy the given conditions. For example, these conditions could be as follows.
Count MeasureValue1 if MeasureValue2 equals 5
Count MeasureValue2 if MeasureValue1 equals 2
Count MeasureValue3 if MeasureValue1 equals 2 and MeasureValue2 equals 5
...
Count MeasureValue40 if MeasureValue1 equals 2 and MeasureValue2 equals 5
In which the final result would be a table with counts per value. Given the example conditions above, that table would look as follows.
1
2
3
4
5
6
7
8
9
10
...
2749
1
1
0
0
1
0
1
0
0
0
...
1
To tackle this problem I have written something akin to the code below. It is definitely faster than performing a LINQ GroupBy across every column, and also faster even compared to multithreaded LINQ. It takes in an array of 400,000 times 40 values. You can imagine the 40 values as being a single row in a table as described above, of which there are 400,000 rows with those 40 values. Those 40 values can range from 0 to 4,000.
Since both the amount of rows and the amount of values can change dynamically, a single array was chosen to store everything. The reason jagged arrays are not being used is due to being clunky to work with, on top of having read that they negatively affect performance.
The code then counts the values present in the array if any combination of the 39 values meets a specific condition. In the example below the first value is counted if the second value is a 5, the second value is counted if the first value is a 2, and the third value is counted if the first value is a 2 and the second value is a 5. These conditions are only known at runtime. The combinations of conditions each with their own range of hundreds of possible values quickly reaches an array with a length too big to store, which means I cannot bake the amounts in some array and be done with it.
This piece of code will be called billions of times per day, maybe more, so it is imperative that it is as quick as possible. The implementation I have currently, which resembles the example below (there is an additional optimization which only counts the conditions once), has it down to an average of 40 milliseconds, I need to reduce this to at least 4 milliseconds by any means possible (except parallelism, obviously throwing more cores at it will make it faster). I briefly looked at SIMD but couldn't figure out how to apply it to this problem. How/where can I find the fastest algorithm to tackle this problem?
void Main()
{
var values = new ushort[400_000 * 40];
var stopwatch = new Stopwatch();
stopwatch.Start();
var totals = new ushort[4_000];
for (var i = 0; i < values.Length; i += 40)
{
if (values[i + 1] == 5)
{
totals[values[i]]++;
}
if (values[i] == 2)
{
totals[values[i + 1]]++;
}
if (values[i] == 2 && values[i + 1] == 5)
{
totals[values[i + 2]]++;
}
// More ifs which count...
}
Console.WriteLine(stopwatch.ElapsedMilliseconds);
}
I was going through the Training Data RASA Format as detailed here.
{
"text": "show me chinese restaurants",
"intent": "restaurant_search",
"entities": [
{
"start": 8,
"end": 15,
"value": "chinese",
"entity": "cuisine"
}
]
}
The substring Chinese is marked as an entity from the 8th to 15th index of the utterance.
I have written a small C# program to verify the correctness of the index of the characters in the utterance.
public class Program
{
public static void Main(string[] args)
{
string s = "show me chinese restaurants";
int i = 0;
foreach(var item in s.ToCharArray())
Console.WriteLine("{0} - {1}", item, i++);
}
}
But when I run the program I get the following output:
s - 0
h - 1
o - 2
w - 3
- 4
m - 5
e - 6
- 7
c - 8
h - 9
i - 10
n - 11
e - 12
s - 13
e - 14
- 15
r - 16
e - 17
s - 18
t - 19
a - 20
u - 21
r - 22
a - 23
n - 24
t - 25
s - 26
Notice the bizarre behavior of the annotation of text the substring Chinese starts at index 8 and ends at 15 with a whitespace.
But the substring Chinese should start at index 8 and end at position 14.
When I train the same text Chinese with indices starting at position 8 and ending at 14. I get Misaligned Entity Annotation warning by RASA as detailed here.
Can someone explain this strange behavior.
Thanks
Reading the link provided I may have come up with a possible explanation:
which together make a python style range to apply to the string, e.g. in the example below, with text="show me chinese restaurants", then text[8:15] == 'chinese'
This lead me down a path that I was thinking
Hmmm that is weird i wonder if python does indexing wierdly
I spun up a quick app to prove this:
text = "show me chinese restaurants"
print(text[8:15])
Now this may not make sense because the character in space 15 of the array here is in all fact a space. Which led me onto thi article:
https://www.pythoncentral.io/how-to-slice-listsarrays-and-tuples-in-python/
It seems that the operator they are using in the example here text[8:15] slices the array, they use the example:
a = [1, 2, 3, 4, 5, 6, 7, 8]
a[1:4] which outputs: [2, 3, 4]
and explains it as such
Let me explain it. The 1 means to start at second element in the list (note that the slicing index starts at 0). The 4 means to end at the fifth element in the list, but not include it. The colon in the middle is how Python's lists recognize that we want to use slicing to get objects in the list.
So it seems that the second parameter of the slicing is exclusive.
Hope this helps
p.s. Had to learn and setup some python stuff :D
This isn't a complicated problem, but I can't for whatever reason think of a simple way to do this with the modulus operator. Basically I have a collection of N items and I want to display them in a grid.
I can display a maximum of 3 entries across and infinite vertically; they are not fixed width...So If I have 2 items they get displayed like that [1][2]. If I have 4 items they get displayed stacked like this:
[1][2]
[3][4]
If I have 5 items it should look like this:
[ 1 ][ 2]
[3][4][5]
Seven items is slightly more complicated:
[ 1 ][ 2]
[ 3 ][ 4]
[5][6][7]
This is one of those things where if I slept on it, it would be brain dead obvious in the morning, but all I can think about doing involves complicated loops and state variables. There has to be an easier way.
I'm doing this in C# but I doubt the language matters.
By maximizing the number of rows that have three items, you can minimize the total number of rows. Thus six items would be grouped as two rows of 3 rather than three rows of 2:
[1][2][3]
[4][5][6]
and ten items would be grouped as two rows of 2 and two rows of 3 rather than five rows of 2:
[ 1 ][ 2 ]
[ 3 ][ 4 ]
[5][6][7 ]
[8][9][10]
If you want rows with two items first, then you keep peeling off two items until the remaining items are divisible by 3. As you go through the loop, you need to keep track of the number of remaining items using an index or whatnot.
In your loop to populate each row, you can check these conditions:
//logic within loop iteration
if (remaining % 3 == 0) //take remaining in threes; break the loop
else if (remaining >= 4) //take two items, leaving two or more remaining
else //take remaining items, which will be two or three; break the loop
If we walk through the example of 10 items, the process would go as follows:
10 items remaining. 10 % 3 != 0. Since 10 > 4, take two items.
8 items remaining. 8 % 3 != 0. Since 8 > 4, take two items.
6 items remaining. 6 % 3 = 0. Take those 6 items in groups of three.
To go to your example of 7 items:
7 items remaining. 7 % 3 != 0. Since 7 > 4, take two items.
5 items remaining. 5 % 3 != 0. Since 5 > 4, take two items.
3 items remaining. 3 % 3 = 0. Take those 3 items as a group.
And here's the result for 4 items:
4 items remaining. 4 % 3 != 0. Since remaining = 4, take two items.
2 items remaining. 2 % 3 != 0. 2 < 4. Fall to else condition, take remaining items.
I think that'll work. At least, at 12:30 a.m. it seems like it should work.
if ((list.Count % 2) == 0)
{
//Display all as [][]
[][]
}
else
{
//Display all as [][]
[][]
//Display last 3 as [][][]
}
How about pseudo-code
if n mod 3 = 1
first 2 rows have 2 items each (assuming n >= 4)
all remaining rows have 3 items
else if n mod 3 = 2
first row has 2 items
all remaining rows have 3 items
else
all rows have 3 items
So, given that: a) the objective is to minimize the number of rows, b) a row cannot have more than 3 items, c) a row should have 3 items if possible, and d) you cannot have a row with a single item unless it is the only item, I would say the algorithm goes as follows:
If there is only one item, it will be alone in its own row; done.
Calculate the 'tentative' number of rows by dividing the number of items by 3.
If the remainder (N % 3) is 0, then all rows will have 3 items.
If the remainder is 1, then there will be an additional row, and the last 2 rows will only have 2 items each.
If the remainder is 2, then there will be an additional row, and it will only have 2 items.
This algorithm will produce a slightly different format from the one you were envisioning, (the 3-item rows will be at the top, the 2-item rows will be at the bottom,) but it satisfies the constraints. If you need the 2-item rows to be at the top, you can modify it.