I've got a weird problem to solve - this is to be used in designing a quiz, but it's easiest to explain using teams.
There're 16 teams, and 24 matches. 4 teams play in every match. Each team has to appear once against 12/16 teams and twice against the remaining 3/16, and has to appear exactly 6 times. Any ideas on how to do this? If there's a software that can do this, that'd be great as well.
UPDATE:
I'm not sure if the above is even possible. Here is the minimum we're trying to accomplish:
Number of games is not set.
Each Game has 4 teams.
Each team gets an equal number of games.
Is this possible?
Check this ...
http://en.wikipedia.org/wiki/Round-robin_tournament
I think someone could generalize the algorithm so that applies for more than 2 teams ...
I know this doesn't answer the question but it provides some tip ...
This also may help a little ...
http://en.wikipedia.org/wiki/Tournament_(graph_theory)
Note that each team plays 3 others per match, so it takes at least 5 matches to play all 15 other teams. We hope, then, that there is a solution for 20 matches where each team plays 5 matches and plays each team exactly once.
With 16 teams it's possible to construct a solution by hand in the following way...
Divide the 20 matches into 5 rounds
Number the teams 1 to 16
For each match in turn, for each of the 4 places in that match, allocate the first team which
is still available to play in that round
has not yet played any of the teams already allocated to that match
You can narrow the search for available teams somewhat by noting that each match must contain exactly one team from each match of the previous round, so for place n you need only consider the teams which played match n in the previous round.
If we want 24 matches then any random choice of matches will suffice in the sixth round to fit the original requirements. However, to also ensure that no exact matches are repeated we can switch pairs of teams between the matches in some previous round. That is, if {1,2,3,4} and {5,6,7,8} were matches in some round then in round 6 we'll have {1,2,7,8} and {3,4,5,6}. Since 1 and 2 played each other exactly once in rounds 1-5, in the match {1,2,3,4}, we certainly haven't played match {1,2,7,8} yet.
The choice of data structures to implement that efficiently is left as an exercise for the reader.
Pull out your combinatorics book. I remember these questions as in that scope.
"Combinatorial Designs and Tournaments" was a textbook I had for a course about Combinatorial Designs that had this type of problem. One of my majors back in university was Combinatorics & Optimization, so I do remember a little about this kind of thing.
A little more clarity identifying the problem would be helpful. What type of sport are you trying to schedule. It sounds like you're into a 16 person tennis league and each week 4 individuals players show up on four courts to play a doubles match (players A&B vs C&D). The same is happening on the other three courts with players E thru P. Is this what you're looking for? If so the answer is easy. If not, I still don't understand what you're looking for.
Related
So I am looking to solve the problem stated below and I am having problems and what to actually look for as I can not describe the problem in simple terms. I am hoping someone may be able to shed some light on the correct Algorithm or path I should take to solve it.
The problem(simplified):
so lets say I have a multiple people objects.
Person1
Person2
Person3
Now lets say I have 6 slots
Slot1
Slot2
Slot3
Slot4
Slot5
Slot6
Each person has rules associated with them such as
Person1 can not use a slot with an odd number and must be in 3
different slots.
Person2 Can only go into slots from 2 up and must be in 2 slots
Person3 can only go into 1 prime number slot.
so we end up with
Slot1 - Person3
Slot2 - Person1
Slot3 - Person2
Slot4 - Person1
Slot5 - Person2
Slot6 - Person1
I know this will require use of A.I/Machine learning and I have done some research into the area but I cannot find what algorithm I should be using for a problem like or even how to search for this. The only way I have found of doing this in some way is through as regression tree but it seems to me like that way seems like the wrong path to take.
Note: I will be using c# to solve this problem and hopefully some framework like Encog.
Actually I think you can solve this problem with Maximum Matching with a simple modification. In the standard Maximum Matching each node is only matched with one other node but here a Person can have multiple matches. By creating multiple instances of Person you can reduce this problem to Maximum Matching. For example:
Person1 cannot use a slot with an odd number and must be in 3
different slots.
Create 3 nodes for Person1 and connect them to even number slots.
Person 2 Can only go into slots from 2 up and must be in 2 slots
Create 2 nodes of Person2 and connect them to slots with number bigger than 2.
Person 3 can only go into 1 prime number slot.
Create 1 node for Person3 and connect it to slot1, slot2, slot3, and slot5.
Perform Maximum Matching on the resulted graph and you will find the answer.
Actually this problem is standard discrete optimization problem. You may want to look at coursera discrete optimization course.
In its first week, it has a workshop named Simple Puzzles. It gives a similar problem to yours and shows how to solve it in their platform mini-zinc.
After you understand how this type of problems solved, you may want to look a c# solution from List of optimization software.
The pairing between Person and Slot is a scheduling problem which requires knowledge to solve it. The knowledge is described in rules like "Person1 can not use a slot with an odd number and must be in 3 different slots." The algorithm to solve the problem uses the rules and the given ressources (three persons and six slots) to generate all possible solutions. Computerscience has the task to formalize the knowledge into machine readable code. There are many programming languages out there e.g. PDDL, Prolog or object-oriented languages. In classical "AI Planning and scheduling" which is discussed at the ICAPS Conference the PDDL language would be the prefered choice for modelling the knowledge. The algorithm itself to solve a given domain is in most cases backtracking, monte-carlo-tree-search or simply brute-force. That is called "problem solving as search".
One of my clients wants to use a unique code for his items (long story..) and he asked me for a solution. The code will consist in 4 parts in which the first one is the zip code where the item is sent from, the second one is the supplier registration number, the third number is the year when the item is sent and the last part is a three division alphanumeric unique character.
As you can see the first three parts are static fields which will never change for the same sender in the same year. So we can say that the last part is the identifier part for that year. This part is 3-division alpahnumeric which means starting from 000 and ending with ZZZ.
The problem is that my client, for some reasonable reasons, wants this part to be not sequential. For example this is not what he wants:
06450-05-2012-000
06450-05-2012-001
06450-05-2012-002
...
06450-05-2012-ZZY
06450-05-2012-ZZZ
The last part should produced randomly like:
06450-05-2012-A17
06450-05-2012-0BF
06450-05-2012-002
...
06450-05-2012-T7W
06450-05-2012-22C
But it should also non-repetitive. So once a possible id is generated the possibility should be discarded from the selection pool.
I am looking for an effective way to do this.
If I only record selected possibilities and check a newly created one against them there is always a worst case possibility that it keeps producing already selected ones, especially near the end.
If I create all possibilities at once and record them in a table or a file it may take a while after every item creation because it will lookup for a non-selected record. By the way 26 letters + 10 digits means 46.656 possible combinations, and there is a chance that there may be a 4th divison added which means 1.679.616 possible combinations.
Is there a more effective way you can suggest? I will use C# for coding and MS SQL for databese..
If it doesn't have to be random, you could maybe simply choose a fixed but "unpredictable" addend which is relatively prime to 26 + 10 == 36 == 2²·3². This means, just choose a fixed addend divisible by neither 2 nor 3.
Then keep adding this fixed number to your previous serial number every time you need a new serial number. This is to be done modulo 46656 (or 1679616) of course.
Mathematics guarantees you won't get the same number twice (before no more "free" numbers are left).
As the addend, you could use const int addend = 26075 since it's 5 modulo 6.
If you expect to create far less than 36^3 entries for each zip-supplier-year tuple, you should probably just pick a random value for the last field and then check to see if it exists, repeating if it does.
Even if you create half of the maximum number of possible entries, new entries still have an expected value of only one failure. Assuming your database is indexed on the overall identifier, this isn't too great a price to pay.
That said, if you expect to use all but a few possible identifiers, then you should probably create all the possible records in advance. It may sounds like a high cost, but each space in memory storing an unused record will eventually store a real record.
I'd expect the first situation is more likely, but if not, or if there's some other combination of the two, please add a comment with some more information and I'll revise my answer.
I think options depend on the amount of the codes that are going to be used:
If you expect to use most of them within a year, then it is better to pre-generate. If done right, lookup should be really fast. And you are going to have 1.679.616 items per year in your DB anyway, so you will have to do such things right.
On the other hand, is it good that you are expecting to use most of them? It may leave you without codes if there are suddenly more items than expected.
If you expect to use only a small amount, then random+existence check might be a way to go, however it is unclear what amount it should be for that to be best (I am pretty sure it is possible to calculate that though).
I record a daily 2 minutes radio broadcast from Internet. There's always the same starting and ending jingle. Since the radio broadcast exact time may vary from more or less 6 minutes I have to record around 15 minutes of radio.
I wish to identify the exact time where those jingles are in the 15 minutes record, so I can extract the portion of audio I want.
I already started a C# application where I decode an MP3 to PCM data and convert the PCM data to a spectrogram based on http://www.codeproject.com/KB/audio-video/SoundCatcher.aspx
I tried to use a Cross Correlation algorithm on the PCM data but the algorithm is very slow around 6 minutes with a step of 10ms and is some occasion it fail to find the jingle start time.
Any ideas of algorithms to compare two spectrogram for match? Or a better way to find that jingle start time?
Thanks,
Update, sorry for the delay
First, thank for all the anwsers most of them were relevent and or interresting ideas.
I tried to implement the Shazam algorithm proposed by fonzo. But failed to detect the peaks in the spectrogram. Here's three spectrograms of the starting jingle from three different records. I tried AForge.NET with the blob filter (but it failed to identify peaks), to blur the image and check for difference in height, the Laplace convolution, slope analysis, to detect the series of vertical bars (but there was too many false positive)...
In the mean while, I tried the Hough algorithm proposed by Dave Aaron Smith. Where I calculate the RMS of each columns. Yes yes each columns, it's a O(N*M) but M << N (Notice a column is around 8k of sample). So in the overall it's not that bad, still the algorithm take about 3 minutes, but has never fail.
I could go with that solution, but if possible, I would prefer the Shazam cause it's O(N) and probably much faster (and cooler also). So does any of you have an idea of an algorithm to always detect the same points in those spectrograms (doesn't have to be peaks), thanks to add a comment.
New Update
Finally, I went with the algorithm explained above, I tried to implement the Shazam algorithm, but failed to find proper peaks in the spectrogram, the identified points where not constant from one sound file to another. In theory, the Shazam algorithm is the solution for that kind of problem. The Hough algorithm proposed by Dave Aaron Smith was more stable and effective. I split around 400 files, and only 20 of them fail to split properly. Disk space when from 8GB to 1GB.
Thanks, for your help.
There's a description of the algorithm used by the shazam service (which identifies a music given a short possibly noisy sample) here : http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf
From what I understood, the first thing done is to isolate peaks in the spectrogram (with some tweaks to assure an uniform coverage), which will give a "constellation" of pair of values (time;frequency) from the initial spectrogram. Once done, the sample constellation is compared to the constellation of the full track by translating a window of the sample length from the beginning to the end and counting the number of correlated points.
The paper then describes the technical solution they found to be able to do the comparison fast even with a huge collection of tracks.
I wonder if you could use a Hough transform. You would start by cataloging each step of the opening sequence. Let's say you use 10 ms steps and the opening sequence is 50 ms long. You compute some metric on each step and get
1 10 1 17 5
Now go through your audio and analyze each 10 ms step for the same metric. Call this array have_audio
8 10 8 7 5 1 10 1 17 6 2 10...
Now create a new empty array that's the same length as have_audio. Call it start_votes. It will contain "votes" for the start of the opening sequence. If you see a 1, you may be in the 1st or 3rd step of the opening sequence, so you have 1 vote for the opening sequence starting 1 step ago and 1 vote for the opening sequence starting 3 steps ago. If you see a 10, you have 1 vote for the opening sequence starting 2 steps ago, a 17 votes for 4 step ago, and so on.
So for that example have_audio, your votes will look like
2 0 0 1 0 4 0 0 0 0 0 1 ...
You have a lot of votes at position 6, so there's a good chance the opening sequence starts there.
You could improve performance by not bothering to analyze the entire opening sequence. If the opening sequence is 10 seconds long, you could just search for the first 5 seconds.
Here is a good python package that does just this:
https://code.google.com/p/py-astm/
If you are looking for a specific algorithm, good search terms to use are "accoustic fingerprinting" or "perceptual hashing".
Here's another python package that could also be used:
http://rudd-o.com/new-projects/python-audioprocessing/documentation/manuals/algorithms/butterscotch-signatures
If you already know the jingle sequence, you could analyse the correlation with the sequence instead of the cross correlation between the full 15 minutes tracks.
To quickly calculate the correlation against the (short) sequence, I would suggest using a Wiener filter.
Edit: a Wiener filter is a way to locate a signal in a sequence with noise. In this application, we are considering anything that is "not jingle" as noise (question for the reader: can we still assume that the noise is white and not correlated?).
( I found the reference I was looking for! The formulas I remembered were a little off and I'll remove them now)
The relevant page is Wiener deconvolution. The idea is that we can define a system whose impulse response h(t) has the same waveform as the jingle, and we have to locate the point in a noisy sequence where the system has received an impulse (i.e.: emitted a jingje).
Since the jingle is known, we can calculate its power spectrum H(f), and since we can assume that a single jingle appears in a recorded sequence, we can say that the unknown input x(t) has the shape of a pulse, whose power density S(f) is constant at each frequency.
Given the knowledges above, you can use the formula to obtain a "jingle-pass" filter (as in, only signals shaped like the jingle can pass) whose output is highest when the jingle is played.
I am trying to figure out the following simple problem in order to get acquainted with Solver Foundation.
I have 8 hours, 1 room and 3 teachers. Each teacher must hold 2 lectures 1 hour long each and each teacher must not hold 2 consecutive lectures. I am having trouble finding out how to model something that contains time in it. How can that be modeled into a mathematical equation?
I am not looking for a code block that does it, but rather an explanation or may be some resources that I can read.
Thanks in advance.
Since you say you have 8 hours, and each lecture must be exactly 1 hour long, can’t you just model the 8 hours as “slots” that you put the teachers “into”? It seems equivalent to assigning people to cinema seats or similar (except of course that each teacher can have two time slots).
I'm developing a poker game in C#. At the moment I'm trying to get the players hand score using RegEx. I search the string (composed of the cards suit and number) and look for suits or numbers to match the RegEx. If i get 2 matches then the player has a pair, 3 matches he has 3 of a kind.
I have 3 classes at the moment, a Card class (with number and suit), a Deck class (that contains 52 Cards) and a Hand class that gets five cards from the shuffled deck.
Deck class has a shuffleDeck();
Hand class has the functions to calculate the score (is in these functions that I am using RegEx).
I generate the string on which I use RegEx by adding the 5 suits and numbers that the hand has.
Is this a good idea or should I do it another way, if so, how?
Thank you for your help
PS. I am one of the unexperienced programmers that want to use a newly learned tool for everything
I do not think that a regex is the appropriate way to deal with this. You probably should be using a more sophisticated representation of a hand than a string.
You have not provided much detail, but what from what I have read, I assume you're not pushing the OOP very far...
I would have a Card class that has a Rank and Suit class instances. I would then have a deck class that handles shuffling / dealing...
I would then have a Hand class that would contain your poker hand of n Card objects...
In this way you can build up rules to evaluate each hand object, thus being more flexible and more extensible in the future...say if you want to make another card game / add support for another variant of poker...
Using Regular expressions to do all of this seems to be a pretty poor choice.
I would agree with the others, that Regex seems like a bad choice. It might work with pairs, 3 of a kind, 4 of a kind. However, it might get kind of tricky (or impossible) once you start looking at hands like flushes, straights, and 2 pair.
I think the best solution would be to evaluate the cards from best hand to worst hand, as shown here, and as soon as your find a match, then that is your hand. This ensures that you don't mistake 4 of a kind for 2 pair. Or a straight flush for just a straight, or just a flush. I would go with mmattax and create an object for the card, and an object for the hand, then you can evaluate the cards in each hand to see if they meet the required criteria for each hand.
I think prime numbers are a good solution for that.
consider :
// D H S C
colors = [7,5,3,2]
// A Q K J T 9 8 7 6 5 4 3 2
ranks = [61,59,53,43,41,37,31,29,23,19,17,13,11,61]
a unique card is identified by a color prime number * a rank prime number.
(for example, As of Diamonds : prime = 7 * 61)
so an entiere unique deck or combinaison are identified by prime * prime * prime * prime * prime
if there is a flush of Diamonds, the 5 cards deck's primes ID must by divisble ( mod = 0 ) by the flush of diamonds ID ( 7 ^ 5 because diamonds color prime is 7 )
Using a string to represent the hand seems like a poor decision. My recommendation would be to use an Enum to represent the Suit and another to represent the numeric value of the card.