Removing cells from a Sudoku solution to make it a puzzle

Removing cells from a Sudoku solution to make it a puzzle - c#

I am writing a Sudoku application and am currently working on the game generation algorithm. I managed to figure out how to quickly generate a solution (not solve). I am stumped on how to remove some of the numbers to actually make it into a puzzle, though. My first inclination was to randomly remove a certain number of cells based on the difficulty, but that is not the correct algorithm, because it often renders a puzzle that is unsolvable or has multiple solutions. It also might generate puzzles that don't reflect the requested difficulty.
Here is the code that I have so far. I removed most of the irrelevant code, but if you would like to see something that isn't implemented but used below, please let me know. I can also provide my attempt at the Puzzlefy method if you would like, but I opted out of immediately posting it since it's blatantly wrong (even though it "works").
using System;
using System.Collections.Generic;
using System.Linq;
namespace Sudoku
{
public class Game
{
public enum Difficulty
{
VeryEasy,
Easy,
Medium,
Difficult,
Evil
}
private readonly int?[,] _currentItems = new int?[9,9];
private readonly int?[,] _solution = new int?[9,9];
private readonly int?[,] _startingItems = new int?[9,9];
private readonly Difficulty _difficulty;
public Game(Difficulty difficulty)
{
_difficulty = difficulty;
GenerateSolution();
Puzzlefy();
}
private void GenerateSolution()
{
var random = new Random();
var availableNumbers = new Stack<List<int?>>(81);
var x = 0;
var y = 0;
availableNumbers.Push(AllowableNumbers(_solution, 0, 0).ToList());
while (x < 9 && y < 9)
{
var currentAvailableNumbers = AllowableNumbers(_solution, x, y).ToList();
availableNumbers.Push(currentAvailableNumbers);
// back trace if the board is in an invalid state
while (currentAvailableNumbers.Count == 0)
{
_solution[x, y] = null;
availableNumbers.Pop();
currentAvailableNumbers = availableNumbers.Peek();
x -= y >= 1 ? 0 : 1;
y = y >= 1 ? y - 1 : 8;
}
var index = random.Next(currentAvailableNumbers.Count);
_solution[x, y] = currentAvailableNumbers[index];
currentAvailableNumbers.RemoveAt(index);
x += y < 8 ? 0 : 1;
y = y < 8 ? y + 1 : 0;
}
}
private void Puzzlefy()
{
CopyCells(_solution, _startingItems);
// remove some stuff from _startingItems
CopyCells(_startingItems, _currentItems);
}
}
}
I am not looking for code, rather an algorithm. How would I go about removing the numbers from the solution to make it into a puzzle?

Here is a paper on sudoku generation
I think that you will need a sudoku solver that will also count the number of solutions available, then substract numbers in such a way that there is always only one available solution.
You could apply the same method to adding numbers to the grid, then check the number of possible solution and keep adding when the number of solution is greater than 1 and backtracking when the number of solutions is 0

There is no "easy" way to remove clues from a completed Sudoku grid as the removal process is not linear.
After each removal of a cell or clue you need to check if the Sudoku only has a unique solution.
To check this you need to run a solver that can count all possible solutions (you can stop it after 2 possibilities are found to save time).
The two most popular algorithm used both for solving a Sudoku, counting all the Sudoku solutions and removing cells are backtracking algorithms and dancing links algorithms.
This article explains really well how the dancing links algorithm can be used in Sudokus:
http://garethrees.org/2007/06/10/zendoku-generation/#section-2
Here is another description of a dancing link algorithm in Sudokus written in JavaScript:
http://www.sudokubum.com/documentation.html
And here is the full paper about dancing links algorithms in general:
http://lanl.arxiv.org/pdf/cs/0011047

Related

Generating a variable number of random numbers to a list, then comparing those numbers to a target?

How would I go about generating a serializable variable of random numbers to a list, then comparing those generated numbers to a target number?
What I want to do is make a program that takes in a number, such as 42, and generates that number of random numbers to a list while still keeping the original variable, in this case 42, to be referenced later. Super ham-handed pseudo-code(?) example:
public class Generate {
[SerializeField]
int generate = 42;
List<int> results = new List<int>;
public void Example() {
int useGenerate = generate;
//Incoming pseudo-code (rather, code that I don't know how to do, exactly)
while (useGenerate => 1) {
results.add (random.range(0,100)); //Does this make a number between 0 and 99?
int useGenerate = useGenerate - 1;
}
}
}
I think this will do something to that effect, once I figure out how to actually code it properly (Still learning).
From there, I'd like to compare the list of results to a target number, to see how many of them pass a certain threshold, in this case greater than or equal to 50. I assume this would require a "foreach" thingamabobber, but I'm not sure how to go about doing that, really. With each "success", I'd like to increment a variable to be returned at a later point. I guess something like this:
int success = 50;
int target = 0;
foreach int in List<results> {
if (int => success) {
int target = target + 1;
}
}
If I have the right idea, please just teach me how to properly code it. If you have any suggestions on how to improve it (like the whole ++ and -- thing I see here and there but don't know how to use), please teach me that, too. I looked around the web for using foreach with lists and it seemed really complicated and people were seemingly pulling some new bit of information from the Aether to include in the operation. Thanks for reading, and thanks in advance for any advice!

Working with micro changes in floats/doubles

The last couple of days have been full with making calculations and formulas and I'm beginning to lose my mind (a little bit). So now I'm turning to you guys for some insight/help.
Here's the problem; I'm working with bluetooth beacons whom are placed all over an entire floor in a building to make an indoor GPS showcase. You can use your phone to connect with these beacons, which results in receiving your longitude and latitude location from them. These numbers are large float/double variables, looking like this:
lat: 52.501288451787076
lng: 6.079107635606511
The actual changes happen at the 4th and 5th position after the point. I'm converting these numbers to the Cartesian coordinate system using;
x = R * cos(lat) * cos(lon)
z = R *sin(lat)
Now the coordinates from this conversion are kind of solid. They are numbers with which I can work with. I use them in a 3d engine (Unity3d) to make a real-time map where you can see where someone is walking.
Now for the actual problem! These beacons are not entirely accurate. These numbers 'jump' up and down even when you lay your phone down. Ranging from, let's assume the same latitude as mentioned above, 52.501280 to 52.501296. If we convert this and use it as coordinates in a 3d engine, the 'avatar' for a user jumps from one position to another (more small jumps than large jumps).
What is a good way to cope with these jumping numbers? I've tried to check for big jumps and ignore those, but the jumps are still too big. A broader check will result in almost no movement, even when a phone is moving. Or is there a better way to convert the lat and long variables for use in a 3d engine?
If there is someone who has had the same problem as me, some mathematical wonder who can give a good conversion/formula to start with or someone who knows what I'm possibly doing wrong then please, help a fellow programmer out.

Moving Average
You could use this: (Taken here: https://stackoverflow.com/a/1305/5089204)
Attention: Please read the comments to this class as this implementation has some flaws... It's just for quick test and show...
public class LimitedQueue<T> : Queue<T> {
private int limit = -1;
public int Limit {
get { return limit; }
set { limit = value; }
}
public LimitedQueue(int limit)
: base(limit) {
this.Limit = limit;
}
public new void Enqueue(T item) {
if (this.Count >= this.Limit) {
this.Dequeue();
}
base.Enqueue(item);
}
}
Just test it like this:
var queue = new LimitedQueue<float>(4);
queue.Enqueue(52.501280f);
var avg1 = queue.Average(); //52.50128
queue.Enqueue(52.501350f);
var avg2 = queue.Average(); //52.5013161
queue.Enqueue(52.501140f);
var avg3 = queue.Average(); //52.50126
queue.Enqueue(52.501022f);
var avg4 = queue.Average(); //52.5011978
queue.Enqueue(52.501635f);
var avg5 = queue.Average(); //52.50129
queue.Enqueue(52.501500f);
var avg6 = queue.Average(); //52.5013237
queue.Enqueue(52.501505f);
var avg7 = queue.Average(); //52.5014153
queue.Enqueue(52.501230f);
var avg8 = queue.Average(); //52.50147
The limited queue will not grow... You just define the count of elements you want to use (in this case I specified 4). The 5th element pushes the first out and so on...
The average will always be a smooth sliding :-)

How to best implement K-nearest neighbours in C# for large number of dimensions?

I'm implementing the K-nearest neighbours classification algorithm in C# for a training and testing set of about 20,000 samples each, and 25 dimensions.
There are only two classes, represented by '0' and '1' in my implementation. For now, I have the following simple implementation :
// testSamples and trainSamples consists of about 20k vectors each with 25 dimensions
// trainClasses contains 0 or 1 signifying the corresponding class for each sample in trainSamples
static int[] TestKnnCase(IList<double[]> trainSamples, IList<double[]> testSamples, IList<int[]> trainClasses, int K)
{
Console.WriteLine("Performing KNN with K = "+K);
var testResults = new int[testSamples.Count()];
var testNumber = testSamples.Count();
var trainNumber = trainSamples.Count();
// Declaring these here so that I don't have to 'new' them over and over again in the main loop,
// just to save some overhead
var distances = new double[trainNumber][];
for (var i = 0; i < trainNumber; i++)
{
distances[i] = new double[2]; // Will store both distance and index in here
}
// Performing KNN ...
for (var tst = 0; tst < testNumber; tst++)
{
// For every test sample, calculate distance from every training sample
Parallel.For(0, trainNumber, trn =>
{
var dist = GetDistance(testSamples[tst], trainSamples[trn]);
// Storing distance as well as index
distances[trn][0] = dist;
distances[trn][1] = trn;
});
// Sort distances and take top K (?What happens in case of multiple points at the same distance?)
var votingDistances = distances.AsParallel().OrderBy(t => t[0]).Take(K);
// Do a 'majority vote' to classify test sample
var yea = 0.0;
var nay = 0.0;
foreach (var voter in votingDistances)
{
if (trainClasses[(int)voter[1]] == 1)
yea++;
else
nay++;
}
if (yea > nay)
testResults[tst] = 1;
else
testResults[tst] = 0;
}
return testResults;
}
// Calculates and returns square of Euclidean distance between two vectors
static double GetDistance(IList<double> sample1, IList<double> sample2)
{
var distance = 0.0;
// assume sample1 and sample2 are valid i.e. same length
for (var i = 0; i < sample1.Count; i++)
{
var temp = sample1[i] - sample2[i];
distance += temp * temp;
}
return distance;
}
This takes quite a bit of time to execute. On my system it takes about 80 seconds to complete. How can I optimize this, while ensuring that it would also scale to larger number of data samples? As you can see, I've tried using PLINQ and parallel for loops, which did help (without these, it was taking about 120 seconds). What else can I do?
I've read about KD-trees being efficient for KNN in general, but every source I read stated that they're not efficient for higher dimensions.
I also found this stackoverflow discussion about this, but it seems like this is 3 years old, and I was hoping that someone would know about better solutions to this problem by now.
I've looked at machine learning libraries in C#, but for various reasons I don't want to call R or C code from my C# program, and some other libraries I saw were no more efficient than the code I've written. Now I'm just trying to figure out how I could write the most optimized code for this myself.
Edited to add - I cannot reduce the number of dimensions using PCA or something. For this particular model, 25 dimensions are required.

Whenever you are attempting to improve the performance of code, the first step is to analyze the current performance to see exactly where it is spending its time. A good profiler is crucial for this. In my previous job I was able to use the dotTrace profiler to good effect; Visual Studio also has a built-in profiler. A good profiler will tell you exactly where you code is spending time method-by-method or even line-by-line.
That being said, a few things come to mind in reading your implementation:
You are parallelizing some inner loops. Could you parallelize the outer loop instead? There is a small but nonzero cost associated to a delegate call (see here or here) which may be hitting you in the "Parallel.For" callback.
Similarly there is a small performance penalty for indexing through an array using its IList interface. You might consider declaring the array arguments to "GetDistance()" explicitly.
How large is K as compared to the size of the training array? You are completely sorting the "distances" array and taking the top K, but if K is much smaller than the array size it might make sense to use a partial sort / selection algorithm, for instance by using a SortedSet and replacing the smallest element when the set size exceeds K.

Increment through a list on a button list

I've stored a list of colors in my program. I am after an object in my scene to one of the colors in the list. So far, I have done the followings:
if(Input.GetKeyDown(KeyCode.O))
{
for(int i = 0; i < claddingColor.Count; i++)
{
claddingMaterial.color = claddingColor[i];
}
}
This isn't working due to a reason I know (and you can probably spot) but I lack to the verbal fortitude to write it down.
As opposed to have a multiple lines of the following:
claddingMaterial.color = claddingColor[0];
Each tied to different buttons, I like a way I can emulate the above but tie it to a single button press. Thus, if I hit the 0 button 5 times, it will loop through each color stored in the list. If I hit it for a sixth time, it will go back to the first color in the list.
Could someone please help me implement this? Or point me to something that I may learn how to do it for myself?

Define LastColor property as class member:
int LastColor;
In your function use modulo
if(Input.GetKeyDown(KeyCode.O))
{
claddingMaterial.color = claddingColor[(LastColor++) % claddingColor.Count];
}
Note: Depending on the type of claddingColor use Count for a List or Length for Array.

You won't need a for loop
int lastStep = 0;
if(Input.GetKeyDown(KeyCode.O))
{
claddingMaterial.color = claddingColor[lastStep++];
if (lastStep == claddingColor.Count)
lastStep = 0;
}

What's the appropriate collection for calculating a running mean?

I'm sifting through some of my old bugs and while reviewing some nasty code I realized that my averaging or smoothing algorithm was pretty bad. I did a little research which led me to the "running mean" - makes sense, pretty straightforward. I was thinking through a possible implementation and realized that I don't know which collection would provide the type of "sliding" functionality that I need. In other words, I need to push/add an item to the end of the collection and then also pop/remove the first item from the collection. I think if I knew what this was called I could find the correct collection but I don't know what to search for.
Ideally a collection where you set the max size and anything added to it that exceeds that size would pop off the first item.
To illustrate, here is what I came up with while messing around:
using System;
using System.Collections.Generic;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
LinkedList<int> samples = new LinkedList<int>();
// Simulate packing the front of the samples, this would most like be a pre-averaged
// value from the raw samples
for (int i = 0; i < 10; i++)
{
samples.AddLast(0);
}
for (int i = 0; i < 100; i++)
{
// My attempt at a "sliding collection" - not really sure what to call it but as
// an item is added the first item is removed
samples.RemoveFirst();
samples.AddLast(i);
foreach (int v in samples)
{
Console.Write("{0:000} ", v);
}
Console.WriteLine(String.Empty);
}
Console.ReadLine();
}
}
}
As you can see I am manually handling the removal of the first item. I'm just asking if there is a standard collection that is optimized for this type of use?

It appears that you're looking for a Circular Buffer. Here's a .NET implementation on CodePlex. You may also want to look at this question: How would you code an efficient Circular Buffer in Java or C#?
From the sample you've provided, it isn't clear how exactly this relates to an online-mean algorithm. If the only operation allowed on the buffer is to append; it should be trivial to cache and update the "total" inside the buffer (add the new value, subtract the removed one); making the maintaining of the mean an O(1) operation for every append. In this case, the buffer is effectively holding the Simple Moving Average (SMA) of a series.

Have you had a look at Queue Class

Does a List satisfy your needs?
List<String> myList = new List<String>();
myList.Add("Something to the end");
myList.RemoveAt(0);

#Ani - I'm creating a new Answer instead of comment because I have some code to paste. I took a swing at a dead simple object to assist with my running mean and came up with the following:
class RollingMean
{
int _pos;
int _count;
double[] _buffer;
public RollingMean(int size)
{
_buffer = new double[size];
_pos = 0;
_count = 0;
}
public RollingMean(int size, double initialValue)
: this(size)
{
// Believe it or not there doesn't seem to be a better(performance) way...
for (int i = 0; i < size; i++)
{
_buffer[i] = initialValue;
}
_count = size;
}
public double Push(double value)
{
_buffer[_pos] = value;
_pos = (++_pos > _buffer.Length - 1) ? 0 : _pos;
_count = Math.Min(++_count, _buffer.Length);
return Mean;
}
public double Mean
{
get
{
return _buffer.Sum() / _count;
}
}
}
I'm reading 16 channels of data from a data acquisition system so I will just instantiate one of these for each channel and I think it will be cleaner than managing a multi-dimensional array or separate set of buffer/postition for each channel.
Here is sample usage for anyone interested:
static void Main(string[] args)
{
RollingMean mean = new RollingMean(10, 7);
mean.Push(3);
mean.Push(4);
mean.Push(5);
mean.Push(6);
mean.Push(7.125);
Console.WriteLine( mean.Mean );
Console.ReadLine();
}
I was going to make the RollingMean object a generic rather than lock into double but I couldn't find a generic constraint to limit the tpye numerical types. I moved on, gotta get back to work. Thanks for you help.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.