i'm learning C# at the moment and i'm learning all those new concepts and stuff.
I was working on a program for school on which it consists of creating an aquarium that has fish on it and these fish has positions.
The class is initialized:
int z = 0,x,y,max;
Peixe[] peixes;
Posicao[] posicoes;
Random rng = new Random();
public Aquario(string nome, int x, int y, int max)
{
this.nome = nome;
this.x = x;
this.y = y;
peixes = new Peixe[max];
posicoes = new Posicao[max];
}
So, in my aquarium there must be a method to shuffle the fish's positions and I did that as follows:
public bool AbanarAquario()
{
for (int i = 0; i < z; i++)
{
int x = rng.Next(0, this.x);
int y = rng.Next(0, this.y);
if (!peixes[i].emagrecer())
{
return false;
}
else
{
posicoes[i].ChangeX(x);
posicoes[i].ChangeY(y);
}
}
if (z > 0)
{
for (int i = 0; i < z; i++)
{
for (int j = 0; j < z; j++)
{
if (posicoes[i].comparePos(posicoes[j]))
{
if (peixes[i].mostraPeso() > peixes[j].mostraPeso())
{
if (!peixes[j].eliminarPeixe())
{
return false;
}
else
{
posicoes[j] = null;
z--;
}
}
else
{
if (!peixes[i].eliminarPeixe())
{
return false;
}
else
{
posicoes[i] = null;
z--;
}
}
}
}
}
}
return true;
}
So, that uses a method that my position class has and again, is as follows:
public bool comparePos(Posicao outro)
{
if (this.x == outro.x && this.y == outro.y)
{
return true;
}
return false;
}
So, visual studio warns me that there will probably be a null reference exception in the line if (posicoes[i].comparePos(posicoes[j])) and I can't figure out why!
Not only this the program actually does crash when I try to shuffle the positions, tried to debug it but i'm not very experienced in the matter so no luck there.
EDIT
Initializing position:
public bool adicionarPeixe(string nome, string cor, float peso)
{
if (z >= peixes.Length)
{
return false;
}
else
{
int x = rng.Next(0, this.x);
int y = rng.Next(0, this.y);
peixes[z] = new Peixe(z, nome, cor, peso);
peixes[z].Aquario(0);
posicoes[z] = new Posicao(x,y);
z++;
}
return true;
}
I think it's a fair warning.
The z variable looks like it could be anything. Consider:
Dividing this up into different methods.
Using a foreach() loop instead of a for() loop. (Failing that you use int z = posicoes.Count())
I think the compiler is worried about z=999 when the collection only goes up to 99 or something like that.
Unless I'm completely missing it, I don't see where you're defining posicoes. But it sounds like you aren't initializing it.
If you're doing something like,
Posicao posicoes;
then using it, then the compiler is assuming it's going to be null and warning you.
You need to do something like,
Posicao posicoes = new Posicao();
So the object exists for you to manipulate. But again, I'm not sure where you're defining it, so this could be off base. Can you update your question with where and how you're defining posicoes.
Also if you are defining it, you can put a simple check in right before you use it, or if you just want to back out of the method if for some reason it is null at the top of the method you can just do,
if(posicoes == null)
{
return false;
}
EDIT
Per information about the constructor, your loop is going outside the size of the list. I don't see where Z is being defined, but are you by chance not accounting for 0 based indexing and going over the length of the array by 1?
Put a break point on where you're setting Z and see if it's bigger than the amount of items in your list.
Related
I am new to c# (or coding in general) and I guess this question is really stupid and confusing (I know I'm doing it a hard way) but please help me.
I'm trying to make a minesweeper with form app. I made a 10 x 10 of buttons and if you click it, number of mines around it will be revealed. If a mine is there "F" (the first letter of "False") will appear.
There's a constructor that contains the button, x and y position, list of surrounding blocks, number of mines around it, and a boolean that indicates if there's a mine or not.
What I tried to do was to make the 8 surrounding blocks (from the list) cleared when the player clicked a block with no mine around it and if the block surrounding that block also doesn't have any mine around it, these blocks that surrounding that block will also be cleared. The method uses foreach to reveal and check the number of mines around that block. If there's no mines, same method will be applied to that block (calling the method recursively). The problem is that I keep getting System.StackOverflowException.
I somehow understand why it's happening but I just can't come up with the other way.
//scroll to the bottom for the method with the problem
private void Form1_Load(object sender, EventArgs e)
{
Random random = new Random();
Button[,] buttons = new Button[10, 10]
{
{ r0c0, r0c1, r0c2, r0c3, r0c4, r0c5, r0c6, r0c7, r0c8, r0c9 },
{ r1c0, r1c1, r1c2, r1c3, r1c4, r1c5, r1c6, r1c7, r1c8, r1c9 },
{ r2c0, r2c1, r2c2, r2c3, r2c4, r2c5, r2c6, r2c7, r2c8, r2c9 },
{ r3c0, r3c1, r3c2, r3c3, r3c4, r3c5, r3c6, r3c7, r3c8, r3c9 },
{ r4c0, r4c1, r4c2, r4c3, r4c4, r4c5, r4c6, r4c7, r4c8, r4c9 },
{ r5c0, r5c1, r5c2, r5c3, r5c4, r5c5, r5c6, r5c7, r5c8, r5c9 },
{ r6c0, r6c1, r6c2, r6c3, r6c4, r6c5, r6c6, r6c7, r6c8, r6c9 },
{ r7c0, r7c1, r7c2, r7c3, r7c4, r7c5, r7c6, r7c7, r7c8, r7c9 },
{ r8c0, r8c1, r8c2, r8c3, r8c4, r8c5, r8c6, r8c7, r8c8, r8c9 },
{ r9c0, r9c1, r9c2, r9c3, r9c4, r9c5, r9c6, r9c7, r9c8, r9c9 }
};
Square[,] squares = new Square[10, 10];
for (int i = 0, ii = 0, iii = 0; i < 100; i++, ii++)
{
if (ii == 10)
{
ii = 0;
iii++;
}
squares[ii, iii] = new Square(i, buttons[ii, iii], ii, iii, 0, true);
}
List<int> randoms = new List<int>();
for (int i = 0; i < 10; i++)
{
int ii = random.Next(100);
if (!randoms.Contains(ii))
{
squares[ii % 10, ii / 10].setSafe(false);
}
else
{
i--;
}
randoms.Add(ii);
}
for (int i = 0; i < 10; i++)
{
for (int ii = 0; ii < 10; ii++)
{
for (int iii = -1; iii < 2; iii++)
{
for (int iiii = -1; iiii < 2; iiii++)
{
try
{
if (squares[i + iii, ii + iiii].getSafe() == false)
squares[i, ii].addNumber();
}
catch (System.IndexOutOfRangeException)
{
}
}
//if (squares[i, ii].getSafe() == false) squares[i, ii].getButton().Text = squares[i, ii].getSafe().ToString();
//else squares[i, ii].getButton().Text = squares[i, ii].getNumber().ToString();
}
}
}
for (int i = 0; i < 10; i++)
{
for (int ii = 0; ii < 10; ii++)
{
for (int iii = -1; iii < 2; iii++)
{
for (int iiii = -1; iiii < 2; iiii++)
{
try
{
squares[i, ii].addList(squares[i + iii, ii + iiii]);
}
catch (System.IndexOutOfRangeException)
{
}
}
}
}
}
}
Here's the Square class:
public class Square
{
int id;
Button button;
int x;
int y;
int number;
bool safe;
List<Square> list = new List<Square>();
public Square(int id, Button button, int x, int y, int number, bool safe)
{
this.id = id;
this.button = button;
this.x = x;
this.y = y;
this.number = number;
this.safe = safe;
button.Text = "";
button.Click += button_Click;
}
public int getId()
{
return id;
}
public void setId(int i)
{
id = i;
}
public Button getButton()
{
return button;
}
public void setButton(Button b)
{
button = b;
}
public int getX()
{
return x;
}
public void setX(int i)
{
x = i;
}
public int getY()
{
return y;
}
public void setY(int i)
{
y = i;
}
public int getNumber()
{
return number;
}
public void setNumber(int i)
{
number = i;
}
public void addNumber()
{
number++;
}
public bool getSafe()
{
return safe;
}
public void setSafe(bool b)
{
safe = b;
}
private void button_Click(object sender, EventArgs e)
{
if (getSafe() == false) button.Text = getSafe().ToString();
else button.Text = getNumber().ToString();
if (getNumber() == 0) zeroReveal();
}
//---------------------------------------------------
// this is the method that reveals surrounding blocks
//---------------------------------------------------
private void zeroReveal()
{
foreach (Square s in list)
{
//revealing the blocks
s.getButton().Text = s.getNumber().ToString();
//call the same method if there's no mine
//this is the line that keeps giving me exception
if (s.getNumber() == 0) s.zeroReveal();
}
}
//-----------------------------------------------------
public List<Square> getList()
{
return list;
}
public void setList(List<Square> sl)
{
list = sl;
}
public void addList(Square s)
{
list.Add(s);
}
}
I am new to c# (or coding in general) and I guess this question is really stupid and confusing (I know I'm doing it a hard way)
This topic confuses many a new developer; don't stress out about it!
If there's no mines, same method will be applied to that block (calling the method recursively).
Recursive methods can be confusing but if you design them using the standard pattern, you will avoid SO exceptions. You have not designed yours using the standard pattern.
The standard pattern for successful recursive methods is:
Am I in a case that requires no recursion?
If yes, do the necessary computations to produce the desired effect and return. The problem is now solved.
If no, then we're going to recurse.
Break the current problem down into some number of smaller problems.
Solve each smaller problem by recursing.
Combine the solutions of the smaller problem to solve the current problem.
The problem is now solved, so return.
The most important thing about designing a recursive method is that each recursion must be solving a smaller problem, and the sequence of smaller problems must bottom out at a case that does not require recursion. If those two conditions are not met, then you will get a stack overflow.
Internalize that pattern, and every time you write a recursive method, actually write it out:
int Frob(int blah)
{
if (I am in the base case)
{
solve the base case
return the result
}
else
{
find smaller problems
solve them
combine their solutions
return the result
}
}
Fill in that template with your real code, and you will be much more likely to avoid stack overflows. I've been writing recursive methods for decades, and I still follow this pattern.
Now, in your example, what is the case that does not require recursion? There must be one, so write down what it is. Next, how will you guarantee that the recursion solves a smaller problem? That is often the hard step! Give it some thought.
The stack overflow is occurring because zeroReveal is recursively calling itself forever. To fix this we need to find ways where we do not need it to make further calls to itself.
The name of the method gives us a clue. If the square has already been revealed, then surely the method does not need to do anything, since it has already been revealed.
It looks like the button's Text property is an empty string if it has not yet been revealed. So change the foreach so that it doesn't process squares that have already been revealed:
foreach (Square s in list)
{
if (s.getButton().Text == ""))
{
// existing code in the foreach loop goes here
}
}
I am trying to implement Iterative Deeping Search. I do not know what I am doing wrong, but I don't seem to be getting it right. I always end up with an infinite loop.
Can anyone point out my mistake?
I implemented the Depth-Limited Search and used it in my IDS code. DLS seems to be working fine on its own, but I do not understand IDS and why i'm ending up in an infinite loop.
public class IterativeDeepeningSearch<T> where T : IComparable
{
string closed;
public int maximumDepth;
public int depth = 0;
bool Found = false;
Stack<Vertex<T>> open;
public IterativeDeepeningSearch()
{
open = new Stack<Vertex<T>>();
}
public bool IDS(Vertex<T> startNode, Vertex<T> goalNode)
{
// loops through until a goal node is found
for (int _depth = 0; _depth < Int32.MaxValue; _depth++)
{
bool found = DLS(startNode, goalNode, _depth);
if (found)
{
return true;
}
}
// this will never be reached as it
// loops forever until goal is found
return false;
}
public bool DLS(Vertex<T> startNode, Vertex<T> goalNode, int _maximumDepth)
{
maximumDepth = _maximumDepth;
open.Push(startNode);
while (open.Count > 0 && depth < maximumDepth)
{
Vertex<T> node = open.Pop();
closed = closed + " " + node.Data.ToString();
if (node.Data.ToString() == goalNode.Data.ToString())
{
Debug.Write("Success");
Found = true;
break;
}
List<Vertex<T>> neighbours = node.Neighbors;
depth++;
if (neighbours != null)
{
foreach (Vertex<T> neighbour in neighbours)
{
if (!closed.Contains(neighbour.ToString()))
open.Push(neighbour);
}
Debug.Write("Failure");
}
}
Console.WriteLine(closed);
return Found;
}
}
}
PS: My Vertex Class just has two properties, Data and Children
The for loop iterates on _depth but in the DLS function you are passing depth, which is always 0
for (int _depth = 0; _depth < Int32.MaxValue; _depth++)
{
bool found = DLS(startNode, goalNode, depth);
if (found)
{
return true;
}
}
I am trying to create a 2D cave generation system. When I run the program I get "System.StackOverflowException" Exception, after I try to create new object from its own class.
My cave generator works like this:
I create a map that contains ID’s (integers) of the different types of cells(like wall, water or empty Space).
First off all my "Map" class creates a map filled with walls and after that in the centre of the map, it creates a "Miner" object. The Miner digs the map and makes caves. The problem is I want to create more miners. So, my Miner that is digging the map creates another Miner. However, when I do this, I get a "System.StackOverflowException" Exception.
How do I go about tracking down the cause of the StackOverflow in my program.
Here is my miner code:
Miner.cs
public class Miner
{
Random rand = new Random();
public string state { get; set; }
public int x { get; set; }
public int y { get; set; }
public Map map { get; set; }
public int minersCount;
public Miner(Map map, string state, int x, int y)
{
this.map = map;
this.state = state;
this.x = x;
this.y = y;
minersCount++;
if (state == "Active")
{
StartDigging();
}
}
bool IsOutOfBounds(int x, int y)
{
if (x == 0 || y == 0)
{
return true;
}
else if (x > map.mapWidth - 2 || y > map.mapHeight - 2)
{
return true;
}
return false;
}
bool IsLastMiner()
{
if (minersCount == 1)
{
return true;
}
else
{
return false;
}
}
public void StartDigging()
{
if (state == "Active")
{
int dir = 0;
bool needStop = false;
int ID = -1;
while (!needStop && !IsOutOfBounds(x, y))
{
while (dir == 0)
{
dir = ChooseDirection();
}
if (!AroundIsNothing())
{
while (ID == -1)
{
ID = GetIDFromDirection(dir);
}
}
else
{
if (!IsLastMiner())
{
needStop = true;
}
}
if (ID == 1)
{
DigToDirection(dir);
dir = 0;
}
if (ID == 0 && IsLastMiner())
{
MoveToDirection(dir);
dir = 0;
}
TryToCreateNewMiner();
}
if (needStop)
{
state = "Deactive";
}
}
}
public void TryToCreateNewMiner()
{
if (RandomPercent(8))
{
Miner newMiner = new Miner(map, "Active", x, y);
}
else
{
return;
}
}
bool AroundIsNothing()
{
if (map.map[x + 1, y] == 0 && map.map[x, y + 1] == 0 &&
map.map[x - 1, y] == 0 && map.map[x, y - 1] == 0)
{
return true;
}
else
{
return false;
}
}
void MoveToDirection(int dir)
{
if (dir == 1)
{
x = x + 1;
}
else if (dir == 2)
{
y = y + 1;
}
else if (dir == 3)
{
x = x - 1;
}
else if (dir == 4)
{
y = y - 1;
}
}
void DigToDirection(int dir)
{
if (dir == 1)
{
map.map[x + 1, y] = 0;
x = x + 1;
}
else if (dir == 2)
{
map.map[x, y + 1] = 0;
y = y + 1;
}
else if (dir == 3)
{
map.map[x - 1, y] = 0;
x = x - 1;
}
else if (dir == 4)
{
map.map[x, y - 1] = 0;
y = y - 1;
}
}
int GetIDFromDirection(int dir)
{
if (dir == 1)
{
return map.map[x + 1, y];
}
else if (dir == 2)
{
return map.map[x, y + 1];
}
else if (dir == 3)
{
return map.map[x - 1, y];
}
else if (dir == 4)
{
return map.map[x, y - 1];
}
else
{
return -1;
}
}
int ChooseDirection()
{
return rand.Next(1, 5);
}
bool RandomPercent(int percent)
{
if (percent >= rand.Next(1, 101))
{
return true;
}
return false;
}
}
Whilst you can get StackOverflowExceptions by creating too many really large objects on the stack, it usually happens because your code has got into a state where it is calling the same chain of functions over and over again. So, to track down the cause in your code, the best starting point is to determine where your code calls itself.
Your code consists of several functions that are called by the Miner class itself, most of which are trivial
Trivial functions that don't call anything else in the class. Whilst these functions may contribute to the state that triggers the problem, they aren't part of the terminal function loop:
IsOutOfBounds(int x, int y)
bool IsLastMiner()
bool AroundIsNothing()
void MoveToDirection(int dir)
void DigToDirection(int dir)
int GetIDFromDirection(int dir)
int ChooseDirection()
bool RandomPercent(int percent)
This leaves your remaining three functions
public Miner(Map map, string state, int x, int y) // Called by TryToCreateNewMiner
public void StartDigging() // Called by constructor
// Contains main digging loop
public void TryToCreateNewMiner() // Called by StartDigging
These three functions form a calling loop, so if the branching logic in the functions is incorrect it could cause a non-terminating loop and hence a stack overflow.
So, looking at the branching logic in the functions
Miner
The constructor only has one branch, based on if the state is "Active". It is always active, since that's the way the object is always being created, so the constructor will always call StartDigging. This feels like the state isn't being handled correctly, although it's possible that you're going to use it for something else in the future...
As an aside, it's generally considered to be bad practice to do a lot of processing, not required to create the object in an objects constructor. All of your processing happens in the constructor which feels wrong.
TryToCreateNewMiner
This has one branch, 8% of the time, it will create a new miner and call the constructor. So for every 10 times TryToCreateNewMiner is called, we stand a good chance that it will have succeeded at least once. The new miner is initially started in the same position as the parent object (x and y aren't changed).
StartDigging
There's a fair bit of branching in this method. The main bit we are interested in are the conditions around calls to TryToCreateNewMiner. Lets look at the branches:
if(state=="Active")
This is currently a redundant check (it's always active).
while (!needStop && !IsOutOfBounds(x, y)) {
The first part of this termination clause is never triggered. needStop is only ever set to true if(!IsLastMiner). Since minersCount is always 1, it's always the last miner, so needStop is never triggered. The way you are using minersCount suggests that you think it is shared between instances of Miner, which it isn't. If that is your intention you may want to read up on static variables.
The second part of the termination clause is the only way out of the loop and that is triggered if either x or y reaches the edge of the map.
while(dir==0)
This is a pointless check, dir can only be a number between 1 and 5, since that's what is returned by ChooseDirection.
if(!AroundIsNothing())
This is checking if the positions that the Miner can move into are all set to 0. If they are not, then GetIDFromDirection is called. This is key. If the Miner is currently surrounded by 0, ID will not be set, it will remain at it's previous value. In a situation where a Miner has just been created, this will be -1 (we know this could happen because all Miners are created at the location of the Miner creating it).
The last two checksif(ID==1) and if(ID==0 && IsLastMiner()) guard the code that moves the Miner (either by calling dig, or move). So, if ID is not 0, or 1 at this point the Miner will not move. This could cause a problem because it is immediately before the call to TryToCreateNewMiner, so if the program ever gets into this situation it will be stuck in a loop where the Miner isn't moving and it's constantly trying to create new Miners in the same position. 8% of the time this will work, creating a new miner in the same position, which will perform the same checks and get into the same loop, again not moving and trying to create a new miner and so it goes until the stack runs out of space and the program crashes.
You need to take a look at your termination clauses and the way you're handling ID, you probably don't want the Miner to just stop doing anything if it gets completely surround by 0.
Does anyone know about a good way to accomplish this task?
Currently i'm doing it more ore less this way, but i'm feeling someway unhappy with this code, unable to say what i could immediately improve.
So if anyone has a smarter way of doing this job i would be happy to know.
private bool Check(List<MyItem> list)
{
bool result = true;
//MyItem implements IComparable<MyItem>
list.Sort();
for (int pos = 0; pos < list.Count - 1; pos++)
{
bool previousCheckOk = true;
if (pos != 0)
{
if (!CheckCollisionWithPrevious(pos))
{
MarkAsFailed(pos);
result = false;
previousCheckOk = false;
}
else
{
MarkAsGood(pos);
}
}
if (previousCheckOk && pos != list.Count - 1)
{
if (!CheckCollisionWithFollowing(pos))
{
MarkAsFailed(pos);
result = false;
}
else
{
MarkAsGood(pos);
}
}
}
return result;
}
private bool CheckCollisionWithPrevious(int pos)
{
bool checkOk = false;
var previousItem = _Item[pos - 1];
// Doing some checks ...
return checkOk;
}
private bool CheckCollisionWithFollowing(int pos)
{
bool checkOk = false;
var followingItem = _Item[pos + 1];
// Doing some checks ...
return checkOk;
}
Update
After reading the answer from Aaronaught and a little weekend to refill full mind power i came up with the following solution, that looks far better now (and nearly the same i got from Aaronaught):
public bool Check(DataGridView dataGridView)
{
bool result = true;
_Items.Sort();
for (int pos = 1; pos < _Items.Count; pos++)
{
var previousItem = _Items[pos - 1];
var currentItem = _Items[pos];
if (previousItem.CollidesWith(currentItem))
{
dataGridView.Rows[pos].ErrorText = "Offset collides with item named " + previousItem.Label;
result = false;
sb.AppendLine("Line " + pos);
}
}
dataGridView.Refresh();
return result;
}
It's certainly possible to reduce the repetition:
private bool Check(List<MyItem> list)
{
list.Sort();
for (int pos = 1; pos < list.Count; pos++)
{
if (!CheckCollisionWithPrevious(list, pos))
{
MarkAsFailed();
return false;
}
MarkAsGood();
}
return true;
}
private bool CheckCollisionWithPrevious(List<MyItem> list, int pos)
{
bool checkOk = false;
var previousItem = list[pos - 1];
// Doing some checks ...
return checkOk;
}
Assuming that CheckCollisionWithPrevious and CheckCollisionWithFollowing perform essentially the same comparisons, then this will perform the same function with a lot less code.
I've also added the list as a parameter to the second function; it doesn't make sense to be taking it as a parameter in the first function, but then referencing a hard-coded member in the function it calls. If you're going to take a parameter, then pass that parameter down the chain.
As far as performance is concerned, though, you're re-sorting the list every time this happens; if it happens often enough, you might be better off using a sorted collection to begin with.
Edit: And just for good measure, if the whole point of this code is just to check for some kind of duplicate key, then you would be way better off using a data structure that prevents this in the first place, such as a Dictionary<TKey, TValue>.
Is there any simple algorithm to determine the likeliness of 2 names representing the same person?
I'm not asking for something of the level that Custom department might be using. Just a simple algorithm that would tell me if 'James T. Clark' is most likely the same name as 'J. Thomas Clark' or 'James Clerk'.
If there is an algorithm in C# that would be great, but I can translate from any language.
Sounds like you're looking for a phonetic-based algorithms, such as soundex, NYSIIS, or double metaphone. The first actually is what several government departments use, and is trivial to implement (with many implementations readily available). The second is a slightly more complicated and more precise version of the first. The latter-most works with some non-English names and alphabets.
Levenshtein distance is a definition of distance between two arbitrary strings. It gives you a distance of 0 between identical strings and non-zero between different strings, which might also be useful if you decide to make a custom algorithm.
Levenshtein is close, although maybe not exactly what you want.
I've faced similar problem and tried to use Levenstein distance first, but it did not work well for me. I came up with an algorithm that gives you "similarity" value between two strings (higher value means more similar strings, "1" for identical strings). This value is not very meaningful by itself (if not "1", always 0.5 or less), but works quite well when you throw in Hungarian Matrix to find matching pairs from two lists of strings.
Use like this:
PartialStringComparer cmp = new PartialStringComparer();
tbResult.Text = cmp.Compare(textBox1.Text, textBox2.Text).ToString();
The code behind:
public class SubstringRange {
string masterString;
public string MasterString {
get { return masterString; }
set { masterString = value; }
}
int start;
public int Start {
get { return start; }
set { start = value; }
}
int end;
public int End {
get { return end; }
set { end = value; }
}
public int Length {
get { return End - Start; }
set { End = Start + value;}
}
public bool IsValid {
get { return MasterString.Length >= End && End >= Start && Start >= 0; }
}
public string Contents {
get {
if(IsValid) {
return MasterString.Substring(Start, Length);
} else {
return "";
}
}
}
public bool OverlapsRange(SubstringRange range) {
return !(End < range.Start || Start > range.End);
}
public bool ContainsRange(SubstringRange range) {
return range.Start >= Start && range.End <= End;
}
public bool ExpandTo(string newContents) {
if(MasterString.Substring(Start).StartsWith(newContents, StringComparison.InvariantCultureIgnoreCase) && newContents.Length > Length) {
Length = newContents.Length;
return true;
} else {
return false;
}
}
}
public class SubstringRangeList: List<SubstringRange> {
string masterString;
public string MasterString {
get { return masterString; }
set { masterString = value; }
}
public SubstringRangeList(string masterString) {
this.MasterString = masterString;
}
public SubstringRange FindString(string s){
foreach(SubstringRange r in this){
if(r.Contents.Equals(s, StringComparison.InvariantCultureIgnoreCase))
return r;
}
return null;
}
public SubstringRange FindSubstring(string s){
foreach(SubstringRange r in this){
if(r.Contents.StartsWith(s, StringComparison.InvariantCultureIgnoreCase))
return r;
}
return null;
}
public bool ContainsRange(SubstringRange range) {
foreach(SubstringRange r in this) {
if(r.ContainsRange(range))
return true;
}
return false;
}
public bool AddSubstring(string substring) {
bool result = false;
foreach(SubstringRange r in this) {
if(r.ExpandTo(substring)) {
result = true;
}
}
if(FindSubstring(substring) == null) {
bool patternfound = true;
int start = 0;
while(patternfound){
patternfound = false;
start = MasterString.IndexOf(substring, start, StringComparison.InvariantCultureIgnoreCase);
patternfound = start != -1;
if(patternfound) {
SubstringRange r = new SubstringRange();
r.MasterString = this.MasterString;
r.Start = start++;
r.Length = substring.Length;
if(!ContainsRange(r)) {
this.Add(r);
result = true;
}
}
}
}
return result;
}
private static bool SubstringRangeMoreThanOneChar(SubstringRange range) {
return range.Length > 1;
}
public float Weight {
get {
if(MasterString.Length == 0 || Count == 0)
return 0;
float numerator = 0;
int denominator = 0;
foreach(SubstringRange r in this.FindAll(SubstringRangeMoreThanOneChar)) {
numerator += r.Length;
denominator++;
}
if(denominator == 0)
return 0;
return numerator / denominator / MasterString.Length;
}
}
public void RemoveOverlappingRanges() {
SubstringRangeList l = new SubstringRangeList(this.MasterString);
l.AddRange(this);//create a copy of this list
foreach(SubstringRange r in l) {
if(this.Contains(r) && this.ContainsRange(r)) {
Remove(r);//try to remove the range
if(!ContainsRange(r)) {//see if the list still contains "superset" of this range
Add(r);//if not, add it back
}
}
}
}
public void AddStringToCompare(string s) {
for(int start = 0; start < s.Length; start++) {
for(int len = 1; start + len <= s.Length; len++) {
string part = s.Substring(start, len);
if(!AddSubstring(part))
break;
}
}
RemoveOverlappingRanges();
}
}
public class PartialStringComparer {
public float Compare(string s1, string s2) {
SubstringRangeList srl1 = new SubstringRangeList(s1);
srl1.AddStringToCompare(s2);
SubstringRangeList srl2 = new SubstringRangeList(s2);
srl2.AddStringToCompare(s1);
return (srl1.Weight + srl2.Weight) / 2;
}
}
Levenstein distance one is much simpler (adapted from http://www.merriampark.com/ld.htm):
public class Distance {
/// <summary>
/// Compute Levenshtein distance
/// </summary>
/// <param name="s">String 1</param>
/// <param name="t">String 2</param>
/// <returns>Distance between the two strings.
/// The larger the number, the bigger the difference.
/// </returns>
public static int LD(string s, string t) {
int n = s.Length; //length of s
int m = t.Length; //length of t
int[,] d = new int[n + 1, m + 1]; // matrix
int cost; // cost
// Step 1
if(n == 0) return m;
if(m == 0) return n;
// Step 2
for(int i = 0; i <= n; d[i, 0] = i++) ;
for(int j = 0; j <= m; d[0, j] = j++) ;
// Step 3
for(int i = 1; i <= n; i++) {
//Step 4
for(int j = 1; j <= m; j++) {
// Step 5
cost = (t.Substring(j - 1, 1) == s.Substring(i - 1, 1) ? 0 : 1);
// Step 6
d[i, j] = System.Math.Min(System.Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1), d[i - 1, j - 1] + cost);
}
}
// Step 7
return d[n, m];
}
}
I doubt there is, considering even the Customs Department doesn't seem to have a satisfactory answer...
If there is a solution to this problem I seriously doubt it's a part of core C#. Off the top of my head, it would require a database of first, middle and last name frequencies, as well as account for initials, as in your example. This is fairly complex logic that relies on a database of information.
Second to Levenshtein distance, what language do you want? I was able to find an implementation in C# on codeproject pretty easily.
In an application I worked on, the Last name field was considered reliable.
So presented all the all the records with the same last name to the user.
User could sort by the other fields to look for similar names.
This solution was good enough to greatly reduce the issue of users creating duplicate records.
Basically looks like the issue will require human judgement.