Find object pattern in array in C# - c#

I'm looking for an efficient way to find a object pattern in an array.
Here is the problem that I have to tackle. I'm writing a tangible interface application that collects data from the webcam, converts it in to a black and white image from which I create an array. The array that is created looks similar to this:
1111111111111111111111111111
1111110001111111111000111111
1111100000111111110000011111
1111100000111111110000011111
1111110001111111111000111111
1111111111111111111111111111
Where the zeros represent the color black in the image. I have about 32 (4 rows with 8 circles in each) circles and I need to find an efficient way to find their coordinates. I don't need the whole shape, just a set of coordinates for each circle.
Thank you for the help.
Regards,
Teodor Stoyanov

Three options that I can see immediately (Tuple is used to represent the coordinates in your matrix):
You could use a BitArray for
each point in the matrix, the bit is set if the coordinate has an O, cost would be
O(row length x column length) for storage. Retrieval is O(1) if you know the coordinates you want to check otherwise, O(n) if you just want to find all O's
You could use a List<Tuple<int,int>>
to only store the coordinates for
each O in the matrix, cost would be
O(m) for storage, m being the number of O's. Retrieval is also O(m)
Alternatively to option 2 you could
use a Dictionary<Tuple<int, int>,
bool>, which allows O(1) retrieval
time if you know the coordinates you
want to check.

Pick an arbitrary 0 and do a flood fill from it. Average the coordinates of all the 0s you find to get the center of the circle. Erase the 0s you flooded and repeat.

There really isn't an easy way to do this, but the best thing you can do is tinker around with Artificial Neural Networks. They allow you to feed in data and get an output over many different data inputs. If you build the network right, it'll self-adjust its weights over many iterations.
Sorry, but I doubt you're going to get the exact solution spelled out for you in code. Although I haven't used any of these libraries or resources, a quick glance over them makes them look pretty decent:
http://franck.fleurey.free.fr/NeuralNetwork/
http://sourceforge.net/projects/neurondotnet/

Related

C# Performance optimization: Avoid drawing points at the same location, increase size instead

I am visualizing a point cloud in Unity. My C# script reads the RGB data of a .png file and draws a particle at the corresponding position( x=r, y=g, z=b).
Pictures of course have multiple pixels of the same color, currently they are still drawn but i want to avoid that and increase the size of the corresponding particle instead.
I already tried checking with Array.IndexOf() for existing particles and increasing their size when found.
The problem with this solution is that it is very slow. The possible amount of different particles is 256*256*256 and when I tried it with only 50*50 particles it took over a minute to compute. An example with 4*4 worked well but this is far from what I need.
I already did think about making a list with existing particles. Maybe the list search is faster, but then I also would have to transform the list to an array.
Another idea is to just store a counter value in an int[256,256,256] and then iterating through it to create particles. But this would also be a huge overhead.
Any ideas for a better approach are very welcome.
Edit: Creating all the particles including the unneccesary ones is very fast, just taking 1-2 seconds to compute a million particles. For this visual an rendering improvement I hope that I dont need to increase the computation time by factor 10 and above.
I'm not too sure on the size of your dataset, but having tested this quick bit of code with 1million items, it seems pretty quick (<0.5s to generate data and generate weighting).
Assuming your RBG list looks like the following:
var rbgList = new List<RBG>();
then a quick bit of LINQ to provide grouping by unique RGB combinations with number of unique occurences would be like:
var grouping=
rbgList.GroupBy(val =>
new {val.R, val.B, val.G}, (key, group)=>
new {RBG= new RBG(key.R, key.B, key.G), Count = group.Count()})
.Select(g=>g)
.OrderBy(g=>g.Count);
Then you can itterate through 'grouping', getting the RBG value and the Count, allowing you to locate the x/y/z coords you need then scale the point size based on the count.
Hope that helps

Find in how many pieces where matrix sliced with recursion

I recently got an assigment , where I have 4 on 5 matrix with 0 and 1(zero's represent the cut parts of the matrix). I need to find a way to calculate in how many pieces matrix will be sliced and as I mentioned zero's represent the cut parts of the matrix so if from one border to another goes straight line of zero's it means the matrix was sliced on that line , as for example(in this pic I've marked where matrix would be split, a line of zero's slice it) :
So guys I know that you won't entirely solve this code for me and I don't need that, but what I need is to understand this :
Firstly how should I tell a compiler in wich direction(when he is going threw matrix ) he should be going, I have an idea with enumarations.
Secondly , what kind of condition sentence I should use that the compiler would recognise the line of zero's(if there is such) ?
Any help would be appreciated :)
You did not specify any requirements for the algorithm (such as time and space complexity), so I guess one answer, that correlates to some specific solutions, would be:
Go in all 4 directions
Don't condition on 0s to create the line, but try to look at the 1s and find to which piece they belong.
A general algorithm for this can be implemented as follows:
Create a helper matrix of the same size, a function to give you a new symbol (for example by increasing some number any time a symbol is asked for), and a data structure to store collisions
Go in all 4 directions starting from anywhere
Whenever you find a 0 in the original matrix, issue some new symbol to each of the new directions you are going from there
Whenever you find 1, try to store the values in the helper matrix. If there is already a value there, then:
Store in the collisions data structure that you found a collision between 2 symbols
Don't continue in any direction from this one.
This will traverse each cell at most 4 time, so time complexity is O(n), and when you are done you will have a data structure with all the collisions.
Now all you need to do is combine all entries in the other data structure to collect how many unique pieces you really have.

Data structure for expandable 3D array?

I'm looking for a data structure similar to T[,,] (a 3D array) except I don't know the dimensions beforehand (nor have a reasonable upper bound) that will expand outwards as time goes on. I'd like to use negative indexes as well.
The only thing that comes to mind is a dictionary, with some kind of Point3 struct as the key. Are there any other alternatives? I'd like to lookups to be as fast as possible. The data will always be clustered around 0,0,0. It can expand in any direction, but there will never be any "gaps" between points.
I think I'm going to go ahead and just use a Dictionary<Point3, T> for now, and see how that performs. If it's a performance issue I'll try building a wrapper around T[,,] such that I can use negative indexes.
Obviously you'll need to store this in a data structure resembling a sparse array, because you have no idea how large your data-set is going to be. So a Dictionary seems reasonable.
I'm going a little crazy here, but I think your indices should be in Spherical Coordinates. It makes sense to me as your data grows outwards. It will also make finding elements at a specific range from (0, 0, 0) extremely easy.
If you may need range queries, KD-Trees comes to mind. They are tree-like structures, at each level seperate the universe into two along one axis. They offer O(logN) lookup time (for constant number of dimensions) which may or may not be fast enough, but they also provide O(logN + S) time for range queries where S is the size of the items found, which usually is very good. They can handle dynamic data (insertions and deletions along with lookups) but the tree may become unbalanced as a result. Also you can do Nearest Neighboor search from a given point, (i.e. get the 10 nearest objects to point (7,8,9)) Wikipedia, as always, is a good starting point: http://en.wikipedia.org/wiki/Kd-tree
If there are huge numbers of things in the world, of if the world is very dynamic (things move, be created/destroyed all the time), kd-trees may not be good enough. If most of the time you will only ask "give me the thing at (7,8,9)" you can either use a hash as you mentioned in your question or something like a List<List<List<T>>>. I'd just implement whichever is easier within an interface and worry about the performance later.
I am kind of assuming you need the dynamic aspect because array could be huge. In that case, what you could try is to allocate your array as a set of 3d 'tiles'. At the top level you have a 3d data structure that stores pointers to your tiles. You expand and allocate this as you go along.
Each individual tile could contain, say, 32x32x32 voxels. Or whatever amount suits your goals.
Looking up you tile is done by dividing your coordinate index by 32 (by bitshifting, of course), and the index in the tile is calculated by masking away the upper bits.
A lookup like this is fairly cheap, possible on par with a .net Dictionary, but it will use less memory, which is good for performance too.
The resulting array will be chunky, though: the array boundaries are multiples of your tile size.
Array access is a very fast linear lookup - if speed of lookup is your priority, then it's probably the way to go, depending on how often you will need to modify your arrays.
If your goal is to maintain the chunks around a player, you might want to arrange a "current world" array structure around the player, such that it is a 3-dimensional array with the centre chunk at 9,9,9 with a size of [20,20,20]. Each time the player leaves the centre chunk for another, you re-index the array to drop the old chunks and move on.
Ultimately, you're asking for options on how to optimize your game engine, but it's nearly impossible to say which is going to be correct for you. Even though games tend to be more optimized for performance than other applications, don't get lured into micro-optimizations; optimize for readability first and then optimize for performance when you find it necessary.
If you suspect this particular data-structure is going to be a bottleneck in your engine, put some performance tracing in so it's easy to be sure one way or the other once you've got the engine running.

Quickly find and render terrain above a given elevation

Given an elevation map consisting of lat/lon/elevation pairs, what is the fastest way to find all points above a given elevation level (or better yet, just the the 2D concave hull)?
I'm working on a GIS app where I need to render an overlay on top of a map to visually indicate regions that are of higher elevation; it's determining this polygon/region that has me stumped (for now). I have a simple array of lat/lon/elevation pairs (more specifically, the GTOPO30 DEM files), but I'm free to transform that into any data structure that you would suggest.
We've been pointed toward Triangulated Irregular Networks (TINs), but I'm not sure how to efficiently query that data once we've generated the TIN. I wouldn't be surprised if our problem could be solved similarly to how one would generate a contour map, but I don't have any experience with it. Any suggestions would be awesome.
It sounds like you're attempting to create a polygonal representation of the boundary of the high land.
If you're working with raster data (sampled on a rectangular grid), try this.
Think of your grid as an assembly of right triangles.
Let's say you have a 3x3 grid of points
a b c
d e f
g h k
Your triangles are:
abd part of the rectangle abed
bde the other part of the rectangle abed
bef part of the rectangle bcfe
cef the other part of the rectangle bcfe
dge ... and so on
Your algorithm has these steps.
Build a list of triangles that are above the elevation threshold.
Take the union of these triangles to make a polygonal area.
Determine the boundary of the polygon.
If necessary, smooth the polygon boundary to make your layer look ok when displayed.
If you're trying to generate good looking contour lines, step 4 is very hard to to right.
Step 1 is the key to this problem.
For each triangle, if all three vertices are above the threshold, include the whole triangle in your list. If all are below, forget about the triangle. If some vertices are above and others below, split your triangle into three by adding new vertices that lie precisely on the elevation line (by interpolating elevation). Include the one or two of those new triangles in your highland list.
For the rest of the steps you'll need a decent 2d geometry processing library.
If your points are not on a regular grid, start by using the Delaunay algorithm (which you can look up) to organize your pointss in into triangles. Then follow the same algorith I mentioned above. Warning. This is going to look kind of sketchy if you don't have many points.
Assuming you have the lat/lon/elevation data stored in an array (or three separate arrays) you should be able to use array querying techniques to select all of the points where the elevation is above a certain threshold. For example, in python with numpy you can do:
indices = where(array > value)
And the indices variable will contain the indices of all elements of array greater than the threshold value. Similar commands are available in various other languages (for example IDL has the WHERE() command, and similar things can be done in Matlab).
Once you've got this list of indices you could create a new binary array where each place where the threshold was satisfied is set to 1:
binary_array[indices] = 1
(Assuming you've created a blank array of the same size as your original lat/long/elevation and called it binary_array.
If you're working with raster data (which I would recommend for this type of work), you may find that you can simply overlay this array on a map and get a nice set of regions appearing. However, if you need to convert the areas above the elevation threshold to vector polygons then you could use one of many inbuilt GIS methods to convert raster->vector.
I would use a nested C-squares arrangement, with each square having a pre-calculated maximum ground height. This would allow me to scan at a high level, discarding any squares where the max height is not above the search height, and drilling further into those squares where parts of the ground were above the search height.
If you're working to various set levels of search height, you could precalculate the convex hull for the various predefined levels for the smallest squares that you decide to use (or all the squares, for that matter.)
I'm not sure whether your lat/lon/alt points are on a regular grid or not, but if not, perhaps they could be interpolated to represent even 100' ft altitude increments, and uniform
lat/lon divisions (bearing in mind that that does not give uniform distance divisions). But if that would work, why not precompute a three dimensional array, where the indices represent altitude, latitude, and longitude respectively. Then when the aircraft needs data about points at or above an altitude, for a specific piece of terrain, the code only needs to read out a small part of the data in this array, which is indexed to make contiguous "voxels" contiguous in the indexing scheme.
Of course, the increments in longitude would not have to be uniform: if uniform distances are required, the same scheme would work, but the indexes for longitude would point to a nonuniformly spaced set of longitudes.
I don't think there would be any faster way of searching this data.
It's not clear from your question if the set of points is static and you need to find what points are above a given elevation many times, or if you only need to do the query once.
The easiest solution is to just store the points in an array, sorted by elevation. Finding all points in a certain elevation range is just binary search, and you only need to sort once.
If you only need to do the query once, just do a linear search through the array in the order you got it. Building a fancier data structure from the array is going to be O(n) anyway, so you won't get better results by complicating things.
If you have some other requirements, like say you need to efficiently list all points inside some rectangle the user is viewing, or that points can be added or deleted at runtime, then a different data structure might be better. Presumably some sort of tree or grid.
If all you care about is rendering, you can perform this very efficiently using graphics hardware, and there is no need to use a fancy data structure at all, you can just send triangles to the GPU and have it kill fragments above or below a certain elevation.

Representing a Gameworld that is Irregularly shaped

I am working on a project where the game world is irregularly shaped (Think of the shape of a lake). this shape has a grid with coordinates placed over it. The game world is only on the inside of the shape. (Once again, think Lake)
How can I efficiently represent the game world? I know that many worlds are basically square, and work well in a 2 or 3 dimension array. I feel like if I use an array that is square, then I am basically wasting space, and increasing the amount of time that I need to iterate through the array. However, I am not sure how a jagged array would work here either.
Example shape of gameworld
X
XX
XX X XX
XXX XXX
XXXXXXX
XXXXXXXX
XXXXX XX
XX X
X
Edit:
The game world will most likely need each valid location stepped through. So I would a method that makes it easy to do so.
There's computational overhead and complexity associated with sparse representations, so unless the bounding area is much larger than your actual world, it's probably most efficient to simply accept the 'wasted' space. You're essentially trading off additional memory usage for faster access to world contents. More importantly, the 'wasted-space' implementation is easier to understand and maintain, which is always preferable until the point where a more complex implementation is required. If you don't have good evidence that it's required, then it's much better to keep it simple.
You could use a quadtree to minimize the amount of wasted space in your representation. Quad trees are good for partitioning 2-dimensional space with varying granularity - in your case, the finest granularity is a game square. If you had a whole 20x20 area without any game squares, the quad tree representation would allow you to use only one node to represent that whole area, instead of 400 as in the array representation.
Use whatever structure you've come up with---you can always change it later. If you're comfortable with using an array, use it. Stop worrying about the data structure you're going to use and start coding.
As you code, build abstractions away from this underlying array, like wrapping it in a semantic model; then, if you realize (through profiling) that it's waste of space or slow for the operations you need, you can swap it out without causing problems. Don't try to optimize until you know what you need.
Use a data structure like a list or map, and only insert the valid game world coordinates. That way the only thing you are saving are valid locations, and you don't waste memory saving the non-game world locations since you can deduce those from lack of presence in your data structure.
The easiest thing is to just use the array, and just mark the non-gamespace positions with some special marker. A jagged array might work too, but I don't use those much.
You could present the world as an (undirected) graph of land (or water) patches. Each patch then has a regular form and the world is the combination of these patches. Every patch is a node in the graph and has has graph edges to all its neighbours.
That is probably also the most natural representation of any general world (but it might not be the most efficient one). From an efficiency point of view, it will probably beat an array or list for a highly irregular map but not for one that fits well into a rectangle (or other regular shape) with few deviations.
An example of a highly irregular map:
x
x x
x x x
x x
x xxx
x
x
x
x
There’s virtually no way this can be efficiently fitted (both in space ratio and access time) into a regular shape. The following, on the other hand, fits very well into a regular shape by applying basic geometric transformations (it’s a parallelogram with small bits missing):
xxxxxx x
xxxxxxxxx
xxxxxxxxx
xx xxxx
One other option that could allow you to still access game world locations in O(1) time and not waste too much space would be a hashtable, where the keys would be the coordinates.
Another way would be to store an edge list - a line vector along each straight edge. Easy to check for inclusion this way and a quad tree or even a simple location hash on each vertice can speed lookup of info. We did this with a height component per edge to model the walls of a baseball stadium and it worked beautifully.
There is a big issue that nobody here addressed: the huge difference between storing it on disk and storing it in memory.
Assuming you are talking about a game world as you said, this means it's going to be very large. You're not going to store the whole thing in memory in once, but instead you will store the immediate vicinity in memory and update it as the player walks around.
This vicinity area should be as simple, easy and quick to access as possible. It should definitely be an array (or a set of arrays which are swapped out as the player moves). It will be referenced often and by many subsystems of your game engine: graphics and physics will handle loading the models, drawing them, keeping the player on top of the terrain, collisions, etc.; sound will need to know what ground type the player is currently standing on, to play the appropriate footstep sound; and so on. Rather than broadcast and duplicate this data among all the subsystems, if you just keep it in global arrays they can access it at will and at 100% speed and efficiency. This can really simplify things (but be aware of the consequences of global variables!).
However, on disk you definitely want to compress it. Some of the given answers provide good suggestions; you can serialize a data structure such as a hash table, or a list of only filled-in locations. You could certainly store an octree as well. In any case, you don't want to store blank locations on disk; according to your statistic, that would mean 66% of the space is wasted. Sure there is a time to forget about optimization and make it Just Work, but you don't want to distribute a 66%-empty file to end users. Also keep in mind that disks are not perfect random-access machines (except for SSDs); mechanical hard drives should still be around another several years at least, and they work best sequentially. See if you can organize your data structure so that the read operations are sequential, as you stream more vicinity terrain while the player moves, and you'll probably find it to be a noticeable difference. Don't take my word for it though, I haven't actually tested this sort of thing, it just makes sense right?

Categories