I was going through this and this post about binary search tree implementation.
I saw that a binary search tree is represented as (for example):
1 5 7 10 40 50
I was trying to learn about the serialization or de-serialization of the same here. The blog post is making me crazy with those -1s which they're calling markers for NULL pointers. And they're representing the tree as:
20 8 4 -1 -1 12 10 -1 -1 14 -1 -1 -1
Confusion
What are those -1s?
My final goal is to store and read a binary search tree to some kind of file using C# but this confusion is keeping me off.
These -1 stand for places where there is no more childs.
For your example
20
/
8__
/ \
4 12
/\
10 14
You can imagine adding additional -1 (you can use any value that can not occur in the tree itself) to places where nodes have no children:
20
/ \
8__ -1
/ \
4 12
/\ /\
-1 -1 10 14
/\ /\
-1 -1 -1 -1
And now if you go through your tree in "root, then left subtree, then right subtree" order, you will get the following string:
20 8 4 -1 -1 12 10 -1 -1 14 -1 -1 -1
Which is exactly what you have. So this is a way to represent the tree in an array form. At the same time, it is easy to reconstruct the tree from that form. Knowing that these -1s are special in a sense that they have no more children, you can reconstruct the tree from such an array without any ambiguity.
Not all nodes have two children. Obviously, because the tree would be infinite otherwise. At runtime/in memory, a missing node is represented by a nullptr, on disk by -1.
The meaning of these -1's was explained in other answers, let me present the code of reading and writing a tree:
class Tree
{
public:
int value;
Tree* left;
Tree* right;
Tree(int i_value, Tree* i_left, Tree* i_right)
: value(i_value), left(i_left), right(i_right)
{}
Tree(const Tree&) = delete;
static const int NO_TREE = -1;
template<typename Iterator>
static Tree* Create(Iterator& i_iterator)
{
if (*i_iterator == NO_TREE)
{
i_iterator++;
return nullptr;
}
int value = *(i_iterator++);
Tree* left = Create(i_iterator);
Tree* right = Create(i_iterator);
return new Tree(value, left, right);
}
template<typename Iterator>
static void Write(Tree* i_tree, Iterator& i_iterator)
{
if (i_tree == nullptr)
{
*(i_iterator++) = NO_TREE;
return;
}
*(i_iterator++) = i_tree->value;
Write(i_tree->left, i_iterator);
Write(i_tree->right, i_iterator);
}
};
Usage:
vector<int> v = { 20, 8, 4, -1, -1, 12, 10, -1, -1, 14, -1, -1, -1 };
Tree* t = Tree::Create(v.begin());
vector<int> w;
Tree::Write(t, std::back_inserter(w));
Related
I am trying to understand in more detail the use of the Enumerable.Where method. Even though I already understand many details including the use of lambda expression, delegates, predicates and so on, some things make no sense for me and I would appreciate any help.
First I am referring to the explanation from the link below:
https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.where?view=net-5.0
In the webpage above they have the following code example :
int[] numbers = { 0, 30, 20, 15, 90, 85, 40, 75 };
IEnumerable<int> query =
numbers.Where((number, index) => number <= index * 10);
foreach (int number in query)
{
Console.WriteLine(number);
}
/*
This code produces the following output:
0
20
15
40
*/
My questions are :
Where are the parameters "number" and "index" defined? I understand that the "number" inside the Where is different from the "number" inside the foreach statement.
Why can I change the name of the parameter "number" inside the Where but can't change the name of "index"?
Why does this code produces the output 0, 20, 15, 40? I know the indexes are from 0 to 7.
What is the usage of the left arrow in "number <= index * 10" and what is the official name of this left arrow? (I know the right arrow is to separate input and output in a lambda expression)
Thank you for your attention and support.
Where are the parameters "number" and "index" defined?
They are declared when you write (number, index) => .... (number, index) => is short for (int number, int index) =>. The types can be omitted because they cam be inferred from the signature of Where.
The overload of Where that you are calling is:
public static IEnumerable<TSource> Where<TSource> (
this IEnumerable<TSource> source,
Func<TSource,int,bool> predicate
);
numbers is passed to source, and (number, index) => number <= index * 10 is passed to predicate. The types can be inferred here because from the source parameter, we know TSource is int (since you passed in an int[]), so the type of the predicate parameter must be Func<int,int,bool>.
Func<int,int,bool> represents a function that takes two int and returns a bool. You are supposed to give Where such a function. This is why you are allowed to declare the two parameters (number, index) - these are the parameters for the function that you are passing to Where. As for what the function does...
What is the usage of the left arrow?
It is the "less than or equal to" operator. The function you are passing to Where returns true if number is less than or equal to 10 times the index of the number. You should see why only 0 (at index 0), 20 (at index 2), 15 (at index 3), and 40 (at index 6) are left in the filtered sequence. This should answer your third question too.
Why I can change the name of the parameter "number" inside the Where but can't change the name of "index"?
You can rename just index:
(number, b) => number <= b * 10
or even rename both of them:
(a, b) => a <= b * 10
They are just parameter names after all. Maybe you weren't doing it correctly.
Where are the parameters "number" and "index" defined? I understand that the "number" inside the Where is different from the "number" inside the foreach statement. <
That is from the enumerable extension method Where:
public static System.Collections.Generic.IEnumerable<TSource> Where<TSource> (this System.Collections.Generic.IEnumerable<TSource> source, Func<TSource,int,bool> predicate);
That takes takes a Func<source,int,bool> (a function that takes a source element from the collection, an int index and returns a bool).
Why I can change the name of the parameter "number" inside the Where but can't change the name of "index"? <
That represents the index in the enumerable.
Why does this code produces the output 0, 20, 15, 40? I know the indexes are from 0 to 7.
{ 0, 30, 20, 15, 90, 85, 40, 75 }
The where only produces a result ( the number from the list ) when the predicate ( number <= index * 10 ) is true
index 0 number 0: 0 <= 0 * 10 : true
index 1 number 30: 30 <= 1 * 10 : false
index 2 number 20: 20 <= 2 * 10 : true
index 3 number 15: 15 <= 3 * 10 : true
index 4 number 90: 90 <= 4 * 10 : false
index 5 number 85: 85 <= 5 * 10 : false
index 6 number 40: 40 <= 6 * 10 : true
index 7 number 75: 75 <= 7 * 10 : false
What is the usage of the left arrow in "number <= index * 10" and what is the official name of this left arrow? (I know the right arrow is to separate input and output in a lambda expression) <
number is less than or equal to index times ten -- its a less than or equal to comparison returning a bool.
Where are the parameters "number" and "index" defined? I understand that the "number" inside the Where is different from the "number" inside the foreach statement.
Imagine the code looks more like this:
public bool IsElementValid(int number, int index)
{
return number <= index * 10;
}
IEnumerable<int> query = numbers.Where(IsElementValid);
Your code (number, index) => number <= index * 10; is effectively declaring an anonymous method which accepts two parameters: number and index and returns a bool. These are called "lambda expressions" and you can read more about them in the documentation.
You can pass a method here because Where accepts a Func<TElement, int, bool> delegate. A delegate effectively allows you to store one or more methods in a variable. You can read about delegates here.
So now we know that a Func can effectively hold a method, we can digest how Where could work by writing our own:
public List<int> MyOwnWhere(int[] source, Func<int, int, bool> filter)
{
var result = new List<int>();
for (int i = 0; i < source.Length; ++i)
{
if (filter(source[i], i) == true)
{
result.Add(source[i]);
}
}
return result;
}
Of course this isn't exactly how Where works, but you can get a sense of what happens under the hood re the Func.
I created a sample here with some diagnostics messages to help you understand the flow.
Why I can change the name of the parameter "number" inside the Where but can't change the name of "index"?
You can change it without breaking things. Here I've changed them to "bob" and "simon" and it still works.
Why does this code produces the output 0, 20, 15, 40? I know the indexes are from 0 to 7.
Your checks are performed like this:
Index | Check
0 | 0 <= 0 (because 0 * 10 == 0) result = true
1 | 30 <= 10 (because 1 * 10 == 10) result = false
2 | 20 <= 20 (because 2 * 10 == 20) result = true
3 | 15 <= 30 (because 3 * 10 == 30) result = true
4 | 90 <= 40 (because 4 * 10 == 40) result = false
5 | 85 <= 50 (because 5 * 10 == 50) result = false
6 | 40 <= 60 (because 6 * 10 == 60) result = true
7 | 75 <= 70 (because 7 * 10 == 70) result = false
What is the usage of the left arrow in "number <= index * 10" and what is the official name of this left arrow? (I know the right arrow is to separe input and output in a lambda expression)
The left arrow is the mathematical symbol for "less than". Combined with the equals, it is "less than or equal to". See comparison operators for more.
I'm new here and this might be a repitition of a previous post but I couldn't find something specific to this. I have a 2d grid containing of random 4 values (0, 1, 2, 3). I want to build an algorithm which finds all the connected neighbours of one particularly selected cell (can be passed by just giving the index of one cell) and highlight them.
For ex:
I have a 2d array:
[0 0 1 0 3 2 0]
[1 3 1 2 1 0 0]
[3 2 3 1 1 1 2]
[0 0 1 2 2 1 0]
[3 2 1 2 1 1 0]
if for example the user selects number 1 (the element highlighted in bold above) i want to find out the cluster that it belongs to and highlight that. Search only for top, bottom, left and right, no diagonals.
Any help would be appreciated.
You are looking for the connected component of a graph, which can be found by depth-first search. Basically you would recursively add all neighbours of the starting node to the output; the details would depend on the specific underlying implementation.
You can Implement the BFS using a queue and a Boolean Array as follows:
void BFS(int srcR, int srcC, int n, int m, int** grid, bool* vis){
int dr[] = {1, -1, 0, 0}; //The change in row
int dc[] = {0, 0, 1, -1}; //The change in column
int target = grid[srcR][srcC];
memset(vis, 0, sizeof vis);
queue<pair<int, int> > q;
q.push({srcR, srcC});
vis[srcR][srcC]=1;
while(!q.empty()){
int ur = q.front().first, uc = q.front().second;
q.pop();
for(int k= 0 ; k < 4 ; ++k){ //The 4 directions we are going to traverse.
int vr = ur + dr[k], vc = uc + dc[k];
if(vr>=0 && vr<n && vc>=0 && vc<m && grid[vr][vc] == target && !vis[vr][vc]){
vis[vr][vc]=1;
q.push({vr, vc});
}
}
}
}
When this function finishes you'll have the vis array with ones denoting the connected component.
I'm sorry if there's any error, as I wrote this from the phone.
I have an array of 8 compass points numbered from SW, clockwise though to S:
2 3 4
1 5
0 7 6
I want to calculate if the shortest route from one point to another would be clockwise (+1) or anticlockwise (-1). E.g. to go from 7 to 5 would be -1, to go from 7 to 0 would be + 1.
Simple problem I guess but I'm having a real brain freeze today.
The closest I've got is if abs(start - end) < 4, -1, 1 but that doesn't work if the start is 3.
There is a similar problem here, the accepted answer for which is to use modulo, but doesn't explain how. I've thrown various calculations around without success.
Instead of using abs, add 8 (the number of entries) and then take modulo 8, like this:
enum Direction {
None, Clockwise, Counterclockwise
}
public static Direction GetDirection(int a, int b) {
if (a == b) {
return Direction.None;
}
return (a-b+8)%8 > 4 ? Direction.Clockwise : Direction.Counterclockwise;
}
Adding 8 makes the difference non-negative; modulo-8 brings it into 0...7 range.
Note that when the number of steps is 4, it does not matter which way you go, so the program prefers counterclockwise. You can change it by using >= in place of >.
Try this
int start=3;
int end=6;
var temp = start-end;
temp= temp < 0 ? temp + 7 : temp;
var result = temp < 4 ? -1 : 1;
I have an "infinite" 2D grid and I want to detect closed/complete "structures" - areas of any shape which are enclosed on all sides. However, I need to identify each individual closed circuit - including the larger shape, if any.
In researching this, I've discovered the cycle detection algorithm, but I don't see a clean/efficient way to separate the larger circuit from the smaller ones.
For example given the following two "complete" structures:
0 1 1 1 0
0 1 0 1 0
0 1 1 1 0
0 0 0 0 0
0 1 1 1 1 1
0 1 0 1 0 1
0 1 1 1 1 1
The first is a single cell enclosed by 8 "walls". The cycle detection makes it trivial to detect this.
The second example consists of two copies of example one but they share a wall. There are three separate circuits I care about - the left room, the right room, and the overall structure.
Multiple passes of a cycle algorithm might work, but I'd have to be sure I'm not retracing an already-found shape.
I've also looked at the flood fill algorithm, but it seems like it makes the assumption you already know a point inside the bounded area. With an infinite 2D grid I'd need a size limit to force it to give up if it's not in a valid structure.
Are there solutions I'm missing or have I missed something with my thinking?
I will only do this "check" when a boundary value is added. Using the example above, if I change any 0 -> 1, a new cycle has potentially been created and I'll run the logic. I do not care about identifying separate structures and will always have an origin coordinate.
I've been studying the solutions posted here but they're all based on already knowing which nodes are connected to other nodes. I've already toyed with logic that identifies each individual "line" and I can keep going from there, but it feels redundant.
I would do this like this:
0 0 0 0 0 0 0
0 1 1 1 1 1 0
0 1 0 1 0 1 0
0 1 1 1 1 1 0
0 0 0 0 0 0 0
fill the background with 2
to determine if you are in background just cast a ray and count consequent zeores. Once you find location where the ray length is bigger then circuit size limit you got your start point.
[0]0-0-0-0-0-0
0 1 1 1 1 1 0
0 1 0 1 0 1 0
0 1 1 1 1 1 0
0 0 0 0 0 0 0
2 2 2 2 2 2 2
2 1 1 1 1 1 2
2 1 0 1 0 1 2
2 1 1 1 1 1 2
2 2 2 2 2 2 2
Do not use unbound recursive flood fills for this !!! because for "infinite" area you will stack overflow. You can limit the recursion level and if reached instead of recursion add point to some que for further processing latter. This usually speeds thing up a lot and limits the stack usage...
find first 0
2 2 2 2 2 2 2
2 1 1 1 1 1 2
2 1[0]1 0 1 2
2 1 1 1 1 1 2
2 2 2 2 2 2 2
flood fill it with 3
2 2 2 2 2 2 2
2 1 1 1 1 1 2
2 1 3 1 0 1 2
2 1 1 1 1 1 2
2 2 2 2 2 2 2
select all 1 near 3
this is your circuit. If you remember the bbox while filling #3 then you need to scan only area enlarged by one cell on each side... Selected cells are your circuit.
2 2 2 2 2 2 2
2 * * * 1 1 2
2 * 3 * 0 1 2
2 * * * 1 1 2
2 2 2 2 2 2 2
flood fill 3 with 2
this will avoid of usage already processed circuits
2 2 2 2 2 2 2
2 1 1 1 1 1 2
2 1 2 1 0 1 2
2 1 1 1 1 1 2
2 2 2 2 2 2 2
loop #2 while any 0 found
change all 2 back to 0
0 0 0 0 0 0 0
0 1 1 1 1 1 0
0 1 0 1 0 1 0
0 1 1 1 1 1 0
0 0 0 0 0 0 0
It is a contours finding problem.
One of possible algorithms is described by Satoshi Suzuki and Keiichi Abe in their paper called Topological Structural Analysis of Digitized Binary Images by Border Following in 1985. And it is not trivial. But you can use OpenCV, it's cv2.findContours() function implements this algorithm.
If you choose to use OpenCV, the solution is easy. You extract contours alongside it's hierarchy. Contours that has at least one child (hole) and their child contours are objects that you are looking for. Example using managed OpenCV wrapper called OpenCvSharp:
byte[,] a = new byte[7, 6]
{
{ 0, 1, 1, 1, 0, 0 },
{ 0, 1, 0, 1, 0, 0 },
{ 0, 1, 1, 1, 0, 0 },
{ 0, 0, 0, 0, 0, 0 },
{ 0, 1, 1, 1, 1, 1 },
{ 0, 1, 0, 1, 0, 1 },
{ 0, 1, 1, 1, 1, 1 }
};
// Clone the matrix if you want to keep original array unmodified.
using (var mat = new MatOfByte(a.GetLength(0), a.GetLength(1), a))
{
// Turn 1 pixel values into 255.
Cv2.Threshold(mat, mat, thresh: 0, maxval: 255, type: ThresholdTypes.Binary);
// Note that in OpenCV Point.X is a matrix column index and Point.Y is a row index.
Point[][] contours;
HierarchyIndex[] hierarchy;
Cv2.FindContours(mat, out contours, out hierarchy, RetrievalModes.CComp, ContourApproximationModes.ApproxNone);
for (var i = 0; i < contours.Length; ++i)
{
var hasHole = hierarchy[i].Child > -1;
if (hasHole)
{
var externalContour = contours[i];
// Process external contour.
var holeIndex = hierarchy[i].Child;
do
{
var hole = contours[holeIndex];
// Process hole.
holeIndex = hierarchy[holeIndex].Next;
}
while (holeIndex > -1);
}
}
}
You can try a list of points and verify the ones that are linked.
class PointList : List<Point>
{
/// <summary>
/// Adds the point to the list and checks for perimeters
/// </summary>
/// <param name="point"></param>
/// <returns>Returns true if it created at least one structure</returns>
public bool AddAndVerify(Point point)
{
this.Add(point);
bool result = LookForPerimeter(point, point, point);
Console.WriteLine(result);
return result;
}
private bool LookForPerimeter(Point point, Point last, Point original)
{
foreach (Point linked in this.Where(p =>
(p.X == point.X -1 && p.Y == point.Y)
|| (p.X == point.X + 1 && p.Y == point.Y)
|| (p.X == point.X && p.Y == point.Y - 1)
|| (p.X == point.X && p.Y == point.Y + 1)
))
{
if (!linked.Equals(last))
{
if (linked == original) return true;
bool subResult = LookForPerimeter(linked, point, original);
if (subResult) return true;
}
}
return false;
}
}
This code is intentended as a starting point, it probably has bugs and does not account for perimeters without 0 inside
Example of use:
class Program
{
static void Main(string[] args)
{
PointList list = new PointList();
list.AddAndVerify(new Point() { X = 0, Y = 0 }); //returns false
list.AddAndVerify(new Point() { X = 0, Y = 1 }); //returns false
list.AddAndVerify(new Point() { X = 0, Y = 2 }); //returns false
list.AddAndVerify(new Point() { X = 1, Y = 2 }); //returns false
list.AddAndVerify(new Point() { X = 2, Y = 2 }); //returns false
list.AddAndVerify(new Point() { X = 2, Y = 1 }); //returns false
list.AddAndVerify(new Point() { X = 2, Y = 0 }); //returns false
list.AddAndVerify(new Point() { X = 1, Y = 0 }); //returns True
}
}
Coming from a graph-theoretic view of the problem, you can interpret every 0 of your map as a node, neighboring 0s are connected with an edge. It sounds to me like what you want to do is compute the connected components of this graph (and maybe their connectivity by 1 values, to find 'neighboring rooms' of the same structure)
If you only want to compute this information once, a straightforward approach using a union-find data structure should suffice, where you apply union once per edge.
If you want to edit your map dynamically, the best approach based on the graph model would probably be some dynamic data structure that supports split or de-union operations, see for example here or here
I had a similar problem trying to find all circles inside a 2D street map graph (given as a SVG file). As you state I too could not find an algorithm for that.
I found the following solution though.
Assumptions
Grid Layout:
Each '1' in the grid is in one of the following states (or an homomorphism of that):
1. 0 2. 0 3. 0 4. 0 5. 0 6. 1
0 1 0 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1
0 0 1 0 1 1
But only example 3 to 6 make a sense for a connected wall, as in a connected wall each '1' has at least two '1's in its neighborhood.
Example 3 indicates a corner. This corner holds at most one structure.
Example 4 indicates a straight line. It can be the wall of zero, one or two structures.
Example 5 indicates a t-wall. It can be the wall of zero, one, two or three structures.
Example 6 indicates a cross-wall. It can be the corner of zero, one, two, three or four structures.
The Algorithm
The idea
Assuming the above, the algorithm works by finding a '1' and doing a depth first search, to mark all connected '1's. The traversed '1's are only marked, if the depth first search arrives at the starting position or at an already marked position.
The implementation
I will post an implementation the next few days for that one.
Re-posting my solution with an explanation and some code.
It took a few days before any answers were posted I tried to find a solution and believe I've found one that works very well for my needs.
Since I always have a starting point, I walk the edges from that point and fork the list of visited points each time the path "branches" off - allowing me to find multiple cycles.
Given a 2D grid with either a 1 or 0 in a cell:
0 1 1 1 1 1
0 1 0 1 0 1
0 1 1 1 1 1
Starting from a cell I already know is a 1, I begin my search:
For the current valid point:
add it to a "visited" list
look for any valid neighbors (except for last point I visited, to avoid infinite loops)
For each valid neighbor:
clone the list of points which is our "trail" to this new point
call step 1 with the neighbor point
Cloning allows each "branch" to become a unique cycle without mixing points.
I haven't run any performance profiling, but it works very well given the examples I've thrown at it.
It's possible to give me two copies of a cycle. For example, if I start in the NW corner, cells to the east and south both have valid paths to follow. They're both treated as new paths and followed, but they're just mirror images of the same cycle. For now, I just prune cycles like these - they have exactly the same points, as long as you ignore their order.
There's also a bit of filtering involved - like for problem #1 and trimming points if the end point matches a visited point that wasn't where we started. I think that's pretty much unavoidable and isn't a big deal but if there was a clean way to avoid that I would. I can't know what "begins" a new cycle until I've found it though, so you know, linear time flow strikes again.
public class CycleDetection {
// Cache found cycles
List<Cycle> cycles = new List<Cycle>();
// Provide public readonly access to our cycle list
public ReadOnlyCollection<Cycle> Cycles {
get { return new ReadOnlyCollection<Cycle>(cycles); }
}
// Steps/slopes that determine how we iterate grid points
public Point[] Steps = new Point[] {
new Point(1, 0),
new Point(0, 1),
new Point(-1, 0),
new Point(0, -1)
};
// Cache our starting position
Point origin;
// Cache the validation function
Func<Point, bool> validator;
public CycleDetection(Point origin, Func<Point, bool> validator) {
this.origin = origin;
this.validator = validator;
this.Scan();
}
// Activate a new scan.
public void Scan() {
cycles.Clear();
if (validator(origin)) {
Scan(new List<Point>(), origin);
}
}
// Add a cycle to our final list.
// This ensures the cycle doesn't already exist (compares points, ignoring order).
void AddCycle(Cycle cycle) {
// Cycles have reached some existing point in the trail, but not necessarily
// the exact starting point. To filter out "strands" we find the index of
// the actual starting point and skip points that came before it
var index = cycle.Points.IndexOf(cycle.Points[cycle.Points.Count - 1]);
// Make a new object with only the points forming the exact cycle
// If the end point is the actual starting point, this has no effect.
cycle = new Cycle(cycle.Points.Skip(index).ToList());
// Add unless duplicate
if (!cycles.Contains(cycle)) {
cycles.Add(cycle);
}
}
// Scan a new point and follow any valid new trails.
void Scan(List<Point> trail, Point start) {
// Cycle completed?
if (trail.Contains(start)) {
// Add this position as the end point
trail.Add(start);
// Add the finished cycle
AddCycle(new Cycle(trail));
return;
}
trail.Add(start);
// Look for neighbors
foreach (var step in Steps) {
var neighbor = start + step;
// Make sure the neighbor isn't the last point we were on... that'd be an infinite loop
if (trail.Count >= 2 && neighbor.Equals(trail[trail.Count - 2])) {
continue;
}
// If neighbor is new and matches
if (validator(neighbor)) {
// Continue the trail with the neighbor
Scan(new List<Point>(trail), neighbor);
}
}
}
}
I've posted the full source here: https://github.com/OutpostOmni/OmniGraph (includes some unrelated graph utils as well)
How do I find the longest increasing sub-sequence of integers from a list of integers in C#?
You just need to break in down into a smaller problem, that of finding the length of an increasing sequence given a starting point.
In pseudo-code, that's something like:
def getSeqLen (int array[], int pos):
for i = pos + 1 to array.last_element:
if array[i] <= array[i-1]:
return i - pos
return array.last_element + 1 - pos
Then step through the array, looking at these individual sequences. You know that the sequences have to be separated at specific points since otherwise the sequences would be longer. In other words, there is no overlap of these increasing sequences:
def getLongestSeqLen (int array[]):
pos = 0
longlen = 0
while pos <= array.last_element:
len = getSeqLen (array, pos)
if len > longlen:
longlen = len
pos = pos + len
return longlen
By way of graphical explanation, consider the following sequence:
element#: 0 1 2 3 4 5 6 7 8 9 10 11 12
value: 9 10 12 7 8 9 6 5 6 7 8 7 8
^ ^ ^ ^ ^
In this case, the ^ characters mark the unambiguous boundaries of a subsequence.
By starting at element 0, getSeqLen returns 3. Since this is greater than the current longest length of 0, we save it and add 3 to the current position (to get 3).
Then at element 3, getSeqLen returns 3. Since this is not greater than the current longest length of 3, we ignore it but we still add 3 to the current position (to get 6).
Then at element 6, getSeqLen returns 1. Since this is not greater than the current longest length of 3, we ignore it but we still add 1 to the current position (to get 7).
Then at element 7, getSeqLen returns 4. Since this is greater than the current longest length of 3, we save it and add 4 to the current position (to get 11).
Then at element 11, getSeqLen returns 2. Since this is not greater than the current longest length of 4, we ignore it but we still add 2 to the current position (to get 13).
Then, since element 13 is beyond the end, we simply return the longest length found (4).
You want what is known as patience sorting. It can compute the length, and find the sequence.
Here is my solution:
public static int[] FindLongestSequence(int[] seq)
{
int c_min = 0, c_len = 1;
int min = 1, len = 0;
for (int i = 0; i < seq.Length - 1; i++)
{
if(seq[i] < seq[i+1])
{
c_len++;
if (c_len > len)
{
len = c_len;
min = c_min;
}
} else
{
c_min = i+1;
c_len = 1;
}
}
return seq.Skip(min).Take(len).ToArray();
}
}
Create three variables: two integer lists and an integer. Set the integer initially to int.MinValue. As you iterate the list, if the current value is greater than your integer variable, append it to list 1. When this is not the case, clear list 1, but first copy list 1 to list 2 if it is longer than list 2. When you finish the sequence, return the longer list (and it's length).
As a performance tip too, if your current longest substring is longer than the remainder of the string, you can call it quits there!
I have solved this in O(n log n) time here:
http://www.olhovsky.com/2009/11/extract-longest-increasing-sequence-from-any-sequence/
An item in the final sequence, used to form a linked list.
class SeqItem():
val = 0 # This item's value.
prev = None # The value before this one.
def __init__(self, val, prev):
self.val = val
self.prev = prev
Extract longest non-decreasing subsequence from sequence seq.
def extract_sorted(seq):
subseqs = [SeqItem(seq[0], None)] # Track decreasing subsequences in seq.
result_list = [subseqs[0]]
for i in range(1, len(seq)):
result = search_insert(subseqs, seq[i], 0, len(subseqs))
# Build Python list from custom linked list:
final_list = []
result = subseqs[-1] # Longest nondecreasing subsequence is found by
# traversing the linked list backwards starting from
# the final smallest value in the last nonincreasing
# subsequence found.
while(result != None and result.val != None):
final_list.append(result.val)
result = result.prev # Walk backwards through longest sequence.
final_list.reverse()
return final_list
Seq tracks the smallest value of each nonincreasing subsequence constructed.
Find smallest item in seq that is greater than search_val.
If such a value does not exist, append search_val to seq, creating the
beginning of a new nonincreasing subsequence.
If such a value does exist, replace the value in seq at that position, and
search_val will be considered the new candidate for the longest subseq if
a value in the following nonincreasing subsequence is added.
Seq is guaranteed to be in increasing sorted order.
Returns the index of the element in seq that should be added to results.
def search_insert(seq, search_val, start, end):
median = (start + end)/2
if end - start < 2: # End of the search.
if seq[start].val > search_val:
if start > 0:
new_item = SeqItem(search_val, seq[start - 1])
else:
new_item = SeqItem(search_val, None)
seq[start] = new_item
return new_item
else: # seq[start].val <= search_val
if start + 1 < len(seq):
new_item = SeqItem(search_val, seq[start])
seq[start + 1] = new_item
return new_item
else:
new_item = SeqItem(search_val, seq[start])
seq.append(new_item)
return new_item
if search_val < seq[median].val: # Search left side
return search_insert(seq, search_val, start, median)
else: #search_val >= seq[median].val: # Search right side
return search_insert(seq, search_val, median, end)
Use the code like so:
import random
if __name__ == '__main__':
seq = []
for i in range(100000):
seq.append(int(random.random() * 1000))
print extract_sorted(seq)
One way to do it is with help from the Aggregate method:
var bestSubSequence = listOfInts.Skip(1).Aggregate(
Tuple.Create(int.MinValue, new List<int>(), new List<int>()),
(curr, next) =>
{
var bestList = curr.Item2.Count > curr.Item3.Count ? curr.Item2 : curr.Item3;
if (curr.Item1 > next)
return Tuple.Create(next, new List<int> {next}, bestList);
curr.Item2.Add(next);
return Tuple.Create(next, curr.Item2, bestList);
}).Item3;
It did not turn out as well as I had hoped when I started writing it and I think the other more direct ways to do it is better and easier to follow, but it might give a different perspective on how these kinds of tasks can be solved.