I have a single dimensional array of floating point values (c# doubles FYI) and I need to find the "peak" of the values ... as if graphed.
I can't just take the highest value, as the peak is actually a plateau that has small fluctuations. This plateau is in the middle of a bunch of noise. I'm looking find a solution that would give me the center of this plateau.
An example array might look like this:
1,2,1,1,2,1,3,2,4,4,4,5,6,8,8,8,8,7,8,7,9,7,5,4,4,3,3,2,2,1,1,1,1,1,2,1,1,1,1
where the peak is somewhere in the bolded section.
Any ideas?
You can apply a low-pass filter to your input array, to smooth out the small fluctuations,
then find the peak in the filtered data. The simplest example is probably a "boxcar"
filter, where the output value is the sum of the input values within a certain distance
from the current array position. In pseudocode, it would look something like this:
for i = 0, samplecount-1
if (i < boxcar_radius) or (i >= (samplecount - boxcar_radius)) then
filtered_data[i] = 0 // boxcar runs off edge of input array, don't use
else
filtered_data[i] = 0
for j = i-boxcar_radius, i+boxcar_radius
filtered_data[i] = filtered_data[i] + input_data[j]
endfor
endif
endfor
If you have some idea how wide the "plateau" will be, you can choose the boxcar radius (approximately half the expected plateau width) to detect features at the appropriate scale.
You need to first define what you mean by 'small'. Say, 'small' fluctuation around the maximum is defined as any value that is within ± ϵ of the maximum. Then, it is straightforward to identify the plateau.
Pass through the data to identify the maximum and then do a second pass to identify all values that are within ± ϵ of the maximum.
Peak detection is one of the stages in Phase Correlation and other motion estimation algorithms used in places like video compression. One approach is this: consider a candidate for a peak and a window of a certain number of neighbours. Now fit a quadratic function using standard regression. The peak, with subpixel accuracy, is at the maximum of the fitted quadratic.
Obviously exact solution depends on details. If your distribution is always nice as in your example you could have:
def GetPeak(l):
large = max(l) * 0.8
above_large = [i for i in xrange(len(l)) if l[i] > large]
left_peak = min(above_large)
right_peak = max(above_large)
return (left_peak, right_peak)
Related
I have a graph input where the X axis is time (going forwards). The Y axis is generally stable but has large drops and raises at different points (marked as the red arrows below)
Visually it's obvious but how do I efficiently detect this from within code? I'm not sure which algorithms I should be using but I would like to keep it as simple as possible.
A simple way is to calculate the difference between every two neighbouring samples, eg diff= abs(y[x point 1] - y[x point 0]) and calculate the standard deviation for all the differences. This will rank the differences in order for you and also help eliminate random noise which you get if you just sample largest diff values.
If your up/down values are over several x periods ( eg temp plotted every minute ), then calculate the diff over N samples, taking the max and min from the N samples. If you want 5 samples to be the detection period, then get samples 0,1,2,3,4 and extract min/max, use those for diff. Repeat for samples 1,2,3,4,5 and so on. You may need to play with this as too many samples starts affecting stddev.
An alternative method is to calculate the slope of up/down parts of the chart by subsampling and selecting slopes and lengths that are interesting. While this can be more accurate for automated detection it is much harder to describe the algorithm in depth.
I've worked on similar issues and built a chart categoriser, but would really love references to research in this area.
When you get this going, you may also want to look at 'control charts' from operations research, they identify several patterns that might also be worth detecting, depending on what your charts are of.
I am working on a method wich should decide whether or not a curve has a nearly constant slope or not.
There are of course x,y points involved. What I did so far is dividing y of each data point by its x to get the slope of that data point. I store this slopes in a List<double>
I think so far I am on the right track (tell me please, if I am not!). Now it's time to decide about being dealing with a constant curve or not, so I ended up with the method below:
private bool IsConstantSlope(List<double> slopes)
{
var max = slopes.Max();
var min = slopes.Min();
var diff = max - min;
return (diff > 0.01) ? false : true;
}
So what I do here checking for maximum and minimum values of slopes and compare it to a custom threshold which I beleive is not good at all.
This method works good for perfectly constant sloped lines, but I want to give it some felexibility, I don't think comparing the difference of max and min values to a constant number is a good practice.
I will appriciate more ideas!
There are ofcource x,y points involved. what I did so far is dividing
y of each data point by its x to get the slope of that data point. I
store this slopes in a List
Strictly speaking a point does not have a slope, what you are measuring here is the slope of the line that connects your point (x,y) and the point (0,0). So if you are doing this for an ordered set of points, then the notion of having a single line is not quite correct. You dont even have the set of slopes of lines that connect adjacent points. Also in your function
return (max > 0.01) || (min < -0.01);
is better if your threshold is 0.01.
If what you really want is a line that fits or approximates the set of points then you first need to perform some kind of straight line regression to your data and test the gradient of this approximating line to see if it is within your threshold limits.
This might be a useful read http://en.wikipedia.org/wiki/Simple_linear_regression
Alternatively, you can order your points by their x value, then work out the slope between each consecutive pair (effectively generating a polyline) and store these in your list and then use your slope camparison function.
I would design a recursive algorithm, working on the whole set of slopes. Considering only the min/max slopes doesn't tell anything about the whole curve.
First of all, I would establish which is the requirement that two slopes A and B must fulfill in order to determine a "constant slope". Then, I would consider the first (A) and last (B) values in your list: do the two values statisfy the requirement? No: no constant slope; Yes: subdivide the range (A,B) into two subranges: (A,M), (M,B) where M is the value equidistant, in the list, from A and B. Then you apply the same algorithm to the two subranges. The number of subranges depends on the accuracy you want to achieve.
I would like to find a fast algorithm in order to find the x closest points to a given point on a plane.
We are actually dealing with not too many points (between 1,000 and 100,000), but I need the x closest points for every of these points. (where x usually will be between 5 and 20.)
I need to write it in C#.
A bit more context about the use case: These points are coordinates on a map. (I know, this means we are not exactly talking about a plane, but I hope to avoid dealing with projection issues.) In the end points that have many other points close to them should be displayed in red, points that have not too many points close to them should be displayed green. Between these two extremees the points are on a color gradient.
What you need is a data structure appropriate for organizing points in a plane. The K-D-Tree is often used in such situations. See k-d tree on Wikipedia.
Here, I found a general description of Geometric Algorithms
UPDATE
I ported a Java implementation of a KD-tree to C#. Please see User:Ojd/KD-Tree on RoboWiki. You can download the code there or you can download CySoft.Collections.zip directly from my homepage (only download, no docu).
For a given point (not all of them) and as the number of points is not extreme, you could calculate the distance from each point:
var points = new List<Point>();
Point source = ...
....
var closestPoints = points.Where(point => point != source).
OrderBy(point => NotReallyDistanceButShouldDo(source, point)).
Take(20);
private double NotReallyDistanceButShouldDo(Point source, Point target)
{
return Math.Pow(target.X - source.X, 2) + Math.Pow(target.Y - source.Y, 2);
}
(I've used x = 20)
The calculation are based on doubles so the fpu should be able to do a decent job here.
Note that you might get better performance if Point is a class rather than a struct.
You need to create a distance function, then calculate distance for every point and sort the results, and take the first x.
If the results must be 100% accurate then you can use the standard distance function:
d = SQRT((x2 - x1)^2 + (y2 - y1)^2)
To make this more efficent. lets say the distance is k. Take all points with x coordinates between x-k and x+k. similarly take, y-k and y+k. So you have removed all excess coordinates. now make distance by (x-x1)^2 + (y-y1)^2. Make a min heap of k elements on them , and add them to the heap if new point < min(heap). You now have the k minimum elements in the heap.
I'm working on an algorithm to find peaks in a List object. I'd thought up what I thought was a good (or good enough) algorithm for doing this by looking at a point and it's neighbors and, if it was a peak, adding it to the results list. However, given some recent results, I don't think this method works as well as I'd initially hoped. (I've included the code I'm currently using, and hope to replace, below). I've done a little work with LabView before and I know that the way their module finds peaks/valleys works for what I need to do. I did some research into how LabView does this and found this:
"This Peak Detector VI is based on an algorithm that fits a quadratic polynomial to sequential groups of data points. The number of data points used in the fit is specified by width.
For each peak or valley, the quadratic fit is tested against the threshold. Peaks with heights lower than the threshold or valleys with troughs higher than the threshold are ignored. Peaks and valleys are detected only after the VI processes approximately width/2 data points beyond the location of the peak or valley. This delay has implications only for real-time processing."
Okay, so now I've been trying to do something similar in C#, however, in all my searching it seems that fitting a quadratic polynomial to data is certainly not trivial. I'd think that this problem would be one explored many, many times, but I've been unsuccessful getting a algorithm that does this well or finding a library to do it with.
Any help with this problem is greatly appreciated. Thanks.
Original/Current Code:
public static List<double> FindPeaks(List<double> values, double rangeOfPeaks)
{
List<double> peaks = new List<double>();
int checksOnEachSide = (int)Math.Floor(rangeOfPeaks / 2);
for (int i = checksOnEachSide; i < values.Count - checksOnEachSide; i++)
{
double current = values[i];
IEnumerable<double> window = values;
if (i > checksOnEachSide)
window = window.Skip(i - checksOnEachSide);
window = window.Take((int)rangeOfPeaks);
if (current == window.Max())
peaks.Add(current);
}
return peaks;
}
I have used Math.NET for matrix operations like this in c#. It has all the tools you might need for least squares problems such as QR decomposition or SVD. For a general overview of how to apply them I think wikipedia does quite a good job.
I want to compare the contrast of several images, for that purpose I need to measure the contrast. In fact I need the local contrast, not global. I already have a solution which just compares the neighboring pixel of each image pixel. The results are ok, but now I need to compare it with another algorithm which I do not understand. It is mentioned in this paper from Peli: http://www.eri.harvard.edu/faculty/peli/papers/ContrastJOSA.pdf and is called "Band-Limited Contrast".
I understand it like this: Transfer an image to Frequency Space and there apply a low pas filter. Well. Apply another lp-filter with frequency range +1 next??? I really don't know what I have to do next... When I apply a lp-filter with range from 0 to 100 and another with 0 to 101 and then divide them and substract 1, the result is not what I had expected.
Does anybody know this kind of filter?
Thanks in advance
Matthias