How to get levels for Fry Graph readability formula?

How to get levels for Fry Graph readability formula? - c#

I'm working in an application (C#) that applies some readability formulas to a text, like Gunning-Fog, Precise SMOG, Flesh-Kincaid.
Now, I need to implement the Fry-based Grade formula in my program, I understand the formula's logic, pretty much you take 3 100-words samples and calculate the average on sentences per 100-words and syllables per 100-words, and then, you use a graph to plot the values.
Here is a more detailed explanation on how this formula works.
I already have the averages, but I have no idea on how can I tell my program to "go check the graph and plot the values and give me a level." I don't have to show the graph to the user, I only have to show him the level.
I was thinking that maybe I can have all the values in memory, divided into levels, for example:
Level 1: values whose sentence average are between 10.0 and 25+, and whose syllables average are between 108 and 132.
Level 2: values whose sentence average are between 7.7 and 10.0, and .... so on
But the problem is that so far, the only place in which I have found the values that define a level, are in the graph itself, and they aren't too much accurate, so if I apply the approach commented above, trying to take the values from the graph, my level estimations would be too much imprecise, thus, the Fry-based Grade will not be accurate.
So, maybe any of you knows about some place where I can find exact values for the different levels of the Fry-based Grade, or maybe any of you can help me think in a way to workaround this.
Thanks

Well, I'm not sure about this being the most efficient solution, neither the best one, but at least it does the job.
I gave up to the idea of having like a math formula to get the levels, maybe there is such a formula, but I couldn't find it.
So I took the Fry's graph, with all the levels, and I painted each level of a different color, them I loaded the image on my program using:
Bitmap image = new Bitmap(#"C:\FryGraph.png");
image.GetPixel(int x, int y);
As you can see, after loading the image I use the GetPixel method to get the color at the specified coordinates. I had to do some conversion, to get the equivalent pixels for a given value on the graph, since the scale of the graph is not the equivalent to the pixels of the image.
In the end, I compare the color returned by GetPixel to see which was the Fry readability level of the text.
I hope this may be of any help for someone who faces the same problem.
Cheers.

You simply need to determine the formula for the graph. That is, a formula that accepts the number of sentences and number of syllables, and returns the level.
If you can't find the formula, you can determine it yourself. Estimate the linear equation for each of the lines on the graph. Also estimate the 'out-of-bounds' areas in the 'long words' and 'long sentences' areas.
Now for each point, just determine the region in which it resides; which lines it is above and which lines it is below. This is fairly simple algebra, unfortunately this is the best link I can find to describe how to do that.

I have made a first pass at solving this that I thought I would share in case someone else is looking sometime in the future. I built on the answer above and created a generic list of linear equations that one can use to determine an approximate grade level. First had to correct the values to make it more linear. This does not take into account the invalid areas, but I may revisit that.
The equation class:
public class GradeLineEquation
{
// using form y = mx+b
// or y=Slope(x)=yIntercept
public int GradeLevel { get; set; }
public float Slope { get; set; }
public float yIntercept { get; set; }
public float GetYGivenX(float x)
{
float result = 0;
result = (Slope * x) + yIntercept;
return result;
}
public GradeLineEquation(int gradelevel,float slope,float yintercept)
{
this.GradeLevel = gradelevel;
this.Slope = slope;
this.yIntercept = yintercept;
}
}
Here is the FryCalculator:
public class FryCalculator
{
//this class normalizes the plot on the Fry readability graph the same way a person would, by choosing points on the graph based on values even though
//the y-axis is non-linear and neither axis starts at 0. Just picking a relative point on each axis to plot the intercept of the zero and infinite scope lines
private List<GradeLineEquation> linedefs = new List<GradeLineEquation>();
public FryCalculator()
{
LoadLevelEquations();
}
private void LoadLevelEquations()
{
// load the estimated linear equations for each line with the
// grade level, Slope, and y-intercept
linedefs.Add(new NLPTest.GradeLineEquation(1, (float)0.5, (float)22.5));
linedefs.Add(new NLPTest.GradeLineEquation(2, (float)0.5, (float)20.5));
linedefs.Add(new NLPTest.GradeLineEquation(3, (float)0.6, (float)17.4));
linedefs.Add(new NLPTest.GradeLineEquation(4, (float)0.6, (float)15.4));
linedefs.Add(new NLPTest.GradeLineEquation(5, (float)0.625, (float)13.125));
linedefs.Add(new NLPTest.GradeLineEquation(6, (float)0.833, (float)7.333));
linedefs.Add(new NLPTest.GradeLineEquation(7, (float)1.05, (float)-1.15));
linedefs.Add(new NLPTest.GradeLineEquation(8, (float)1.25, (float)-8.75));
linedefs.Add(new NLPTest.GradeLineEquation(9, (float)1.75, (float)-24.25));
linedefs.Add(new NLPTest.GradeLineEquation(10, (float)2, (float)-35));
linedefs.Add(new NLPTest.GradeLineEquation(11, (float)2, (float)-40));
linedefs.Add(new NLPTest.GradeLineEquation(12, (float)2.5, (float)-58.5));
linedefs.Add(new NLPTest.GradeLineEquation(13, (float)3.5, (float)-93));
linedefs.Add(new NLPTest.GradeLineEquation(14, (float)5.5, (float)-163));
}
public int GetGradeLevel(float avgSylls,float avgSentences)
{
// first normalize the values given to cartesion positions on the graph
float x = NormalizeX(avgSylls);
float y = NormalizeY(avgSentences);
// given x find the first grade level equation that produces a lower y at that x
return linedefs.Find(a => a.GetYGivenX(x) < y).GradeLevel;
}
private float NormalizeY(float avgSentenceCount)
{
float result = 0;
int lower = -1;
int upper = -1;
// load the list of y axis line intervalse
List<double> intervals = new List<double> {2.0, 2.5, 3.0, 3.3, 3.5, 3.6, 3.7, 3.8, 4.0, 4.2, 4.3, 4.5, 4.8, 5.0, 5.2, 5.6, 5.9, 6.3, 6.7, 7.1, 7.7, 8.3, 9.1, 10.0, 11.1, 12.5, 14.3, 16.7, 20.0, 25.0 };
// find the first line lower or equal to the number we have
lower = intervals.FindLastIndex(a => ((double)avgSentenceCount) >= a);
// if we are not over the top or on the line grab the next higher line value
if(lower > -1 && lower < intervals.Count-1 && ((float) intervals[lower] != avgSentenceCount))
upper = lower + 1;
// set the integer portion of the respons
result = (float)lower;
// if we have an upper limit calculate the percentage above the lower line (to two decimal places) and add it to the result
if(upper != -1)
result += (float)Math.Round((((avgSentenceCount - intervals[lower])/(intervals[upper] - intervals[lower]))),2);
return result;
}
private float NormalizeX(float avgSyllableCount)
{
// the x axis is MUCH simpler. Subtract 108 and divide by 2 to get the x position relative to a 0 origin.
float result = (avgSyllableCount - 108) / 2;
return result;
}
}

Related

How to Calculate Accuracy %

I have the following scenario. I have a game in Unity where a player is provided with varying amount of targets (we'll say 125 as an example). The accuracy is multi-class in that there is Perfect(bullseye), Great, Good, Miss (where miss is the target is missed entirely, no points awarded). I'm trying to find the right way to calculate a correct accuracy percentage in this scenario. If the player hits every target (125) as Perfect, the accuracy would be 100%. If they hit 124 Perfect and 1 Great, while every target was hit the accuracy percentage would still drop (99.8%). What would be the correct way to calculate this? Balanced Accuracy? Weighted Accuracy? Precision?
I'd like to understand the underlying calculation, not just how to implement this in code.
I appreciate any help I can get with this.

This can be calculated by assigning each accuracy a score between 0 and 100 (percentage) and then calculating the average score or arithmetic mean for all the shots.
You could use an enum to define the scores for the different accuracies.
public enum Accuracy
{
Perfect = 100,
Great = 80,
Good = 50,
Miss = 0
}
Then to calculate the average you just need to sum all the accuracy scores together and divide the result by the total number of shots.
int sum = 0;
foreach(Accuracy shot in shotsTaken)
{
sum += (int)shot;
}
double average = (double)sum / shotsTaken.Count;
Calculating the average can be simplified using System.Linq.
public class Tally
{
private readonly List<Accuracy> shotsTaken = new List<Accuracy>();
public void RecordShot(Accuracy shot) => shotsTaken.Add(shot);
public string CalculateAverageAccuracy() => shotsTaken.Average(shot => (int)shot).ToString("0.#") + "%";
}
You can test the results using this code:
[MenuItem("Test/Tally")]
public static void Test()
{
var tally = new Tally();
for(int i = 0; i < 124; i++)
{
tally.RecordShot(Accuracy.Perfect);
}
tally.RecordShot(Accuracy.Great);
Debug.Log(tally.CalculateAverageAccuracy());
}
Result:

Logarithmic growth with min and max values

I am trying to "fake 3D" in a game in WPF. Think of a road, and that the objects appear somewhere in the distant. As they get closer, they look bigger, and eventually they grow in size very fast.
I'm thinking that when the object appears, it's close to 0 in width and height. As it moves towards the player, it becomes closer to hundred percent of its true size.
I think I will need to solve this using logarithmic calculations, and there are several threads on that. What I would really want to do however, is to send in three values to a LogaritmicGrowth method:
the starting Y point
the point at which the object should appear at 100%
the y point where the object is at this very moment.
Thus, what I would like to get in return is the scaling factor for the object in question. So if it's halfway between the starting point and the ending point, then perhaps 0.3 (or so) should be returned.
I can write the method inputs and outputs myself, but need help with the calculation. Thanks!

I am not entirely sure about the use of log here. This is a simple geometry problem.
Think about a point P which is D distance in front of you, which has a height Y (from your line of observation). Your screen is d distance in front of you. The intersection point of the light from P on the screen is p, which makes a height y on screen.
Then, by considering the similar triangles, one can show that:
y = (Y/D) d

Just in case someone else is looking at this question in the future, here's the correct reply (I figured it out myself):
/// <summary>
/// Method that enlargens the kind of object sent in
/// </summary>
public void ExponentialGrowth2(string name, float startY, float endY)
{
float totalDistance = endY - startY;
float currentY = 0;
for (int i = 0; i < Bodies.Bodylist.Count; i++)
{
if (Bodies.Bodylist[i].Name.StartsWith(name)) //looks for all bodies of this type
{
currentY = Bodies.Bodylist[i].PosY;
float distance = currentY - startY + (float)Bodies.Bodylist[i].circle.Height;
float fraction = distance / totalDistance; //such as 0.8
Bodies.Bodylist[i].circle.Width = Bodies.Bodylist[i].OriginalWidth * Math.Pow(fraction, 3);
Bodies.Bodylist[i].circle.Height = Bodies.Bodylist[i].OriginalHeight * Math.Pow(fraction, 3);
}
}
}
The method could be worked on further, such as allowing randomized power-to values (say from 1.5 to 4.5). Note that the higher the exponential value, the greater the effect.

Is there an algorithm to compute miles between coordinates?

I want to be able to display a Bing map in a Windows 8/Store app with an array of pushpins/waypoints at a zoom setting that will show every location, but no more than that - IOW, I want as much detail as possible while still showing all of the locations/coordinates.
I have this pseudocode:
public static int GetMapZoomSettingForCoordinates(List<String> coordinatesList)
{
string furthestNorth = GetFurthestNorth(coordinatesList);
string furthestSouth = GetFurthestSouth(coordinatesList);
string furthestEast = GetFurthestEast(coordinatesList);
string furthestWest = GetFurthestWest(coordinatesList);
int milesBetweenNorthAndSouthExtremes = GetMilesBetween(furthestNorth, furthestSouth);
int milesBetweenEastAndWestExtremes = GetMilesBetween(furthestEast, furthestWest);
int greaterCardinalDistance = Math.Max(milesBetweenNorthAndSouthExtremes, milesBetweenEastAndWestExtremes);
return GetZoomSettingForDistance(greaterCardinalDistance);
}
...but the "sticking point" (the hard part) are the "milesBetween" functions. Is there an existing algorithm for computing the miles between two coordinates?
I do realize this is a U.S.-centric bunch of code for now (miles vs. kilometers); that is, for now, as designed.
UPDATE
This is my new pseudocode (actual compiling code, but untested):
public static int GetMapZoomSettingForCoordinates(List<string> coordinatePairsList)
{
List<double> LatsList = new List<double>();
List<double> LongsList = new List<double>();
List<string> tempList = new List<string>();
foreach (string s in coordinatePairsList)
{
tempList.AddRange(s.Split(';'));
double dLat;
double.TryParse(tempList[0], out dLat);
double dLong;
double.TryParse(tempList[0], out dLong);
LatsList.Add(dLat);
LongsList.Add(dLong);
tempList.Clear();
}
double furthestNorth = GetFurthestNorth(LatsList);
double furthestSouth = GetFurthestSouth(LatsList);
double furthestEast = GetFurthestEast(LongsList);
double furthestWest = GetFurthestWest(LongsList);
int milesToDisplay =
HaversineInMiles(furthestWest, furthestNorth, furthestEast, furthestSouth);
return GetZoomSettingForDistance(milesToDisplay);
}
private static double GetFurthestNorth(List<double> longitudesList)
{
double northernmostVal = 0.0;
foreach (double d in longitudesList)
{
if (d > northernmostVal)
{
northernmostVal = d;
}
}
return northernmostVal;
}
...I still don't know what GetZoomSettingForDistance() should be/do, though...
UPDATE 2
This is "more better":
public static int GetMapZoomSettingForCoordinates(List<Tuple<double, double>> coordinatePairsList)
{
var LatsList = new List<double>();
var LongsList = new List<double>();
foreach (Tuple<double,double> tupDub in coordinatePairsList)
{
LatsList.Add(tupDub.Item1);
LongsList.Add(tupDub.Item2);
}
double furthestNorth = GetFurthestNorth(LongsList);
double furthestSouth = GetFurthestSouth(LongsList);
double furthestEast = GetFurthestEast(LatsList);
double furthestWest = GetFurthestWest(LatsList);
int milesToDisplay =
HaversineInMiles(furthestWest, furthestNorth, furthestEast, furthestSouth);
return GetZoomSettingForDistance(milesToDisplay);
}
UPDATE 3
I realized that my logic was backwards, or wrong, at any rate, regarding meridians of longitude and parallels of latitude. While it's true that meridians of longitude are the vertical lines ("drawn" North-to-South or vice versa) and that parallels of latitude are the horizontal lines ("drawn" East-to-West), points along those line represent the North-South location based on parallels of latitude, and represent East-West locations based on meridians of longitude. This seemed backwards in my mind until I visualized the lines spinning across (longitude) and up and over (latitude) the earth, rather than simply circling the earth like the rings of Saturn do; what also helped get my perception right was reminding myself that it is the values of the meridians of longitude that determine in which time zone one finds themselves. SO, the code above should change to pass latitudes to determine furthest North and furthest South, and conversely pass longitudes to determine furthest East and furthest West.

You can use the Haversine formula to compute the distance along the surface of a sphere.
Here's a C++ function to compute the distance using the Earth as the size of the sphere. It would easily be convertible to C#.
Note that the formula can be simplified if you want to just find the distance either latitudinally or longitudinally (which it sounds like you are trying to do).

To get the straight line distance you use the Pythagorean Theorem to find the hypotenuse.
d = ((delta x)^2 + (delta y)^2)^.5
Basically square both the changes in the x direction and the y direction, add them, then take the square root.
in your pseudo code it looks like you could have many points and you want to find a maximum distance that should encompass all of them, which makes sense if you are trying to figure out a scale for the zoom of the map. The same formula should work, just use milesBetweenEastAndWestExtremes for delta x, and milesBetweenNorthAndSouthExtremes for delta y. You may opt to add a fixed amount to this just to make sure you don't have points right on the very edge of the map.

Looking for ideas how to refactor my algorithm

I am trying to write my own Game of Life, with my own set of rules. First 'concept', which I would like to apply, is socialization (which basicaly means if the cell wants to be alone or in a group with other cells). Data structure is 2-dimensional array (for now).
In order to be able to move a cell to/away from a group of another cells, I need to determine where to move it. The idea is, that I evaluate all the cells in the area (neighbours) and get a vector, which tells me where to move the cell. Size of the vector is 0 or 1 (don't move or move) and the angle is array of directions (up, down, right, left).
This is a image with representation of forces to a cell, like I imagined it (but reach could be more than 5):
Let's for example take this picture:
Forces from lower left neighbour: down (0), up (2), right (2), left (0)
Forces from right neighbour : down (0), up (0), right (0), left (2)
sum : down (0), up (2), right (0), left (0)
So the cell should go up.
I could write an algorithm with a lot of if statements and check all cells in the neighbourhood. Of course this algorithm would be easiest if the 'reach' parameter is set to 1 (first column on picture 1). But what if I change reach parameter to 10 for example? I would need to write an algorithm for each 'reach' parameter in advance... How can I avoid this (notice, that the force is growing potentialy (1, 2, 4, 8, 16, 32,...))? Can I use specific design pattern for this problem?
Also: the most important thing is not speed, but to be able to extend initial logic.
Things to take into consideration:
reach should be passed as a parameter
i would like to change function, which calculates force (potential, fibonacci)
a cell can go to a new place only if this new place is not populated
watch for corners (you can't evaluate right and top neighbours in top-right corner for example)

It should not be difficult to write your algorithm to search all of the cells within the reach distance of a particular cell C. Each cell that has an inhabitant would have a particular force of repulsion on cell C. This force of repulsion is based on the distance from the cell to cell C. In the example that you have given, that force of repulsion is based upon the L-1 distance and is 2^(reach-distance). Each repulsion force is then added together to create a cumulative force that dictates the direction in which to move the inhabitant in cell C.
You do not need to write an algorithm for each different reach. The magnitude of the force can be determined via a simple formula. If you change that formula to something else such as a Fibonacci number, you should still be able to calculate the magnitude as needed based upon the distance and the reach.
Here is some rough code written in pseudo-Java showing the basic ideas: http://codepad.org/K6zxnOAx
enum Direction {Left, Right, Up, Down, None};
Direction push(boolean board[][], int testX, int testY, int reach)
{
int xWeight = 0;
int yWeight = 0;
for (int xDist=-reach; xDist<=+reach; ++xDist)
{
for (int yDist=-reach; yDist<=+reach; ++yDist)
{
int normDist = abs(xDist) + abs(yDist);
if (0<normDist && normDist<reach)
{
int x = testX + xDist;
int y = testY + yDist;
if (0<=x && x<board.length && 0<=y && y<board[0].length)
{
if (board[x][y])
{
int force = getForceMagnitude(reach, normDist);
xWeight += sign(xDist) * force;
yWeight += sign(yDist) * force;
}
}
}
}
}
if (xWeight==0 && yWeight==0) return Direction.None;
if (abs(xWeight) > abs(yWeight))
{
return xWeight<0 ? Direction.Left : Direction.Right;
}
else
{
return yWeight<0 ? Direction.Up : Direction.Down;
}
}
int getForceMagnitude(int reach, int distance)
{
return 1<<(reach-distance);
}

Write a function to loop over the neighbors:
Use min/max to clamp the bounds of the matrix.
Use a for loop to loop over all neighbors.
Modify the for loop bounds to represent reach.
:
def CalculateForceOnCell(x, y):
force_on_x_y = [0,0,0,0]
for i in range(max(0, x-reach), min(WIDTH, x+reach)+1):
limited_reach = reach - abs(x-i)
for j in range(max(0, y - limited_reach), min(HEIGHT, y + limited_reach + 1)):
force_coefficient = limited_reach + 1
AddNeighborForce(force_on_x_y, (x, y), (i, j), force_coefficient)
return force_on_x_y

Is there any algorithm for calculating area of a shape given co-ordinates that define the shape?

So I have some function that receives N random 2D points.
Is there any algorithm to calculate area of the shape defined by the input points?

You want to calculate the area of a polygon?
(Taken from link, converted to C#)
class Point { double x, y; }
double PolygonArea(Point[] polygon)
{
int i,j;
double area = 0;
for (i=0; i < polygon.Length; i++) {
j = (i + 1) % polygon.Length;
area += polygon[i].x * polygon[j].y;
area -= polygon[i].y * polygon[j].x;
}
area /= 2;
return (area < 0 ? -area : area);
}

Defining the "area" of your collection of points may be hard, e.g. if you want to get the smallest region with straight line boundaries which enclose your set then I'm not sure how to proceed. Probably what you want to do is calculate the area of the convex hull of your set of points; this is a standard problem, a description of the problem with links to implementations of solutions is given by Steven Skiena at the Stony Brook Algorithms repository. From there one way to calculate the area (it seems to me to be the obvious way) would be to triangulate the region and calculate the area of each individual triangle.

You can use Timothy Chan's algorithm for finding convex hull in nlogh, where n is the number of points, h is the number of convex hull vertices. If you want an easy algorithm, go for Graham scan.
Also, if you know that your data is ordered like a simple chain, where the points don't cross each other, you can use Melkman's algorithm to compute convex hull in O(N).
Also, one more interesting property of convex hull is that, it has the minium perimeter.

Your problem does not directly imply that there's a ready-made polygon (which is assumed by this answer). I would recommend a triangulation such as a Delaunay Triangulation and then trivially compute the area of each triangle. OpenCV (I've used it with a large number of 2D points and it's very effective) and CGAL provide excellent implementations for determining the triangulation.

I found another function written in Java , so i traslated it to C#
public static double area(List<Double> lats,List<Double> lons)
{
double sum=0;
double prevcolat=0;
double prevaz=0;
double colat0=0;
double az0=0;
for (int i=0;i<lats.Count;i++)
{
double colat=2*Math.Atan2(Math.Sqrt(Math.Pow(Math.Sin(lats[i]*Math.PI/180/2), 2)+ Math.Cos(lats[i]*Math.PI/180)*Math.Pow(Math.Sin(lons[i]*Math.PI/180/2), 2)),
Math.Sqrt(1- Math.Pow(Math.Sin(lats[i]*Math.PI/180/2), 2)- Math.Cos(lats[i]*Math.PI/180)*Math.Pow(Math.Sin(lons[i]*Math.PI/180/2), 2)));
double az=0;
if (lats[i]>=90)
{
az=0;
}
else if (lats[i]<=-90)
{
az=Math.PI;
}
else
{
az=Math.Atan2(Math.Cos(lats[i]*Math.PI/180) * Math.Sin(lons[i]*Math.PI/180),Math.Sin(lats[i]*Math.PI/180))% (2*Math.PI);
}
if(i==0)
{
colat0=colat;
az0=az;
}
if(i>0 && i<lats.Count)
{
sum=sum+(1-Math.Cos(prevcolat + (colat-prevcolat)/2))*Math.PI*((Math.Abs(az-prevaz)/Math.PI)-2*Math.Ceiling(((Math.Abs(az-prevaz)/Math.PI)-1)/2))* Math.Sign(az-prevaz);
}
prevcolat=colat;
prevaz=az;
}
sum=sum+(1-Math.Cos(prevcolat + (colat0-prevcolat)/2))*(az0-prevaz);
return 5.10072E14* Math.Min(Math.Abs(sum)/4/Math.PI,1-Math.Abs(sum)/4/Math.PI);
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.