Unity formatting multiple numbers - c#

So I'm a complete newb to unity and c# and I'm trying to make my first mobile incremental game. I know how to format a variable from (e.g.) 1000 >>> 1k however I have several variables that can go up to decillion+ so I imagine having to check every variable's value seperately up to decillion+ will be quite inefficient. Being a newb I'm not sure how to go about it, maybe a for loop or something?
EDIT: I'm checking if x is greater than a certain value. For example if it's greater than 1,000, display 1k. If it's greater than 1,000,000, display 1m...etc etc
This is my current code for checking if x is greater than 1000 however I don't think copy pasting this against other values would be very efficient;
if (totalCash > 1000)
{
totalCashk = totalCash / 1000;
totalCashTxt.text = "$" + totalCashk.ToString("F1") + "k";
}

So, I agree that copying code is not efficient. That's why people invented functions!
How about simply wrapping your formatting into function, eg. named prettyCurrency?
So you can simply write:
totalCashTxt.text = prettyCurrency(totalCashk);
Also, instead of writing ton of ifs you can handle this case with logarithm with base of 10 to determine number of digits. Example in pure C# below:
using System.IO;
using System;
class Program
{
// Very simple example, gonna throw exception for numbers bigger than 10^12
static readonly string[] suffixes = {"", "k", "M", "G"};
static string prettyCurrency(long cash, string prefix="$")
{
int k;
if(cash == 0)
k = 0; // log10 of 0 is not valid
else
k = (int)(Math.Log10(cash) / 3); // get number of digits and divide by 3
var dividor = Math.Pow(10,k*3); // actual number we print
var text = prefix + (cash/dividor).ToString("F1") + suffixes[k];
return text;
}
static void Main()
{
Console.WriteLine(prettyCurrency(0));
Console.WriteLine(prettyCurrency(333));
Console.WriteLine(prettyCurrency(3145));
Console.WriteLine(prettyCurrency(314512455));
Console.WriteLine(prettyCurrency(31451242545));
}
}
OUTPUT:
$0.0
$333.0
$3.1k
$314.5M
$31.5G
Also, you might think about introducing a new type, which implements this function as its ToString() overload.
EDIT:
I forgot about 0 in input, now it is fixed. And indeed, as #Draco18s said in his comment nor int nor long will handle really big numbers, so you can either use external library like BigInteger or switch to double which will lose his precision when numbers becomes bigger and bigger. (e.g. 1000000000000000.0 + 1 might be equal to 1000000000000000.0). If you choose the latter you should change my function to handle numbers in range (0.0,1.0), for which log10 is negative.

Related

Positive addition operations producing a negative number

I am writing a script to take a series of number from a csv file and total them.
I have extracted the values from the csv into a List<string> and am looping over that to add them together.
The numbers are millisecond representations of each minute in the day, so generally starting at 0 and incrementing by 6000.
For some reason though, the final numbers appear to be negative.
I am checking at the end of the addition operations and the final count is less than 1.
I tried printing the numbers to the console and they are correct, I guess something wrong somewhere else?
Screenshot of sample out
Thanks in advance.
var totalSeconds = 0;
var minutesCounted = 0;
var unzippedFolder = Compression.UnzipToFolder(zipPath);
var listOfSeconds = ReadCsvIndex(unzippedFolder[0], ",", 0, true);
foreach (var second in listOfSeconds)
{
// Console.WriteLine(Int32.Parse(second)); // Prints correct numbers
totalSeconds += Int32.Parse(second);
minutesCounted++;
Console.WriteLine(minutesCounted + totalSeconds);
}
Console.WriteLine(security + totalSeconds);
Console.WriteLine(minutesCounted);
File.Delete(unzippedFolder[0]);
if (totalSeconds > 1)
{
Console.WriteLine(true);
}
else
{
Console.WriteLine(false); // This is returning false
}
Console.ReadLine();
Use a long value instead of the default int for that totalSeconds.
The value apparently goes above Int32.MaxValue, so the value wraps around to negative that.
A long has a much higher max value, so you will not get this overflow effect (at least, not as soon)

How to compare large string integer values

Currently I am working on a program that processes extremely large integernumbers .
To prevent hitting the intiger.maxvalue a script that processes strings as numbers, and splits them up into a List<int>as following
0 is the highest currently known value
list entry 0: 123 (hundred twenty three million)
list entry 1: 321 (three hundred twenty one thousand)
list entry 2: 777 (seven hundred seventy seven)
Now my question is: How would one check if the incoming string value is sub tractable from these values?
The start for subtraction I currently made is as following, but I am getting stuck on the subtracting part.
public bool Subtract(string value)
{
string cleanedNumeric = NumericAndSpaces(value);
List<string> input = new List<string>(cleanedNumeric.Split(' '));
// In case 1) the amount is bigger 2) biggest value exceeded by a 10 fold
// 3) biggest value exceeds the value
if (input.Count > values.Count ||
input[input.Count - 1].Length > values[0].ToString().Length ||
FastParseInt(input[input.Count -1]) > values[0])
return false;
// Flip the array for ease of comparison
input.Reverse();
return true;
}
EDIT
Current target for the highest achievable number in this program is a Googolplex And are limited to .net3.5 MONO
You should do some testing on this because I haven't run extensive tests but it has worked on the cases I've put it through. Also, it might be worth ensuring that each character in the string is truly a valid integer as this procedure would bomb given a non-integer character. Finally, it expects positive numbers for both subtrahend and minuend.
static void Main(string[] args)
{
// In subtraction, a subtrahend is subtracted from a minuend to find a difference.
string minuend = "900000";
string subtrahend = "900001";
var isSubtractable = IsSubtractable(subtrahend, minuend);
}
public static bool IsSubtractable(string subtrahend, string minuend)
{
minuend = minuend.Trim();
subtrahend = subtrahend.Trim();
// maybe loop through characters and ensure all are valid integers
// check if the original number is longer - clearly subtractable
if (minuend.Length > subtrahend.Length) return true;
// check if original number is shorter - not subtractable
if (minuend.Length < subtrahend.Length) return false;
// at this point we know the strings are the same length, so we'll
// loop through the characters, one by one, from the start, to determine
// if the minued has a higher value character in a column of the number.
int numberIndex = 0;
while (numberIndex < minuend.Length )
{
Int16 minuendCharValue = Convert.ToInt16(minuend[numberIndex]);
Int16 subtrahendCharValue = Convert.ToInt16(subtrahend[numberIndex]);
if (minuendCharValue > subtrahendCharValue) return true;
if (minuendCharValue < subtrahendCharValue) return false;
numberIndex++;
}
// number are the same
return true;
}
[BigInteger](https://msdn.microsoft.com/en-us/library/system.numerics.biginteger.aspx) is of aribtary size.
Run this code if you don't believe me
var foo = new BigInteger(2);
while (true)
{
foo = foo * foo;
}
Things get crazy. My debugger (VS2013) becomes unable to represent the number before it's done. ran it for a short time and got a number with 1.2 million digits in base 10 from ToString. It is big enough. There is a 2GB limit on object, which can be overriden in .NET 4.5 with the setting gcAllowVeryLargeObjects
Now what to do if you are using .NET 3.5? You basically need to reimplement BigInteger (obviously only taking what you need, there is a lot in there).
public class MyBigInteger
{
uint[] _bits; // you need somewhere to store the value to an arbitrary length.
....
You also need to perform maths on these arrays. here is the Equals method from BigInteger:
public bool Equals(BigInteger other)
{
AssertValid();
other.AssertValid();
if (_sign != other._sign)
return false;
if (_bits == other._bits)
// _sign == other._sign && _bits == null && other._bits == null
return true;
if (_bits == null || other._bits == null)
return false;
int cu = Length(_bits);
if (cu != Length(other._bits))
return false;
int cuDiff = GetDiffLength(_bits, other._bits, cu);
return cuDiff == 0;
}
It basically does cheap length and sign comparisons of the byte arrays, then, if that doesn't produce a difference hands off to GetDiffLength.
internal static int GetDiffLength(uint[] rgu1, uint[] rgu2, int cu)
{
for (int iv = cu; --iv >= 0; )
{
if (rgu1[iv] != rgu2[iv])
return iv + 1;
}
return 0;
}
Which does the expensive check of looping through the arrays looking for a difference.
All you math will have to follow this pattern and can largely be ripped of from the .Net source code.
Googleplex and 2GB:
Here the 2GB limit becomes a problem, because you will be needing an object size of 3.867×10^90 gigabyte. This the the point where you give up, or get clever and store objects as powers at the cost of not being able to represent a lot of them. *2
if you moderate your expectations, it doesn't actually change the maths of BigInteger to split _bits into multiple jagged arrays *1. You change the cheap checks a bit. Rather than checking the size of the array, you check the number of subarrays and then the size of the last one. Then the loop needs to be a bit more (but not much) more complex in that it does elementwise array comparison for each sub array. There are other changes as well, but it's by no means impossible and gets you out of the 2GB limit.
*1 Note use jagged arrays[][], not multidimensional arrays [,] which are still subject to the same limit.
*2 Ie give up on precision and store the mantissa and exponent. If you look how floating point numbers are implemented they can't represent all numbers between their max and min (as the number of real numbers in a range is 'bigger' than infinite). They make a complex trade off between precision and range. If you are wanting to do this, looking at float implementations will be a lot more useful than taking about integer representations like Biginteger.

Constantly Incrementing String

So, what I'm trying to do this something like this: (example)
a,b,c,d.. etc. aa,ab,ac.. etc. ba,bb,bc, etc.
So, this can essentially be explained as generally increasing and just printing all possible variations, starting at a. So far, I've been able to do it with one letter, starting out like this:
for (int i = 97; i <= 122; i++)
{
item = (char)i
}
But, I'm unable to eventually add the second letter, third letter, and so forth. Is anyone able to provide input? Thanks.
Since there hasn't been a solution so far that would literally "increment a string", here is one that does:
static string Increment(string s) {
if (s.All(c => c == 'z')) {
return new string('a', s.Length + 1);
}
var res = s.ToCharArray();
var pos = res.Length - 1;
do {
if (res[pos] != 'z') {
res[pos]++;
break;
}
res[pos--] = 'a';
} while (true);
return new string(res);
}
The idea is simple: pretend that letters are your digits, and do an increment the way they teach in an elementary school. Start from the rightmost "digit", and increment it. If you hit a nine (which is 'z' in our system), move on to the prior digit; otherwise, you are done incrementing.
The obvious special case is when the "number" is composed entirely of nines. This is when your "counter" needs to roll to the next size up, and add a "digit". This special condition is checked at the beginning of the method: if the string is composed of N letters 'z', a string of N+1 letter 'a's is returned.
Here is a link to a quick demonstration of this code on ideone.
Each iteration of Your for loop is completely
overwriting what is in "item" - the for loop is just assigning one character "i" at a time
If item is a String, Use something like this:
item = "";
for (int i = 97; i <= 122; i++)
{
item += (char)i;
}
something to the affect of
public string IncrementString(string value)
{
if (string.IsNullOrEmpty(value)) return "a";
var chars = value.ToArray();
var last = chars.Last();
if(char.ToByte() == 122)
return value + "a";
return value.SubString(0, value.Length) + (char)(char.ToByte()+1);
}
you'll probably need to convert the char to a byte. That can be encapsulated in an extension method like static int ToByte(this char);
StringBuilder is a better choice when building large amounts of strings. so you may want to consider using that instead of string concatenation.
Another way to look at this is that you want to count in base 26. The computer is very good at counting and since it always has to convert from base 2 (binary), which is the way it stores values, to base 10 (decimal--the number system you and I generally think in), converting to different number bases is also very easy.
There's a general base converter here https://stackoverflow.com/a/3265796/351385 which converts an array of bytes to an arbitrary base. Once you have a good understanding of number bases and can understand that code, it's a simple matter to create a base 26 counter that counts in binary, but converts to base 26 for display.

Looking for a way to optimize this algorithm for parsing a very large string

The following class parses through a very large string (an entire novel of text) and breaks it into consecutive 4-character strings that are stored as a Tuple. Then each tuple can be assigned a probability based on a calculation. I am using this as part of a monte carlo/ genetic algorithm to train the program to recognize a language based on syntax only (just the character transitions).
I am wondering if there is a faster way of doing this. It takes about 400ms to look up the probability of any given 4-character tuple. The relevant method _Probablity() is at the end of the class.
This is a computationally intensive problem related to another post of mine: Algorithm for computing the plausibility of a function / Monte Carlo Method
Ultimately I'd like to store these values in a 4d-matrix. But given that there are 26 letters in the alphabet that would be a HUGE task. (26x26x26x26). If I take only the first 15000 characters of the novel then performance improves a ton, but my data isn't as useful.
Here is the method that parses the text 'source':
private List<Tuple<char, char, char, char>> _Parse(string src)
{
var _map = new List<Tuple<char, char, char, char>>();
for (int i = 0; i < src.Length - 3; i++)
{
int j = i + 1;
int k = i + 2;
int l = i + 3;
_map.Add
(new Tuple<char, char, char, char>(src[i], src[j], src[k], src[l]));
}
return _map;
}
And here is the _Probability method:
private double _Probability(char x0, char x1, char x2, char x3)
{
var subset_x0 = map.Where(x => x.Item1 == x0);
var subset_x0_x1_following = subset_x0.Where(x => x.Item2 == x1);
var subset_x0_x2_following = subset_x0_x1_following.Where(x => x.Item3 == x2);
var subset_x0_x3_following = subset_x0_x2_following.Where(x => x.Item4 == x3);
int count_of_x0 = subset_x0.Count();
int count_of_x1_following = subset_x0_x1_following.Count();
int count_of_x2_following = subset_x0_x2_following.Count();
int count_of_x3_following = subset_x0_x3_following.Count();
decimal p1;
decimal p2;
decimal p3;
if (count_of_x0 <= 0 || count_of_x1_following <= 0 || count_of_x2_following <= 0 || count_of_x3_following <= 0)
{
p1 = e;
p2 = e;
p3 = e;
}
else
{
p1 = (decimal)count_of_x1_following / (decimal)count_of_x0;
p2 = (decimal)count_of_x2_following / (decimal)count_of_x1_following;
p3 = (decimal)count_of_x3_following / (decimal)count_of_x2_following;
p1 = (p1 * 100) + e;
p2 = (p2 * 100) + e;
p3 = (p3 * 100) + e;
}
//more calculations omitted
return _final;
}
}
EDIT - I'm providing more details to clear things up,
1) Strictly speaking I've only worked with English so far, but its true that different alphabets will have to be considered. Currently I only want the program to recognize English, similar to whats described in this paper: http://www-stat.stanford.edu/~cgates/PERSI/papers/MCMCRev.pdf
2) I am calculating the probabilities of n-tuples of characters where n <= 4. For instance if I am calculating the total probability of the string "that", I would break it down into these independent tuples and calculate the probability of each individually first:
[t][h]
[t][h][a]
[t][h][a][t]
[t][h] is given the most weight, then [t][h][a], then [t][h][a][t]. Since I am not just looking at the 4-character tuple as a single unit, I wouldn't be able to just divide the instances of [t][h][a][t] in the text by the total no. of 4-tuples in the next.
The value assigned to each 4-tuple can't overfit to the text, because by chance many real English words may never appear in the text and they shouldn't get disproportionally low scores. Emphasing first-order character transitions (2-tuples) ameliorates this issue. Moving to the 3-tuple then the 4-tuple just refines the calculation.
I came up with a Dictionary that simply tallies the count of how often the tuple occurs in the text (similar to what Vilx suggested), rather than repeating identical tuples which is a waste of memory. That got me from about ~400ms per lookup to about ~40ms per, which is a pretty great improvement. I still have to look into some of the other suggestions, however.
In yoiu probability method you are iterating the map 8 times. Each of your wheres iterates the entire list and so does the count. Adding a .ToList() ad the end would (potentially) speed things. That said I think your main problem is that the structure you've chossen to store the data in is not suited for the purpose of the probability method. You could create a one pass version where the structure you store you're data in calculates the tentative distribution on insert. That way when you're done with the insert (which shouldn't be slowed down too much) you're done or you could do as the code below have a cheap calculation of the probability when you need it.
As an aside you might want to take puntuation and whitespace into account. The first letter/word of a sentence and the first letter of a word gives clear indication on what language a given text is written in by taking punctuaion charaters and whitespace as part of you distribution you include those characteristics of the sample data. We did that some years back. Doing that we shown that using just three characters was almost as exact (we had no failures with three on our test data and almost as exact is an assumption given that there most be some weird text where the lack of information would yield an incorrect result). as using more (we test up till 7) but the speed of three letters made that the best case.
EDIT
Here's an example of how I think I would do it in C#
class TextParser{
private Node Parse(string src){
var top = new Node(null);
for (int i = 0; i < src.Length - 3; i++){
var first = src[i];
var second = src[i+1];
var third = src[i+2];
var fourth = src[i+3];
var firstLevelNode = top.AddChild(first);
var secondLevelNode = firstLevelNode.AddChild(second);
var thirdLevelNode = secondLevelNode.AddChild(third);
thirdLevelNode.AddChild(fourth);
}
return top;
}
}
public class Node{
private readonly Node _parent;
private readonly Dictionary<char,Node> _children
= new Dictionary<char, Node>();
private int _count;
public Node(Node parent){
_parent = parent;
}
public Node AddChild(char value){
if (!_children.ContainsKey(value))
{
_children.Add(value, new Node(this));
}
var levelNode = _children[value];
levelNode._count++;
return levelNode;
}
public decimal Probability(string substring){
var node = this;
foreach (var c in substring){
if(!node.Contains(c))
return 0m;
node = node[c];
}
return ((decimal) node._count)/node._parent._children.Count;
}
public Node this[char value]{
get { return _children[value]; }
}
private bool Contains(char c){
return _children.ContainsKey(c);
}
}
the usage would then be:
var top = Parse(src);
top.Probability("test");
I would suggest changing the data structure to make that faster...
I think a Dictionary<char,Dictionary<char,Dictionary<char,Dictionary<char,double>>>> would be much more efficient since you would be accessing each "level" (Item1...Item4) when calculating... and you would cache the result in the innermost Dictionary so next time you don't have to calculate at all..
Ok, I don't have time to work out details, but this really calls for
neural classifier nets (Just take any off the shelf, even the Controllable Regex Mutilator would do the job with way more scalability) -- heuristics over brute force
you could use tries (Patricia Tries a.k.a. Radix Trees to make a space optimized version of your datastructure that can be sparse (the Dictionary of Dictionaries of Dictionaries of Dictionaries... looks like an approximation of this to me)
There's not much you can do with the parse function as it stands. However, the tuples appear to be four consecutive characters from a large body of text. Why not just replace the tuple with an int and then use the int to index the large body of text when you need the character values. Your tuple based method is effectively consuming four times the memory the original text would use, and since memory is usually the bottleneck to performance, it's best to use as little as possible.
You then try to find the number of matches in the body of text against a set of characters. I wonder how a straightforward linear search over the original body of text would compare with the linq statements you're using? The .Where will be doing memory allocation (which is a slow operation) and the linq statement will have parsing overhead (but the compiler might do something clever here). Having a good understanding of the search space will make it easier to find an optimal algorithm.
But then, as has been mentioned in the comments, using a 264 matrix would be the most efficent. Parse the input text once and create the matrix as you parse. You'd probably want a set of dictionaries:
SortedDictionary <int,int> count_of_single_letters; // key = single character
SortedDictionary <int,int> count_of_double_letters; // key = char1 + char2 * 32
SortedDictionary <int,int> count_of_triple_letters; // key = char1 + char2 * 32 + char3 * 32 * 32
SortedDictionary <int,int> count_of_quad_letters; // key = char1 + char2 * 32 + char3 * 32 * 32 + char4 * 32 * 32 * 32
Finally, a note on data types. You're using the decimal type. This is not an efficient type as there is no direct mapping to CPU native type and there is overhead in processing the data. Use a double instead, I think the precision will be sufficient. The most precise way will be to store the probability as two integers, the numerator and denominator and then do the division as late as possible.
The best approach here is to using sparse storage and pruning after each each 10000 character for example. Best storage strucutre in this case is prefix tree, it will allow fast calculation of probability, updating and sparse storage. You can find out more theory in this javadoc http://alias-i.com/lingpipe/docs/api/com/aliasi/lm/NGramProcessLM.html

Is there a simple method of converting an ordinal numeric string to its matching numeric value?

Does anyone know of a method to convert words like "first", "tenth" and "one hundredth" to their numeric equivalent?
Samples:
"first" -> 1,
"second" -> 2,
"tenth" -> 10,
"hundredth" -> 100
Any algorithm will suffice but I'm writing this in C#.
EDIT
It ain't pretty and only works with one word at a time but it suits my purposes. Maybe someone can improve it but I'm out of time.
public static int GetNumberFromOrdinalString(string inputString)
{
string[] ordinalNumberWords = { "", "first", "second", "third", "fourth", "fifth", "sixth", "seventh", "eighth", "ninth", "tenth", "eleventh", "twelfth", "thirteenth", "fourteenth", "fifteenth", "sixteenth", "seventeenth", "eighteenth", "nineteenth", "twentieth" };
string[] ordinalNumberWordsTens = { "", "tenth", "twentieth", "thirtieth", "fortieth", "fiftieth", "sixtieth", "seventieth", "eightieth", "ninetieth" };
string[] ordinalNumberWordsExtended = {"hundredth", "thousandth", "millionth", "billionth" };
if (inputString.IsNullOrEmpty() || inputString.Length < 5 || inputString.Contains(" ")) return 0;
if (ordinalNumberWords.Contains(inputString) || ordinalNumberWordsTens.Contains(inputString))
{
var outputMultiplier = ordinalNumberWords.Contains(inputString) ? 1 : 10;
var arrayToCheck = ordinalNumberWords.Contains(inputString) ? ordinalNumberWords : ordinalNumberWordsTens;
// Use the loop counter to get our output integer.
for (int x = 0; x < arrayToCheck.Count(); x++)
{
if (arrayToCheck[x] == inputString)
{
return x * outputMultiplier;
}
}
}
// Check if the number is one of our extended numbers and return the appropriate value.
if (ordinalNumberWordsExtended.Contains(inputString))
{
return inputString == ordinalNumberWordsExtended[0] ? 100 : inputString == ordinalNumberWordsExtended[1] ? 1000 : inputString == ordinalNumberWordsExtended[2] ? 1000000 : 1000000000;
}
return 0;
}
I've never given this much thought beyond I know the word "and" is supposed to be the transition from whole numbers to decimals. Like
One Hundred Ninety-Nine Dollars and Ten Cents
not
One Hundred and Ninety-Nine Dollars.
Anyways any potential solution would have to parse the input string, raise any exceptions or otherwise return the value.
But first you'd have to know "the rules" This seems to be very arbitrary and based on tradition but this gentleman seems as good a place as any to start:
Ask Dr. Math
I think you might end up having to map strings to values up to the maximum range you expect and then parse the string in order and place values as such. Since there's very little regular naming convention across and within order of magnitude, I don't think there's an elegant or easy way to parse a word to get its numeric value. Luckily, depending on the format, you probably only have to map every order of magnitude. For example, if you only expect numbers 0-100 and they are inputted as "ninety-nine" then you only need to map 0-9, then 10-100 in steps of 10.

Categories