I have some decimal data that I am pushing into a SharePoint list where it is to be viewed. I'd like to restrict the number of significant figures displayed in the result data based on my knowledge of the specific calculation. Sometimes it'll be 3, so 12345 will become 12300 and 0.012345 will become 0.0123. Occasionally it will be 4 or 5. Is there any convenient way to handle this?
See: RoundToSignificantFigures by "P Daddy".
I've combined his method with another one I liked.
Rounding to significant figures is a lot easier in TSQL where the rounding method is based on rounding position, not number of decimal places - which is the case with .Net math.round. You could round a number in TSQL to negative places, which would round at whole numbers - so the scaling isn't needed.
Also see this other thread. Pyrolistical's method is good.
The trailing zeros part of the problem seems like more of a string operation to me, so I included a ToString() extension method which will pad zeros if necessary.
using System;
using System.Globalization;
public static class Precision
{
// 2^-24
public const float FLOAT_EPSILON = 0.0000000596046448f;
// 2^-53
public const double DOUBLE_EPSILON = 0.00000000000000011102230246251565d;
public static bool AlmostEquals(this double a, double b, double epsilon = DOUBLE_EPSILON)
{
// ReSharper disable CompareOfFloatsByEqualityOperator
if (a == b)
{
return true;
}
// ReSharper restore CompareOfFloatsByEqualityOperator
return (System.Math.Abs(a - b) < epsilon);
}
public static bool AlmostEquals(this float a, float b, float epsilon = FLOAT_EPSILON)
{
// ReSharper disable CompareOfFloatsByEqualityOperator
if (a == b)
{
return true;
}
// ReSharper restore CompareOfFloatsByEqualityOperator
return (System.Math.Abs(a - b) < epsilon);
}
}
public static class SignificantDigits
{
public static double Round(this double value, int significantDigits)
{
int unneededRoundingPosition;
return RoundSignificantDigits(value, significantDigits, out unneededRoundingPosition);
}
public static string ToString(this double value, int significantDigits)
{
// this method will round and then append zeros if needed.
// i.e. if you round .002 to two significant figures, the resulting number should be .0020.
var currentInfo = CultureInfo.CurrentCulture.NumberFormat;
if (double.IsNaN(value))
{
return currentInfo.NaNSymbol;
}
if (double.IsPositiveInfinity(value))
{
return currentInfo.PositiveInfinitySymbol;
}
if (double.IsNegativeInfinity(value))
{
return currentInfo.NegativeInfinitySymbol;
}
int roundingPosition;
var roundedValue = RoundSignificantDigits(value, significantDigits, out roundingPosition);
// when rounding causes a cascading round affecting digits of greater significance,
// need to re-round to get a correct rounding position afterwards
// this fixes a bug where rounding 9.96 to 2 figures yeilds 10.0 instead of 10
RoundSignificantDigits(roundedValue, significantDigits, out roundingPosition);
if (Math.Abs(roundingPosition) > 9)
{
// use exponential notation format
// ReSharper disable FormatStringProblem
return string.Format(currentInfo, "{0:E" + (significantDigits - 1) + "}", roundedValue);
// ReSharper restore FormatStringProblem
}
// string.format is only needed with decimal numbers (whole numbers won't need to be padded with zeros to the right.)
// ReSharper disable FormatStringProblem
return roundingPosition > 0 ? string.Format(currentInfo, "{0:F" + roundingPosition + "}", roundedValue) : roundedValue.ToString(currentInfo);
// ReSharper restore FormatStringProblem
}
private static double RoundSignificantDigits(double value, int significantDigits, out int roundingPosition)
{
// this method will return a rounded double value at a number of signifigant figures.
// the sigFigures parameter must be between 0 and 15, exclusive.
roundingPosition = 0;
if (value.AlmostEquals(0d))
{
roundingPosition = significantDigits - 1;
return 0d;
}
if (double.IsNaN(value))
{
return double.NaN;
}
if (double.IsPositiveInfinity(value))
{
return double.PositiveInfinity;
}
if (double.IsNegativeInfinity(value))
{
return double.NegativeInfinity;
}
if (significantDigits < 1 || significantDigits > 15)
{
throw new ArgumentOutOfRangeException("significantDigits", value, "The significantDigits argument must be between 1 and 15.");
}
// The resulting rounding position will be negative for rounding at whole numbers, and positive for decimal places.
roundingPosition = significantDigits - 1 - (int)(Math.Floor(Math.Log10(Math.Abs(value))));
// try to use a rounding position directly, if no scale is needed.
// this is because the scale mutliplication after the rounding can introduce error, although
// this only happens when you're dealing with really tiny numbers, i.e 9.9e-14.
if (roundingPosition > 0 && roundingPosition < 16)
{
return Math.Round(value, roundingPosition, MidpointRounding.AwayFromZero);
}
// Shouldn't get here unless we need to scale it.
// Set the scaling value, for rounding whole numbers or decimals past 15 places
var scale = Math.Pow(10, Math.Ceiling(Math.Log10(Math.Abs(value))));
return Math.Round(value / scale, significantDigits, MidpointRounding.AwayFromZero) * scale;
}
}
This might do the trick:
double Input1 = 1234567;
string Result1 = Convert.ToDouble(String.Format("{0:G3}",Input1)).ToString("R0");
double Input2 = 0.012345;
string Result2 = Convert.ToDouble(String.Format("{0:G3}", Input2)).ToString("R6");
Changing the G3 to G4 produces the oddest result though.
It appears to round up the significant digits?
I ended up snagging some code from http://ostermiller.org/utils/SignificantFigures.java.html. It was in java, so I did a quick search/replace and some resharper reformatting to make the C# build. It seems to work nicely for my significant figure needs. FWIW, I removed his javadoc comments to make it more concise here, but the original code is documented quite nicely.
/*
* Copyright (C) 2002-2007 Stephen Ostermiller
* http://ostermiller.org/contact.pl?regarding=Java+Utilities
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* See COPYING.TXT for details.
*/
public class SignificantFigures
{
private String original;
private StringBuilder _digits;
private int mantissa = -1;
private bool sign = true;
private bool isZero = false;
private bool useScientificNotation = true;
public SignificantFigures(String number)
{
original = number;
Parse(original);
}
public SignificantFigures(double number)
{
original = Convert.ToString(number);
try
{
Parse(original);
}
catch (Exception nfe)
{
_digits = null;
}
}
public bool UseScientificNotation
{
get { return useScientificNotation; }
set { useScientificNotation = value; }
}
public int GetNumberSignificantFigures()
{
if (_digits == null) return 0;
return _digits.Length;
}
public SignificantFigures SetLSD(int place)
{
SetLMSD(place, Int32.MinValue);
return this;
}
public SignificantFigures SetLMSD(int leastPlace, int mostPlace)
{
if (_digits != null && leastPlace != Int32.MinValue)
{
int significantFigures = _digits.Length;
int current = mantissa - significantFigures + 1;
int newLength = significantFigures - leastPlace + current;
if (newLength <= 0)
{
if (mostPlace == Int32.MinValue)
{
original = "NaN";
_digits = null;
}
else
{
newLength = mostPlace - leastPlace + 1;
_digits.Length = newLength;
mantissa = leastPlace;
for (int i = 0; i < newLength; i++)
{
_digits[i] = '0';
}
isZero = true;
sign = true;
}
}
else
{
_digits.Length = newLength;
for (int i = significantFigures; i < newLength; i++)
{
_digits[i] = '0';
}
}
}
return this;
}
public int GetLSD()
{
if (_digits == null) return Int32.MinValue;
return mantissa - _digits.Length + 1;
}
public int GetMSD()
{
if (_digits == null) return Int32.MinValue;
return mantissa + 1;
}
public override String ToString()
{
if (_digits == null) return original;
StringBuilder digits = new StringBuilder(this._digits.ToString());
int length = digits.Length;
if ((mantissa <= -4 || mantissa >= 7 ||
(mantissa >= length &&
digits[digits.Length - 1] == '0') ||
(isZero && mantissa != 0)) && useScientificNotation)
{
// use scientific notation.
if (length > 1)
{
digits.Insert(1, '.');
}
if (mantissa != 0)
{
digits.Append("E" + mantissa);
}
}
else if (mantissa <= -1)
{
digits.Insert(0, "0.");
for (int i = mantissa; i < -1; i++)
{
digits.Insert(2, '0');
}
}
else if (mantissa + 1 == length)
{
if (length > 1 && digits[digits.Length - 1] == '0')
{
digits.Append('.');
}
}
else if (mantissa < length)
{
digits.Insert(mantissa + 1, '.');
}
else
{
for (int i = length; i <= mantissa; i++)
{
digits.Append('0');
}
}
if (!sign)
{
digits.Insert(0, '-');
}
return digits.ToString();
}
public String ToScientificNotation()
{
if (_digits == null) return original;
StringBuilder digits = new StringBuilder(this._digits.ToString());
int length = digits.Length;
if (length > 1)
{
digits.Insert(1, '.');
}
if (mantissa != 0)
{
digits.Append("E" + mantissa);
}
if (!sign)
{
digits.Insert(0, '-');
}
return digits.ToString();
}
private const int INITIAL = 0;
private const int LEADZEROS = 1;
private const int MIDZEROS = 2;
private const int DIGITS = 3;
private const int LEADZEROSDOT = 4;
private const int DIGITSDOT = 5;
private const int MANTISSA = 6;
private const int MANTISSADIGIT = 7;
private void Parse(String number)
{
int length = number.Length;
_digits = new StringBuilder(length);
int state = INITIAL;
int mantissaStart = -1;
bool foundMantissaDigit = false;
// sometimes we don't know if a zero will be
// significant or not when it is encountered.
// keep track of the number of them so that
// the all can be made significant if we find
// out that they are.
int zeroCount = 0;
int leadZeroCount = 0;
for (int i = 0; i < length; i++)
{
char c = number[i];
switch (c)
{
case '.':
{
switch (state)
{
case INITIAL:
case LEADZEROS:
{
state = LEADZEROSDOT;
}
break;
case MIDZEROS:
{
// we now know that these zeros
// are more than just trailing place holders.
for (int j = 0; j < zeroCount; j++)
{
_digits.Append('0');
}
zeroCount = 0;
state = DIGITSDOT;
}
break;
case DIGITS:
{
state = DIGITSDOT;
}
break;
default:
{
throw new Exception(
"Unexpected character '" + c + "' at position " + i
);
}
}
}
break;
case '+':
{
switch (state)
{
case INITIAL:
{
sign = true;
state = LEADZEROS;
}
break;
case MANTISSA:
{
state = MANTISSADIGIT;
}
break;
default:
{
throw new Exception(
"Unexpected character '" + c + "' at position " + i
);
}
}
}
break;
case '-':
{
switch (state)
{
case INITIAL:
{
sign = false;
state = LEADZEROS;
}
break;
case MANTISSA:
{
state = MANTISSADIGIT;
}
break;
default:
{
throw new Exception(
"Unexpected character '" + c + "' at position " + i
);
}
}
}
break;
case '0':
{
switch (state)
{
case INITIAL:
case LEADZEROS:
{
// only significant if number
// is all zeros.
zeroCount++;
leadZeroCount++;
state = LEADZEROS;
}
break;
case MIDZEROS:
case DIGITS:
{
// only significant if followed
// by a decimal point or nonzero digit.
mantissa++;
zeroCount++;
state = MIDZEROS;
}
break;
case LEADZEROSDOT:
{
// only significant if number
// is all zeros.
mantissa--;
zeroCount++;
state = LEADZEROSDOT;
}
break;
case DIGITSDOT:
{
// non-leading zeros after
// a decimal point are always
// significant.
_digits.Append(c);
}
break;
case MANTISSA:
case MANTISSADIGIT:
{
foundMantissaDigit = true;
state = MANTISSADIGIT;
}
break;
default:
{
throw new Exception(
"Unexpected character '" + c + "' at position " + i
);
}
}
}
break;
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
{
switch (state)
{
case INITIAL:
case LEADZEROS:
case DIGITS:
{
zeroCount = 0;
_digits.Append(c);
mantissa++;
state = DIGITS;
}
break;
case MIDZEROS:
{
// we now know that these zeros
// are more than just trailing place holders.
for (int j = 0; j < zeroCount; j++)
{
_digits.Append('0');
}
zeroCount = 0;
_digits.Append(c);
mantissa++;
state = DIGITS;
}
break;
case LEADZEROSDOT:
case DIGITSDOT:
{
zeroCount = 0;
_digits.Append(c);
state = DIGITSDOT;
}
break;
case MANTISSA:
case MANTISSADIGIT:
{
state = MANTISSADIGIT;
foundMantissaDigit = true;
}
break;
default:
{
throw new Exception(
"Unexpected character '" + c + "' at position " + i
);
}
}
}
break;
case 'E':
case 'e':
{
switch (state)
{
case INITIAL:
case LEADZEROS:
case DIGITS:
case LEADZEROSDOT:
case DIGITSDOT:
{
// record the starting point of the mantissa
// so we can do a substring to get it back later
mantissaStart = i + 1;
state = MANTISSA;
}
break;
default:
{
throw new Exception(
"Unexpected character '" + c + "' at position " + i
);
}
}
}
break;
default:
{
throw new Exception(
"Unexpected character '" + c + "' at position " + i
);
}
}
}
if (mantissaStart != -1)
{
// if we had found an 'E'
if (!foundMantissaDigit)
{
// we didn't actually find a mantissa to go with.
throw new Exception(
"No digits in mantissa."
);
}
// parse the mantissa.
mantissa += Convert.ToInt32(number.Substring(mantissaStart));
}
if (_digits.Length == 0)
{
if (zeroCount > 0)
{
// if nothing but zeros all zeros are significant.
for (int j = 0; j < zeroCount; j++)
{
_digits.Append('0');
}
mantissa += leadZeroCount;
isZero = true;
sign = true;
}
else
{
// a hack to catch some cases that we could catch
// by adding a ton of extra states. Things like:
// "e2" "+e2" "+." "." "+" etc.
throw new Exception(
"No digits in number."
);
}
}
}
public SignificantFigures SetNumberSignificantFigures(int significantFigures)
{
if (significantFigures <= 0)
throw new ArgumentException("Desired number of significant figures must be positive.");
if (_digits != null)
{
int length = _digits.Length;
if (length < significantFigures)
{
// number is not long enough, pad it with zeros.
for (int i = length; i < significantFigures; i++)
{
_digits.Append('0');
}
}
else if (length > significantFigures)
{
// number is too long chop some of it off with rounding.
bool addOne; // we need to round up if true.
char firstInSig = _digits[significantFigures];
if (firstInSig < '5')
{
// first non-significant digit less than five, round down.
addOne = false;
}
else if (firstInSig == '5')
{
// first non-significant digit equal to five
addOne = false;
for (int i = significantFigures + 1; !addOne && i < length; i++)
{
// if its followed by any non-zero digits, round up.
if (_digits[i] != '0')
{
addOne = true;
}
}
if (!addOne)
{
// if it was not followed by non-zero digits
// if the last significant digit is odd round up
// if the last significant digit is even round down
addOne = (_digits[significantFigures - 1] & 1) == 1;
}
}
else
{
// first non-significant digit greater than five, round up.
addOne = true;
}
// loop to add one (and carry a one if added to a nine)
// to the last significant digit
for (int i = significantFigures - 1; addOne && i >= 0; i--)
{
char digit = _digits[i];
if (digit < '9')
{
_digits[i] = (char) (digit + 1);
addOne = false;
}
else
{
_digits[i] = '0';
}
}
if (addOne)
{
// if the number was all nines
_digits.Insert(0, '1');
mantissa++;
}
// chop it to the correct number of figures.
_digits.Length = significantFigures;
}
}
return this;
}
public double ToDouble()
{
return Convert.ToDouble(original);
}
public static String Format(double number, int significantFigures)
{
SignificantFigures sf = new SignificantFigures(number);
sf.SetNumberSignificantFigures(significantFigures);
return sf.ToString();
}
}
I have a shorted answer to calculating significant figures of a number. Here is the code & the test results...
using System;
using System.Collections.Generic;
namespace ConsoleApplicationRound
{
class Program
{
static void Main(string[] args)
{
//char cDecimal = '.'; // for English cultures
char cDecimal = ','; // for German cultures
List<double> l_dValue = new List<double>();
ushort usSignificants = 5;
l_dValue.Add(0);
l_dValue.Add(0.000640589);
l_dValue.Add(-0.000640589);
l_dValue.Add(-123.405009);
l_dValue.Add(123.405009);
l_dValue.Add(-540);
l_dValue.Add(540);
l_dValue.Add(-540911);
l_dValue.Add(540911);
l_dValue.Add(-118.2);
l_dValue.Add(118.2);
l_dValue.Add(-118.18);
l_dValue.Add(118.18);
l_dValue.Add(-118.188);
l_dValue.Add(118.188);
foreach (double d in l_dValue)
{
Console.WriteLine("d = Maths.Round('" +
cDecimal + "', " + d + ", " + usSignificants +
") = " + Maths.Round(
cDecimal, d, usSignificants));
}
Console.Read();
}
}
}
The Maths class used is as follows:
using System;
using System.Text;
namespace ConsoleApplicationRound
{
class Maths
{
/// <summary>
/// The word "Window"
/// </summary>
private static String m_strZeros = "000000000000000000000000000000000";
/// <summary>
/// The minus sign
/// </summary>
public const char m_cDASH = '-';
/// <summary>
/// Determines the number of digits before the decimal point
/// </summary>
/// <param name="cDecimal">
/// Language-specific decimal separator
/// </param>
/// <param name="strValue">
/// Value to be scrutinised
/// </param>
/// <returns>
/// Nr. of digits before the decimal point
/// </returns>
private static ushort NrOfDigitsBeforeDecimal(char cDecimal, String strValue)
{
short sDecimalPosition = (short)strValue.IndexOf(cDecimal);
ushort usSignificantDigits = 0;
if (sDecimalPosition >= 0)
{
strValue = strValue.Substring(0, sDecimalPosition + 1);
}
for (ushort us = 0; us < strValue.Length; us++)
{
if (strValue[us] != m_cDASH) usSignificantDigits++;
if (strValue[us] == cDecimal)
{
usSignificantDigits--;
break;
}
}
return usSignificantDigits;
}
/// <summary>
/// Rounds to a fixed number of significant digits
/// </summary>
/// <param name="d">
/// Number to be rounded
/// </param>
/// <param name="usSignificants">
/// Requested significant digits
/// </param>
/// <returns>
/// The rounded number
/// </returns>
public static String Round(char cDecimal,
double d,
ushort usSignificants)
{
StringBuilder value = new StringBuilder(Convert.ToString(d));
short sDecimalPosition = (short)value.ToString().IndexOf(cDecimal);
ushort usAfterDecimal = 0;
ushort usDigitsBeforeDecimalPoint =
NrOfDigitsBeforeDecimal(cDecimal, value.ToString());
if (usDigitsBeforeDecimalPoint == 1)
{
usAfterDecimal = (d == 0)
? usSignificants
: (ushort)(value.Length - sDecimalPosition - 2);
}
else
{
if (usSignificants >= usDigitsBeforeDecimalPoint)
{
usAfterDecimal =
(ushort)(usSignificants - usDigitsBeforeDecimalPoint);
}
else
{
double dPower = Math.Pow(10,
usDigitsBeforeDecimalPoint - usSignificants);
d = dPower*(long)(d/dPower);
}
}
double dRounded = Math.Round(d, usAfterDecimal);
StringBuilder result = new StringBuilder();
result.Append(dRounded);
ushort usDigits = (ushort)result.ToString().Replace(
Convert.ToString(cDecimal), "").Replace(
Convert.ToString(m_cDASH), "").Length;
// Add lagging zeros, if necessary:
if (usDigits < usSignificants)
{
if (usAfterDecimal != 0)
{
if (result.ToString().IndexOf(cDecimal) == -1)
{
result.Append(cDecimal);
}
int i = (d == 0) ? 0 : Math.Min(0, usDigits - usSignificants);
result.Append(m_strZeros.Substring(0, usAfterDecimal + i));
}
}
return result.ToString();
}
}
}
Any answer with a shorter code?
You can get an elegant bit perfect rounding by using the GetBits method on Decimal and leveraging BigInteger to perform masking.
Some utils
public static int CountDigits
(BigInteger number) => ((int)BigInteger.Log10(number))+1;
private static readonly BigInteger[] BigPowers10
= Enumerable.Range(0, 100)
.Select(v => BigInteger.Pow(10, v))
.ToArray();
The main function
public static decimal RoundToSignificantDigits
(this decimal num,
short n)
{
var bits = decimal.GetBits(num);
var u0 = unchecked((uint)bits[0]);
var u1 = unchecked((uint)bits[1]);
var u2 = unchecked((uint)bits[2]);
var i = new BigInteger(u0)
+ (new BigInteger(u1) << 32)
+ (new BigInteger(u2) << 64);
var d = CountDigits(i);
var delta = d - n;
if (delta < 0)
return num;
var scale = BigPowers10[delta];
var div = i/scale;
var rem = i%scale;
var up = rem > scale/2;
if (up)
div += 1;
var shifted = div*scale;
bits[0] =unchecked((int)(uint) (shifted & BigUnitMask));
bits[1] =unchecked((int)(uint) (shifted>>32 & BigUnitMask));
bits[2] =unchecked((int)(uint) (shifted>>64 & BigUnitMask));
return new decimal(bits);
}
test case 0
public void RoundToSignificantDigits()
{
WMath.RoundToSignificantDigits(0.0012345m, 2).Should().Be(0.0012m);
WMath.RoundToSignificantDigits(0.0012645m, 2).Should().Be(0.0013m);
WMath.RoundToSignificantDigits(0.040000000000000008, 6).Should().Be(0.04);
WMath.RoundToSignificantDigits(0.040000010000000008, 6).Should().Be(0.04);
WMath.RoundToSignificantDigits(0.040000100000000008, 6).Should().Be(0.0400001);
WMath.RoundToSignificantDigits(0.040000110000000008, 6).Should().Be(0.0400001);
WMath.RoundToSignificantDigits(0.20000000000000004, 6).Should().Be(0.2);
WMath.RoundToSignificantDigits(0.10000000000000002, 6).Should().Be(0.1);
WMath.RoundToSignificantDigits(0.0, 6).Should().Be(0.0);
}
test case 1
public void RoundToSigFigShouldWork()
{
1.2m.RoundToSignificantDigits(1).Should().Be(1m);
0.01235668m.RoundToSignificantDigits(3).Should().Be(0.0124m);
0.01m.RoundToSignificantDigits(3).Should().Be(0.01m);
1.23456789123456789123456789m.RoundToSignificantDigits(4)
.Should().Be(1.235m);
1.23456789123456789123456789m.RoundToSignificantDigits(16)
.Should().Be(1.234567891234568m);
1.23456789123456789123456789m.RoundToSignificantDigits(24)
.Should().Be(1.23456789123456789123457m);
1.23456789123456789123456789m.RoundToSignificantDigits(27)
.Should().Be(1.23456789123456789123456789m);
}
I found this article doing a quick search on it. Basically this one converts to a string and goes by the characters in that array one at a time, till it reached the max. significance. Will this work?
The following code doesn't quite meet the spec, since it doesn't try to round anything to the left of the decimal point. But it's simpler than anything else presented here (so far). I was quite surprised that C# doesn't have a built-in method to handle this.
static public string SignificantDigits(double d, int digits=10)
{
int magnitude = (d == 0.0) ? 0 : (int)Math.Floor(Math.Log10(Math.Abs(d))) + 1;
digits -= magnitude;
if (digits < 0)
digits = 0;
string fmt = "f" + digits.ToString();
return d.ToString(fmt);
}
This method is dead simple and works with any number, positive or negative, and only uses a single transcendental function (Log10). The only difference (which may/may-not matter) is that it will not round the integer component. This is perfect however for currency processing where you know the limits are within certain bounds, because you can use doubles for much faster processing than the dreadfully slow Decimal type.
public static double ToDecimal( this double x, int significantFigures = 15 ) {
// determine # of digits before & after the decimal
int digitsBeforeDecimal = (int)x.Abs().Log10().Ceil().Max( 0 ),
digitsAfterDecimal = (significantFigures - digitsBeforeDecimal).Max( 0 );
// round it off
return x.Round( digitsAfterDecimal );
}
As I remember it "significant figures" means the number of digits after the dot separator so 3 significant digits for 0.012345 would be 0.012 and not 0.0123, but that really doesnt matter for the solution.
I also understand that you want to "nullify" the last digits to a certain degree if the number is > 1. You write that 12345 would become 12300 but im not sure whether you want 123456 to become 1230000 or 123400 ? My solution does the last. Instead of calculating the factor you could ofcourse make a small initialized array if you only have a couple of variations.
private static string FormatToSignificantFigures(decimal number, int amount)
{
if (number > 1)
{
int factor = Factor(amount);
return ((int)(number/factor)*factor).ToString();
}
NumberFormatInfo nfi = new CultureInfo("en-US", false).NumberFormat;
nfi.NumberDecimalDigits = amount;
return(number.ToString("F", nfi));
}
private static int Factor(int x)
{
return DoCalcFactor(10, x-1);
}
private static int DoCalcFactor(int x, int y)
{
if (y == 1) return x;
return 10*DoCalcFactor(x, y - 1);
}
Kind regards
Carsten
Related
I have 2 strings
string a = "foo bar";
string b = "bar foo";
and I want to detect the changes from a to b. What characters do I have to change, to get from a to b?
I think there must be a iteration over each character and detect if it was added, removed or remained equal. So this is my exprected result
'f' Remove
'o' Remove
'o' Remove
' ' Remove
'b' Equal
'a' Equal
'r' Equal
' ' Add
'f' Add
'o' Add
'o' Add
class and enum for the result:
public enum Operation { Add,Equal,Remove };
public class Difference
{
public Operation op { get; set; }
public char c { get; set; }
}
Here is my solution but the "Remove" case is not clear to me how the code has to look like
public static List<Difference> CalculateDifferences(string left, string right)
{
int count = 0;
List<Difference> result = new List<Difference>();
foreach (char ch in left)
{
int index = right.IndexOf(ch, count);
if (index == count)
{
count++;
result.Add(new Difference() { c = ch, op = Operation.Equal });
}
else if (index > count)
{
string add = right.Substring(count, index - count);
result.AddRange(add.Select(x => new Difference() { c = x, op = Operation.Add }));
count += add.Length;
}
else
{
//Remove?
}
}
return result;
}
How does the code have to look like for removed characters?
Update - added a few more examples
example 1:
string a = "foobar";
string b = "fooar";
expected result:
'f' Equal
'o' Equal
'o' Equal
'b' Remove
'a' Equal
'r' Equal
example 2:
string a = "asdfghjk";
string b = "wsedrftr";
expected result:
'a' Remove
'w' Add
's' Equal
'e' Add
'd' Equal
'r' Add
'f' Equal
'g' Remove
'h' Remove
'j' Remove
'k' Remove
't' Add
'r' Add
Update:
Here is a comparison between Dmitry's and ingen's answer: https://dotnetfiddle.net/MJQDAO
You are looking for (minimum) edit distance / (minimum) edit sequence. You can find the theory of the process here:
https://web.stanford.edu/class/cs124/lec/med.pdf
Let's implement (simplest) Levenstein Distance / Sequence algorithm (for details see https://en.wikipedia.org/wiki/Levenshtein_distance). Let's start from helper classes (I've changed a bit your implementation of them):
public enum EditOperationKind : byte {
None, // Nothing to do
Add, // Add new character
Edit, // Edit character into character (including char into itself)
Remove, // Delete existing character
};
public struct EditOperation {
public EditOperation(char valueFrom, char valueTo, EditOperationKind operation) {
ValueFrom = valueFrom;
ValueTo = valueTo;
Operation = valueFrom == valueTo ? EditOperationKind.None : operation;
}
public char ValueFrom { get; }
public char ValueTo {get ;}
public EditOperationKind Operation { get; }
public override string ToString() {
switch (Operation) {
case EditOperationKind.None:
return $"'{ValueTo}' Equal";
case EditOperationKind.Add:
return $"'{ValueTo}' Add";
case EditOperationKind.Remove:
return $"'{ValueFrom}' Remove";
case EditOperationKind.Edit:
return $"'{ValueFrom}' to '{ValueTo}' Edit";
default:
return "???";
}
}
}
As far as I can see from the examples provided we don't have any edit operation, but add + remove; that's why I've put editCost = 2 when insertCost = 1, int removeCost = 1 (in case of tie: insert + remove vs. edit we put insert + remove).
Now we are ready to implement Levenstein algorithm:
public static EditOperation[] EditSequence(
string source, string target,
int insertCost = 1, int removeCost = 1, int editCost = 2) {
if (null == source)
throw new ArgumentNullException("source");
else if (null == target)
throw new ArgumentNullException("target");
// Forward: building score matrix
// Best operation (among insert, update, delete) to perform
EditOperationKind[][] M = Enumerable
.Range(0, source.Length + 1)
.Select(line => new EditOperationKind[target.Length + 1])
.ToArray();
// Minimum cost so far
int[][] D = Enumerable
.Range(0, source.Length + 1)
.Select(line => new int[target.Length + 1])
.ToArray();
// Edge: all removes
for (int i = 1; i <= source.Length; ++i) {
M[i][0] = EditOperationKind.Remove;
D[i][0] = removeCost * i;
}
// Edge: all inserts
for (int i = 1; i <= target.Length; ++i) {
M[0][i] = EditOperationKind.Add;
D[0][i] = insertCost * i;
}
// Having fit N - 1, K - 1 characters let's fit N, K
for (int i = 1; i <= source.Length; ++i)
for (int j = 1; j <= target.Length; ++j) {
// here we choose the operation with the least cost
int insert = D[i][j - 1] + insertCost;
int delete = D[i - 1][j] + removeCost;
int edit = D[i - 1][j - 1] + (source[i - 1] == target[j - 1] ? 0 : editCost);
int min = Math.Min(Math.Min(insert, delete), edit);
if (min == insert)
M[i][j] = EditOperationKind.Add;
else if (min == delete)
M[i][j] = EditOperationKind.Remove;
else if (min == edit)
M[i][j] = EditOperationKind.Edit;
D[i][j] = min;
}
// Backward: knowing scores (D) and actions (M) let's building edit sequence
List<EditOperation> result =
new List<EditOperation>(source.Length + target.Length);
for (int x = target.Length, y = source.Length; (x > 0) || (y > 0);) {
EditOperationKind op = M[y][x];
if (op == EditOperationKind.Add) {
x -= 1;
result.Add(new EditOperation('\0', target[x], op));
}
else if (op == EditOperationKind.Remove) {
y -= 1;
result.Add(new EditOperation(source[y], '\0', op));
}
else if (op == EditOperationKind.Edit) {
x -= 1;
y -= 1;
result.Add(new EditOperation(source[y], target[x], op));
}
else // Start of the matching (EditOperationKind.None)
break;
}
result.Reverse();
return result.ToArray();
}
Demo:
var sequence = EditSequence("asdfghjk", "wsedrftr");
Console.Write(string.Join(Environment.NewLine, sequence));
Outcome:
'a' Remove
'w' Add
's' Equal
'e' Add
'd' Equal
'r' Add
'f' Equal
'g' Remove
'h' Remove
'j' Remove
'k' Remove
't' Add
'r' Add
I'll go out on a limb here and provide an algorithm that's not the most efficient, but is easy to reason about.
Let's cover some ground first:
1) Order matters
string before = "bar foo"
string after = "foo bar"
Even though "bar" and "foo" occur in both strings, "bar" will need to be removed and added again later. This also tells us it's the after string that gives us the order of chars we're interested in, we want "foo" first.
2) Order over count
Another way to look at it, is that some chars may never get their turn.
string before = "abracadabra"
string after = "bar bar"
Only the bold chars of "bar bar", get their say in "abracadabra". Even though we've got two b's in both strings, only the first one counts. By the time we get to the second b in "bar bar" the second b in "abracadabra" has already been passed, when we were looking for the first occurrence of 'r'.
3) Barriers
Barriers are the chars that exist in both strings, taking order and count into consideration. This already suggests a set might not be the most appropriate data structure, as we would lose count.
For an input
string before = "pinata"
string after = "accidental"
We get (pseudocode)
var barriers = { 'a', 't', 'a' }
"pinata"
"accidental"
Let's follow the execution flow:
'a' is the first barrier, it's also the first char of after so everything prepending the first 'a' in before can be removed. "pinata" -> "ata"
the second barrier is 't', it's not at the next position in our after string, so we can insert everything in between. "ata" -> "accidenta"
the third barrier 'a' is already at the next position, so we can move to the next barrier without doing any real work.
there are no more barriers, but our string length is still less than that of after, so there will be some post processing. "accidenta" -> "accidental"
Note 'i' and 'n' don't get to play, again, order over count.
Implementation
We've established that order and count matter, a Queue comes to mind.
static public List<Difference> CalculateDifferences(string before, string after)
{
List<Difference> result = new List<Difference>();
Queue<char> barriers = new Queue<char>();
#region Preprocessing
int index = 0;
for (int i = 0; i < after.Length; i++)
{
// Look for the first match starting at index
int match = before.IndexOf(after[i], index);
if (match != -1)
{
barriers.Enqueue(after[i]);
index = match + 1;
}
}
#endregion
#region Queue Processing
index = 0;
while (barriers.Any())
{
char barrier = barriers.Dequeue();
// Get the offset to the barrier in both strings,
// ignoring the part that's already been handled
int offsetBefore = before.IndexOf(barrier, index) - index;
int offsetAfter = after.IndexOf(barrier, index) - index;
// Remove prefix from 'before' string
if (offsetBefore > 0)
{
RemoveChars(before.Substring(index, offsetBefore), result);
before = before.Substring(offsetBefore);
}
// Insert prefix from 'after' string
if (offsetAfter > 0)
{
string substring = after.Substring(index, offsetAfter);
AddChars(substring, result);
before = before.Insert(index, substring);
index += substring.Length;
}
// Jump over the barrier
KeepChar(barrier, result);
index++;
}
#endregion
#region Post Queue processing
if (index < before.Length)
{
RemoveChars(before.Substring(index), result);
}
if (index < after.Length)
{
AddChars(after.Substring(index), result);
}
#endregion
return result;
}
static private void KeepChar(char barrier, List<Difference> result)
{
result.Add(new Difference()
{
c = barrier,
op = Operation.Equal
});
}
static private void AddChars(string substring, List<Difference> result)
{
result.AddRange(substring.Select(x => new Difference()
{
c = x,
op = Operation.Add
}));
}
static private void RemoveChars(string substring, List<Difference> result)
{
result.AddRange(substring.Select(x => new Difference()
{
c = x,
op = Operation.Remove
}));
}
I tested with 3 examples above, and it returns the expected result properly and perfectly.
int flag = 0;
int flag_2 = 0;
string a = "asdfghjk";
string b = "wsedrftr";
char[] array_a = a.ToCharArray();
char[] array_b = b.ToCharArray();
for (int i = 0,j = 0, n= 0; i < array_b.Count(); i++)
{
//Execute 1 time until reach first equal character
if(i == 0 && a.Contains(array_b[0]))
{
while (array_a[n] != array_b[0])
{
Console.WriteLine(String.Concat(array_a[n], " : Remove"));
n++;
}
Console.WriteLine(String.Concat(array_a[n], " : Equal"));
n++;
}
else if(i == 0 && !a.Contains(array_b[0]))
{
Console.WriteLine(String.Concat(array_a[n], " : Remove"));
n++;
Console.WriteLine(String.Concat(array_b[0], " : Add"));
}
else
{
if(n < array_a.Count())
{
if (array_a[n] == array_b[i])
{
Console.WriteLine(String.Concat(array_a[n], " : Equal"));
n++;
}
else
{
flag = 0;
for (int z = n; z < array_a.Count(); z++)
{
if (array_a[z] == array_b[i])
{
flag = 1;
break;
}
}
if (flag == 0)
{
flag_2 = 0;
for (int aa = i; aa < array_b.Count(); aa++)
{
for(int bb = n; bb < array_a.Count(); bb++)
{
if (array_b[aa] == array_a[bb])
{
flag_2 = 1;
break;
}
}
}
if(flag_2 == 1)
{
Console.WriteLine(String.Concat(array_b[i], " : Add"));
}
else
{
for (int z = n; z < array_a.Count(); z++)
{
Console.WriteLine(String.Concat(array_a[z], " : Remove"));
n++;
}
Console.WriteLine(String.Concat(array_b[i], " : Add"));
}
}
else
{
Console.WriteLine(String.Concat(array_a[n], " : Remove"));
i--;
n++;
}
}
}
else
{
Console.WriteLine(String.Concat(array_b[i], " : Add"));
}
}
}//end for
MessageBox.Show("Done");
//OUTPUT CONSOLE:
/*
a : Remove
w : Add
s : Equal
e : Add
d : Equal
r : Add
f : Equal
g : Remove
h : Remove
j : Remove
k : Remove
t : Add
r : Add
*/
Here might be another solution, full code and commented.
However the result of your first original example is inverted :
class Program
{
enum CharState
{
Add,
Equal,
Remove
}
struct CharResult
{
public char c;
public CharState state;
}
static void Main(string[] args)
{
string a = "asdfghjk";
string b = "wsedrftr";
while (true)
{
Console.WriteLine("Enter string a (enter to quit) :");
a = Console.ReadLine();
if (a == string.Empty)
break;
Console.WriteLine("Enter string b :");
b = Console.ReadLine();
List<CharResult> result = calculate(a, b);
DisplayResults(result);
}
Console.WriteLine("Press a key to exit");
Console.ReadLine();
}
static List<CharResult> calculate(string a, string b)
{
List<CharResult> res = new List<CharResult>();
int i = 0, j = 0;
char[] array_a = a.ToCharArray();
char[] array_b = b.ToCharArray();
while (i < array_a.Length && j < array_b.Length)
{
//For the current char in a, we check for the equal in b
int index = b.IndexOf(array_a[i], j);
if (index < 0) //not found, this char should be removed
{
res.Add(new CharResult() { c = array_a[i], state = CharState.Remove });
i++;
}
else
{
//we add all the chars between B's current index and the index
while (j < index)
{
res.Add(new CharResult() { c = array_b[j], state = CharState.Add });
j++;
}
//then we say the current is the same
res.Add(new CharResult() { c = array_a[i], state = CharState.Equal });
i++;
j++;
}
}
while (i < array_a.Length)
{
//b is now empty, we remove the remains
res.Add(new CharResult() { c = array_a[i], state = CharState.Remove });
i++;
}
while (j < array_b.Length)
{
//a has been treated, we add the remains
res.Add(new CharResult() { c = array_b[j], state = CharState.Add });
j++;
}
return res;
}
static void DisplayResults(List<CharResult> results)
{
foreach (CharResult r in results)
{
Console.WriteLine($"'{r.c}' - {r.state}");
}
}
}
If you want to have a precise comparison between two strings, you must read and understand Levenshtein Distance. by using this algorithm you can precisely calculate rate of similarity between two string and also you can backtrack the algorithm to get the chain of changing on the second string. this algorithm is a important metric for Natural Language Processing also.
there are some other benefits and it's need time to learn.
in this link there is a C# version of Levenshtein Distance :
https://www.dotnetperls.com/levenshtein
I have to make a string which consists a string like - AAA0009, and once it reaches AAA0009, it will generate AA0010 to AAA0019 and so on.... till AAA9999 and when it will reach to AAA9999, it will give AAB0000 to AAB9999 and so on till ZZZ9999.
I want to use static class and static variables so that it can auto increment by itself on every hit.
I have tried some but not even close, so help me out thanks.
Thanks for being instructive I was trying as I Said already but anyways you already want to put negatives over there without even knowing the thing:
Code:
public class GenerateTicketNumber
{
private static int num1 = 0;
public static string ToBase36()
{
const string base36 = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
var sb = new StringBuilder(9);
do
{
sb.Insert(0, base36[(byte)(num1 % 36)]);
num1 /= 36;
} while (num1 != 0);
var paddedString = "#T" + sb.ToString().PadLeft(8, '0');
num1 = num1 + 1;
return paddedString;
}
}
above is the code. this will generate a sequence but not the way I want anyways will use it and thanks for help.
Though there's already an accepted answer, I would like to share this one.
P.S. I do not claim that this is the best approach, but in my previous work we made something similar using Azure Table Storage which is a no sql database (FYI) and it works.
1.) Create a table to store your running ticket number.
public class TicketNumber
{
public string Type { get; set; } // Maybe you want to have different types of ticket?
public string AlphaPrefix { get; set; }
public string NumericPrefix { get; set; }
public TicketNumber()
{
this.AlphaPrefix = "AAA";
this.NumericPrefix = "0001";
}
public void Increment()
{
int num = int.Parse(this.NumericPrefix);
if (num + 1 >= 9999)
{
num = 1;
int i = 2; // We are assuming that there are only 3 characters
bool isMax = this.AlphaPrefix == "ZZZ";
if (isMax)
{
this.AlphaPrefix = "AAA"; // reset
}
else
{
while (this.AlphaPrefix[i] == 'Z')
{
i--;
}
char iChar = this.AlphaPrefix[i];
StringBuilder sb = new StringBuilder(this.AlphaPrefix);
sb[i] = (char)(iChar + 1);
this.AlphaPrefix = sb.ToString();
}
}
else
{
num++;
}
this.NumericPrefix = num.ToString().PadLeft(4, '0');
}
public override string ToString()
{
return this.AlphaPrefix + this.NumericPrefix;
}
}
2.) Make sure you perform row-level locking and issue an error when it fails.
Here's an oracle syntax:
SELECT * FROM TICKETNUMBER WHERE TYPE = 'TYPE' FOR UPDATE NOWAIT;
This query locks the row and returns an error if the row is currently locked by another session.
We need this to make sure that even if you have millions of users generating a ticket number, it will not mess up the sequence.
Just make sure to save the new ticket number before you perform a COMMIT.
I forgot the MSSQL version of this but I recall using WITH (ROWLOCK) or something. Just google it.
3.) Working example:
static void Main()
{
TicketNumber ticketNumber = new TicketNumber();
ticketNumber.AlphaPrefix = "ZZZ";
ticketNumber.NumericPrefix = "9999";
for (int i = 0; i < 10; i++)
{
Console.WriteLine(ticketNumber);
ticketNumber.Increment();
}
Console.Read();
}
Output:
Looking at your code that you've provided, it seems that you're backing this with a number and just want to convert that to a more user-friendly text representation.
You could try something like this:
private static string ValueToId(int value)
{
var parts = new List<string>();
int numberPart = value % 10000;
parts.Add(numberPart.ToString("0000"));
value /= 10000;
for (int i = 0; i < 3 || value > 0; ++i)
{
parts.Add(((char)(65 + (value % 26))).ToString());
value /= 26;
}
return string.Join(string.Empty, parts.AsEnumerable().Reverse().ToArray());
}
It will take the first 4 characters and use them as is, and then for the remainder of the value if will convert it into characters A-Z.
So 9999 becomes AAA9999, 10000 becomes AAB0000, and 270000 becomes ABB0000.
If the number is big enough that it exceeds 3 characters, it will add more letters at the start.
Here's an example of how you could go about implementing it
void Main()
{
string template = #"AAAA00";
var templateChars = template.ToCharArray();
for (int i = 0; i < 100000; i++)
{
templateChars = IncrementCharArray(templateChars);
Console.WriteLine(string.Join("",templateChars ));
}
}
public static char Increment(char val)
{
if(val == '9') return 'A';
if(val == 'Z') return '0';
return ++val;
}
public static char[] IncrementCharArray(char[] val)
{
if (val.All(chr => chr == 'Z'))
{
var newArray = new char[val.Length + 1];
for (int i = 0; i < newArray.Length; i++)
{
newArray[i] = '0';
}
return newArray;
}
int length = val.Length;
while (length > -1)
{
char lastVal = val[--length];
val[length] = Increment(lastVal);
if ( val[length] != '0') break;
}
return val;
}
If the string passed in already has 3 digits at the end then return unchanged. If the string passed in does not have 3 digits at the end then need to insert zeros before any digits at the end to have 3 digits.
I have done coding where i had put some logic in private static string stringCleaner(string inputString) to implement but its giving this error:
Test:'A12' Expected:'A012' Exception:Index was outside the bounds of the array.
Test:'A12345' Expected:'A12345' Exception:Index was outside the bounds of the array.
Test:'A1B3' Expected:'A1B003' Exception:Index was outside the bounds of the array.
Test:'' Expected:'000' Exception:Object reference not set to an instance of an object.
Test:'' Expected:'000' Actual:'000' Result:Pass
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConvertToCamelCaseCS
{
class Program
{
static void Main(string[] args)
{
List<string[]> testValues = new List<string[]>()
{
new string[]{"A12","A012"},
new string[]{"A12345","A12345"},
new string[]{"A1B3","A1B003"},
new string[]{null, "000"},
new string[]{"", "000"}
};
foreach (string[] testValue in testValues)
{
testStringCleaner(testValue[0], testValue[1]);
}
Console.ReadLine();
}
private static void testStringCleaner(string inputString, string expectedString)
{
try
{
String actualString = stringCleaner(inputString);
String passOrFail = (actualString == expectedString) ? "Pass" : "Fail";
Console.WriteLine("Test:'{0}' Expected:'{1}' Actual:'{2}' Result:{3}", inputString, expectedString, actualString, passOrFail);
}
catch (Exception ex)
{
Console.WriteLine("Test:'{0}' Expected:'{1}' Exception:{2}", inputString, expectedString, ex.Message);
}
}
private static string stringCleaner(string inputString)
{
string result = inputString;
int lengthOfString = result.Length;
int changeIndex = 0;
if (lengthOfString == 0)
{
result = "000";
}
else
{
for (int i = lengthOfString; i >= lengthOfString - 2; i--)
{
char StrTOChar = (char)result[i];
int CharToInt = (int)StrTOChar;
if (CharToInt >= 65 && CharToInt <= 122)
{
changeIndex = i;
break;
}
}
if (lengthOfString == changeIndex + 3)
{
return result;
}
else
{
if (changeIndex == lengthOfString)
{
return result = result + "000";
}
else if (changeIndex + 1 == lengthOfString)
{
return result = result.Substring(0, changeIndex) + "00" + result.Substring(changeIndex + 1, lengthOfString);
}
else if(changeIndex+2==lengthOfString)
{
return result = result.Substring(0, changeIndex) + "0" + result.Substring(changeIndex + 1, lengthOfString);
}
}
}
return result;
}
}
}
You are overcomplicating this a lot from what I can tell among this somewhat confusing question and code.
I would use substring to extract the last 3 characters, and then check that string from the back whether it is a digit using Char.IsDigit. Depending on when you run into a non-digit you add a certain amount of zero using simple string concatenation.
Perhaps try to rewrite your code from scratch now that you probably have a better idea of how to do this.
char StrTOChar = (char)result[i];
Your problem is in this line. (Line: 60)
You used i that starts from result.Length. And, results[result.Length] is outside of the bounds of the array. You must use it lower than the length of the array.
Let's implement:
private static string stringCleaner(string value) {
// let's not hardcode magic values: 3, "000" etc. but a have a constant
const int digits_at_least = 3;
// special case: null or empty string
if (string.IsNullOrEmpty(value))
return new string('0', digits_at_least);
int digits = 0;
// let's count digits starting from the end
// && digits < digits_at_least - do not loop if we have enough digits
// (value[i] >= '0' && value[i] <= '9') - we want 0..9 digits only,
// not unicode digits (e.g. Persian ones) - char.IsDigit
for (int i = value.Length - 1; i >= 0 && digits < digits_at_least; --i)
if (value[i] >= '0' && value[i] <= '9')
digits += 1;
else
break;
if (digits >= digits_at_least) // we have enough digits, return as it is
return value;
else
return value.Substring(0, value.Length - digits) +
new string('0', digits_at_least - digits) + // inserting zeros
value.Substring(value.Length - digits);
}
Tests:
using System.Linq;
...
var testValues = new string[][] {
new string[]{"A12","A012"},
new string[]{"A12345","A12345"},
new string[]{"A1B3","A1B003"},
new string[]{null, "000"},
new string[]{"", "000"}
};
// Failed tests
var failed = testValues
.Where(test => test[1] != stringCleaner(test[0]))
.Select(test =>
$"stringCleaner ({test[0]}) == {stringCleaner(test[0])} expected {test[1]}");
string failedReport = string.Join(Environment.NewLine, failed);
// All failed tests
Console.WriteLine(failedReport);
// All tests and their results
var allTests = testValues
.Select(test => new {
argument = test[0],
expected = test[1],
actual = stringCleaner(test[0]),
})
.Select(test => $"{(test.expected == test.actual ? "passed" : $"failed: f({test.argument}) = {test.actual} expected {test.expected}")}");
string allReport = string.Join(Environment.NewLine, allTests);
Console.WriteLine(allReport);
Outcome (no failedReport and all tests passed):
passed
passed
passed
passed
passed
I have a c# class like so
internal class QueuedMinimumNumberFinder : ConcurrentQueue<int>
{
private readonly string _minString;
public QueuedMinimumNumberFinder(string number, int takeOutAmount)
{
if (number.Length < takeOutAmount)
{
throw new Exception("Error *");
}
var queueIndex = 0;
var queueAmount = number.Length - takeOutAmount;
var numQueue = new ConcurrentQueue<int>(number.ToCharArray().Where(m => (int) Char.GetNumericValue(m) != 0).Select(m=>(int)Char.GetNumericValue(m)).OrderBy(m=>m));
var zeroes = number.Length - numQueue.Count;
while (queueIndex < queueAmount)
{
int next;
if (queueIndex == 0)
{
numQueue.TryDequeue(out next);
Enqueue(next);
} else
{
if (zeroes > 0)
{
Enqueue(0);
zeroes--;
} else
{
numQueue.TryDequeue(out next);
Enqueue(next);
}
}
queueIndex++;
}
var builder = new StringBuilder();
while (Count > 0)
{
int next = 0;
TryDequeue(out next);
builder.Append(next.ToString());
}
_minString = builder.ToString();
}
public override string ToString() { return _minString; }
}
The point of the program is to find the minimum possible integer that can be made by taking out any x amount of characters from a string(example 100023 is string, if you take out any 3 letters, the minimum int created would be 100). My question is, is this the correct way to do this? Is there a better data structure that can be used for this problem?
First Edit:
Here's how it looks now
internal class QueuedMinimumNumberFinder
{
private readonly string _minString;
public QueuedMinimumNumberFinder(string number, int takeOutAmount)
{
var queue = new Queue<int>();
if (number.Length < takeOutAmount)
{
throw new Exception("Error *");
}
var queueIndex = 0;
var queueAmount = number.Length - takeOutAmount;
var numQueue = new List<int>(number.Where(m=>(int)Char.GetNumericValue(m)!=0).Select(m=>(int)Char.GetNumericValue(m))).ToList();
var zeroes = number.Length - numQueue.Count;
while (queueIndex < queueAmount)
{
if (queueIndex == 0)
{
var nextMin = numQueue.Min();
numQueue.Remove(nextMin);
queue.Enqueue(nextMin);
} else
{
if (zeroes > 1)
{
queue.Enqueue(0);
zeroes--;
} else
{
var nextMin = numQueue.Min();
numQueue.Remove(nextMin);
queue.Enqueue(nextMin);
}
}
queueIndex++;
}
var builder = new StringBuilder();
while (queue.Count > 0)
{
builder.Append(queue.Dequeue().ToString());
}
_minString = builder.ToString();
}
public override string ToString() { return _minString; }
}
A pretty simple and efficient implementation can be made, once you realize that your input string digits map to the domain of only 10 possible values: '0' .. '9'.
This can be encoded as the number of occurrences of a specific digit in your input string using a simple array of 10 integers: var digit_count = new int[10];
#MasterGillBates describes this idea in his answer.
You can then regard this array as your priority queue from which you can dequeue the characters you need by iteratively removing the lowest available character (decreasing its occurrence count in the array).
The code sample below provides an example implementation for this idea.
public static class MinNumberSolver
{
public static string GetMinString(string number, int takeOutAmount)
{
// "Add" the string by simply counting digit occurrance frequency.
var digit_count = new int[10];
foreach (var c in number)
if (char.IsDigit(c))
digit_count[c - '0']++;
// Now remove them one by one in lowest to highest order.
// For the first character we skip any potential leading 0s
var selected = new char[takeOutAmount];
var start_index = 1;
selected[0] = TakeLowest(digit_count, ref start_index);
// For the rest we start in digit order at '0' first.
start_index = 0;
for (var i = 0; i < takeOutAmount - 1; i++)
selected[1 + i] = TakeLowest(digit_count, ref start_index);
// And return the result.
return new string(selected);
}
private static char TakeLowest(int[] digit_count, ref int start_index)
{
for (var i = start_index; i < digit_count.Length; i++)
{
if (digit_count[i] > 0)
{
start_index = ((--digit_count[i] > 0) ? i : i + 1);
return (char)('0' + i);
}
}
throw new InvalidDataException("Input string does not have sufficient digits");
}
}
Just keep a count of how many times each digit appears. An array of size 10 will do. Count[i] gives the count of digit i.
Then pick the smallest non-zero i first, then pick the smallest etc and form your number.
Here's my solution using LINQ:
public string MinimumNumberFinder(string number, int takeOutAmount)
{
var ordered = number.OrderBy(n => n);
var nonZero = ordered.SkipWhile(n => n == '0');
var zero = ordered.TakeWhile(n => n == '0');
var result = nonZero.Take(1)
.Concat(zero)
.Concat(nonZero.Skip(1))
.Take(number.Length - takeOutAmount);
return new string(result.ToArray());
}
You could place every integer into a list and find all possible sequences of these values. From the list of sequences, you could sort through taking only the sets which have the number of integers you want. From there, you can write a quick function which parses a sequence into an integer. Next, you could store all of your parsed sequences into an array or an other data structure and sort based on value, which will allow you to select the minimum number from the data structure. There may be simpler ways to do this, but this will definitely work and gives you options as far as how many digits you want your number to have.
If I'm understanding this correctly, why don't you just pick out your numbers starting with the smallest number greater than zero. Then pick out all zeroes, then any remaining number if all the zeroes are picked up. This is all depending on the length of your ending result
In your example you have a 6 digit number and you want to pick out 3 digits. This means you'll only have 3 digits left. If it was a 10 digit number, then you would end up with a 7 digit number, etc...
So have an algorithm that knows the length of your starting number, how many digits you plan on removing, and the length of your ending number. Then just pick out the numbers.
This is just quick and dirty code:
string startingNumber = "9999903040404"; // "100023";
int numberOfCharactersToRemove = 3;
string endingNumber = string.Empty;
int endingNumberLength = startingNumber.Length - numberOfCharactersToRemove;
while (endingNumber.Length < endingNumberLength)
{
if (string.IsNullOrEmpty(endingNumber))
{
// Find the smallest digit in the starting number
for (int i = 1; i <= 9; i++)
{
if (startingNumber.Contains(i.ToString()))
{
endingNumber += i.ToString();
startingNumber = startingNumber.Remove(startingNumber.IndexOf(i.ToString()), 1);
break;
}
}
}
else if (startingNumber.Contains("0"))
{
// Add any zeroes
endingNumber += "0";
startingNumber = startingNumber.Remove(startingNumber.IndexOf("0"), 1);
}
else
{
// Add any remaining numbers from least to greatest
for (int i = 1; i <= 9; i++)
{
if (startingNumber.Contains(i.ToString()))
{
endingNumber += i.ToString();
startingNumber = startingNumber.Remove(startingNumber.IndexOf(i.ToString()), 1);
break;
}
}
}
}
Console.WriteLine(endingNumber);
100023 starting number resulted in 100 being the end result
9999903040404 starting number resulted in 3000044499 being the end result
Here's my version to fix this problem:
DESIGN:
You can sort your list using a binary tree , there are a lot of
implementations , I picked this one
Then you can keep track of the number of the Zeros you have in your
string Finally you will end up with two lists: I named one
SortedDigitsList and the other one ZeroDigitsList
perform a switch case to determine which last 3 digits should be
returned
Here's the complete code:
class MainProgram2
{
static void Main()
{
Tree theTree = new Tree();
Console.WriteLine("Please Enter the string you want to process:");
string input = Console.ReadLine();
foreach (char c in input)
{
// Check if it's a digit or not
if (c >= '0' && c <= '9')
{
theTree.Insert((int)Char.GetNumericValue(c));
}
}
//End of for each (char c in input)
Console.WriteLine("Inorder traversal resulting Tree Sort without the zeros");
theTree.Inorder(theTree.ReturnRoot());
Console.WriteLine(" ");
//Format the output depending on how many zeros you have
Console.WriteLine("The final 3 digits are");
switch (theTree.ZeroDigitsList.Count)
{
case 0:
{
Console.WriteLine("{0}{1}{2}", theTree.SortedDigitsList[0], theTree.SortedDigitsList[1], theTree.SortedDigitsList[2]);
break;
}
case 1:
{
Console.WriteLine("{0}{1}{2}", theTree.SortedDigitsList[0], 0, theTree.SortedDigitsList[2]);
break;
}
default:
{
Console.WriteLine("{0}{1}{2}", theTree.SortedDigitsList[0], 0, 0);
break;
}
}
Console.ReadLine();
}
}//End of main()
}
class Node
{
public int item;
public Node leftChild;
public Node rightChild;
public void displayNode()
{
Console.Write("[");
Console.Write(item);
Console.Write("]");
}
}
class Tree
{
public List<int> SortedDigitsList { get; set; }
public List<int> ZeroDigitsList { get; set; }
public Node root;
public Tree()
{
root = null;
SortedDigitsList = new List<int>();
ZeroDigitsList = new List<int>();
}
public Node ReturnRoot()
{
return root;
}
public void Insert(int id)
{
Node newNode = new Node();
newNode.item = id;
if (root == null)
root = newNode;
else
{
Node current = root;
Node parent;
while (true)
{
parent = current;
if (id < current.item)
{
current = current.leftChild;
if (current == null)
{
parent.leftChild = newNode;
return;
}
}
else
{
current = current.rightChild;
if (current == null)
{
parent.rightChild = newNode;
return;
}
}
}
}
}
//public void Preorder(Node Root)
//{
// if (Root != null)
// {
// Console.Write(Root.item + " ");
// Preorder(Root.leftChild);
// Preorder(Root.rightChild);
// }
//}
public void Inorder(Node Root)
{
if (Root != null)
{
Inorder(Root.leftChild);
if (Root.item > 0)
{
SortedDigitsList.Add(Root.item);
Console.Write(Root.item + " ");
}
else
{
ZeroDigitsList.Add(Root.item);
}
Inorder(Root.rightChild);
}
}
Is there any simple algorithm to determine the likeliness of 2 names representing the same person?
I'm not asking for something of the level that Custom department might be using. Just a simple algorithm that would tell me if 'James T. Clark' is most likely the same name as 'J. Thomas Clark' or 'James Clerk'.
If there is an algorithm in C# that would be great, but I can translate from any language.
Sounds like you're looking for a phonetic-based algorithms, such as soundex, NYSIIS, or double metaphone. The first actually is what several government departments use, and is trivial to implement (with many implementations readily available). The second is a slightly more complicated and more precise version of the first. The latter-most works with some non-English names and alphabets.
Levenshtein distance is a definition of distance between two arbitrary strings. It gives you a distance of 0 between identical strings and non-zero between different strings, which might also be useful if you decide to make a custom algorithm.
Levenshtein is close, although maybe not exactly what you want.
I've faced similar problem and tried to use Levenstein distance first, but it did not work well for me. I came up with an algorithm that gives you "similarity" value between two strings (higher value means more similar strings, "1" for identical strings). This value is not very meaningful by itself (if not "1", always 0.5 or less), but works quite well when you throw in Hungarian Matrix to find matching pairs from two lists of strings.
Use like this:
PartialStringComparer cmp = new PartialStringComparer();
tbResult.Text = cmp.Compare(textBox1.Text, textBox2.Text).ToString();
The code behind:
public class SubstringRange {
string masterString;
public string MasterString {
get { return masterString; }
set { masterString = value; }
}
int start;
public int Start {
get { return start; }
set { start = value; }
}
int end;
public int End {
get { return end; }
set { end = value; }
}
public int Length {
get { return End - Start; }
set { End = Start + value;}
}
public bool IsValid {
get { return MasterString.Length >= End && End >= Start && Start >= 0; }
}
public string Contents {
get {
if(IsValid) {
return MasterString.Substring(Start, Length);
} else {
return "";
}
}
}
public bool OverlapsRange(SubstringRange range) {
return !(End < range.Start || Start > range.End);
}
public bool ContainsRange(SubstringRange range) {
return range.Start >= Start && range.End <= End;
}
public bool ExpandTo(string newContents) {
if(MasterString.Substring(Start).StartsWith(newContents, StringComparison.InvariantCultureIgnoreCase) && newContents.Length > Length) {
Length = newContents.Length;
return true;
} else {
return false;
}
}
}
public class SubstringRangeList: List<SubstringRange> {
string masterString;
public string MasterString {
get { return masterString; }
set { masterString = value; }
}
public SubstringRangeList(string masterString) {
this.MasterString = masterString;
}
public SubstringRange FindString(string s){
foreach(SubstringRange r in this){
if(r.Contents.Equals(s, StringComparison.InvariantCultureIgnoreCase))
return r;
}
return null;
}
public SubstringRange FindSubstring(string s){
foreach(SubstringRange r in this){
if(r.Contents.StartsWith(s, StringComparison.InvariantCultureIgnoreCase))
return r;
}
return null;
}
public bool ContainsRange(SubstringRange range) {
foreach(SubstringRange r in this) {
if(r.ContainsRange(range))
return true;
}
return false;
}
public bool AddSubstring(string substring) {
bool result = false;
foreach(SubstringRange r in this) {
if(r.ExpandTo(substring)) {
result = true;
}
}
if(FindSubstring(substring) == null) {
bool patternfound = true;
int start = 0;
while(patternfound){
patternfound = false;
start = MasterString.IndexOf(substring, start, StringComparison.InvariantCultureIgnoreCase);
patternfound = start != -1;
if(patternfound) {
SubstringRange r = new SubstringRange();
r.MasterString = this.MasterString;
r.Start = start++;
r.Length = substring.Length;
if(!ContainsRange(r)) {
this.Add(r);
result = true;
}
}
}
}
return result;
}
private static bool SubstringRangeMoreThanOneChar(SubstringRange range) {
return range.Length > 1;
}
public float Weight {
get {
if(MasterString.Length == 0 || Count == 0)
return 0;
float numerator = 0;
int denominator = 0;
foreach(SubstringRange r in this.FindAll(SubstringRangeMoreThanOneChar)) {
numerator += r.Length;
denominator++;
}
if(denominator == 0)
return 0;
return numerator / denominator / MasterString.Length;
}
}
public void RemoveOverlappingRanges() {
SubstringRangeList l = new SubstringRangeList(this.MasterString);
l.AddRange(this);//create a copy of this list
foreach(SubstringRange r in l) {
if(this.Contains(r) && this.ContainsRange(r)) {
Remove(r);//try to remove the range
if(!ContainsRange(r)) {//see if the list still contains "superset" of this range
Add(r);//if not, add it back
}
}
}
}
public void AddStringToCompare(string s) {
for(int start = 0; start < s.Length; start++) {
for(int len = 1; start + len <= s.Length; len++) {
string part = s.Substring(start, len);
if(!AddSubstring(part))
break;
}
}
RemoveOverlappingRanges();
}
}
public class PartialStringComparer {
public float Compare(string s1, string s2) {
SubstringRangeList srl1 = new SubstringRangeList(s1);
srl1.AddStringToCompare(s2);
SubstringRangeList srl2 = new SubstringRangeList(s2);
srl2.AddStringToCompare(s1);
return (srl1.Weight + srl2.Weight) / 2;
}
}
Levenstein distance one is much simpler (adapted from http://www.merriampark.com/ld.htm):
public class Distance {
/// <summary>
/// Compute Levenshtein distance
/// </summary>
/// <param name="s">String 1</param>
/// <param name="t">String 2</param>
/// <returns>Distance between the two strings.
/// The larger the number, the bigger the difference.
/// </returns>
public static int LD(string s, string t) {
int n = s.Length; //length of s
int m = t.Length; //length of t
int[,] d = new int[n + 1, m + 1]; // matrix
int cost; // cost
// Step 1
if(n == 0) return m;
if(m == 0) return n;
// Step 2
for(int i = 0; i <= n; d[i, 0] = i++) ;
for(int j = 0; j <= m; d[0, j] = j++) ;
// Step 3
for(int i = 1; i <= n; i++) {
//Step 4
for(int j = 1; j <= m; j++) {
// Step 5
cost = (t.Substring(j - 1, 1) == s.Substring(i - 1, 1) ? 0 : 1);
// Step 6
d[i, j] = System.Math.Min(System.Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1), d[i - 1, j - 1] + cost);
}
}
// Step 7
return d[n, m];
}
}
I doubt there is, considering even the Customs Department doesn't seem to have a satisfactory answer...
If there is a solution to this problem I seriously doubt it's a part of core C#. Off the top of my head, it would require a database of first, middle and last name frequencies, as well as account for initials, as in your example. This is fairly complex logic that relies on a database of information.
Second to Levenshtein distance, what language do you want? I was able to find an implementation in C# on codeproject pretty easily.
In an application I worked on, the Last name field was considered reliable.
So presented all the all the records with the same last name to the user.
User could sort by the other fields to look for similar names.
This solution was good enough to greatly reduce the issue of users creating duplicate records.
Basically looks like the issue will require human judgement.