Split exponential number string representation into power and exponent - c#

I have strings which come from resources in exponential form like the following: 2⁴. I was wondering if there is a way to split this into:
var base = 2; //or even "2", this is also helpful since it can be parsed
and
var exponent = 4;
I have searched the internet and msdn Standard Numeric Format Strings also, but I was unable to find the solve for this case.

You can add mapping between digits to superscript digits, then select all digits from source (this will be the base) and all the others - the exponent
const string superscriptDigits = "⁰¹²³⁴⁵⁶⁷⁸⁹";
var digitToSuperscriptMapping = superscriptDigits.Select((c, i) => new { c, i })
.ToDictionary(item => item.c, item => item.i.ToString());
const string source = "23⁴⁴";
var baseString = new string(source.TakeWhile(char.IsDigit).ToArray());
var exponentString = string.Concat(source.SkipWhile(char.IsDigit).Select(c => digitToSuperscriptMapping[c]));
Now you can convert base and exponent to int.
Also you'll need to validate input before executing conversion code.
Or even without mapping:
var baseString = new string(source.TakeWhile(char.IsDigit).ToArray());
var exponentString = string.Concat(source.SkipWhile(char.IsDigit).Select(c => char.GetNumericValue(c).ToString()));

You can use a regular expression together with String.Normalize:
var value = "42⁴³";
var match = Regex.Match(value, #"(?<base>\d+)(?<exponent>[⁰¹²³⁴-⁹]+)");
var #base = int.Parse(match.Groups["base"].Value);
var exponent = int.Parse(match.Groups["exponent"].Value.Normalize(NormalizationForm.FormKD));
Console.WriteLine($"base: {#base}, exponent: {exponent}");

The way your exponent is formatted is called superscript in English.
You can find many question related to this if you search with that keyword.
Digits in superscript are mapped in Unicode as:
0 -> \u2070
1 -> \u00b9
2 -> \u00b2
3 -> \u00b3
4 -> \u2074
5 -> \u2075
6 -> \u2076
7 -> \u2077
8 -> \u2078
9 -> \u2079
You can search for that values in your string:
Lis<char> superscriptDigits = new List<char>(){
'\u2070', \u00b9', \u00b2', \u00b3', \u2074',
\u2075', \u2076', \u2077', \u2078', \u2079"};
//the rest of the string is the expontent. Join remaining chars.
str.SkipWhile( ch => !superscriptDigits.Contains(ch) );
You get the idea

You can use a simple regex (if your source is quite clean):
string value = "2⁴⁴";
string regex = #"(?<base>\d+)(?<exp>.*)";
var matches = Regex.Matches(value, regex);
int b;
int exp = 0;
int.TryParse(matches[0].Groups["base"].Value, out b);
foreach (char c in matches[0].Groups["exp"].Value)
{
exp = exp * 10 + expToInt(c.ToString());
}
Console.Out.WriteLine("base is : {0}, exponent is {1}", b, exp);
And expToInt (based on Unicode subscripts and superscripts):
public static int expToInt(string c)
{
switch (c)
{
case "\u2070":
return 0;
case "\u00b9":
return 1;
case "\u00b2":
return 2;
case "\u00b3":
return 3;
case "\u2074":
return 4;
case "\u2075":
return 5;
case "\u2076":
return 6;
case "\u2077":
return 7;
case "\u2078":
return 8;
case "\u2079":
return 9;
default:
throw new ArgumentOutOfRangeException();
}
}
This will output:
base is 2, exp is 44

Related

Hierarchical and numerical ordering of strings made from delimited integers (C#)

I have a list of folder names that represent chapters, subchapters, sections, paragraphs and lines in a specification. A small sample of these folders looks like the following.
1_1_1
1_1_12
1_1_2
1_2_1
1_2_1_3_1
1_2_2
I need to write a function that sorts these numerically and taking account for hierarchical nesting. For instance the correct output of sorting the above would be.
1_1_1
1_1_2
1_1_12
1_2_1
1_2_1_3_1
1_2_2
Since this is very much the same way version numbers are sorted I have tried the following code which worked until it attempts to process an input with more than 4 sections (i.e. 1_2_1_3_1)
private List<Resource> OrderResources(List<Resource> list)
{
return list.OrderBy(v => System.Version.Parse(v.Id.Replace('_', '.'))).ToList();
}
The error I get is
System.ArgumentException : Version string portion was too short or too long. (Parameter 'input')
Sorting is possible if you add n characters 0 between the digits.
Also You can use long and move the numbers n digits to the left, but then there will be a limit on the length of the number.
static void Main(string[] args)
{
var chapter = new List<string>();
chapter.Add("1_1_1");
chapter.Add("1_1_12");
chapter.Add("1_1_2");
chapter.Add("1_2_1");
chapter.Add("1_2_1_3_1");
chapter.Add("1_2_2");
var result = chapter.OrderBy(x=>SortCalc(x)).ToArray();
foreach (var s in result)
{
Console.WriteLine($"{s}->{SortCalc(s)}");
}
}
private static string SortCalc(string x, int count = 3)
{
var array = x.Split('_').ToList();
for (var index = 0; index < array.Count; index++)
{
var length = count - array[index].Length;
if (length <=0)
continue;
array[index] = new string('0', length)+ array[index];
}
var num = string.Join("", array);
return num;
}
Output will be
1_1_1->001001001
1_1_2->001001002
1_1_12->001001012
1_2_1->001002001
1_2_1_3_1->001002001003001
1_2_2->001002002
I am posting this as a way to help the OP that has a problem with the solution posted at Natural Sort Order in C#
This answer is just an example on how to use that class described in that answer. So I am pretty close to a verbatim copy of the mentioned answer. But, in any case, please look at the comment in the question.
Here we go.
First the class NaturalStringComparer is taken in full from the answer linked above.
Next I add a fac-simile of the class Resource used by the OP and some code to initialize it
public class Resource
{
public string ID { get; set; }
.... other properties???....
}
void Main()
{
List<Resource> unordered = new List<Resource>
{
new Resource{ ID = "1_1_12"},
new Resource{ ID = "1_2_1_3_1"},
new Resource{ ID = "1_1_2"},
new Resource{ ID = "1_2_2"},
new Resource{ ID = "1_1_1"},
new Resource{ ID = "1_2_1"},
};
Now the call to the NaturalStringOrder is the following
var c = new NaturalStringComparer();
var r = unordered.OrderBy(u => u.ID, c);
And the results are
foreach(var x in r)
Console.WriteLine(x.ID);
---------
1_1_1
1_1_2
1_1_12
1_2_1
1_2_1_3_1
1_2_2
I provide also a benchmark to check the performances.
Stopwatch sw = new Stopwatch();
sw.Start();
for (int x = 0; x < 1000000; x++)
{
var c = new NaturalStringComparer();
var r = unordered.OrderBy(u => u.ID, c);
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
I get 17ms, that looks pretty good.
I decided to implement it using Linq as well:
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args)
{
var chapter = new List<string>();
chapter.Add("1_5");
chapter.Add("1_4_1");
chapter.Add("1_1_1");
chapter.Add("1_1_12");
chapter.Add("1_1_2");
chapter.Add("1_2_1");
chapter.Add("1_2_1_3_1");
chapter.Add("1_2_2");
chapter.Add("1_2_22");
chapter.Add("1_2_21");
var result = chapter
.OrderBy(aX=> aX.Replace("_", "0"))
.ToArray();
foreach (var s in result)
{
Console.WriteLine($"{s} -> {s.Replace("_", "0")}");
}
}
}
Also maybe for some extra revs, the original implementation could benefit from Span<char> chars = stackalloc char[x.Length];. You can then skip the part while doing Split and make it:
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args)
{
var chapter = new List<string>();
chapter.Add("1_5");
chapter.Add("1_4_1");
chapter.Add("1_1_1");
chapter.Add("1_1_12");
chapter.Add("1_1_2");
chapter.Add("1_2_1");
chapter.Add("1_2_1_3_1");
chapter.Add("1_2_2");
chapter.Add("1_2_22");
chapter.Add("1_2_21");
var spanResult = chapter
.OrderBy(Sort)
.ToArray();
foreach (var s in spanResult)
{
Console.WriteLine($"{s} -> {Sort(s)}");
}
}
static string Sort(string aValue)
{
const char zero = '0';
const char underscore = '_';
Span<char> chars = stackalloc char[aValue.Length];
for (int i = 0; i < aValue.Length; i++)
{
if (aValue[i] == underscore)
{
if (i != aValue.Length - 1)
{
chars[i] = zero;
}
continue;
}
chars[i] = aValue[i];
}
return chars.ToString();
}
}
The output:
Linq
1 -> 1
1_ -> 10
1_1_1 -> 10101
1_1_12 -> 101012
1_1_2 -> 10102
1_2_1 -> 10201
1_2_1_3_1 -> 102010301
1_2_2 -> 10202
1_2_21 -> 102021
1_2_22 -> 102022
1_4_1 -> 10401
1_5 -> 105
Span
1 -> 1
1_ -> 1
1_1_1 -> 10101
1_1_12 -> 101012
1_1_2 -> 10102
1_2_1 -> 10201
1_2_1_3_1 -> 102010301
1_2_2 -> 10202
1_2_21 -> 102021
1_2_22 -> 102022
1_4_1 -> 10401
1_5 -> 105
The assumption in my implementation is that _ will only be applied in case there is "hierarchy", otherwise the implementation would need to check for that, of course.
I have added the updated implementation in case the "assumption" is not correct.

How to return the amount of duplicate letters in a string

I am trying to get a user's input so that I can return how many duplicate characters they have.
This is how I got the input
Console.WriteLine("Input a word to reveal duplicate letters");
string input = Console.ReadLine();
For example, the code should return something like:
List of duplicate characters in String 'Programming'
g : 2
r : 2
m : 2
How do I find duplicate letters and count them in a string?
Yes you can obtain this by using System.Linq GroupBy(), you going to group your string by character value and after filter the generated groups that have more than 1 values like so :
var word = "Hello World!";
var multipleChars = word.GroupBy(c => c).Where(group => group.Count() > 1);
foreach (var charGroup in multipleChars)
{
Console.WriteLine(charGroup .Key + " : " + charGroup .Count());
}
this will include case sensitivity as well as excluding non alphanumeric char
var sample = Console.ReadLine();
var letterCounter = sample.Where(char.IsLetterOrDigit)
.GroupBy(char.ToLower)
.Select(counter => new { Letter = counter.Key, Counter = counter.Count() })
.Where(c=>c.Counter>1);
foreach (var counter in letterCounter){
Console.WriteLine(String.Format("{0} = {1}", counter.Letter,counter.Counter));
}

using Func<string, bool>, how can I split on a character then count the length of those values within a function

Essentially, I need to split a string like "aaaaa.bbbbbbbb.cccccc" on the . and then count the length of the split values using a function.
Func<string, bool> length = f => f.Split(".").Length > 1;
pretty much this but instead of counting the length of the split array, I need to count how many letters per entry of the array and see if they are over a certain length.
If you need a boolean answer, then it's either one of the following (depdending on if you want at least one substring with length > 1 or you want all of them:
Func<string, bool> length1 = f => f.Split('.').Any(s => s.Length > 1);
Func<string, bool> length2 = f => f.Split('.').All(s => s.Length > 1);
I think you are trying to do this:
string input = "aaaaa.bbbbbbbb.cccccc";
var parts = input.Split('.');
var lengths = parts.Select(e=>e.Count());
If you mean per letter:
aaaaa.bbbbbbbb.cccccc -> a = 5, b = 8, c = 6
Func<string, IEnumerable<int>> length = f => f.Split('.').Select(a => a.Length)
From what I understand, you need to take a string, split it only by the delimiter '.', and then count to see if each split value is greater than 1.
If so, you can use the following:
Func<string, bool> length = str => str.Split('.').All(s => s.Length > 1);
This will first split the string by your delimiter, then iterate on all of the values to check if they are greater than 1.
Quick test case:
string test1 = "aaa.b.ccccccc";
string test2 = "aaaaaa.bbb.c";
string test3 = "aaa.bbb.ccccc";
Console.WriteLine(length(test1)); // false, as b is 1, not greater
Console.WriteLine(length(test2)); // false, for similar reasons
Console.WriteLine(length(test3)); // true, all are greater
Another way using a func delegate:
Func<string, Tuple<string[], int[]>> length = (str) => {
string[] stringParts = str.Split('.');
int[] countLetters = stringParts.Select(s => s.Length).ToArray();
return new Tuple<string[], int[]>(stringParts, countLetters);
};
string input = "aaaaa.bbbbbbbb.cccccc";
var res = length(input);
string[] strs = res.Item1;
int[] countLetter = res.Item2;
for (int i = 0; i < strs.Length; i++)
{
Console.WriteLine(strs[i]);
Console.WriteLine(countLetter[i]);
}
Output:
aaaaa
5
bbbbbbbb
8
cccccc
6
An extension method would perhaps be an easier way to do this.
How about you write a method that returns list of strings that above the length you specify, something like this:
IEnumerable<string> GetSplittedAboveLimit(string inputString,int limit)
{
var splitted = inputString.Split(".");
foreach(var input in splitted)
{
if(input.Length > limit)
{
yield return input;
}
}
}
what about this [Example reference]:
Func<string, List<int>> length = f => f.Split('.').Select(x=>x.Length).ToList();
And call the method like this:
string inputStr = "aaaaa.bbbbbbbb.cccccc";
Console.WriteLine(String.Join(",",length(inputStr))); // prints 5,8,6
Please note: the second parameter in the Func denotes the return type, Here in the example I used it as List<int> that's why I added .ToList() at the end of the code. You can change the return types accordingly. If you are ok with IEnumerable<int> then
Func<string, IEnumerable<int>> length = f => f.Split('.').Select(x=>x.Length)
is enough.

system.version more than 3 decimal points c#

I am currently receiving the following error -
"Version string portion was too short or too long"
When using this statement -
records = records.OrderBy(r => new Version(r.RefNo)).ToList();
To order a list of string's called RefNo. It fails on 25.1.2.1.2 so i assume it is because it has four decimal points? as it works ok with 3....
Is the max four deciaml points for system.version?
Thanks
A Version can only have 4 parts:
major, minor, build, and revision, in that order, and all separated by
periods.
That's why your approach fails. You could create an extension-method which handles this case, f.e.:
public static Version TryParseVersion(this string versionString)
{
if (string.IsNullOrEmpty(versionString))
return null;
String[] tokens = versionString.Split('.');
if (tokens.Length < 2 || !tokens.All(t => t.All(char.IsDigit)))
return null;
if (tokens.Length > 4)
{
int maxVersionLength = tokens.Skip(4).Max(t => t.Length);
string normalizedRest = string.Concat(tokens.Skip(4).Select(t => t.PadLeft(maxVersionLength, '0')));
tokens[3] = $"{tokens[3].PadLeft(maxVersionLength, '0')}{normalizedRest}";
Array.Resize(ref tokens, 4);
}
versionString = string.Join(".", tokens);
bool valid = Version.TryParse(versionString, out Version v);
return valid ? v : null;
}
Now you can use it in this way:
records = records
.OrderBy(r => r.RefNo.TryParseVersion())
.ToList();
With your sample this new version string will be parsed(successfully): 25.1.2.12
See MSDN
Costructor public Version(string version)
A string containing the major, minor, build, and revision numbers,
where each number is delimited with a period character ('.').
Makes a total of 4 numbers.
Means the string is limited to 4 numbers, 5 lead to an error.
Also the costructors with int's as parameters only support 1 to 4 parameters.
sorry for the late reply but here is the finished extension method i used with a couple of alterations -
public static Version ParseRefNo(this string refNoString)
{
if (string.IsNullOrEmpty(refNoString))
return null;
String[] tokens = refNoString.Split('.');
if (tokens.Length < 2 || !tokens.All(t => t.All(char.IsDigit)))
return null;
if (tokens.Length > 4)
{
int maxVersionLength = tokens.Skip(4).Max(t => t.Length);
string normalizedRest = string.Concat(tokens.Skip(4).Select(t => t.PadLeft(maxVersionLength, '0')));
tokens[3] = $"{tokens[3].PadLeft(maxVersionLength, '0')}{normalizedRest}";
Array.Resize(ref tokens, 4);
}
refNoString = string.Join(".", tokens);
Version v = null;
bool valid = Version.TryParse(refNoString, out v);
return valid ? v : null;
}

Using LINQ to parse the numbers from a string

Is it possible to write a query where we get all those characters that could be parsed into int from any given string?
For example we have a string like: "$%^DDFG 6 7 23 1"
Result must be "67231"
And even slight harder: Can we get only first three numbers?
This will give you your string
string result = new String("y0urstr1ngW1thNumb3rs".
Where(x => Char.IsDigit(x)).ToArray());
And for the first 3 chars use .Take(3) before ToArray()
The following should work.
var myString = "$%^DDFG 6 7 23 1";
//note that this is still an IEnumerable object and will need
// conversion to int, or whatever type you want.
var myNumber = myString.Where(a=>char.IsNumber(a)).Take(3);
It's not clear if you want 23 to be considered a single number sequence, or 2 distinct numbers. My solution above assumes you want the final result to be 672
public static string DigitsOnly(string strRawData)
{
return Regex.Replace(strRawData, "[^0-9]", "");
}
string testString = "$%^DDFG 6 7 23 1";
string cleaned = new string(testString.ToCharArray()
.Where(c => char.IsNumber(c)).Take(3).ToArray());
If you want to use a white list (not always numbers):
char[] acceptedChars = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
string cleaned = new string(testString.ToCharArray()
.Where(c => acceptedChars.Contains(c)).Take(3).ToArray());
How about something like this?
var yourstring = "$%^DDFG 6 7 23 1";
var selected = yourstring.ToCharArray().Where(c=> c >= '0' && c <= '9').Take(3);
var reduced = yourstring.Where(char.IsDigit).Take(3);
Regex:
private int ParseInput(string input)
{
System.Text.RegularExpressions.Regex r = new System.Text.RegularExpressions.Regex(#"\d+");
string valueString = string.Empty;
foreach (System.Text.RegularExpressions.Match match in r.Matches(input))
valueString += match.Value;
return Convert.ToInt32(valueString);
}
And even slight harder: Can we get
only first three numbers?
private static int ParseInput(string input, int take)
{
System.Text.RegularExpressions.Regex r = new System.Text.RegularExpressions.Regex(#"\d+");
string valueString = string.Empty;
foreach (System.Text.RegularExpressions.Match match in r.Matches(input))
valueString += match.Value;
valueString = valueString.Substring(0, Math.Min(valueString.Length, take));
return Convert.ToInt32(valueString);
}
> 'string strRawData="12#$%33fgrt$%$5";
> string[] arr=Regex.Split(strRawData,"[^0-9]"); int a1 = 0;
> foreach (string value in arr) { Console.WriteLine("line no."+a1+" ="+value); a1++; }'
Output:line no.0 =12
line no.1 =
line no.2 =
line no.3 =33
line no.4 =
line no.5 =
line no.6 =
line no.7 =
line no.8 =
line no.9 =
line no.10 =5
Press any key to continue . . .

Categories