How to append character in long integer? - c#

I want to append a character to a long integer, using the below code:
if (strArrIds[1].Contains("CO"))
{
long rdb2 = Convert.ToInt64(strArrIds[1].Substring(strArrIds[1].Length - 1));
assessmentEntity.RatingType = rdb2;
}
If rdb2 = 5, I want to append a L to this value, like rdb2 = 5L.
Any ideas? Thanks in advance.

You can using Long.Parse instead Convert.ToInt64 to get the long and you wont need to append L to make it long
if (strArrIds[1].Contains("CO"))
{
long rdb2 = long.Parse(strArrIds[1].Substring(strArrIds[1].Length - 1));
assessmentEntity.RatingType = rdb2;
}

If you are processing a lot of these, and if this item will be used for further processing, you could consider turning this into a class which might be easier to manage. This would also allow you to more easily customise your ratings.
I'm thinking maybe a factory might serve you well also which could instantiate and return your assessment entity. You could then leverage a dependency injection strategy for any other functionality.
Its a little hard to tell what you require this for without a bit more context.
If this is a once off, I would refactor to
assessmentEntity.RatingType = strArrIds[1].Contains("CO") ?
String.Concat(long.Parse(strArrIds[1].Substring(strArrIds[1].Length - 1)).ToString(), "L") :
"0N";
Assuming "0N" is some other default rating..

You do not need an L here. That is only for literals of type long appearing in the C# source.
You could do something like:
string str1 = strArrIds[1];
if (str1.Contains("CO"))
{
long rdb2 = str1[str1.Length - 1] - '0';
if (rdb2 < 0L || rdb2 > 9L)
throw new InvalidOperationException("Unexpected rdb2 value from str1=" + str1);
assessmentEntity.RatingType = rdb2;
}

Related

Postfix Calculator

Making a console application in C sharp to solve expressions in postfix notation by utilizing a stack, such as:
Expression: 43+2*
Answer: 14
What I've done so far:
using System;
using System.Collections;
using System.Linq;
using System.Text;
namespace ConsoleApplication7
{
class Program
{
static void Main(string[] args)
{
string input = "23+";
int counter = 0;
Stack values = new Stack();
while (counter < input.Length)
{
int temp1,
temp2,
answer;
char x = char.Parse(input.Substring(counter, 1));
if ( );
else if (x == '+')
{
temp1 = (int)values.Pop();
temp2 = (int)values.Pop();
values.Push(answer = temp1 + temp2);
}
else if (x == '-')
{
temp1 = (int)values.Pop();
temp2 = (int)values.Pop();
values.Push(answer = temp1 - temp2);
}
else if (x == '*')
{
temp1 = (int)values.Pop();
temp2 = (int)values.Pop();
values.Push(answer = temp1 / temp2);
}
else if (x == '/')
{
temp1 = (int)values.Pop();
temp2 = (int)values.Pop();
values.Push(answer = temp1 * temp2);
}
counter++;
}
Console.WriteLine(values.Pop());
}
}
For the if statement, what can I use as a condition to check if x is a operand?
Is your example input 2, 3, + (which equals 5), or 23, + (which is invalid input)? I'm assuming the former. How, then, would you write two-digit numbers? Your current approach doesn't seem to support this. I think you shouldn't be parsing this char-by-char, but split it into the separate components first, perhaps using a regex that recognizes numbers and punctuation. As a quick example: Regex.Matches("10 3+", #"(\d+|[\+\-\*/ ])") splits into 10, , 3, and +, which can be parsed and understood fairly easily with the code you already have, (spaces should be ignored; they're simply a punctuation I picked to separate numbers so that you can have multi-digit numbers) and int.TryParse (or double, which requires a more complicated regex pattern, see Matching Floating Point Numbers for that pattern) to see if an input is a number.
You should use a Stack<int> to avoid casting and make it compile-time safe.
Surely this is wrong:
((int)Char.GetNumericValue(x) <= 0 && (int)Char.GetNumericValue(x) >= 0)
I think it should be
((int)Char.GetNumericValue(x) <= 9 && (int)Char.GetNumericValue(x) >= 0)
I really think this is more like a code review but well so be it - first: please seperate some concerns - you baked everything into a big messy monster - think about the parts of the problem and put them into seperate methods to start with.
Then: if you cannot solve the hole problem, make it smaller first: Let the user enter some kind of seperator for the parts, or assume for now that he does - space will do just fine.
You can think of how to handle operators without spaces pre/postfixed later.
So try parsing "2 3 +" instead of "23+" or "2 3+" ...
If you do this you can indeed just use String.Split to make your life much easier!
As to how you can recognize an operant: very easy - try Double.TryParse it will tell you if you passed it a valid number and you don't have to waste your time with parsing the numbers yourself
Instead of using a while in there you should use a for or even better a foreach - heck you can even do this with LINQ and [Enumerable.Aggregate][1] and get FUNctional :D
And finally ... don't use this if/then/else mess if a switch does the job ...
You could say that there essentially are no operands. Even digits can be thought of as operators that multiply the top of the stack by 10 and add the digit value; accumulating a value over several digits as necessary. Then you just need an operator for seeding this by pushing a zero to the stack (perhaps a space character for that).
http://blogs.msdn.com/b/ashleyf/archive/2009/10/23/tinyrpn-calculator.aspx

Constantly Incrementing String

So, what I'm trying to do this something like this: (example)
a,b,c,d.. etc. aa,ab,ac.. etc. ba,bb,bc, etc.
So, this can essentially be explained as generally increasing and just printing all possible variations, starting at a. So far, I've been able to do it with one letter, starting out like this:
for (int i = 97; i <= 122; i++)
{
item = (char)i
}
But, I'm unable to eventually add the second letter, third letter, and so forth. Is anyone able to provide input? Thanks.
Since there hasn't been a solution so far that would literally "increment a string", here is one that does:
static string Increment(string s) {
if (s.All(c => c == 'z')) {
return new string('a', s.Length + 1);
}
var res = s.ToCharArray();
var pos = res.Length - 1;
do {
if (res[pos] != 'z') {
res[pos]++;
break;
}
res[pos--] = 'a';
} while (true);
return new string(res);
}
The idea is simple: pretend that letters are your digits, and do an increment the way they teach in an elementary school. Start from the rightmost "digit", and increment it. If you hit a nine (which is 'z' in our system), move on to the prior digit; otherwise, you are done incrementing.
The obvious special case is when the "number" is composed entirely of nines. This is when your "counter" needs to roll to the next size up, and add a "digit". This special condition is checked at the beginning of the method: if the string is composed of N letters 'z', a string of N+1 letter 'a's is returned.
Here is a link to a quick demonstration of this code on ideone.
Each iteration of Your for loop is completely
overwriting what is in "item" - the for loop is just assigning one character "i" at a time
If item is a String, Use something like this:
item = "";
for (int i = 97; i <= 122; i++)
{
item += (char)i;
}
something to the affect of
public string IncrementString(string value)
{
if (string.IsNullOrEmpty(value)) return "a";
var chars = value.ToArray();
var last = chars.Last();
if(char.ToByte() == 122)
return value + "a";
return value.SubString(0, value.Length) + (char)(char.ToByte()+1);
}
you'll probably need to convert the char to a byte. That can be encapsulated in an extension method like static int ToByte(this char);
StringBuilder is a better choice when building large amounts of strings. so you may want to consider using that instead of string concatenation.
Another way to look at this is that you want to count in base 26. The computer is very good at counting and since it always has to convert from base 2 (binary), which is the way it stores values, to base 10 (decimal--the number system you and I generally think in), converting to different number bases is also very easy.
There's a general base converter here https://stackoverflow.com/a/3265796/351385 which converts an array of bytes to an arbitrary base. Once you have a good understanding of number bases and can understand that code, it's a simple matter to create a base 26 counter that counts in binary, but converts to base 26 for display.

String Builder vs Lists

I am reading in multiple files in with millions of lines and I am creating a list of all line numbers that have a specific issue. For example if a specific field is left blank or contains an invalid value.
So my question is what would be the most efficient date type to keep track of a list of numbers that could be upwards of a million number of rows. Would using String Builder, Lists, or something else be more efficient?
My end goal is to out put a message like "Specific field is blank on 1-32, 40, 45, 47, 49-51, etc. So in the case of a String Builder, I would check the previous value and if it is is only 1 more I would change it from 1 to 1-2 and if it was more than one would separate it by a comma. With the List, I would just add each number to the list and then combine then once the file has been completely read. However in this case I could have multiple list containing millions of numbers.
Here is the current code I am using to combine a list of numbers using String Builder:
string currentLine = sbCurrentLineNumbers.ToString();
string currentLineSub;
StringBuilder subCurrentLine = new StringBuilder();
StringBuilder subCurrentLineSub = new StringBuilder();
int indexLastSpace = currentLine.LastIndexOf(' ');
int indexLastDash = currentLine.LastIndexOf('-');
int currentStringInt = 0;
if (sbCurrentLineNumbers.Length == 0)
{
sbCurrentLineNumbers.Append(lineCount);
}
else if (indexLastSpace == -1 && indexLastDash == -1)
{
currentStringInt = Convert.ToInt32(currentLine);
if (currentStringInt == lineCount - 1)
sbCurrentLineNumbers.Append("-" + lineCount);
else
{
sbCurrentLineNumbers.Append(", " + lineCount);
commaCounter++;
}
}
else if (indexLastSpace > indexLastDash)
{
currentLineSub = currentLine.Substring(indexLastSpace);
currentStringInt = Convert.ToInt32(currentLineSub);
if (currentStringInt == lineCount - 1)
sbCurrentLineNumbers.Append("-" + lineCount);
else
{
sbCurrentLineNumbers.Append(", " + lineCount);
commaCounter++;
}
}
else if (indexLastSpace < indexLastDash)
{
currentLineSub = currentLine.Substring(indexLastDash + 1);
currentStringInt = Convert.ToInt32(currentLineSub);
string charOld = currentLineSub;
string charNew = lineCount.ToString();
if (currentStringInt == lineCount - 1)
sbCurrentLineNumbers.Replace(charOld, charNew);
else
{
sbCurrentLineNumbers.Append(", " + lineCount);
commaCounter++;
}
}
My end goal is to out put a message like "Specific field is blank on 1-32, 40, 45, 47, 49-51
If that's the end goal, no point in going through an intermediary representation such as a List<int> - just go with a StringBuilder. You will save on memory and CPU that way.
StringBuilder serves your purpose so stick with that, if you ever need the line numbers you can easily change the code then.
Depends on how you can / want to break the code up.
Given you are reading it in line order, not sure you need a list at all.
Your current desired output implies that you can't output anything until the file is completely scanned. The size of the file suggests a one pass`analysis phase would be a good idea as well, given you are going to use buffered input as opposed to reading the entire thing into memory.
I'd be tempted with an enum to describe the issue e.g Field??? is blank and then use that as the key a dictionary of string builders.
As a first thought anyway
Is your output supposed to be human readable? If so, you'll hit the limit of what is reasonable to read, long before you have any performance/memory issues from your data structure. Use whatever is easiest for you to work with.
If the output is supposed to be machine readable, then that output might suggest an appropriate data structure.
As others have pointed out, I would probably use StringBuilder. The List may have to resize many times; the new implementation of StringBuilder does not have to resize.

Splitting strings at specific positions

I got a little problem here, i'm looking for a better way to split Strings.
For example i receive a String looking like this.
0000JHASDF+4429901234ALEXANDER
I know the pattern the string is built with and i have an array of numbers like this.
4,5,4,7,9
0000 - JHASDF - +442 - 9901234 - ALEXANDER
It is easy to split the whole thing up with the String MID command but it seems to be slow when i receive a file containing 8000 - 10000 datasets.
So any suggestion how i can make this faster to get the data in a List or an Array of Strings?
If anyone knows how to do this for example with RegEx.
var lengths = new[] { 4, 6, 4, 7, 9 };
var parts = new string[lengths.Length];
// if you're not using .NET4 or above then use ReadAllLines rather than ReadLines
foreach (string line in File.ReadLines("YourFile.txt"))
{
int startPos = 0;
for (int i = 0; i < lengths.Length; i++)
{
parts[i] = line.Substring(startPos, lengths[i]);
startPos += lengths[i];
}
// do something with "parts" before moving on to the next line
}
Isn't mid a VB method?
string firstPart = string.Substring(0, 4);
string secondPart = string.Substring(4, 5);
string thirdPart = string.Substring(9, 4);
//...
Perhaps something like this:
string[] SplitString(string s,int[] parts)
{
string[] result=new string[parts.Length];
int start=0;
for(int i=0;i<parts.Length;i++)
{
int len=parts[i];
result[i]=s.SubString(start, len);
start += len;
}
if(start!=s.Length)
throw new ArgumentException("String length doesn't match sum of part lengths");
return result;
}
(I didn't compile it, so it probably contains some minor errors)
As the Mid() function is VB, you could simply try
string.Substring(0, 4);
and so on.
The Regex Split Method would be a possibility, but since you don't have a specific delimiter in the string then I doubt it will be of any use and unlikely to be any faster.
String.Substring is also a possibility. You use it like: var myFirstString = fullString.Substring(0, 4)
I know this is late, but in the Microsoft.VisualBasic.FileIO namespace, you can find the textfieldparser and it would do a better job handling your issue. Here is a link to MSDN - https://msdn.microsoft.com/en-us/library/zezabash.aspx with an explanation. The code is in VB, but you can easily convert it to C#. You will need to add a reference to the Microsoft.VisualBasic.FileIO namespace as well. Hope this helps anyone stumbling on this question in the future.
Here is what it would look like in vb for the questioner's issue:
Using Reader As New Microsoft.VisualBasic.FileIO.
TextFieldParser("C:\TestFolder\test.log")
Reader.TextFieldType =
Microsoft.VisualBasic.FileIO.FieldType.FixedWidth
Reader.SetFieldWidths(4, 6, 4, 7, 9)
Dim currentRow As String()
While Not Reader.EndOfData
Try
currentRow = Reader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
MsgBox(currentField)
Next
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid and will be skipped.")
End Try
End While
End Using

Interpolation in c# - performance problem

I need to resample big sets of data (few hundred spectra, each containing few thousand points) using simple linear interpolation.
I have created interpolation method in C# but it seems to be really slow for huge datasets.
How can I improve the performance of this code?
public static List<double> interpolate(IList<double> xItems, IList<double> yItems, IList<double> breaks)
{
double[] interpolated = new double[breaks.Count];
int id = 1;
int x = 0;
while(breaks[x] < xItems[0])
{
interpolated[x] = yItems[0];
x++;
}
double p, w;
// left border case - uphold the value
for (int i = x; i < breaks.Count; i++)
{
while (breaks[i] > xItems[id])
{
id++;
if (id > xItems.Count - 1)
{
id = xItems.Count - 1;
break;
}
}
System.Diagnostics.Debug.WriteLine(string.Format("i: {0}, id {1}", i, id));
if (id <= xItems.Count - 1)
{
if (id == xItems.Count - 1 && breaks[i] > xItems[id])
{
interpolated[i] = yItems[yItems.Count - 1];
}
else
{
w = xItems[id] - xItems[id - 1];
p = (breaks[i] - xItems[id - 1]) / w;
interpolated[i] = yItems[id - 1] + p * (yItems[id] - yItems[id - 1]);
}
}
else // right border case - uphold the value
{
interpolated[i] = yItems[yItems.Count - 1];
}
}
return interpolated.ToList();
}
Edit
Thanks, guys, for all your responses. What I wanted to achieve, when I wrote this questions, were some general ideas where I could find some areas to improve the performance. I haven't expected any ready solutions, only some ideas. And you gave me what I wanted, thanks!
Before writing this question I thought about rewriting this code in C++ but after reading comments to Will's asnwer it seems that the gain can be less than I expected.
Also, the code is so simple, that there are no mighty code-tricks to use here. Thanks to Petar for his attempt to optimize the code
It seems that all reduces the problem to finding good profiler and checking every line and soubroutine and trying to optimize that.
Thank you again for all responses and taking your part in this discussion!
public static List<double> Interpolate(IList<double> xItems, IList<double> yItems, IList<double> breaks)
{
var a = xItems.ToArray();
var b = yItems.ToArray();
var aLimit = a.Length - 1;
var bLimit = b.Length - 1;
var interpolated = new double[breaks.Count];
var total = 0;
var initialValue = a[0];
while (breaks[total] < initialValue)
{
total++;
}
Array.Copy(b, 0, interpolated, 0, total);
int id = 1;
for (int i = total; i < breaks.Count; i++)
{
var breakValue = breaks[i];
while (breakValue > a[id])
{
id++;
if (id > aLimit)
{
id = aLimit;
break;
}
}
double value = b[bLimit];
if (id <= aLimit)
{
var currentValue = a[id];
var previousValue = a[id - 1];
if (id != aLimit || breakValue <= currentValue)
{
var w = currentValue - previousValue;
var p = (breakValue - previousValue) / w;
value = b[id - 1] + p * (b[id] - b[id - 1]);
}
}
interpolated[i] = value;
}
return interpolated.ToList();
}
I've cached some (const) values and used Array.Copy, but I think these are micro optimization that are already made by the compiler in Release mode. However You can try this version and see if it will beat the original version of the code.
Instead of
interpolated.ToList()
which copies the whole array, you compute the interpolated values directly in the final list (or return that array instead). Especially if the array/List is big enough to qualify for the large object heap.
Unlike the ordinary heap, the LOH is not compacted by the GC, which means that short lived large objects are far more harmful than small ones.
Then again: 7000 doubles are approx. 56'000 bytes which is below the large object threshold of 85'000 bytes (1).
Looks to me you've created an O(n^2) algorithm. You are searching for the interval, that's O(n), then probably apply it n times. You'll get a quick and cheap speed-up by taking advantage of the fact that the items are already ordered in the list. Use BinarySearch(), that's O(log(n)).
If still necessary, you should be able to do something speedier with the outer loop, what ever interval you found previously should make it easier to find the next one. But that code isn't in your snippet.
I'd say profile the code and see where it spends its time, then you have somewhere to focus on.
ANTS is popular, but Equatec is free I think.
few suggestions,
as others suggested, use profiler to understand better where time is used.
the loop
while (breaks[x] < xItems[0])
could cause exception if x grows bigger than number of items in "breaks" list. You should use something like
while (x < breaks.Count && breaks[x] < xItems[0])
But you might not need that loop at all. Why treat the first item as special case, just start with id=0 and handle the first point in for(i) loop. I understand that id might start from 0 in this case, and [id-1] would be negative index, but see if you can do something there.
If you want to optimize for speed then you sacrifice memory size, and vice versa. You cannot usually have both, except if you make really clever algorithm. In this case, it would mean to calculate as much as you can outside loops, store those values in variables (extra memory) and use them later. For example, instead of always saying:
id = xItems.Count - 1;
You could say:
int lastXItemsIndex = xItems.Count-1;
...
id = lastXItemsIndex;
This is the same suggestion as Petar Petrov did with aLimit, bLimit....
next point, your loop (or the one Petar Petrov suggested):
while (breaks[i] > xItems[id])
{
id++;
if (id > xItems.Count - 1)
{
id = xItems.Count - 1;
break;
}
}
could probably be reduced to:
double currentBreak = breaks[i];
while (id <= lastXIndex && currentBreak > xItems[id]) id++;
and the last point I would add is to check if there is some property in your samples that is special for your problem. For example if xItems represent time, and you are sampling in regular intervals, then
w = xItems[id] - xItems[id - 1];
is constant, and you do not have to calculate it every time in the loop.
This is probably not often the case, but maybe your problem has some other property which you could use to improve performance.
Another idea is this: maybe you do not need double precision, "float" is probably faster because it is smaller.
Good luck
System.Diagnostics.Debug.WriteLine(string.Format("i: {0}, id {1}", i, id));
I hope it's release build without DEBUG defined?
Otherwise, it might depend on what exactly are those IList parameters. May be useful to store Count value instead of accessing property every time.
This is the kind of problem where you need to move over to native code.

Categories