Splitting strings at specific positions - c#

I got a little problem here, i'm looking for a better way to split Strings.
For example i receive a String looking like this.
0000JHASDF+4429901234ALEXANDER
I know the pattern the string is built with and i have an array of numbers like this.
4,5,4,7,9
0000 - JHASDF - +442 - 9901234 - ALEXANDER
It is easy to split the whole thing up with the String MID command but it seems to be slow when i receive a file containing 8000 - 10000 datasets.
So any suggestion how i can make this faster to get the data in a List or an Array of Strings?
If anyone knows how to do this for example with RegEx.

var lengths = new[] { 4, 6, 4, 7, 9 };
var parts = new string[lengths.Length];
// if you're not using .NET4 or above then use ReadAllLines rather than ReadLines
foreach (string line in File.ReadLines("YourFile.txt"))
{
int startPos = 0;
for (int i = 0; i < lengths.Length; i++)
{
parts[i] = line.Substring(startPos, lengths[i]);
startPos += lengths[i];
}
// do something with "parts" before moving on to the next line
}

Isn't mid a VB method?
string firstPart = string.Substring(0, 4);
string secondPart = string.Substring(4, 5);
string thirdPart = string.Substring(9, 4);
//...

Perhaps something like this:
string[] SplitString(string s,int[] parts)
{
string[] result=new string[parts.Length];
int start=0;
for(int i=0;i<parts.Length;i++)
{
int len=parts[i];
result[i]=s.SubString(start, len);
start += len;
}
if(start!=s.Length)
throw new ArgumentException("String length doesn't match sum of part lengths");
return result;
}
(I didn't compile it, so it probably contains some minor errors)

As the Mid() function is VB, you could simply try
string.Substring(0, 4);
and so on.

The Regex Split Method would be a possibility, but since you don't have a specific delimiter in the string then I doubt it will be of any use and unlikely to be any faster.
String.Substring is also a possibility. You use it like: var myFirstString = fullString.Substring(0, 4)

I know this is late, but in the Microsoft.VisualBasic.FileIO namespace, you can find the textfieldparser and it would do a better job handling your issue. Here is a link to MSDN - https://msdn.microsoft.com/en-us/library/zezabash.aspx with an explanation. The code is in VB, but you can easily convert it to C#. You will need to add a reference to the Microsoft.VisualBasic.FileIO namespace as well. Hope this helps anyone stumbling on this question in the future.
Here is what it would look like in vb for the questioner's issue:
Using Reader As New Microsoft.VisualBasic.FileIO.
TextFieldParser("C:\TestFolder\test.log")
Reader.TextFieldType =
Microsoft.VisualBasic.FileIO.FieldType.FixedWidth
Reader.SetFieldWidths(4, 6, 4, 7, 9)
Dim currentRow As String()
While Not Reader.EndOfData
Try
currentRow = Reader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
MsgBox(currentField)
Next
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid and will be skipped.")
End Try
End While
End Using

Related

C# - Read, Edit & Save FixedLength file

I need to read FixedLenght file, edit some data inside of it and then save that file to some location. This little app which should do all this should be run every 2 hours.
This is the example of the file:
14000 US A111 78900
14000 US A222 78900
14000 US A222 78900
I need to look for data like A111 and A222, and to replace all A111 to for example A555. I have tried using TextFieldParser but without any luck... This is my code. I am able to get element of array but I am not sure what to do next...
using (TextFieldParser parser =
FileSystem.OpenTextFieldParser(sourceFile))
{
parser.TextFieldType = FieldType.FixedWidth;
parser.FieldWidths = new int[] { 6, 3, 5, 5 };
while (!parser.EndOfData)
{
try
{
string[] fields = parser.ReadFields();
foreach (var f in fields)
{
Console.WriteLine(f);
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
}
This is solution by Berkouz, but still having issues, the items of array are not replaced in output when saved to a file. The code:
string[] rows = File.ReadAllLines(sourceFile);
foreach (var row in rows)
{
string[] elements = row.Split(' ');
for (int i = 0; i < elements.Length; i++)
{
if (elements.GetValue(i).ToString() == "A111") {
elements.SetValue("A555", i);
}
}
}
var destFile = targetPath.FullName + "\\" + "output.txt";
File.WriteAllLines(destFile, rows);
Note the line where rows[rowIndex] is assigned to. that's because of string immutability forcing replace and similar functions to have an output value(as opposed to modifying their input) that you have to assign back to your data storage(whatever it may be, an array in this case).
var rows = File.ReadAllLines(sourcefile);
for (int rowIndex = 0; rowIndex != rows.Length; rowIndex++)
rows[rowIndex] = rows[rowIndex].Replace("A111", "A555");
File.WriteAllLines(destFile, rows);
This looks like an AB problem. If this is a one time thing, I suggest you use sed instead.
Invocation is simple: sed -e 's/A111/A555/g'
In case your file contents are more complex you can use awk, perl pcre regex features.
If this is in fact not a one-time thing and you want it written in C#, you can:
A) use System.IO.File.ReadAllLines(), split the text using string.Split(), replace the item you want using string.Replace() and write it back using WriteAllLines()
B) use a MemoryMappedFile. This way, you don't have to worry about writing anything. But it tends to get a little bit pointery and you should be careful with BOMs.
There are a LOT of other ways, these are the two ends of the spectrum for easy/slow/clean and fast/efficient/ugly code.

How to append character in long integer?

I want to append a character to a long integer, using the below code:
if (strArrIds[1].Contains("CO"))
{
long rdb2 = Convert.ToInt64(strArrIds[1].Substring(strArrIds[1].Length - 1));
assessmentEntity.RatingType = rdb2;
}
If rdb2 = 5, I want to append a L to this value, like rdb2 = 5L.
Any ideas? Thanks in advance.
You can using Long.Parse instead Convert.ToInt64 to get the long and you wont need to append L to make it long
if (strArrIds[1].Contains("CO"))
{
long rdb2 = long.Parse(strArrIds[1].Substring(strArrIds[1].Length - 1));
assessmentEntity.RatingType = rdb2;
}
If you are processing a lot of these, and if this item will be used for further processing, you could consider turning this into a class which might be easier to manage. This would also allow you to more easily customise your ratings.
I'm thinking maybe a factory might serve you well also which could instantiate and return your assessment entity. You could then leverage a dependency injection strategy for any other functionality.
Its a little hard to tell what you require this for without a bit more context.
If this is a once off, I would refactor to
assessmentEntity.RatingType = strArrIds[1].Contains("CO") ?
String.Concat(long.Parse(strArrIds[1].Substring(strArrIds[1].Length - 1)).ToString(), "L") :
"0N";
Assuming "0N" is some other default rating..
You do not need an L here. That is only for literals of type long appearing in the C# source.
You could do something like:
string str1 = strArrIds[1];
if (str1.Contains("CO"))
{
long rdb2 = str1[str1.Length - 1] - '0';
if (rdb2 < 0L || rdb2 > 9L)
throw new InvalidOperationException("Unexpected rdb2 value from str1=" + str1);
assessmentEntity.RatingType = rdb2;
}

Postfix Calculator

Making a console application in C sharp to solve expressions in postfix notation by utilizing a stack, such as:
Expression: 43+2*
Answer: 14
What I've done so far:
using System;
using System.Collections;
using System.Linq;
using System.Text;
namespace ConsoleApplication7
{
class Program
{
static void Main(string[] args)
{
string input = "23+";
int counter = 0;
Stack values = new Stack();
while (counter < input.Length)
{
int temp1,
temp2,
answer;
char x = char.Parse(input.Substring(counter, 1));
if ( );
else if (x == '+')
{
temp1 = (int)values.Pop();
temp2 = (int)values.Pop();
values.Push(answer = temp1 + temp2);
}
else if (x == '-')
{
temp1 = (int)values.Pop();
temp2 = (int)values.Pop();
values.Push(answer = temp1 - temp2);
}
else if (x == '*')
{
temp1 = (int)values.Pop();
temp2 = (int)values.Pop();
values.Push(answer = temp1 / temp2);
}
else if (x == '/')
{
temp1 = (int)values.Pop();
temp2 = (int)values.Pop();
values.Push(answer = temp1 * temp2);
}
counter++;
}
Console.WriteLine(values.Pop());
}
}
For the if statement, what can I use as a condition to check if x is a operand?
Is your example input 2, 3, + (which equals 5), or 23, + (which is invalid input)? I'm assuming the former. How, then, would you write two-digit numbers? Your current approach doesn't seem to support this. I think you shouldn't be parsing this char-by-char, but split it into the separate components first, perhaps using a regex that recognizes numbers and punctuation. As a quick example: Regex.Matches("10 3+", #"(\d+|[\+\-\*/ ])") splits into 10, , 3, and +, which can be parsed and understood fairly easily with the code you already have, (spaces should be ignored; they're simply a punctuation I picked to separate numbers so that you can have multi-digit numbers) and int.TryParse (or double, which requires a more complicated regex pattern, see Matching Floating Point Numbers for that pattern) to see if an input is a number.
You should use a Stack<int> to avoid casting and make it compile-time safe.
Surely this is wrong:
((int)Char.GetNumericValue(x) <= 0 && (int)Char.GetNumericValue(x) >= 0)
I think it should be
((int)Char.GetNumericValue(x) <= 9 && (int)Char.GetNumericValue(x) >= 0)
I really think this is more like a code review but well so be it - first: please seperate some concerns - you baked everything into a big messy monster - think about the parts of the problem and put them into seperate methods to start with.
Then: if you cannot solve the hole problem, make it smaller first: Let the user enter some kind of seperator for the parts, or assume for now that he does - space will do just fine.
You can think of how to handle operators without spaces pre/postfixed later.
So try parsing "2 3 +" instead of "23+" or "2 3+" ...
If you do this you can indeed just use String.Split to make your life much easier!
As to how you can recognize an operant: very easy - try Double.TryParse it will tell you if you passed it a valid number and you don't have to waste your time with parsing the numbers yourself
Instead of using a while in there you should use a for or even better a foreach - heck you can even do this with LINQ and [Enumerable.Aggregate][1] and get FUNctional :D
And finally ... don't use this if/then/else mess if a switch does the job ...
You could say that there essentially are no operands. Even digits can be thought of as operators that multiply the top of the stack by 10 and add the digit value; accumulating a value over several digits as necessary. Then you just need an operator for seeding this by pushing a zero to the stack (perhaps a space character for that).
http://blogs.msdn.com/b/ashleyf/archive/2009/10/23/tinyrpn-calculator.aspx

Initializing an integer array

string dosage = "2/3/5 mg";
string[] dosageStringArray = dosage.Split('/');
int[] dosageIntArray = null;
for (int i = 0; i <= dosageStringArray.Length; i++)
{
if (i == dosageStringArray.Length)
{
string[] lastDigit = dosageStringArray[i].Split(' ');
dosageIntArray[i] = Common.Utility.ConvertToInt(lastDigit[0]);
}
else
{
dosageIntArray[i] = Common.Utility.ConvertToInt(dosageStringArray[i]);
}
}
I am getting the exception on this line: dosageIntArray[i] = Common.Utility.ConvertToInt(dosageStringArray[i]);
I am unable to resolve this issue. Not getting where the problem is. But this line int[] dosageIntArray = null; is looking suspicious.
Exception: Object reference not set to an instance of an object.
The biggest problem with your solution is not the missing array declaration, but rather how
you'd parse the following code:
string dosage = "2/13/5 mg";
Since your problem is surely domain specific, this may not arise, but some variation of two digits representing same integer.
The following solution splits the string on forward slash, then removes any non-digits from the substrings before converting them to integers.
Regex digitsOnly = new Regex(#"[^\d]");
var array = dosage.Split('/')
.Select(num => int.Parse(digitsOnly.Replace(num, string.Empty)))
.ToArray();
Or whatever that looks like with the cuddly Linq synthax.
You are looking for something like
int[] dosageIntArray = new int[dosageStringArray.Length];
You are trying to access a null array (dosageIntArray) here:
dosageIntArray[i] = Common.Utility.ConvertToInt(lastDigit[0]);
You need to initialize it before you can access it like that.
You have to allocate dosageIntArray like this:
in[] dosageIntArray = new int[dosageStringArray.Length];
Also, you have another bug in your code:
Index of last element of an array is Length - 1.
Your for statement should read as:
for (int i = 0; i < dosageStringArray.Length; i++)
or
for (int i = 0; i <= (dosageStringArray.Length - 1); i++)
The former is preferred and is the most common style you will see.
I strongly recommend you use Lists instead of Arrays. You don't need to define the size of the List; just add items to it. It's very functional and much easier to use.
As alternative approach:
var dosage = "2/3/5 mg";
int[] dosageIntArray = Regex.Matches(dosage, #"\d+")
.Select(m => int.Parse(m.Value))
.ToArray();

what's the C# equivalent of string$ from basic

And is there an elegant linqy way to do it?
What I want to do is create string of given length with made of up multiples of another string up to that length
So for length - 9 and input string "xxx" I get "xxxxxxxxx" (ie length 9)
for a non integral multiple then I'd like to truncate the line.
I can do this using loops and a StringBuilder easily but I'm looking to see if the language can express this idea easily.
(FYI I'm making easter maths homework for my son)
No, nothing simple and elegant - you have to basically code this yourself.
You can construct a string with a number of repeated characters, but ot repeated strings,
i.e.
string s = new string("#", 6); // s = "######"
To do this with strings, you would need a loop to concatenate them, and the easest would then be to use substring to truncate to the desired final length - along the lines of:
string FillString(string text, int count)
{
StringBuilder s = new StringBuilder();
for(int i = 0; i <= count / text.Length; i++)
s.Add(text);
return(s.ToString().Substring(count));
}
A possible solution using Enumerable.Repeat.
const int TargetLength = 10;
string pattern = "xxx";
int repeatCount = TargetLength / pattern.Length + 1;
string result = String.Concat(Enumerable.Repeat(pattern, repeatCount).ToArray());
result = result.Substring(0, TargetLength);
Console.WriteLine(result);
Console.WriteLine(result.Length);
My Linqy (;)) solution would be to create an extension method. Linq is language integrated query, so why the abuse? Im pretty sure it's possible with the select statement of linq since you can create new (anonymous) objects, but why...?

Categories