Searching a string file

Searching a string file - c#

This is an extract of 1 record in a text file:
"Aggie Ngwenya#15.99#FULL-TIME"
I need to check if the last bit after the second "#" is true.
In other words if it is "FULL-TIME" then I extract the rest of that line, if it is not true then ignore it and move on to the next line of text.
Any ideas?

Assuming the format is constant (each line has three parts delimited by '#'), you could do something like:
string[] parts = line.Split('#');
if (parts[2] == "FULL-TIME") DoStuff();
It is probably wise to add some error handling to react well to input that does not meet your expectation, e.g. check the length of parts, perhaps do a case insensitive compare if casing can be an issue.

Related

Search for an exact phrase and change

I have a problem with changing values into a string.
I have three values, e.g. a, aa, aaa
StringBuilder builders = new StringBuilder(string);
builders.Replace("a", "ab");
builders.Replace("aa", "bab");
builders.Replace("aaa", "bba");
string new_string = builders.ToString();
How to do it to find exactly the value of 'aaa' because for this value both the condition of 'a' and 'aa' is also fulfilled.

Just do the replaces from the longest substring to the shortest one:
builders.Replace("aaa", "bba");
builders.Replace("aa", "bab");
builders.Replace("a", "ab");

The easiest way is to replace the long tokens first.
It looks like your replacement strings also contain "a". So you might need to replace them with a temporary string (one that is unlikely to occur in the original text), and then convert to the target string after you've processed them all.
The only way to intelligently only replace "a" without replacing "aa" is to parse each character in the text to determine how long each token is. It's not that hard, but a lot more work than calling Replace().

C# parsing out data line by line by character location from txt file

Just looking to see what the best way to approach the following situation would be.
I am trying to make a small job that reads in a txt file which has a thousand or so lines;
Each line is about 40 characters long (mostly numbers, some letter identifiers).
I have used
DataTable txtCache = new DataTable();
txtCache.Columns.Add(new DataColumn("Column1"));
string[] lines = System.IO.File.ReadAllLines(FILEcheck.Properties.Settings.Default.filePath);
foreach (string line in lines)
{
txtCache.Rows.Add(line);
}
However, what I really want to do is a bit confusing and hard to explain so i'll do my best. An example of line is below:
5498494000584454684840}eD44448774V6468465 Z
In the beginning of that long string is a "84", and then a "58" a little bit later. I need to do a comparison on these two numbers. They could be anything, but only a few combinations are acceptable in the file. They will always be in the same spot and same amount of characters (so it will always be 2 numbers and always in the 4-5 location). So I want to have 3 columns. I want the full string in 1 column, and then the 2 individual smaller numbers in columns of themselves. I can then compare them later on, and if there is an issue, I can return the full string which caused the issue.
Is this possible? I am just not sure how to parse out a substring based on character location and then loading it into a datatable.
Any advice would be appreciated. Thank you,

You could create the columns for each of items you are looking to store (whole string, first number, second number), and then add a row for each of the lines in the input file. You could just use the substring method to parse out the two digit numbers and store them. To do your analysis, you could parse the numbers out from the strings, or whatever else you need to do.
lines[0].Substring(3,2) will give you "84" in your above example. If you want the int, you could use Int32.Parse(lines[0].Substring(3,2))
Substring reference: http://msdn.microsoft.com/en-us/library/aka44szs%28v=vs.110%29.aspx

How do I prevent a string from appearing in a result string when a set of child strings are concatenated to form the result string?

I have 5 strings, let's call them
EarthString
FireString
WindString
WaterString
HeartString
All of them can have varying length, any of them can be empty, or can be very long (but never null).
These 5 strings are very good friends, and every weekend they are concatenated to form a result string using this c# statement
ResultString = EarthString + FireString + WindString + WaterString + HeartString
Depending on the values of these strings, sometimes (only sometimes), ResultString will contain "Captain Planet" as a substring.
My question is, how do I manipulate each of the 5 strings before they are concatenated, so that when they are combined, "Captain Planet" will never appear as a substring in the resultant string?
The only way I can think of right now is to examine each character in each string, in sequential order, but that seems very tedious. Since each of the 5 good friends strings can be of any length, examining the characters individually will also require some kind of concatenation before we can determine whether any character need to be dropped.
Edit: The resultant string is a filtered version of the 5 strings concatenated together, all the other content remain the same except the "Captain Planet" string is dropped. Yes, i'm looking for a solution which allows the 5 strings to be manipulated before concatenation. (this is actually a simplification of a bigger programming problem i'm encountering). Thanks guys.

If you want to do it pre-concat you could
Assign the start and end of each string a numeric value based on the portion of "CaptainPlanet" they contein. Ex: if Air = "net the big captain" then it would get 3 for a start value and 7 for an end value. to determine if you could concat 2 values safely you would just check to see if the end of the left string + start of the right string were not equal to the total length of "CaptainPlanet". If you had very large strings this would allow you to inspect just the first x and last x characters of the string to compute the start/end value.
This solution doesn't account for short strings like ei air = "Cap" , earth ="tain" and fire="Planet". In that case you would need to have a special case for tokens that are shorter than the length of "CaptainPlanet" For those.

Is there a particular reason you can't just do this?
ResultString.Replace("CaptainPlanet", "x");

If it doesn't matter how many chars will be dropped, you can remove f.e. all 'C' in all strings.

The original answer cleared all of the strings, but as pointed out by J.Steen, there was already a formulation of the expected output. So there we go.
Run elementString.Replace("Captain Planet", "") on every substring.
Now you have to identify all the prefixes / suffixes of "Captain Planet" on each of the substrings, and keep that information so that it can be processed before contatenation. That is, e.g. if the substring ends with "Capt", then you should have an information that "substring contains at the end a prefix of the 4 first letters of 'Captain Planet'". You also have to consider the cases of complete substrings (e.g. one of the strings is "ptain Pla"). The problem also becomes more complex if any of the e.g. prefixes can be recursive or repeated (e.g. "CaptainCap" contains 2 kinds of valid prefixes for "CaptainCaptain", and "apt" can be found at two locations in the resulting string);
You process that information before concatenation so that the result string has the same thing as ResultString.Replace("Captain Planet", ""). Congratulations, you have made your program much more complex than necessary!
But in short, you cannot get both the result that you want (all of the substrings intact except for the combined result output) and do the processing wholly before the concatenation step.

Parsing a CSV file

We have an integration with another system that relies on passing CSV files back and forth (really old school).
The structure is generally:
ID, Name, PhoneNumber, comments, fathersname
1, tom, 555-1234, just some random text, bill
2, jill smith, 555-4234, other random text, richard
Every so often we see this:
3, jacked up, 999-1231, here
be dragons
amongst us, ted
The primary problem I care about is detecting that a line breaker (\n) occurs in the middle of the record when that is the record terminator.
Is there anyway I can preprocess this to reliably fix it?
Note that we have zero control over what the other system emits.

So you should be able to do something more or less like this:
for (int i = 0; i < lines.Count; i++)
{
var fields = lines[i].Split(',').ToList();
while (fields.Count < numFields)//here be dragons amonst us
{
i++;//include next line in this line
//check to make sure we haven't run out of lines.
//combine end of previous field with start of the next one,
//and add the line break back in.
var innerFields = lines[i].Split(',');
fields[fields.Count - 1] += "\n" + innerFields[0];
fields.AddRange(innerFields.Skip(1));
}
//we now know we have a "real" full line
processFields(fields);
}
(For simplicity I assumed all lines were read in at the start; I assume you could alter it to lazily fetch each line easily enough.)

Let me start and say that the CSV file in your example is invalid. If a line break occurs inside a string, it should be wrapped with double quote characters.
Now for the answer - In order to parse this invalid csv format you must do several assumptions. In this case I made 2 assumptions: 1) The ID column must be numeric 2) The comment field can not contain digits.
Based on these assumptions you can check the first character after the line break character. If it is digit, you assume its a new record. If not you should treat it as a continue value of the comment field.
I don't know if the second assumption is valid, if not, you can enhance the logic so it will cover the business rules of the system.
Good Luck!

Firstly I would recommend using a tool to manage reading and writing your csv files, I use the FileHelpers library which is great.
You can essentially type your records and it will do all the validation and such for you. Worth the effort.
To your question perhaps you can do some preprocessing on the file and use Regex to replace any line breaks with a space?
I do something similar (not with files but) try
line.Replace(Environment.NewLine, " ");
With FileHelpers you could write a custom converter to do this during processing, or hook into the BeforeRead event.

Replacing part of text in richtextbox

I need to compare a value in a string to what user typed in a richtextbox.
For example: if a richtextbox holds string rtbText = "aaaka" and I compare this to another variable string comparable = "ka"(I want it to compare backwards). I want the last 2 letters from rtbText (comparable has only 2 letters) to be replaced with something that was predetermined(doesn't really matter what).
So rtbText should look like this:
rtbText = "aaa(something)"
This doesn't really have to be compared it can just count letters in comparable and based on that it can remove 2 letters from rtbText and replace them with something else.
UPDATE:
Here is what I have:
int coLen = comparable.Length;
comparable = null;
TextPointer caretBack = rtb.CaretPosition.GetPositionAtOffset(coLen, LogicalDirection.Backward);
TextRange rtbText = new TextRange(rtb.CaretPosition, caretBack);
string text = rtbText.Text;
rtbText returns an empty string or I get an error for everything longer than 3 characters. What am I doing wrong?
Let me elaborate it a little bit further. I have a listbox that holds replacements for values that user types in rtb. The values(replacements) are coming from there, meaning that I don't really need to go through the whole text to check values. I just need to check the values right before caret. I am comparing these values to what I have stored in another variable (comparable).
Please let me know if you don't understand something.
I did my best to explain what needs to be done.
Thank you

You could use Regex.Replace.
// this replaces all occurances of "ka" with "Replacement"
Regex replace = new Regex("ka");
string result = replace.Replace("aaaka","Replacemenet");

gumenimeda, I had similar problems few weeks ago. I found my self doing the following (I asume you will have more than one occurance in the RichTextBox that you will need to change), note that I did it for Windows Forms where I have access directly to the Rtf text of the control, not quite sure if it will work well in your scenario:
I find all the occurancies of the string (using IndexOf for example) and store them in a List for example.
Sort the list in descending order (max index goes first, the one before him second, etc)
Start replacing the occurancies directly in the RichTextBox, by removing the characters I don't need and appending the characters I need.
The sorting in step 2 is necessary as we always want to start from the last occurance going up to the first. Starting from the first occurance or any other and going down will have an unpleasant surprise - if the length of the chunk you want to remove and the length of the chunk you want to append are different in length, the string will be modified and all other occurancies will be invalid (for example if the second occurance was in at position 12 and your new string is 2 characters longer than the original, it will become 14th). This is not an issue if we go from the last to the first occurance as the change in string will not affect the next occurance in the list).
Ofcourse I can not be sure that this is the fastest way that can be used to achieve the desired result. It's just what I came up with and what worked for me.
Good luck!

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.