Parsing data have blank array field showing - c#

I am parsing my data output, however, my data has return charicters in it (\n). So when I run my code, the array is built and one of the arrays (4) is blank data... I have tried using null, "", and " ". Would anyone know how I can prevent that last array from showing?
char[] returnChar= {'\n' };
string parseText = captcha;
string[] words = parseText.Split(returnChar);
int count = words.Length;
for (int i = 0; i < count; i++)
{
if (words[i] == null)
{
MessageBox.Show("This row is empty: " + i);
}
MessageBox.Show(words[i]);
}

When doing String.Split, define the second parameter - StringSplitOptions.
string[] words =
parseText.Split(returnChar, StringSplitOptions.RemoveEmptyEntries);
This way it will skip over empty elements.

Related

C#: Need to split a string into a string[] and keeping the delimiter (also a string) at the beginning of the string

I think I am too dumb to solve this problem...
I have some formulas which need to be "translated" from one syntax to another.
Let's say I have a formula that goes like that (it's a simple one, others have many "Ceilings" in it):
string formulaString = "If([Param1] = 0, 1, Ceiling([Param2] / 0.55) * [Param3])";
I need to replace "Ceiling()" with "Ceiling(; 1)" (basically, insert "; 1" before the ")").
My attempt is to split the fomulaString at "Ceiling(" so I am able to iterate through the string array and insert my string at the correct index (counting every "(" and ")" to get the right index)
What I have so far:
//splits correct, but loses "CEILING("
string[] parts = formulaString.Split(new[] { "CEILING(" }, StringSplitOptions.None);
//splits almost correct, "CEILING(" is in another group
string[] parts = Regex.Split(formulaString, #"(CEILING\()");
//splits almost every letter
string[] parts = Regex.Split(formulaString, #"(?=[(CEILING\()])");
When everything is done, I concat the string so I have my complete formula again.
What do I have to set as Regex pattern to achieve this sample? (Or any other method that will help me)
part1 = "If([Param1] = 0, 1, ";
part2 = "Ceiling([Param2] / 0.55) * [Param3])";
//part3 = next "CEILING(" in a longer formula and so on...
As I mention in a comment, you almost got it: (?=Ceiling). This is incomplete for your use case unfortunately.
I need to replace "Ceiling()" with "Ceiling(; 1)" (basically, insert "; 1" before the ")").
Depending on your regex engine (for example JS) this works:
string[] parts = Regex.Split(formulaString, #"(?<=Ceiling\([^)]*(?=\)))");
string modifiedFormula = String.join("; 1", parts);
The regex
(?<=Ceiling\([^)]*(?=\)))
(?<= ) Positive lookbehind
Ceiling\( Search for literal "Ceiling("
[^)] Match any char which is not ")" ..
* .. 0 or more times
(?=\)) Positive lookahead for ")", effectively making us stop before the ")"
This regex is a zero-assertion, therefore nothing is lost and it will cut your strings before the last ")" in every "Ceiling()".
This solution would break whenever you have nested "Ceiling()". Then your only solution would be writing your own parser for the same reasons why you can't parse markup with regex.
Regex.Replace(formulaString, #"(?<=Ceiling\()(.*?)(?=\))","$1; 1");
Note: This will not work for nested "Ceilings", but it does for Ceiling(), It will also not work fir Ceiling(AnotherFunc(x)). For that you need something like:
Regex.Replace(formulaString, #"(?<=Ceiling\()((.*\((?>[^()]+|(?1))*\))*|[^\)]*)(\))","$1; 1$3");
but I could not get that to work with .NET, only in JavaScript.
This is my solution:
private string ConvertCeiling(string formula)
{
int ceilingsCount = formula.CountOccurences("Ceiling(");
int startIndex = 0;
int bracketCounter;
for (int i = 0; i < ceilingsCount; i++)
{
startIndex = formula.IndexOf("Ceiling(", startIndex);
bracketCounter = 0;
for (int j = 0; j < formula.Length; j++)
{
if (j < startIndex) continue;
var c = formula[j];
if (c == '(')
{
bracketCounter++;
}
if (c == ')')
{
bracketCounter--;
if (bracketCounter == 0)
{
// found end
formula = formula.Insert(j, "; 1");
startIndex++;
break;
}
}
}
}
return formula;
}
And CountOccurence:
public static int CountOccurences(this string value, string parameter)
{
int counter = 0;
int startIndex = 0;
int indexOfCeiling;
do
{
indexOfCeiling = value.IndexOf(parameter, startIndex);
if (indexOfCeiling < 0)
{
break;
}
else
{
startIndex = indexOfCeiling + 1;
counter++;
}
} while (true);
return counter;
}

Iterating backwards through an char array after finding a known word

I've got a project I'm working on in C#. I've got two char array's. One is a sentence and one is a word. I've got to iterate through the sentence array until I find a word that matches the word that was turned into an word array what I'm wondering is once I find the word how do I iterate backwards through the sentence array at the point I found the word back through the same length as the word array?
Code :
String wordString = "(Four)";
String sentenceString = "Maybe the fowl of Uruguay repeaters (Four) will be found";
char[] wordArray = wordString.ToCharArray();
List<String> words = sentenceString.Split(' ').ToList<string>();
//This would be the part where I iterate through sentence
foreach (string sentence in sentArray)
{
//Here I would need to find where the string of (Four) and trim it and see if it equals the wordString.
if (sentence.Contains(wordString)
{
//At this point I would need to go back the length of wordString which happens to be four places but I'm not sure how to do this. And for each word I go back in the sentence I need to capture that in another string array.
}
I don't know if I'm being clear enough on this but if I'm not please feel free to ask.. Thank you in advanced. Also what this should return is "fowl of Uruguay repeaters". So basically the use case is for the number of letters in the parenthesis the logic should return the same number of words before the word in parenthesis.
our are you. I have few question concerned to this exercise.
If the Word (four) was in beginning it should not return? or return all string?
As the length of four is equal to 4 imagine if that word appear as the second word on sentence what it should return just the first word or return 4 words even including the (four) word.?
My solution is the laziest one I just see your question and decide to help.
My solution assumes it return all the word before the (four) if the length is bigger than the word before the (four) word.
My solution return empty string if the word (four) is in the beginning.
My solution return Length of (four) (4) words before (four) word.
ONCE AGAIN IT IS NOT MY BEST APPROACH.
I see the code bellow:
string wordString = "(Four)";
string sentenceString = "Maybe the fowl of Uruguay repeaters (Four) will be found";
//Additionally you can add splitoption to remove the empty word on split function bellow
//Just in case there are more space in sentence.
string[] splitedword = sentenceString.Split(' ');
int tempBackposition = 0;
int finalposition = 0;
for (int i = 0; i < splitedword.Length; i++)
{
if (splitedword[i].Contains(wordString))
{
finalposition = i;
break;
}
}
tempBackposition = finalposition - wordString.Replace("(","").Replace(")","").Length;
string output = "";
tempBackposition= tempBackposition<0?0:tempBackposition;
for (int i = tempBackposition; i < finalposition; i++)
{
output += splitedword[i] + " ";
}
Console.WriteLine(output);
Console.ReadLine();
If it's not what you want can you answer my questions on top? or help me to understand were it's wrong
int i ;
string outputString = (i=sentenceString.IndexOf(wordString))<0 ?
sentenceString : sentenceString.Substring(0,i) ;
var wordString = "(Four)";
int wordStringInt = 4; // Just do switch case to convert your string to int
var sentenceString = "Maybe the fowl of Uruguay repeaters (Four) will be found";
var sentenceStringArray = sentenceString.Split(' ').ToList();
int wordStringIndexInArray = sentenceStringArray.IndexOf(wordString) - 1;
var stringOutPut = "";
if (wordStringIndexInArray > 0 && wordStringIndexInArray > wordStringInt)
{
stringOutPut = "";
while (wordStringInt > 0)
{
stringOutPut = sentenceStringArray[wordStringInt] + " " + stringOutPut;
wordStringInt--;
}
}
What you are matching is kind of complex, so for a more general solution you could use regular expressions.
First we declare what we are searching for:
string word = "(Four)";
string sentence = "Maybe the fowl of Uruguay repeaters (Four) will be found";
We will then search for the words in this string using regular expressions. Since we don't want to match whitespace, and we need to know where each match actually starts and we need to know the word inside the parenthesis we tell it that we optionally want opening and ending parenthesis, but we also want the contents of those as a match:
var words = Regex.Matches(sentence, #"[\p{Ps}]*(?<Content>[\w]+)[\p{Pe}]*").Cast<Match>().ToList();
[\p{Ps}] means we want opening punctuation ([{ etc. while the * indicates zero or more.
Followed is a sub-capture called Content (specified by ?<Content>) with one or more word characters.
At the end we specify that we want zero or more ending punctuation.
We then need to find the word in the list of matches:
var item = words.Single(x => x.Value == word);
Then we need to find this item's index:
int index = words.IndexOf(item);
At this point we just need to know the length of the contents:
var length = item.Groups["Content"].Length;
This length we use to go back in the string 4 words
var start = words.Skip(index - length).First();
And now we have everything we need:
var result = sentence.Substring(start.Index, item.Index - start.Index);
Result should contain fowl of Uruguay repeaters.
edit: It may be a lot simpler to just figure out the count from the word rather than from the content. In that case the complete code should be the following:
string word = "(Four)";
string sentence = "Maybe the fowl of Uruguay repeaters (Four) will be found";
var wordMatch = Regex.Match(word, #"[\p{Ps}]*(?<Content>[\w]+)[\p{Pe}]*");
var length = wordMatch.Groups["Content"].Length;
var words = Regex.Matches(sentence, #"\S+").Cast<Match>().ToList();
var item = words.Single(x => x.Value == word);
int index = words.IndexOf(item);
var start = words.Skip(index - length).First();
var result = sentence.Substring(start.Index, item.Index - start.Index);
\S+ in this case means "match one or more non-whitespace character".
You should try something like the following, which uses Array.Copy after it finds the number word. You will still have to implement the ConvertToNum function correctly (it is hardcoded in for now), but this should be a quick and easy solution.
string[] GetWords()
{
string sentenceString = "Maybe the fowl of Uruguay repeaters (Four) will be found";
string[] words = sentenceString.Split();
int num = 0;
int i; // scope of i should remain outside the for loop
for (i = 0; i < words.Length; i++)
{
string word = words[i];
if (word.StartsWith("(") && word.EndsWith(")"))
{
num = ConvertToNum(word.Substring(1, word.Length - 1));
// converted the number word we found, so we break
break;
}
}
if (num == 0)
{
// no number word was found in the string - return empty array
return new string[0];
}
// do some extra checking if number word exceeds number of previous words
int startIndex = i - num;
// if it does - just start from index 0
startIndex = startIndex < 0 ? 0 : startIndex;
int length = i - startIndex;
string[] output = new string[length];
Array.Copy(words, startIndex, output, 0, length);
return output;
}
// Convert the number word to an integer
int ConvertToNum(string numberStr)
{
return 4; // you should implement this method correctly
}
See - Convert words (string) to Int, for help implementing the ConvertToNum solution. Obviously it could be simplified depending on the range of numbers you expect to deal with.
Here is my solution, im not using regex at all, for the sake of easy understanding:
static void Main() {
var wordString = "(Four)";
int wordStringLength = wordString.Replace("(","").Replace(")","").Length;
//4, because i'm assuming '(' and ')' doesn't count.
var sentenceString = "Maybe the fowl of Uruguay repeaters (Four) will be found";
//Transform into a list of words, ToList() to future use of IndexOf Method
var sentenceStringWords = sentenceString.Split(' ').ToList();
//Find the index of the word in the list of words
int wordIndex = sentenceStringWords.IndexOf(wordString);
//Get a subrange from the original list of words, going back x Times the legnth of word (in this case 4),
var wordsToConcat = sentenceStringWords.GetRange(wordIndex-wordStringLength, wordStringLength);
//Finally concat the output;
var outPut = string.Join(" ", wordsToConcat);
//Output: fowl of Uruguay repeaters
}
I have a solution for you:
string wordToMatch = "(Four)";
string sentence = "Maybe the fowl of Uruguay repeaters (Four) will be found";
if (sentence.Contains(wordToMatch))
{
int length = wordToMatch.Trim(new[] { '(', ')' }).Length;
int indexOfMatchedWord = sentence.IndexOf(wordToMatch);
string subString1 = sentence.Substring(0, indexOfMatchedWord);
string[] words = subString1.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
var reversed = words.Reverse().Take(length);
string result = string.Join(" ", reversed.Reverse());
Console.WriteLine(result);
Console.ReadLine();
}
Likely this can be improved regardng performance, but I have a feeling you don't care about that. Make sure you are using 'System.Linq'
I assumed empty returns when input is incomplete, feel free to correct me on that. Wasn't 100% clear in your post how this should be handled.
private string getPartialSentence(string sentence, string word)
{
if (string.IsNullOrEmpty(sentence) || string.IsNullOrEmpty(word))
return string.Empty;
int locationInSentence = sentence.IndexOf(word, StringComparison.Ordinal);
if (locationInSentence == -1)
return string.Empty;
string partialSentence = sentence.Substring(0, locationInSentence);
string[] words = partialSentence.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries);
int nbWordsRequired = word.Replace("(", "").Replace(")", "").Length;
if (words.Count() >= nbWordsRequired)
return String.Join(" ", words.Skip(words.Count() - nbWordsRequired));
return String.Join(" ", words);
}
I went with using an enumeration and associated dictionary to pair the "(Four)" type strings with their integer values. You could just as easily (and maybe easier) go with a switch statement using
case "(Four)": { currentNumber = 4; };
I feel like the enum allows a little more flexibility though.
public enum NumberVerb
{
one = 1,
two = 2,
three = 3,
four = 4,
five = 5,
six = 6,
seven = 7,
eight = 8,
nine = 9,
ten = 10,
};
public static Dictionary<string, NumberVerb> m_Dictionary
{
get
{
Dictionary<string, NumberVerb> temp = new Dictionary<string, NumberVerb>();
temp.Add("(one)", NumberVerb.one);
temp.Add("(two)", NumberVerb.two);
temp.Add("(three)", NumberVerb.three);
temp.Add("(four)", NumberVerb.four);
temp.Add("(five)", NumberVerb.five);
temp.Add("(six)", NumberVerb.six);
temp.Add("(seven)", NumberVerb.seven);
temp.Add("(eight)", NumberVerb.eight);
temp.Add("(nine)", NumberVerb.nine);
temp.Add("(ten)", NumberVerb.ten);
return temp;
}
}
static void Main(string[] args)
{
string resultPhrase = "";
// Get the sentance that will be searched.
Console.WriteLine("Please enter the starting sentance:");
Console.WriteLine("(don't forget your keyword: ie '(Four)')");
string sentance = Console.ReadLine();
// Get the search word.
Console.WriteLine("Please enter the search keyword:");
string keyword = Console.ReadLine();
// Set the associated number of words to backwards-iterate.
int currentNumber = -1;
try
{
currentNumber = (int)m_Dictionary[keyword.ToLower()];
}
catch(KeyNotFoundException ex)
{
Console.WriteLine("The provided keyword was not found in the dictionary.");
}
// Search the sentance string for the keyword, and get the starting index.
Console.WriteLine("Searching for phrase...");
string[] words = sentance.Split(' ');
int searchResultIndex = -1;
for (int i = 0; (searchResultIndex == -1 && i < words.Length); i++)
{
if (words[i].Equals(keyword))
{
searchResultIndex = i;
}
}
// Handle the search results.
if (searchResultIndex == -1)
{
resultPhrase = "The keyword was not found.";
}
else if (searchResultIndex < currentNumber)
{
// Check the array boundaries with the given indexes.
resultPhrase = "Error: Out of bounds!";
}
else
{
// Get the preceding words.
for (int i = 0; i < currentNumber; i++)
{
resultPhrase = string.Format(" {0}{1}", words[searchResultIndex - 1 - i], resultPhrase);
}
}
// Display the preceding words.
Console.WriteLine(resultPhrase.Trim());
// Exit.
Console.ReadLine();
}

How to determine if a cell is empty in a .csv file?

I am trying to parse a column (File is composed of only 1 column filled with double numbers.) in a .csv file but C# throws me error when it encounters an empty cell.
{"Input string was not in a correct format."}
I want program to continue with the next cell when this happens. Is there a way?
Note: I tried
if(array[i] != null)
but this does not seem to work.
I use this block to read from .csv:
var column = new List<string>();
using (var rd = new StreamReader(#"pathofthecsvfile"))
{
while (!rd.EndOfStream)
{
var splits = rd.ReadLine().Split(';');
column.Add(splits[0]);
}
}
string[] arr = column.ToArray();
double[] array = new double[arr.Length];
//problem is in this block
for (int i = 0; i < array.Length; i++)
{
if (arr[i] != null)
{
array[i] = Convert.ToDouble(arr[i]);
}
}
I would check it when inserting:
if split[0] != null && split[0] != "" {
column.Add(splits[0]);
}
You are not validating the data you are loading into the array to see if it is in a valid format - Convert.ToDouble will fail for a number of reasons other than just null value and empty string.
If you are happy just to skip to next column, use a TryParse instead in the for loop:
double.TryParse(arr[i], out array[i]);

Issue with .NET String.Split

I'm attempting to parse a text file containing data that is being used on a remote FTP server. The data is delimited by an equals sign (=) and I'm attempting to load each row in to two columns in a DataGridView. The code I have written works fine except for when an equals character is thrown into the second column's value. When this happens, regardless of specifying the maximum count as being 2. I'd prefer not to change the delimiter if possible.
Here is the code that is being problematic:
dataGrid_FileContents.Rows.Clear();
char delimiter = '=';
StreamReader fileReader = new StreamReader(fileLocation);
String fileData = fileReader.ReadToEnd();
String[] rows = fileData.Split("\n".ToCharArray());
for(int i = 0; i < rows.Length; i++)
{
String str = rows[i];
String[] items = str.Split(new char[] { delimiter }, 1, StringSplitOptions.RemoveEmptyEntries);
if (items.Length == 2)
{
dataGrid_FileContents.Rows.Add(items[0], items[1]);
}
}
fileReader.Close();
And an example of the file being loaded:
boats=123
cats=234-f
cars==1
It works as intended for the first two rows and then ignores the last row as it ends up creating a String[] with 1 element and two String[]s with zero elements.
Try the following. It will capture the value before and after the first '=', correctly parsing the cars==1 scenario.
String[] items = str.Split(new char[] { delimiter }, 2, stringSplitOptions.None);
A different solution, if you want everything after the first equals then you could approach this problem using string.IndexOf
for(int i = 0; i < rows.Length; i++)
{
String str = rows[i];
int pos = str.IndexOf(delimiter);
if (pos != -1)
{
string first = str.Substring(0, pos-1);
string second = str.Substring(pos + 1);
dataGrid_FileContents.Rows.Add(first, second);
}
}
Just read all items delimeted by '=' in row.
Then iterate over items, and check, that item not empty, than use this prepared data to write
here illustrated snippet
http://dotnetfiddle.net/msVho2
and your snippet can be transformed to something like bellow
dataGrid_FileContents.Rows.Clear();
char delimiter = '=';
using(StreamReader fileReader = new StreamReader(fileLocation))
{
string[] data = new string[2];
while(true)
{
string row = fileReader.ReadLine();
if(row == null)
break;
string[] items = row.Split(delimiter);
int data_index = 0;
foreach(string item in items)
{
if(data_index >= data.Length)
{
//TODO: log warning
break;
}
if(!string.IsNullOrWhiteSpace(item))
{
data[data_index++] = item;
}
}
if(data_index < data.Length)
{
//TODO: log error, only 1 item in row
continue;
}
dataGrid_FileContents.Rows.Add(data[0], data[1]);
}
}

C# Split not working

I'm trying to read data from a file, split the data and save to array. The code works fine except for the split. It is returning a NullException.
Any help would be greatly appreciated.
public static void LoadHandData(CurrentHand[] handData, string fileName)
{
string input = ""; //temporary variable to hold one line of data
string[] cardData; //temporary array to hold data split from input
StreamReader readHand = new StreamReader(fileName);
for (int counter = 0; counter < handData.Length; counter++)
{
input = readHand.ReadLine(); //one record
cardData = input.Split(' '); //split record into fields
int index = 0;
handData[counter].cardSuit = Convert.ToChar(cardData[index++]);
handData[counter].cardValue = Convert.ToInt16(cardData[index++]);
}
readHand.Close();
}
As per the comments, you've only got one line of data. But look at your loop:
for (int counter = 0; counter < handData.Length; counter++)
{
input = readHand.ReadLine(); //one record
cardData = input.Split(' '); //split record into fields
int index = 0;
handData[counter].cardSuit = Convert.ToChar(cardData[index++]);
handData[counter].cardValue = Convert.ToInt16(cardData[index++]);
}
That's trying to read one line per hand. On the second iteration, ReadLine will return null, so when you call input.Split() you'll end up with the NullReferenceException you're seeing.
You need to read the line once and split it. Given that you've only got one line of text, you can just use File.ReadAllText to simplify things:
string input = File.ReadAllText(fileName);
string[] cardData = input.Split(' ');
for (int counter = 0; counter < handData.Length; counter++)
{
handData[counter].cardSuit = Convert.ToChar(cardData[counter * 2]);
handData[counter].cardValue = Convert.ToInt16(cardData[counter * 2 + 1]);
}

Categories