How to split long strings into fixed array items - c#

I have a dictionary object IDictionary<string, string> which only has the following items:
Item1, Items2 and Item3. Each item only has a maximum length of 50 characters.
I then have a list of words List<string>. I need a loop that will go through the words and add them to the dictionary starting at Item1, but before adding it to the dictionary the length needs to be checked. If the new item and the current item's length added together is greater than 50 characters then the word needs to move down to the next line (in this case Item2).
What is the best way to do this?

I'm not sure why this question got voted down so much, but perhaps the reason is you have a very clear algorithm already, so it should be trivial to get the C# code. As it stands, either you're either really inexperienced or really lazy. I'm going to assume the former.
Anyways, lets walk through the requirements.
1) "I then have a list of words List." you have this line in some form already.
List<string> words = GetListOfWords();
2) "go through the words and add them to the dictionary starting at Item1" -- I'd recommend a List instead of a dictionary, if you're going for a sequence of strings. Also, you'll need a temporary variable to store the contents of the current line, because you're really after adding a full line at a time.
var lines = new List<string>();
string currentLine = "";
3) "I need a loop that will go through the words"
foreach(var word in words) {
4) " If the new item and the current item's length added together is greater than 50 characters" -- +1 for the space.
if (currentLine.Length + word.Length + 1 > 50) {
5) "then the word needs to move down to the next line "
lines.Add(currentLine);
currentLine = word;
}
6) " go through the words and add them to the dictionary starting at Item1 " -- you didn't phrase this very clearly. What you meant was you want to join each word to the last line unless it would make the line exceed 50 characters.
else {
currentLine += " " + word;
}
}
lines.Add(currentLine); // the last unfinished line
and there you go.
If you absolutely need it as an IDictionary with 3 lines, just do
var dict = new Dictionary<string,string>();
for(int lineNum = 0; lineNum < 3; lineNum ++)
dict["Address"+lineNum] = lineNume < lines.Length ? lines[lineNum] : "";

I currently have this and was wondering if there was perhaps a better solution:
public IDictionary AddWordsToDictionary(IList words)
{
IDictionary addressParts = new Dictionary();
addressParts.Add("Address1", string.Empty);
addressParts.Add("Address2", string.Empty);
addressParts.Add("Address3", string.Empty);
int currentIndex = 1;
foreach (string word in words)
{
if (!string.IsNullOrEmpty(word))
{
string key = string.Concat("Address", currentIndex);
int space = 0;
string spaceChar = string.Empty;
if (!string.IsNullOrEmpty(addressParts[key]))
{
space = 1;
spaceChar = " ";
}
if (word.Length + space + addressParts[key].Length > MaxAddressLineLength)
{
currentIndex++;
key = string.Concat("Address", currentIndex);
space = 0;
spaceChar = string.Empty;
if (currentIndex > addressParts.Count)
{
break; // To big for all 3 elements so discard
}
}
addressParts[key] = string.Concat(addressParts[key], spaceChar, word);
}
}
return addressParts;
}

I would just do the following and do what you like with the resulting "lines"
static List<string> BuildLines(List<string> words, int lineLen)
{
List<string> lines = new List<string>();
string line = string.Empty;
foreach (string word in words)
{
if (string.IsNullOrEmpty(word)) continue;
if (line.Length + word.Length + 1 <= lineLen)
{
line += " " + word;
}
else
{
lines.Add(line);
line = word;
}
}
lines.Add(line);
return lines;
}

Related

Using a for loop to iterate from an array to a list

I have a text file that is divided up into many sections, each about 10 or so lines long. I'm reading in the file using File.ReadAllLines into an array, one line per element of the array, and I'm then I'm trying to parse each section of the file to bring back just some of the data. I'm storing the results in a list, and hoping to export the list to csv ultimately.
My for loop is giving me trouble, as it loops through the right amount of times, but only pulls the data from the first section of the text file each time rather than pulling the data from the first section and then moving on and pulling the data from the next section. I'm sure I'm doing something wrong either in my for loop or for each loop. Any clues to help me solve this would be much appreciated! Thanks
David
My code so far:
namespace ParseAndExport
{
class Program
{
static readonly string sourcefile = #"Path";
static void Main(string[] args)
{
string[] readInLines = File.ReadAllLines(sourcefile);
int counter = 0;
int holderCPStart = counter + 3;//Changed Paths will be an different number of lines each time, but will always start 3 lines after the startDiv
/*Need to find the start of the section and the end of the section and parse the bit in between.
* Also need to identify the blank line that occurs in each section as it is essentially a divider too.*/
int startDiv = Array.FindIndex(readInLines, counter, hyphens72);
int blankLine = Array.FindIndex(readInLines, startDiv, emptyElement);
int endDiv = Array.FindIndex(readInLines, counter + 1, hyphens72);
List<string> results = new List<string>();
//Test to see if FindIndexes work. Results should be 0, 7, 9 for 1st section of sourcefile
/*Console.WriteLine(startDiv);
Console.WriteLine(blankLine);
Console.WriteLine(endDiv);*/
//Check how long the file is so that for testing we know how long the while loop should run for
//Console.WriteLine(readInLines.Length);
//sourcefile has 5255 lines (elements) in the array
for (int i = 0; i <= readInLines.Length; i++)
{
if (i == startDiv)
{
results = (readInLines[i + 1].Split('|').Select(p => p.Trim()).ToList());
string holderCP = string.Join(Environment.NewLine, readInLines, holderCPStart, (blankLine - holderCPStart - 1)).Trim();
results.Add(holderCP);
string comment = string.Join(" ", readInLines, blankLine + 1, (endDiv - (blankLine + 1)));//in case the comment is more than one line long
results.Add(comment);
i = i + 1;
}
else
{
i = i + 1;
}
foreach (string result in results)
{
Console.WriteLine(result);
}
//csvcontent.AppendLine("Revision Number, Author, Date, Time, Count of Lines, Changed Paths, Comments");
/* foreach (string result in results)
{
for (int x = 0; x <= results.Count(); x++)
{
StringBuilder csvcontent = new StringBuilder();
csvcontent.AppendLine(results[x] + "," + results[x + 1] + "," + results[x + 2] + "," + results[x + 3] + "," + results[x + 4] + "," + results[x + 5]);
x = x + 6;
string csvpath = #"addressforcsvfile";
File.AppendAllText(csvpath, csvcontent.ToString());
}
}*/
}
Console.ReadKey();
}
private static bool hyphens72(String h)
{
if (h == "------------------------------------------------------------------------")
{
return true;
}
else
{
return false;
}
}
private static bool emptyElement(String ee)
{
if (ee == "")
{
return true;
}
else
{
return false;
}
}
}
}
It looks like you are trying to grab all of the lines in a file that are not "------" and put them into a list of strings.
You can try this:
var lineswithoutdashes = readInLines.Where(x => x != hyphens72).Select(x => x).ToList();
Now you can take this list and do the split with a '|' to extract the fields you wanted
The logic seems wrong. There are issues with the code in itself also. I am unsure what precisely you're trying to do. Anyway, a few hints that I hope will help:
The if (i == startDiv) checks to see if I equals startDiv. I assume the logic that happens when this condition is met, is what you refer to as "pulls the data from the first section". That's correct, given you only run this code when I equals startDiv.
You increase the counter I inside the for loop, which in itself also increases the counter i.
If the issue in 2. wouldn't exists then I'd suggest to not do the same operation "i = i + 1" in both the true and false conditions of the if (i == startDiv).
Given I assume this file might actually be massive, it's probably a good idea to not store it in memory, but just read the file line by line and process line by line. There's currently no obvious reason why you'd want to consume this amount of memory, unless it's because of the convenience of this API "File.ReadAllLines(sourcefile)". I wouldn't be too scared to read the file like this:
Try (BufferedReader br = new BufferedReader(new FileReader (file))) {
String line;
while ((line = br.readLine()) != null) {
// process the line.
}
}
You can skip the lines until you've passed where the line equals hyphens72.
Then for each line, you process the line with the code you provided in the true case of (i == startDiv), or at least, from what you described, this is what I assume you are trying to do.
int startDiv will return the line number that contains hyphens72.
So your current for loop will only copy to results for the single line that matches the calculated line number.
I guess you want to search the postion of startDiv in the current line?
const string hyphens72;
// loop over lines
for (var lineNumber = 0; lineNumber <= readInLines.Length; lineNumber++) {
string currentLine = readInLines[lineNumber];
int startDiv = currentLine.IndexOf(hyphens72);
// loop over characters in line
for (var charIndex = 0; charIndex < currentLine.Length; charIndex++) {
if (charIndex == startDiv) {
var currentCharacter = currentLine[charIndex];
// write to result ...
}
else {
continue; // skip this character
}
}
}
There are a several things which could be improved.
I would use ReadLines over File.ReadAllLines( because ReadAllLines reads all the lines at ones. ReadLines will stream it.
With the line results = (readInLines[i + 1].Split('|').Select(p => p.Trim()).ToList()); you're overwriting the previous results list. You'd better use results.AddRange() to add new results.
for (int i = 0; i <= readInLines.Length; i++) means when the length = 10 it will do 11 iterations. (1 too many) (remove the =)
Array.FindIndex(readInLines, counter, hyphens72); will do a scan. On large files it will take ages to completely read them and search in it. Try to touch a single line only ones.
I cannot test what you are doing, but here's a hint:
IEnumerable<string> readInLines = File.ReadLines(sourcefile);
bool started = false;
List<string> results = new List<string>();
foreach(var line in readInLines)
{
// skip empty lines
if(emptyElement(line))
continue;
// when dashes are found, flip a boolean to activate the reading mode.
if(hyphens72(line))
{
// flip state.. (start/end)
started != started;
}
if(started)
{
// I don't know what you are doing here precisely, do what you gotta do. ;-)
results.AddRange((line.Split('|').Select(p => p.Trim()).ToList()));
string holderCP = string.Join(Environment.NewLine, readInLines, holderCPStart, (blankLine - holderCPStart - 1)).Trim();
results.Add(holderCP);
string comment = string.Join(" ", readInLines, blankLine + 1, (endDiv - (blankLine + 1)));//in case the comment is more than one line long
results.Add(comment);
}
}
foreach (string result in results)
{
Console.WriteLine(result);
}
You might want to start with a class like this. I don't know whether each section begins with a row of hyphens, or if it's just in between. This should handle either scenario.
What this is going to do is take your giant list of strings (the lines in the file) and break it into chunks - each chunk is a set of lines (10 or so lines, according to your OP.)
The reason is that it's unnecessarily complicated to try to read the file, looking for the hyphens, and process the contents of the file at the same time. Instead, one class takes the input and breaks it into chunks. That's all it does.
Another class might read the file and pass its contents to this class to break them up. Then the output is the individual chunks of text.
Another class can then process those individual sections of 10 or so lines without having to worry about hyphens or what separates on chunk from another.
Now that each of these classes is doing its own thing, it's easier to write unit tests for each of them separately. You can test that your "processing" class receives an array of 10 or so lines and does whatever it's supposed to do with them.
public class TextSectionsParser
{
private readonly string _delimiter;
public TextSectionsParser(string delimiter)
{
_delimiter = delimiter;
}
public IEnumerable<IEnumerable<string>> ParseSections(IEnumerable<string> lines)
{
var result = new List<List<string>>();
var currentList = new List<string>();
foreach (var line in lines)
{
if (line == _delimiter)
{
if(currentList.Any())
result.Add(currentList);
currentList = new List<string>();
}
else
{
currentList.Add(line);
}
}
if (currentList.Any() && !result.Contains(currentList))
{
result.Add(currentList);
}
return result;
}
}

C# How to generate a new string based on multiple ranged index

Let's say I have a string like this one, left part is a word, right part is a collection of indices (single or range) used to reference furigana (phonetics) for kanjis in my word:
string myString = "子で子にならぬ時鳥,0:こ;2:こ;7-8:ほととぎす"
The pattern in detail:
word,<startIndex>(-<endIndex>):<furigana>
What would be the best way to achieve something like this (with a space in front of the kanji to mark which part is linked to the [furigana]):
子[こ]で 子[こ]にならぬ 時鳥[ほととぎす]
Edit: (thanks for your comments guys)
Here is what I wrote so far:
static void Main(string[] args)
{
string myString = "ABCDEF,1:test;3:test2";
//Split Kanjis / Indices
string[] tokens = myString.Split(',');
//Extract furigana indices
string[] indices = tokens[1].Split(';');
//Dictionnary to store furigana indices
Dictionary<string, string> furiganaIndices = new Dictionary<string, string>();
//Collect
foreach (string index in indices)
{
string[] splitIndex = index.Split(':');
furiganaIndices.Add(splitIndex[0], splitIndex[1]);
}
//Processing
string result = tokens[0] + ",";
for (int i = 0; i < tokens[0].Length; i++)
{
string currentIndex = i.ToString();
if (furiganaIndices.ContainsKey(currentIndex)) //add [furigana]
{
string currentFurigana = furiganaIndices[currentIndex].ToString();
result = result + " " + tokens[0].ElementAt(i) + string.Format("[{0}]", currentFurigana);
}
else //nothing to add
{
result = result + tokens[0].ElementAt(i);
}
}
File.AppendAllText(#"D:\test.txt", result + Environment.NewLine);
}
Result:
ABCDEF,A B[test]C D[test2]EF
I struggle to find a way to process ranged indices:
string myString = "ABCDEF,1:test;2-3:test2";
Result : ABCDEF,A B[test] CD[test2]EF
I don't have anything against manually manipulating strings per se. But given that you seem to have a regular pattern describing the inputs, it seems to me that a solution that uses regex would be more maintainable and readable. So with that in mind, here's an example program that takes that approach:
class Program
{
private const string _kinvalidFormatException = "Invalid format for edit specification";
private static readonly Regex
regex1 = new Regex(#"(?<word>[^,]+),(?<edit>(?:\d+)(?:-(?:\d+))?:(?:[^;]+);?)+", RegexOptions.Compiled),
regex2 = new Regex(#"(?<start>\d+)(?:-(?<end>\d+))?:(?<furigana>[^;]+);?", RegexOptions.Compiled);
static void Main(string[] args)
{
string myString = "子で子にならぬ時鳥,0:こ;2:こ;7-8:ほととぎす";
string result = EditString(myString);
}
private static string EditString(string myString)
{
Match editsMatch = regex1.Match(myString);
if (!editsMatch.Success)
{
throw new ArgumentException(_kinvalidFormatException);
}
int ichCur = 0;
string input = editsMatch.Groups["word"].Value;
StringBuilder text = new StringBuilder();
foreach (Capture capture in editsMatch.Groups["edit"].Captures)
{
Match oneEditMatch = regex2.Match(capture.Value);
if (!oneEditMatch.Success)
{
throw new ArgumentException(_kinvalidFormatException);
}
int start, end;
if (!int.TryParse(oneEditMatch.Groups["start"].Value, out start))
{
throw new ArgumentException(_kinvalidFormatException);
}
Group endGroup = oneEditMatch.Groups["end"];
if (endGroup.Success)
{
if (!int.TryParse(endGroup.Value, out end))
{
throw new ArgumentException(_kinvalidFormatException);
}
}
else
{
end = start;
}
text.Append(input.Substring(ichCur, start - ichCur));
if (text.Length > 0)
{
text.Append(' ');
}
ichCur = end + 1;
text.Append(input.Substring(start, ichCur - start));
text.Append(string.Format("[{0}]", oneEditMatch.Groups["furigana"]));
}
if (ichCur < input.Length)
{
text.Append(input.Substring(ichCur));
}
return text.ToString();
}
}
Notes:
This implementation assumes that the edit specifications will be listed in order and won't overlap. It makes no attempt to validate that part of the input; depending on where you are getting your input from you may want to add that. If it's valid for the specifications to be listed out of order, you can also extend the above to first store the edits in a list and sort the list by the start index before actually editing the string. (In similar fashion to the way the other proposed answer works; though, why they are using a dictionary instead of a simple list to store the individual edits, I have no idea…that seems arbitrarily complicated to me.)
I included basic input validation, throwing exceptions where failures occur in the pattern matching. A more user-friendly implementation would add more specific information to each exception, describing what part of the input actually was invalid.
The Regex class actually has a Replace() method, which allows for complete customization. The above could have been implemented that way, using Replace() and a MatchEvaluator to provide the replacement text, instead of just appending text to a StringBuilder. Which way to do it is mostly a matter of preference, though the MatchEvaluator might be preferred if you have a need for more flexible implementation options (i.e. if the exact format of the result can vary).
If you do choose to use the other proposed answer, I strongly recommend you use StringBuilder instead of simply concatenating onto the results variable. For short strings it won't matter much, but you should get into the habit of always using StringBuilder when you have a loop that is incrementally adding onto a string value, because for long string the performance implications of using concatenation can be very negative.
This should do it (and even handle ranged indices), based on the formatting of the input string you have-
using System;
using System.Collections.Generic;
public class stringParser
{
private struct IndexElements
{
public int start;
public int end;
public string value;
}
public static void Main()
{
//input string
string myString = "子で子にならぬ時鳥,0:こ;2:こ;7-8:ほととぎす";
int wordIndexSplit = myString.IndexOf(',');
string word = myString.Substring(0,wordIndexSplit);
string indices = myString.Substring(wordIndexSplit + 1);
string[] eachIndex = indices.Split(';');
Dictionary<int,IndexElements> index = new Dictionary<int,IndexElements>();
string[] elements;
IndexElements e;
int dash;
int n = 0;
int last = -1;
string results = "";
foreach (string s in eachIndex)
{
e = new IndexElements();
elements = s.Split(':');
if (elements[0].Contains("-"))
{
dash = elements[0].IndexOf('-');
e.start = int.Parse(elements[0].Substring(0,dash));
e.end = int.Parse(elements[0].Substring(dash + 1));
}
else
{
e.start = int.Parse(elements[0]);
e.end = e.start;
}
e.value = elements[1];
index.Add(n,e);
n++;
}
//this is the part that takes the "setup" from the parts above and forms the result string
//loop through each of the "indices" parsed above
for (int i = 0; i < index.Count; i++)
{
//if this is the first iteration through the loop, and the first "index" does not start
//at position 0, add the beginning characters before its start
if (last == -1 && index[i].start > 0)
{
results += word.Substring(0,index[i].start);
}
//if this is not the first iteration through the loop, and the previous iteration did
//not stop at the position directly before the start of the current iteration, add
//the intermediary chracters
else if (last != -1 && last + 1 != index[i].start)
{
results += word.Substring(last + 1,index[i].start - (last + 1));
}
//add the space before the "index" match, the actual match, and then the formatted "index"
results += " " + word.Substring(index[i].start,(index[i].end - index[i].start) + 1)
+ "[" + index[i].value + "]";
//remember the position of the ending for the next iteration
last = index[i].end;
}
//if the last "index" did not stop at the end of the input string, add the remaining characters
if (index[index.Keys.Count - 1].end + 1 < word.Length)
{
results += word.Substring(index[index.Keys.Count-1].end + 1);
}
//trimming spaces that may be left behind
results = results.Trim();
Console.WriteLine("INPUT - " + myString);
Console.WriteLine("OUTPUT - " + results);
Console.Read();
}
}
input - 子で子にならぬ時鳥,0:こ;2:こ;7-8:ほととぎす
output - 子[こ]で 子[こ]にならぬ 時鳥[ほととぎす]
Note that this should also work with characters the English alphabet if you wanted to use English instead-
input - iliketocodeverymuch,2:A;4-6:B;9-12:CDEFG
output - il i[A]k eto[B]co deve[CDEFG]rymuch

Iterating backwards through an char array after finding a known word

I've got a project I'm working on in C#. I've got two char array's. One is a sentence and one is a word. I've got to iterate through the sentence array until I find a word that matches the word that was turned into an word array what I'm wondering is once I find the word how do I iterate backwards through the sentence array at the point I found the word back through the same length as the word array?
Code :
String wordString = "(Four)";
String sentenceString = "Maybe the fowl of Uruguay repeaters (Four) will be found";
char[] wordArray = wordString.ToCharArray();
List<String> words = sentenceString.Split(' ').ToList<string>();
//This would be the part where I iterate through sentence
foreach (string sentence in sentArray)
{
//Here I would need to find where the string of (Four) and trim it and see if it equals the wordString.
if (sentence.Contains(wordString)
{
//At this point I would need to go back the length of wordString which happens to be four places but I'm not sure how to do this. And for each word I go back in the sentence I need to capture that in another string array.
}
I don't know if I'm being clear enough on this but if I'm not please feel free to ask.. Thank you in advanced. Also what this should return is "fowl of Uruguay repeaters". So basically the use case is for the number of letters in the parenthesis the logic should return the same number of words before the word in parenthesis.
our are you. I have few question concerned to this exercise.
If the Word (four) was in beginning it should not return? or return all string?
As the length of four is equal to 4 imagine if that word appear as the second word on sentence what it should return just the first word or return 4 words even including the (four) word.?
My solution is the laziest one I just see your question and decide to help.
My solution assumes it return all the word before the (four) if the length is bigger than the word before the (four) word.
My solution return empty string if the word (four) is in the beginning.
My solution return Length of (four) (4) words before (four) word.
ONCE AGAIN IT IS NOT MY BEST APPROACH.
I see the code bellow:
string wordString = "(Four)";
string sentenceString = "Maybe the fowl of Uruguay repeaters (Four) will be found";
//Additionally you can add splitoption to remove the empty word on split function bellow
//Just in case there are more space in sentence.
string[] splitedword = sentenceString.Split(' ');
int tempBackposition = 0;
int finalposition = 0;
for (int i = 0; i < splitedword.Length; i++)
{
if (splitedword[i].Contains(wordString))
{
finalposition = i;
break;
}
}
tempBackposition = finalposition - wordString.Replace("(","").Replace(")","").Length;
string output = "";
tempBackposition= tempBackposition<0?0:tempBackposition;
for (int i = tempBackposition; i < finalposition; i++)
{
output += splitedword[i] + " ";
}
Console.WriteLine(output);
Console.ReadLine();
If it's not what you want can you answer my questions on top? or help me to understand were it's wrong
int i ;
string outputString = (i=sentenceString.IndexOf(wordString))<0 ?
sentenceString : sentenceString.Substring(0,i) ;
var wordString = "(Four)";
int wordStringInt = 4; // Just do switch case to convert your string to int
var sentenceString = "Maybe the fowl of Uruguay repeaters (Four) will be found";
var sentenceStringArray = sentenceString.Split(' ').ToList();
int wordStringIndexInArray = sentenceStringArray.IndexOf(wordString) - 1;
var stringOutPut = "";
if (wordStringIndexInArray > 0 && wordStringIndexInArray > wordStringInt)
{
stringOutPut = "";
while (wordStringInt > 0)
{
stringOutPut = sentenceStringArray[wordStringInt] + " " + stringOutPut;
wordStringInt--;
}
}
What you are matching is kind of complex, so for a more general solution you could use regular expressions.
First we declare what we are searching for:
string word = "(Four)";
string sentence = "Maybe the fowl of Uruguay repeaters (Four) will be found";
We will then search for the words in this string using regular expressions. Since we don't want to match whitespace, and we need to know where each match actually starts and we need to know the word inside the parenthesis we tell it that we optionally want opening and ending parenthesis, but we also want the contents of those as a match:
var words = Regex.Matches(sentence, #"[\p{Ps}]*(?<Content>[\w]+)[\p{Pe}]*").Cast<Match>().ToList();
[\p{Ps}] means we want opening punctuation ([{ etc. while the * indicates zero or more.
Followed is a sub-capture called Content (specified by ?<Content>) with one or more word characters.
At the end we specify that we want zero or more ending punctuation.
We then need to find the word in the list of matches:
var item = words.Single(x => x.Value == word);
Then we need to find this item's index:
int index = words.IndexOf(item);
At this point we just need to know the length of the contents:
var length = item.Groups["Content"].Length;
This length we use to go back in the string 4 words
var start = words.Skip(index - length).First();
And now we have everything we need:
var result = sentence.Substring(start.Index, item.Index - start.Index);
Result should contain fowl of Uruguay repeaters.
edit: It may be a lot simpler to just figure out the count from the word rather than from the content. In that case the complete code should be the following:
string word = "(Four)";
string sentence = "Maybe the fowl of Uruguay repeaters (Four) will be found";
var wordMatch = Regex.Match(word, #"[\p{Ps}]*(?<Content>[\w]+)[\p{Pe}]*");
var length = wordMatch.Groups["Content"].Length;
var words = Regex.Matches(sentence, #"\S+").Cast<Match>().ToList();
var item = words.Single(x => x.Value == word);
int index = words.IndexOf(item);
var start = words.Skip(index - length).First();
var result = sentence.Substring(start.Index, item.Index - start.Index);
\S+ in this case means "match one or more non-whitespace character".
You should try something like the following, which uses Array.Copy after it finds the number word. You will still have to implement the ConvertToNum function correctly (it is hardcoded in for now), but this should be a quick and easy solution.
string[] GetWords()
{
string sentenceString = "Maybe the fowl of Uruguay repeaters (Four) will be found";
string[] words = sentenceString.Split();
int num = 0;
int i; // scope of i should remain outside the for loop
for (i = 0; i < words.Length; i++)
{
string word = words[i];
if (word.StartsWith("(") && word.EndsWith(")"))
{
num = ConvertToNum(word.Substring(1, word.Length - 1));
// converted the number word we found, so we break
break;
}
}
if (num == 0)
{
// no number word was found in the string - return empty array
return new string[0];
}
// do some extra checking if number word exceeds number of previous words
int startIndex = i - num;
// if it does - just start from index 0
startIndex = startIndex < 0 ? 0 : startIndex;
int length = i - startIndex;
string[] output = new string[length];
Array.Copy(words, startIndex, output, 0, length);
return output;
}
// Convert the number word to an integer
int ConvertToNum(string numberStr)
{
return 4; // you should implement this method correctly
}
See - Convert words (string) to Int, for help implementing the ConvertToNum solution. Obviously it could be simplified depending on the range of numbers you expect to deal with.
Here is my solution, im not using regex at all, for the sake of easy understanding:
static void Main() {
var wordString = "(Four)";
int wordStringLength = wordString.Replace("(","").Replace(")","").Length;
//4, because i'm assuming '(' and ')' doesn't count.
var sentenceString = "Maybe the fowl of Uruguay repeaters (Four) will be found";
//Transform into a list of words, ToList() to future use of IndexOf Method
var sentenceStringWords = sentenceString.Split(' ').ToList();
//Find the index of the word in the list of words
int wordIndex = sentenceStringWords.IndexOf(wordString);
//Get a subrange from the original list of words, going back x Times the legnth of word (in this case 4),
var wordsToConcat = sentenceStringWords.GetRange(wordIndex-wordStringLength, wordStringLength);
//Finally concat the output;
var outPut = string.Join(" ", wordsToConcat);
//Output: fowl of Uruguay repeaters
}
I have a solution for you:
string wordToMatch = "(Four)";
string sentence = "Maybe the fowl of Uruguay repeaters (Four) will be found";
if (sentence.Contains(wordToMatch))
{
int length = wordToMatch.Trim(new[] { '(', ')' }).Length;
int indexOfMatchedWord = sentence.IndexOf(wordToMatch);
string subString1 = sentence.Substring(0, indexOfMatchedWord);
string[] words = subString1.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
var reversed = words.Reverse().Take(length);
string result = string.Join(" ", reversed.Reverse());
Console.WriteLine(result);
Console.ReadLine();
}
Likely this can be improved regardng performance, but I have a feeling you don't care about that. Make sure you are using 'System.Linq'
I assumed empty returns when input is incomplete, feel free to correct me on that. Wasn't 100% clear in your post how this should be handled.
private string getPartialSentence(string sentence, string word)
{
if (string.IsNullOrEmpty(sentence) || string.IsNullOrEmpty(word))
return string.Empty;
int locationInSentence = sentence.IndexOf(word, StringComparison.Ordinal);
if (locationInSentence == -1)
return string.Empty;
string partialSentence = sentence.Substring(0, locationInSentence);
string[] words = partialSentence.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries);
int nbWordsRequired = word.Replace("(", "").Replace(")", "").Length;
if (words.Count() >= nbWordsRequired)
return String.Join(" ", words.Skip(words.Count() - nbWordsRequired));
return String.Join(" ", words);
}
I went with using an enumeration and associated dictionary to pair the "(Four)" type strings with their integer values. You could just as easily (and maybe easier) go with a switch statement using
case "(Four)": { currentNumber = 4; };
I feel like the enum allows a little more flexibility though.
public enum NumberVerb
{
one = 1,
two = 2,
three = 3,
four = 4,
five = 5,
six = 6,
seven = 7,
eight = 8,
nine = 9,
ten = 10,
};
public static Dictionary<string, NumberVerb> m_Dictionary
{
get
{
Dictionary<string, NumberVerb> temp = new Dictionary<string, NumberVerb>();
temp.Add("(one)", NumberVerb.one);
temp.Add("(two)", NumberVerb.two);
temp.Add("(three)", NumberVerb.three);
temp.Add("(four)", NumberVerb.four);
temp.Add("(five)", NumberVerb.five);
temp.Add("(six)", NumberVerb.six);
temp.Add("(seven)", NumberVerb.seven);
temp.Add("(eight)", NumberVerb.eight);
temp.Add("(nine)", NumberVerb.nine);
temp.Add("(ten)", NumberVerb.ten);
return temp;
}
}
static void Main(string[] args)
{
string resultPhrase = "";
// Get the sentance that will be searched.
Console.WriteLine("Please enter the starting sentance:");
Console.WriteLine("(don't forget your keyword: ie '(Four)')");
string sentance = Console.ReadLine();
// Get the search word.
Console.WriteLine("Please enter the search keyword:");
string keyword = Console.ReadLine();
// Set the associated number of words to backwards-iterate.
int currentNumber = -1;
try
{
currentNumber = (int)m_Dictionary[keyword.ToLower()];
}
catch(KeyNotFoundException ex)
{
Console.WriteLine("The provided keyword was not found in the dictionary.");
}
// Search the sentance string for the keyword, and get the starting index.
Console.WriteLine("Searching for phrase...");
string[] words = sentance.Split(' ');
int searchResultIndex = -1;
for (int i = 0; (searchResultIndex == -1 && i < words.Length); i++)
{
if (words[i].Equals(keyword))
{
searchResultIndex = i;
}
}
// Handle the search results.
if (searchResultIndex == -1)
{
resultPhrase = "The keyword was not found.";
}
else if (searchResultIndex < currentNumber)
{
// Check the array boundaries with the given indexes.
resultPhrase = "Error: Out of bounds!";
}
else
{
// Get the preceding words.
for (int i = 0; i < currentNumber; i++)
{
resultPhrase = string.Format(" {0}{1}", words[searchResultIndex - 1 - i], resultPhrase);
}
}
// Display the preceding words.
Console.WriteLine(resultPhrase.Trim());
// Exit.
Console.ReadLine();
}

If a word in a foreach loop contains xxx, show previous word

If a word matches with a string I compare with, I want to see the previous word I checked. Sounds very hard to understand, so I hope the code explains it a bit better:
String line = "qwerty/asdfg/xxx/zxcvb";
string[] words = line.Split('/');
foreach (string word in words)
{
if (word.Contains("xxx"))
{
Console.WriteLine(word.Last);
Console.Writeline(word.Next); //It doenst work this way, but I hope you understand the idea
}
}
What you need to do is have a variable that maintains the current state of your search. In your case, it's pretty simple as all you want to remember is the last word checked:
string line = "qwerty/asdfg/xxx/zxcvb";
string[] words = line.Split('/');
string previousWord = "";
foreach (string word in words)
{
if (word.Contains("xxx"))
{
Console.WriteLine(previousWord);
}
previousWord = word;
}
Just store the last word you checked.
In addition, you should be using String.IndexOf() if you have any need for culture or case sensitivity awareness.
String line = "qwerty/asdfg/xxx/zxcvb";
string[] words = line.Split('/');
string previous = words.First();
string next = "";
foreach (string word in words)
{
var currentIndex = (Array.IndexOf(words, word));
if (currentIndex + 1 < words.Length)
{
next = words[currentIndex + 1];
}
if (word.IndexOf("xxx", StringComparison.CurrentCulture) > -1)
{
Console.WriteLine(previous);
if (next == word)
{
Console.WriteLine("The match was the last word and no next word is available.");
}
else
{
Console.WriteLine(next);
}
}
else
{
previous = word;
}
}
In this particular case, "asdfg" is outputted as the previous and "zxcvb" is next. However, if you are looking instead for "zxcvb" as the match, "The match was the last word and no next word is available." would be outputted.
Just use a for loop instead of a foreach
String line = "qwerty/asdfg/xxx/zxcvb";
string[] words = line.Split('/');
for (int i = 0; i < words.Length; i++)
{
if (words[i].Contains("xxx"))
{
// Check to make sure you to go beyond the first element
if (i - 1 > -1)
{
// Previous word
Console.WriteLine(words[i - 1]);
}
// Check to make sure you to go beyond the last element
if (i + 1 < words.Length)
{
// Next word
Console.WriteLine(words[i + 1]);
}
}
}
Console.ReadLine();
Results:
asdfg
zxcvb
Without Loop
You could also use Array.IndexOf()
String line = "qwerty/asdfg/xxx/zxcvb";
string[] words = line.Split('/');
int index = Array.IndexOf(words, "xxx");
// Check to make sure you to go beyond the first element
if (index - 1 > -1)
{
// Previous word
Console.WriteLine(words[index - 1]);
}
// Check to make sure you to go beyond the last element
if (index != -1 && index + 1 < words.Length)
{
// Next word
Console.WriteLine(words[index + 1]);
}
Console.ReadLine();
String line = "xxx/qwerty/asdfg/xxx/zxcvb";
string[] words = line.Split('/');
for (int i = 0; i < words.Length; i++)
{
if (words[i].Contains("xxx"))
{
Console.WriteLine(words[i]); //current word
if (i == 0)
Console.WriteLine("No Previous Word");
else
Console.WriteLine(words[i - 1]); //PreviousWord
if (i == words.Length - 1)
Console.WriteLine("No Next Word");
else
Console.WriteLine(words[i + 1]); //NextWord
}
}
words[words.IndexOf(word) - 1] should do the trick.
Edit: Forget this. Almo's is nicer and not prone to Exceptions.

Find and count latter pair from a string by alphabetical order

I am working on a small project which is in C# where I want to find and count the latter pairs which comes in alphabetical order by ignoring spaces and special characters.
e.g.
This is a absolutely easy.
Here my output should be
hi 1
ab 1
I refereed This post but not getting exact idea for pair latter count.
First I remove the spaces and special characters as you specified by simply going though the string and checking whether the current character is a letter:
private static string GetLetters(string s)
{
string newString = "";
foreach (var item in s)
{
if (char.IsLetter(item))
{
newString += item;
}
}
return newString;
}
Than I wrote a method which checks if the next letter is in alphabetical order using simple logic. I lower the character's case and check if the current character's ASCII code + 1 is equal to the next one's. If it is, of course they are the same:
private static string[] GetLetterPairsInAlphabeticalOrder(string s)
{
List<string> pairs = new List<string>();
for (int i = 0; i < s.Length - 1; i++)
{
if (char.ToLower(s[i]) + 1 == char.ToLower(s[i + 1]))
{
pairs.Add(s[i].ToString() + s[i+1].ToString());
}
}
return pairs.ToArray();
}
Here is how the main method will look like:
static void Main()
{
string s = "This is a absolutely easy.";
s = GetLetters(s);
string[] pairOfLetters = GetLetterPairsInAlphabeticalOrder(s);
foreach (var item in arr)
{
Console.WriteLine(item);
}
}
First, I would normalize the string to reduce confusion from special characters like this:
string str = "This is a absolutely easy.";
Regex rgx = new Regex("[^a-zA-Z]");
str = rgx.Replace(str, "");
str = str.ToLower();
Then, I would loop over all of the characters in the string and see if their neighbor is the next letter in the alphabet.
Dictionary<string, int> counts = new Dictionary<string, int>();
for (int i = 0; i < str.Length - 1; i++)
{
if (str[i+1] == (char)(str[i]+1))
{
string index = "" + str[i] + str[i+1];
if (!counts.ContainsKey(index))
counts.Add(index, 0);
counts[index]++;
}
}
Printing the counts from there is pretty straightforward.
foreach (string s in counts.Keys)
{
Console.WriteLine(s + " " + counts[s]);
}

Categories