Find and count latter pair from a string by alphabetical order

Find and count latter pair from a string by alphabetical order - c#

I am working on a small project which is in C# where I want to find and count the latter pairs which comes in alphabetical order by ignoring spaces and special characters.
e.g.
This is a absolutely easy.
Here my output should be
hi 1
ab 1
I refereed This post but not getting exact idea for pair latter count.

First I remove the spaces and special characters as you specified by simply going though the string and checking whether the current character is a letter:
private static string GetLetters(string s)
{
string newString = "";
foreach (var item in s)
{
if (char.IsLetter(item))
{
newString += item;
}
}
return newString;
}
Than I wrote a method which checks if the next letter is in alphabetical order using simple logic. I lower the character's case and check if the current character's ASCII code + 1 is equal to the next one's. If it is, of course they are the same:
private static string[] GetLetterPairsInAlphabeticalOrder(string s)
{
List<string> pairs = new List<string>();
for (int i = 0; i < s.Length - 1; i++)
{
if (char.ToLower(s[i]) + 1 == char.ToLower(s[i + 1]))
{
pairs.Add(s[i].ToString() + s[i+1].ToString());
}
}
return pairs.ToArray();
}
Here is how the main method will look like:
static void Main()
{
string s = "This is a absolutely easy.";
s = GetLetters(s);
string[] pairOfLetters = GetLetterPairsInAlphabeticalOrder(s);
foreach (var item in arr)
{
Console.WriteLine(item);
}
}

First, I would normalize the string to reduce confusion from special characters like this:
string str = "This is a absolutely easy.";
Regex rgx = new Regex("[^a-zA-Z]");
str = rgx.Replace(str, "");
str = str.ToLower();
Then, I would loop over all of the characters in the string and see if their neighbor is the next letter in the alphabet.
Dictionary<string, int> counts = new Dictionary<string, int>();
for (int i = 0; i < str.Length - 1; i++)
{
if (str[i+1] == (char)(str[i]+1))
{
string index = "" + str[i] + str[i+1];
if (!counts.ContainsKey(index))
counts.Add(index, 0);
counts[index]++;
}
}
Printing the counts from there is pretty straightforward.
foreach (string s in counts.Keys)
{
Console.WriteLine(s + " " + counts[s]);
}

Related

Get Occurrences of Letters in A String

I am trying to count occurrences of letters in a string and almost got the result using the below code snippet:
public static void GetNoofLetters()
{
string str = "AAAAABBCCCDDDD";
int count = 1;
char[] charVal = str.ToCharArray();
List<string> charCnt = new List<string>();
string concat = "";
//Getting each letters using foreach loop
foreach (var ch in charVal)
{
int index = charCnt.FindIndex(c => c.Contains(ch.ToString())); //Checks if there's any existing letter in the list
if(index >= 0) //If letter exists, then count and replace the last value
{
count++;
charCnt[charCnt.Count - 1] = count.ToString() + ch.ToString();
}
else
{
charCnt.Add(ch.ToString()); //If no matching letter exists, then add it to the list initially
count = 1;
}
}
foreach (var item in charCnt)
{
concat += item;
}
Console.WriteLine(concat.Trim());
}
The code works for the given input sample and returns output as: 5A2B3C4D. Simple is that.
But say I've the following input: Second input sample
string str = "AAAAABBCCCDDDDAA";
Expected output:
5A2B3C4D2A
With the above code that I've returns the output as follows:
5A2B3C6A
The above actually occurred for the below code snippet:
if(index >= 0) //If letter found, then count and replace the last value
{
count++;
charCnt[charCnt.Count - 1] = count.ToString() + ch.ToString();
}
Is there any better idea that I can resolve to get the expected output for the second input sample? I can understand, am close enough and may be missing something that's simple enough.
Code sample: Count Occurrences of Letters

Why don't we just loop over value and count? We can have two possibilities:
When character c doesn't equal to current (we have the different character) we should write down the previous sequence and start a new one
Otherwise, add 1 to count
Code:
private static string Compress(string value) {
if (string.IsNullOrEmpty(value))
return value;
char current = '\0';
int count = 0;
StringBuilder result = new StringBuilder(2 * value.Length);
foreach (char c in value) {
if (count != 0 && c != current) {
result.Append(count);
result.Append(current);
count = 0;
}
current = c;
count += 1;
}
result.Append(count);
result.Append(current);
return result.ToString();
}
Please, fiddle yourself

Well, I ended with the following code sample:
public static void Main()
{
string str = "AAAAABBCCCDDDDAABBBBAABB";
int count = 1;
char[] charVal = str.ToCharArray();
List<string> charCnt = new List<string>();
charCnt.Add("");
string concat = "";
//Getting each letters using foreach loop
foreach (var ch in charVal)
{
var lastItem = charCnt.LastOrDefault();
if (lastItem.EndsWith((ch.ToString()))) //If letter exists, then count and replace the last value
{
count++;
charCnt[charCnt.Count - 1] = count.ToString() + ch.ToString();
}
else
{
charCnt.Add(ch.ToString()); //If no matching letter exists, then add it to the list initially
count = 1;
}
}
foreach (var item in charCnt)
{
concat += item; //Concatenate items from the list
}
Console.WriteLine(concat.Trim());
}
Here's a woking sample: Get Occurrences of Letters in A String

How to stop String.Concat(); from removing whitespaces? [duplicate]

I would like to split a string with delimiters but keep the delimiters in the result.
How would I do this in C#?

If the split chars were ,, ., and ;, I'd try:
using System.Text.RegularExpressions;
...
string[] parts = Regex.Split(originalString, #"(?<=[.,;])")
(?<=PATTERN) is positive look-behind for PATTERN. It should match at any place where the preceding text fits PATTERN so there should be a match (and a split) after each occurrence of any of the characters.

If you want the delimiter to be its "own split", you can use Regex.Split e.g.:
string input = "plum-pear";
string pattern = "(-)";
string[] substrings = Regex.Split(input, pattern); // Split on hyphens
foreach (string match in substrings)
{
Console.WriteLine("'{0}'", match);
}
// The method writes the following to the console:
// 'plum'
// '-'
// 'pear'
So if you are looking for splitting a mathematical formula, you can use the following Regex
#"([*()\^\/]|(?<!E)[\+\-])"
This will ensure you can also use constants like 1E-02 and avoid having them split into 1E, - and 02
So:
Regex.Split("10E-02*x+sin(x)^2", #"([*()\^\/]|(?<!E)[\+\-])")
Yields:
10E-02
*
x
+
sin
(
x
)
^
2

Building off from BFree's answer, I had the same goal, but I wanted to split on an array of characters similar to the original Split method, and I also have multiple splits per string:
public static IEnumerable<string> SplitAndKeep(this string s, char[] delims)
{
int start = 0, index;
while ((index = s.IndexOfAny(delims, start)) != -1)
{
if(index-start > 0)
yield return s.Substring(start, index - start);
yield return s.Substring(index, 1);
start = index + 1;
}
if (start < s.Length)
{
yield return s.Substring(start);
}
}

Just in case anyone wants this answer aswell...
Instead of string[] parts = Regex.Split(originalString, #"(?<=[.,;])") you could use string[] parts = Regex.Split(originalString, #"(?=yourmatch)") where yourmatch is whatever your separator is.
Supposing the original string was
777- cat
777 - dog
777 - mouse
777 - rat
777 - wolf
Regex.Split(originalString, #"(?=777)") would return
777 - cat
777 - dog
and so on

This version does not use LINQ or Regex and so it's probably relatively efficient. I think it might be easier to use than the Regex because you don't have to worry about escaping special delimiters. It returns an IList<string> which is more efficient than always converting to an array. It's an extension method, which is convenient. You can pass in the delimiters as either an array or as multiple parameters.
/// <summary>
/// Splits the given string into a list of substrings, while outputting the splitting
/// delimiters (each in its own string) as well. It's just like String.Split() except
/// the delimiters are preserved. No empty strings are output.</summary>
/// <param name="s">String to parse. Can be null or empty.</param>
/// <param name="delimiters">The delimiting characters. Can be an empty array.</param>
/// <returns></returns>
public static IList<string> SplitAndKeepDelimiters(this string s, params char[] delimiters)
{
var parts = new List<string>();
if (!string.IsNullOrEmpty(s))
{
int iFirst = 0;
do
{
int iLast = s.IndexOfAny(delimiters, iFirst);
if (iLast >= 0)
{
if (iLast > iFirst)
parts.Add(s.Substring(iFirst, iLast - iFirst)); //part before the delimiter
parts.Add(new string(s[iLast], 1));//the delimiter
iFirst = iLast + 1;
continue;
}
//No delimiters were found, but at least one character remains. Add the rest and stop.
parts.Add(s.Substring(iFirst, s.Length - iFirst));
break;
} while (iFirst < s.Length);
}
return parts;
}
Some unit tests:
text = "[a link|http://www.google.com]";
result = text.SplitAndKeepDelimiters('[', '|', ']');
Assert.IsTrue(result.Count == 5);
Assert.AreEqual(result[0], "[");
Assert.AreEqual(result[1], "a link");
Assert.AreEqual(result[2], "|");
Assert.AreEqual(result[3], "http://www.google.com");
Assert.AreEqual(result[4], "]");

A lot of answers to this! One I knocked up to split by various strings (the original answer caters for just characters i.e. length of 1). This hasn't been fully tested.
public static IEnumerable<string> SplitAndKeep(string s, params string[] delims)
{
var rows = new List<string>() { s };
foreach (string delim in delims)//delimiter counter
{
for (int i = 0; i < rows.Count; i++)//row counter
{
int index = rows[i].IndexOf(delim);
if (index > -1
&& rows[i].Length > index + 1)
{
string leftPart = rows[i].Substring(0, index + delim.Length);
string rightPart = rows[i].Substring(index + delim.Length);
rows[i] = leftPart;
rows.Insert(i + 1, rightPart);
}
}
}
return rows;
}

This seems to work, but its not been tested much.
public static string[] SplitAndKeepSeparators(string value, char[] separators, StringSplitOptions splitOptions)
{
List<string> splitValues = new List<string>();
int itemStart = 0;
for (int pos = 0; pos < value.Length; pos++)
{
for (int sepIndex = 0; sepIndex < separators.Length; sepIndex++)
{
if (separators[sepIndex] == value[pos])
{
// add the section of string before the separator
// (unless its empty and we are discarding empty sections)
if (itemStart != pos || splitOptions == StringSplitOptions.None)
{
splitValues.Add(value.Substring(itemStart, pos - itemStart));
}
itemStart = pos + 1;
// add the separator
splitValues.Add(separators[sepIndex].ToString());
break;
}
}
}
// add anything after the final separator
// (unless its empty and we are discarding empty sections)
if (itemStart != value.Length || splitOptions == StringSplitOptions.None)
{
splitValues.Add(value.Substring(itemStart, value.Length - itemStart));
}
return splitValues.ToArray();
}

Recently I wrote an extension method do to this:
public static class StringExtensions
{
public static IEnumerable<string> SplitAndKeep(this string s, string seperator)
{
string[] obj = s.Split(new string[] { seperator }, StringSplitOptions.None);
for (int i = 0; i < obj.Length; i++)
{
string result = i == obj.Length - 1 ? obj[i] : obj[i] + seperator;
yield return result;
}
}
}

I'd say the easiest way to accomplish this (except for the argument Hans Kesting brought up) is to split the string the regular way, then iterate over the array and add the delimiter to every element but the last.

To avoid adding character to new line try this :
string[] substrings = Regex.Split(input,#"(?<=[-])");

result = originalString.Split(separator);
for(int i = 0; i < result.Length - 1; i++)
result[i] += separator;
(EDIT - this is a bad answer - I misread his question and didn't see that he was splitting by multiple characters.)
(EDIT - a correct LINQ version is awkward, since the separator shouldn't get concatenated onto the final string in the split array.)

Iterate through the string character by character (which is what regex does anyway.
When you find a splitter, then spin off a substring.
pseudo code
int hold, counter;
List<String> afterSplit;
string toSplit
for(hold = 0, counter = 0; counter < toSplit.Length; counter++)
{
if(toSplit[counter] = /*split charaters*/)
{
afterSplit.Add(toSplit.Substring(hold, counter));
hold = counter;
}
}
That's sort of C# but not really. Obviously, choose the appropriate function names.
Also, I think there might be an off-by-1 error in there.
But that will do what you're asking.

veggerby's answer modified to
have no string items in the list
have fixed string as delimiter like "ab" instead of single character
var delimiter = "ab";
var text = "ab33ab9ab"
var parts = Regex.Split(text, $#"({Regex.Escape(delimiter)})")
.Where(p => p != string.Empty)
.ToList();
// parts = "ab", "33", "ab", "9", "ab"
The Regex.Escape() is there just in case your delimiter contains characters which regex interprets as special pattern commands (like *, () and thus have to be escaped.

using System.Collections.Generic;
using System.Text.RegularExpressions;
namespace ConsoleApplication9
{
class Program
{
static void Main(string[] args)
{
string input = #"This;is:a.test";
char sep0 = ';', sep1 = ':', sep2 = '.';
string pattern = string.Format("[{0}{1}{2}]|[^{0}{1}{2}]+", sep0, sep1, sep2);
Regex regex = new Regex(pattern);
MatchCollection matches = regex.Matches(input);
List<string> parts=new List<string>();
foreach (Match match in matches)
{
parts.Add(match.ToString());
}
}
}
}

I wanted to do a multiline string like this but needed to keep the line breaks so I did this
string x =
#"line 1 {0}
line 2 {1}
";
foreach(var line in string.Format(x, "one", "two")
.Split("\n")
.Select(x => x.Contains('\r') ? x + '\n' : x)
.AsEnumerable()
) {
Console.Write(line);
}
yields
line 1 one
line 2 two

I came across same problem but with multiple delimiters. Here's my solution:
public static string[] SplitLeft(this string #this, char[] delimiters, int count)
{
var splits = new List<string>();
int next = -1;
while (splits.Count + 1 < count && (next = #this.IndexOfAny(delimiters, next + 1)) >= 0)
{
splits.Add(#this.Substring(0, next));
#this = new string(#this.Skip(next).ToArray());
}
splits.Add(#this);
return splits.ToArray();
}
Sample with separating CamelCase variable names:
var variableSplit = variableName.SplitLeft(
Enumerable.Range('A', 26).Select(i => (char)i).ToArray());

I wrote this code to split and keep delimiters:
private static string[] SplitKeepDelimiters(string toSplit, char[] delimiters, StringSplitOptions splitOptions = StringSplitOptions.None)
{
var tokens = new List<string>();
int idx = 0;
for (int i = 0; i < toSplit.Length; ++i)
{
if (delimiters.Contains(toSplit[i]))
{
tokens.Add(toSplit.Substring(idx, i - idx)); // token found
tokens.Add(toSplit[i].ToString()); // delimiter
idx = i + 1; // start idx for the next token
}
}
// last token
tokens.Add(toSplit.Substring(idx));
if (splitOptions == StringSplitOptions.RemoveEmptyEntries)
{
tokens = tokens.Where(token => token.Length > 0).ToList();
}
return tokens.ToArray();
}
Usage example:
string toSplit = "AAA,BBB,CCC;DD;,EE,";
char[] delimiters = new char[] {',', ';'};
string[] tokens = SplitKeepDelimiters(toSplit, delimiters, StringSplitOptions.RemoveEmptyEntries);
foreach (var token in tokens)
{
Console.WriteLine(token);
}

How to reverse an array of strings without changing the position of special characters in C#

I'm working on reversing a sentence. I'm able to do it. But I'm not sure, how to reverse the word without changing the special characters positions. I'm using regex but as soon as it finds the special characters it's stopping the reversal of the word.
Following is the code:
Console.WriteLine("Enter:");
string w = Console.ReadLine();
string rw = String.Empty;
String[] arr = w.Split(' ');
var regexItem = new Regex("^[a-zA-Z0-9]*$");
StringBuilder appendString = new StringBuilder();
for (int i = 0; i < arr.Length; i++)
{
char[] chararray = arr[i].ToCharArray();
for (int j = chararray.Length - 1; j >= 0; j--)
{
if (regexItem.IsMatch(rw))
{
rw = appendString.Append(chararray[j]).ToString();
}
}
sb.Append(' ');
}
Console.WriteLine(rw);
Console.ReadLine();
Example : Input
Marshall! Hello.
Expected output
llahsram! olleh.

A basic solution with regex and LINQ. Try it online.
public static void Main()
{
Console.WriteLine("Marshall! Hello.");
Console.WriteLine(Reverse("Marshall! Hello."));
}
public static string Reverse(string source)
{
// we split by groups to keep delimiters
var parts = Regex.Split(source, #"([^a-zA-Z0-9])");
// if we got a group of valid characters
var results = parts.Select(x => x.All(char.IsLetterOrDigit)
// we reverse it
? new string(x.Reverse().ToArray())
// or we keep the delimiters as it
: x);
// then we concat all of them
return string.Concat(results);
}
The same solution without LINQ. Try it online.
public static void Main()
{
Console.WriteLine("Marshall! Hello.");
Console.WriteLine(Reverse("Marshall! Hello."));
}
public static bool IsLettersOrDigits(string s)
{
foreach (var c in s)
{
if (!char.IsLetterOrDigit(c))
{
return false;
}
}
return true;
}
public static string Reverse(char[] s)
{
Array.Reverse(s);
return new string(s);
}
public static string Reverse(string source)
{
var parts = Regex.Split(source, #"([^a-zA-Z0-9])");
var results = new List<string>();
foreach(var x in parts)
{
results.Add(IsLettersOrDigits(x)
? Reverse(x.ToCharArray())
: x);
}
return string.Concat(results);
}

This is a solution without LINQ. I wasn't sure about what are considered special characters.
string sentence = "Marshall! Hello.";
List<string> words = sentence.Split(' ').ToList();
List<string> reversedWords = new List<string>();
foreach (string word in words)
{
char[] arr = new char[word.Length];
for( int i=0; i<word.Length; i++)
{
if(!Char.IsLetterOrDigit((word[i])))
{
for ( int x=0; x< i; x++)
{
arr[x] = arr[x + 1];
}
arr[i] = word[i];
}
else
{
arr[word.Length - 1 - i] = word[i];
}
}
reversedWords.Add(new string(arr));
}
string reversedSentence = string.Join(" ", reversedWords);
Console.WriteLine(reversedSentence);
And this is the output:
Updated Output = llahsraM! olleH.

Here is a non-regex version that does what you want:
var sentence = "Hello, john!";
var parts = sentence.Split(' ');
var reversed = new StringBuilder();
var charPositions = sentence.Select((c, idx) => new { Char = c, Index = idx })
.Where(_ => !char.IsLetterOrDigit(_.Char));
for (int i = 0; i < parts.Length; i++)
{
var chars = parts[i].ToCharArray();
for (int j = chars.Length - 1; j >= 0; j--)
{
if (char.IsLetterOrDigit(chars[j]))
{
reversed.Append(chars[j]);
}
}
}
foreach (var ch in charPositions)
{
reversed.Insert(ch.Index, ch.Char);
}
// olleH, nhoj!
Console.WriteLine(reversed.ToString());
Basically the trick is to remember the position of special (i.e. non letter or digit) characters and insert them at the end to those positions.

This solution is without LINQ and Regex. It may not be an efficient answer but working properly for small string values.
// This will reverse the string and special characters will just stay there.
public string ReverseString(string rString)
{
StringBuilder ss = new StringBuilder(rString);
int y = 0;
// The idea is to swap values. Like swapping first value with last one. It will keep swapping unless it reaches at the middle of the string where no swapping will be needed.
// This first loop is to detect first values.
for(int i=rString.Length-1;i>=0;i--)
{
// This condition is to check if the values is String or not. If it is not string then it is considered as special character which will just stay there at same old position.
if(Char.IsLetter(Convert.ToChar(rString.Substring(i,1))))
{
// This is second loop which is starting from end to swap values from end with first.
for (int k = y; k < rString.Length; k++)
{
// Again checking last values if values are string or not.
if (Char.IsLetter(Convert.ToChar(rString.Substring(k, 1))))
{
// This is swapping. So st1 is First value in that string
// st2 is the last item in that string
char st1 = Convert.ToChar(rString.Substring(k, 1));
char st2 = Convert.ToChar(rString.Substring(i, 1));
//This is swapping. So last item will go to first position and first item will go to last position, To make sure string is reversed.
// Remember when the string value is Special Character, swapping will move forward without swapping.
ss[rString.IndexOf(rString.Substring(i, 1))] = st1;
ss[rString.IndexOf(rString.Substring(k, 1))] = st2;
y++;
// When the swapping is done for first 2 items. The loop will stop to change the values.
break;
}
else
{
// This is just increment if value was Special character.
y++;
}
}
}
}
return ss.ToString();
}
Thanks!

C# How to generate a new string based on multiple ranged index

Let's say I have a string like this one, left part is a word, right part is a collection of indices (single or range) used to reference furigana (phonetics) for kanjis in my word:
string myString = "子で子にならぬ時鳥,0:こ;2:こ;7-8:ほととぎす"
The pattern in detail:
word,<startIndex>(-<endIndex>):<furigana>
What would be the best way to achieve something like this (with a space in front of the kanji to mark which part is linked to the [furigana]):
子[こ]で 子[こ]にならぬ 時鳥[ほととぎす]
Edit: (thanks for your comments guys)
Here is what I wrote so far:
static void Main(string[] args)
{
string myString = "ABCDEF,1:test;3:test2";
//Split Kanjis / Indices
string[] tokens = myString.Split(',');
//Extract furigana indices
string[] indices = tokens[1].Split(';');
//Dictionnary to store furigana indices
Dictionary<string, string> furiganaIndices = new Dictionary<string, string>();
//Collect
foreach (string index in indices)
{
string[] splitIndex = index.Split(':');
furiganaIndices.Add(splitIndex[0], splitIndex[1]);
}
//Processing
string result = tokens[0] + ",";
for (int i = 0; i < tokens[0].Length; i++)
{
string currentIndex = i.ToString();
if (furiganaIndices.ContainsKey(currentIndex)) //add [furigana]
{
string currentFurigana = furiganaIndices[currentIndex].ToString();
result = result + " " + tokens[0].ElementAt(i) + string.Format("[{0}]", currentFurigana);
}
else //nothing to add
{
result = result + tokens[0].ElementAt(i);
}
}
File.AppendAllText(#"D:\test.txt", result + Environment.NewLine);
}
Result:
ABCDEF,A B[test]C D[test2]EF
I struggle to find a way to process ranged indices:
string myString = "ABCDEF,1:test;2-3:test2";
Result : ABCDEF,A B[test] CD[test2]EF

I don't have anything against manually manipulating strings per se. But given that you seem to have a regular pattern describing the inputs, it seems to me that a solution that uses regex would be more maintainable and readable. So with that in mind, here's an example program that takes that approach:
class Program
{
private const string _kinvalidFormatException = "Invalid format for edit specification";
private static readonly Regex
regex1 = new Regex(#"(?<word>[^,]+),(?<edit>(?:\d+)(?:-(?:\d+))?:(?:[^;]+);?)+", RegexOptions.Compiled),
regex2 = new Regex(#"(?<start>\d+)(?:-(?<end>\d+))?:(?<furigana>[^;]+);?", RegexOptions.Compiled);
static void Main(string[] args)
{
string myString = "子で子にならぬ時鳥,0:こ;2:こ;7-8:ほととぎす";
string result = EditString(myString);
}
private static string EditString(string myString)
{
Match editsMatch = regex1.Match(myString);
if (!editsMatch.Success)
{
throw new ArgumentException(_kinvalidFormatException);
}
int ichCur = 0;
string input = editsMatch.Groups["word"].Value;
StringBuilder text = new StringBuilder();
foreach (Capture capture in editsMatch.Groups["edit"].Captures)
{
Match oneEditMatch = regex2.Match(capture.Value);
if (!oneEditMatch.Success)
{
throw new ArgumentException(_kinvalidFormatException);
}
int start, end;
if (!int.TryParse(oneEditMatch.Groups["start"].Value, out start))
{
throw new ArgumentException(_kinvalidFormatException);
}
Group endGroup = oneEditMatch.Groups["end"];
if (endGroup.Success)
{
if (!int.TryParse(endGroup.Value, out end))
{
throw new ArgumentException(_kinvalidFormatException);
}
}
else
{
end = start;
}
text.Append(input.Substring(ichCur, start - ichCur));
if (text.Length > 0)
{
text.Append(' ');
}
ichCur = end + 1;
text.Append(input.Substring(start, ichCur - start));
text.Append(string.Format("[{0}]", oneEditMatch.Groups["furigana"]));
}
if (ichCur < input.Length)
{
text.Append(input.Substring(ichCur));
}
return text.ToString();
}
}
Notes:
This implementation assumes that the edit specifications will be listed in order and won't overlap. It makes no attempt to validate that part of the input; depending on where you are getting your input from you may want to add that. If it's valid for the specifications to be listed out of order, you can also extend the above to first store the edits in a list and sort the list by the start index before actually editing the string. (In similar fashion to the way the other proposed answer works; though, why they are using a dictionary instead of a simple list to store the individual edits, I have no idea…that seems arbitrarily complicated to me.)
I included basic input validation, throwing exceptions where failures occur in the pattern matching. A more user-friendly implementation would add more specific information to each exception, describing what part of the input actually was invalid.
The Regex class actually has a Replace() method, which allows for complete customization. The above could have been implemented that way, using Replace() and a MatchEvaluator to provide the replacement text, instead of just appending text to a StringBuilder. Which way to do it is mostly a matter of preference, though the MatchEvaluator might be preferred if you have a need for more flexible implementation options (i.e. if the exact format of the result can vary).
If you do choose to use the other proposed answer, I strongly recommend you use StringBuilder instead of simply concatenating onto the results variable. For short strings it won't matter much, but you should get into the habit of always using StringBuilder when you have a loop that is incrementally adding onto a string value, because for long string the performance implications of using concatenation can be very negative.

This should do it (and even handle ranged indices), based on the formatting of the input string you have-
using System;
using System.Collections.Generic;
public class stringParser
{
private struct IndexElements
{
public int start;
public int end;
public string value;
}
public static void Main()
{
//input string
string myString = "子で子にならぬ時鳥,0:こ;2:こ;7-8:ほととぎす";
int wordIndexSplit = myString.IndexOf(',');
string word = myString.Substring(0,wordIndexSplit);
string indices = myString.Substring(wordIndexSplit + 1);
string[] eachIndex = indices.Split(';');
Dictionary<int,IndexElements> index = new Dictionary<int,IndexElements>();
string[] elements;
IndexElements e;
int dash;
int n = 0;
int last = -1;
string results = "";
foreach (string s in eachIndex)
{
e = new IndexElements();
elements = s.Split(':');
if (elements[0].Contains("-"))
{
dash = elements[0].IndexOf('-');
e.start = int.Parse(elements[0].Substring(0,dash));
e.end = int.Parse(elements[0].Substring(dash + 1));
}
else
{
e.start = int.Parse(elements[0]);
e.end = e.start;
}
e.value = elements[1];
index.Add(n,e);
n++;
}
//this is the part that takes the "setup" from the parts above and forms the result string
//loop through each of the "indices" parsed above
for (int i = 0; i < index.Count; i++)
{
//if this is the first iteration through the loop, and the first "index" does not start
//at position 0, add the beginning characters before its start
if (last == -1 && index[i].start > 0)
{
results += word.Substring(0,index[i].start);
}
//if this is not the first iteration through the loop, and the previous iteration did
//not stop at the position directly before the start of the current iteration, add
//the intermediary chracters
else if (last != -1 && last + 1 != index[i].start)
{
results += word.Substring(last + 1,index[i].start - (last + 1));
}
//add the space before the "index" match, the actual match, and then the formatted "index"
results += " " + word.Substring(index[i].start,(index[i].end - index[i].start) + 1)
+ "[" + index[i].value + "]";
//remember the position of the ending for the next iteration
last = index[i].end;
}
//if the last "index" did not stop at the end of the input string, add the remaining characters
if (index[index.Keys.Count - 1].end + 1 < word.Length)
{
results += word.Substring(index[index.Keys.Count-1].end + 1);
}
//trimming spaces that may be left behind
results = results.Trim();
Console.WriteLine("INPUT - " + myString);
Console.WriteLine("OUTPUT - " + results);
Console.Read();
}
}
input - 子で子にならぬ時鳥,0:こ;2:こ;7-8:ほととぎす
output - 子[こ]で 子[こ]にならぬ 時鳥[ほととぎす]
Note that this should also work with characters the English alphabet if you wanted to use English instead-
input - iliketocodeverymuch,2:A;4-6:B;9-12:CDEFG
output - il i[A]k eto[B]co deve[CDEFG]rymuch

How to split long strings into fixed array items

I have a dictionary object IDictionary<string, string> which only has the following items:
Item1, Items2 and Item3. Each item only has a maximum length of 50 characters.
I then have a list of words List<string>. I need a loop that will go through the words and add them to the dictionary starting at Item1, but before adding it to the dictionary the length needs to be checked. If the new item and the current item's length added together is greater than 50 characters then the word needs to move down to the next line (in this case Item2).
What is the best way to do this?

I'm not sure why this question got voted down so much, but perhaps the reason is you have a very clear algorithm already, so it should be trivial to get the C# code. As it stands, either you're either really inexperienced or really lazy. I'm going to assume the former.
Anyways, lets walk through the requirements.
1) "I then have a list of words List." you have this line in some form already.
List<string> words = GetListOfWords();
2) "go through the words and add them to the dictionary starting at Item1" -- I'd recommend a List instead of a dictionary, if you're going for a sequence of strings. Also, you'll need a temporary variable to store the contents of the current line, because you're really after adding a full line at a time.
var lines = new List<string>();
string currentLine = "";
3) "I need a loop that will go through the words"
foreach(var word in words) {
4) " If the new item and the current item's length added together is greater than 50 characters" -- +1 for the space.
if (currentLine.Length + word.Length + 1 > 50) {
5) "then the word needs to move down to the next line "
lines.Add(currentLine);
currentLine = word;
}
6) " go through the words and add them to the dictionary starting at Item1 " -- you didn't phrase this very clearly. What you meant was you want to join each word to the last line unless it would make the line exceed 50 characters.
else {
currentLine += " " + word;
}
}
lines.Add(currentLine); // the last unfinished line
and there you go.
If you absolutely need it as an IDictionary with 3 lines, just do
var dict = new Dictionary<string,string>();
for(int lineNum = 0; lineNum < 3; lineNum ++)
dict["Address"+lineNum] = lineNume < lines.Length ? lines[lineNum] : "";

I currently have this and was wondering if there was perhaps a better solution:
public IDictionary AddWordsToDictionary(IList words)
{
IDictionary addressParts = new Dictionary();
addressParts.Add("Address1", string.Empty);
addressParts.Add("Address2", string.Empty);
addressParts.Add("Address3", string.Empty);
int currentIndex = 1;
foreach (string word in words)
{
if (!string.IsNullOrEmpty(word))
{
string key = string.Concat("Address", currentIndex);
int space = 0;
string spaceChar = string.Empty;
if (!string.IsNullOrEmpty(addressParts[key]))
{
space = 1;
spaceChar = " ";
}
if (word.Length + space + addressParts[key].Length > MaxAddressLineLength)
{
currentIndex++;
key = string.Concat("Address", currentIndex);
space = 0;
spaceChar = string.Empty;
if (currentIndex > addressParts.Count)
{
break; // To big for all 3 elements so discard
}
}
addressParts[key] = string.Concat(addressParts[key], spaceChar, word);
}
}
return addressParts;
}

I would just do the following and do what you like with the resulting "lines"
static List<string> BuildLines(List<string> words, int lineLen)
{
List<string> lines = new List<string>();
string line = string.Empty;
foreach (string word in words)
{
if (string.IsNullOrEmpty(word)) continue;
if (line.Length + word.Length + 1 <= lineLen)
{
line += " " + word;
}
else
{
lines.Add(line);
line = word;
}
}
lines.Add(line);
return lines;
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Find and count latter pair from a string by alphabetical order - c#

Related

Get Occurrences of Letters in A String

How to stop String.Concat(); from removing whitespaces? [duplicate]

How to reverse an array of strings without changing the position of special characters in C#

C# How to generate a new string based on multiple ranged index

How to split long strings into fixed array items

Categories

Resources