Find the longest repetition character in String - c#

Let's say i have a string like
string text = "hello dear";
Then I want to determine the longest repetition of coherented characters - in this case it would be ll. If there are more than one with the same count, take any.
I have tried to solve this with linq
char rchar = text.GroupBy(y => y).OrderByDescending(y => y.Count()).Select(x => x).First().Key;
int rcount = text.Where(x => x == rchar).Count();
string routput = new string(rchar, rcount);
But this returns ee. Am I on the right track?

Another solution using LINQ:
string text = "hello dear";
string longestRun = new string(text.Select((c, index) => text.Substring(index).TakeWhile(e => e == c))
.OrderByDescending(e => e.Count())
.First().ToArray());
Console.WriteLine(longestRun); // ll
It selects a sequence of substrings starting with the same repeating character, and creates the result string with the longest of them.

Although a regex or custom Linq extension is fine, if you don't mind "doing it the old way" you can achieve your result with two temporary variables and a classic foreach loop.
The logic is pretty simple, and runs in O(n). Loop through your string and compare the current character with the previous one.
If it's the same, increase your count. If it's different, reset it to 1.
Then, if your count is greater than the previous recorded max count, overwrite the rchar with your current character.
string text = "hello dear";
char rchar = text[0];
int rcount = 1;
int currentCount = 0;
char previousChar = char.MinValue;
foreach (char character in text)
{
if (character != previousChar)
{
currentCount = 1;
}
else
{
currentCount++;
}
if (currentCount >= rcount)
{
rchar = character;
rcount = currentCount;
}
previousChar = character;
}
string routput = new string(rchar, rcount);
It's indeed verbose, but gets the job done.

If you prefer doing things with LINQ, you can do it fairly simply with a LINQ extension:
var text = "hello dear";
var result = string.Join("",
text
.GroupAdjacentBy((l, r) => (l == r)) /* Create groups where the current character is
Is the same as the previous character */
.OrderByDescending(g => g.Count()) //Order by the group lengths
.First() //Take the first group (which is the longest run)
);
With this extension (taken from Use LINQ to group a sequence of numbers with no gaps):
public static class LinqExtensions
{
public static IEnumerable<IEnumerable<T>> GroupAdjacentBy<T>(this IEnumerable<T> source, Func<T, T, bool> predicate)
{
using (var e = source.GetEnumerator())
{
if (e.MoveNext())
{
var list = new List<T> { e.Current };
var pred = e.Current;
while (e.MoveNext())
{
if (predicate(pred, e.Current))
{
list.Add(e.Current);
}
else
{
yield return list;
list = new List<T> { e.Current };
}
pred = e.Current;
}
yield return list;
}
}
}
}
Might be a bit overkill - but I've found GroupAdjacentBy quite useful in other situations as well.

RegEx would be a option
string text = "hello dear";
string Result = string.IsNullOrEmpty(text) ? string.Empty : Regex.Matches(text, #"(.)\1*", RegexOptions.None).Cast<Match>().OrderByDescending(x => x.Length).First().Value;

Related

rearrange the characters of a string so that any two adjacent characters are not the same

How to rearrange the characters of a string so that any two adjacent characters are not the same? using c#
c#
Without using Hashmaps and Dictionary
I managed to find each element of the string, and the occurrence of each element.
This is what I've done so far
Using LINQ, you can gather the characters of the string, group them by duplicate characters, then pivot the groups and join then back into a string.
First, some extension methods to make Join easier:
public static class IEnumerableExt {
public static string Join(this IEnumerable<char> chars) => String.Concat(chars); // faster >= .Net Core 2.1
public static string Join(this IEnumerable<string> strings) => String.Concat(strings);
}
Then, an extension method to pivot an IEnumerable<IEnumerable<T>>:
public static class IEnumerableIEnumerableExt {
public static IEnumerable<IEnumerable<T>> Pivot<T>(this IEnumerable<IEnumerable<T>> src) {
var enums = src.Select(ie => ie.GetEnumerator()).ToList();
var hasMores = Enumerable.Range(0, enums.Count).Select(n => true).ToList();
for (; ; ) {
var oneGroup = new List<T>();
var hasMore = false;
for (int j1 = 0; j1 < enums.Count; ++j1) {
if (hasMores[j1]) {
hasMores[j1] = enums[j1].MoveNext();
hasMore = hasMore || hasMores[j1];
if (hasMores[j1])
oneGroup.Add(enums[j1].Current);
}
}
if (!hasMore)
break;
yield return oneGroup;
}
for (int j1 = 0; j1 < enums.Count; ++j1)
enums[j1].Dispose();
}
}
Finally, use these to solve your problem:
var s = "How to rearrange the characters of a string";
var tryAns = s.OrderBy(c => c)
.GroupBy(ch => ch)
.Pivot()
.Select(gch => gch.Join())
.Join();
var dupRE = new Regex(#"(.)\1", RegexOptions.Compiled);
var hasDups = dupRE.IsMatch(tryAns);
// tryAns will be " Hacefghinorstw aceghnorst aeort aert ar r "
// hasDups will be false
If the resulting string has two adjacent identical characters, then hasDups will be true.

How to get every possible combination base on the ranges in brackets?

Looking for the best way to take something like 1[a-C]3[1-6]07[R,E-G] and have it output a log that would look like the following — basically every possible combination base on the ranges in brackets.
1a3107R
1a3107E
1a3107F
1a3107G
1b3107R
1b3107E
1b3107F
1b3107G
1c3107R
1c3107E
1c3107F
1c3107G
all the way to 1C3607G.
Sorry for not being more technical about what I looking for, just not sure on the correct terms to explain.
Normally what we'd do to get all combinations is to put all our ranges into arrays, then use nested loops to loop through each array, and create a new item in the inner loop that gets added to our results.
But in order to do that here, we'd first need to write a method that can parse your range string and return a list of char values defined by the range. I've written a rudimentary one here, which works with your sample input but should have some validation added to ensure the input string is in the proper format:
public static List<char> GetRange(string input)
{
input = input.Replace("[", "").Replace("]", "");
var parts = input.Split(',');
var range = new List<char>();
foreach (var part in parts)
{
var ends = part.Split('-');
if (ends.Length == 1)
{
range.Add(ends[0][0]);
}
else if (char.IsDigit(ends[0][0]))
{
var start = Convert.ToInt32(ends[0][0]);
var end = Convert.ToInt32(ends[1][0]);
var count = end - start + 1;
range.AddRange(Enumerable.Range(start, count).Select(c => (char) c));
}
else
{
var start = (int) ends[0][0];
var last = (int) ends[1][0];
var end = last < start ? 'z' : last;
range.AddRange(Enumerable.Range(start, end - start + 1)
.Select(c => (char) c));
if (last < start)
{
range.AddRange(Enumerable.Range('A', last - 'A' + 1)
.Select(c => (char) c));
}
}
}
return range;
}
Now that we can get a range of values from a string like "[a-C]", we need a way to create nested loops for each range, and to build our list of values based on the input string.
One way to do this is to replace our input string with one that contains placeholders for each range, and then we can create a loop for each range, and on each iteration we can replace the placeholder for that range with a character from the range.
So we'll take an input like this: "1[a-C]3[1-6]07[R,E-G]", and turn it into this: "1{0}3{1}07{2}". Now we can create loops where we take the characters from the first range and create a new string for each one of them, replacing the {0} with the character. Then, for each one of those strings, we iterate over the second range and create a new string that replaces the {1} placeholder with a character from the second range, and so on and so on until we've created new strings for every possible combination.
public static List<string> GetCombinatins(string input)
{
// Sample input = "1[a-C]3[1-6]07[R,E-G]"
var inputWithPlaceholders = string.Empty; // This will become "1{0}3{1}07{2}"
var placeholder = 0;
var ranges = new List<List<char>>();
for (int i = 0; i < input.Length; i++)
{
// We've found a range start, so replace this with our
// placeholder '{n}' and add the range to our list of ranges
if (input[i] == '[')
{
inputWithPlaceholders += $"{{{placeholder++}}}";
var rangeEndIndex = input.IndexOf("]", i);
ranges.Add(GetRange(input.Substring(i, rangeEndIndex - i)));
i = rangeEndIndex;
}
else
{
inputWithPlaceholders += input[i];
}
}
if (ranges.Count == 0) return new List<string> {input};
// Add strings for the first range
var values = ranges.First().Select(chr =>
inputWithPlaceholders.Replace("{0}", chr.ToString())).ToList();
// Then continually add all combinations of other ranges
for (int i = 1; i < ranges.Count; i++)
{
values = values.SelectMany(value =>
ranges[i].Select(chr =>
value.Replace($"{{{i}}}", chr.ToString()))).ToList();
}
return values;
}
Now with these methods out of the way, we can create output of all our ranges quite easily:
static void Main()
{
Console.WriteLine(string.Join(", ", GetCombinatins("1[a-C]3[1-6]07[R,E-G]")));
GetKeyFromUser("\nPress any key to exit...");
}
Output
I would approach this problem in three stages. The first stage is to transform the source string to an IEnumerable of IEnumerable<string>.
static IEnumerable<IEnumerable<string>> ParseSourceToEnumerables(string source);
For example the source "1[A-C]3[1-6]07[R,E-G]" should be transformed to the 6 enumerables below:
"1"
"A", "B", "C"
"3"
"1", "2", "3", "4", "5", "6"
"07"
"R", "E", "F", "G"
Each literal inside the source has been transformed to an IEnumerable<string> containing a single string.
The second stage would be to create the Cartesian product of these enumerables.
static IEnumerable<IEnumerable<T>> CartesianProduct<T>(
IEnumerable<IEnumerable<T>> sequences)
The final (and easiest) stage would be to concatenate each one of the inner IEnumerable<string> of the Cartesian product to a single string. For example
the sequence "1", "A", "3", "1", "07", "R" to the string "1A3107R"
The hardest stage is the first one, because it involves parsing. Below is a partial implementation:
static IEnumerable<IEnumerable<string>> ParseSourceToEnumerables(string source)
{
var matches = Regex.Matches(source, #"\[(.*?)\]", RegexOptions.Singleline);
int previousIndex = 0;
foreach (Match match in matches)
{
var previousLiteral = source.Substring(
previousIndex, match.Index - previousIndex);
if (previousLiteral.Length > 0)
yield return Enumerable.Repeat(previousLiteral, 1);
yield return SinglePatternToEnumerable(match.Groups[1].Value);
previousIndex = match.Index + match.Length;
}
var lastLiteral = source.Substring(previousIndex, source.Length - previousIndex);
if (lastLiteral.Length > 0) yield return Enumerable.Repeat(lastLiteral, 1);
}
static IEnumerable<string> SinglePatternToEnumerable(string pattern)
{
// TODO
// Should transform the pattern "X,A-C,YZ"
// to the sequence ["X", "A", "B", "C", "YZ"]
}
The second stage is hard too, but solved. I just grabbed the implementation from Eric Lippert's blog.
static IEnumerable<IEnumerable<T>> CartesianProduct<T>(
IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
accumulator.SelectMany(_ => sequence,
(accseq, item) => accseq.Append(item)) // .NET Framework 4.7.1
);
}
The final stage is just a call to String.Join.
var source = "1[A-C]3[1-6]07[R,E-G]";
var enumerables = ParseSourceToEnumerables(source);
var combinations = CartesianProduct(enumerables);
foreach (var combination in combinations)
{
Console.WriteLine($"Combination: {String.Join("", combination)}");
}

Check if string contains characters in certain order in C#r

I have a code that's working right now, but it doesn't check if the characters are in order, it only checks if they're there. How can I modify my code so the the characters 'gaoaf' are checked in that order in the string?
Console.WriteLine("5.feladat");
StreamWriter sw = new StreamWriter("keres.txt");
sw.WriteLine("gaoaf");
string s = "";
for (int i = 0; i < n; i++)
{
s = zadatok[i].nev+zadatok[i].cim;
if (s.Contains("g") && s.Contains("a") && s.Contains("o") && s.Contains("a") && s.Contains("f") )
{
sw.WriteLine(i);
sw.WriteLine(zadatok[i].nev + zadatok[i].cim);
}
}
sw.Close();
You can convert the letters into a pattern and use Regex:
var letters = "gaoaf";
var pattern = String.Join(".*",letters.AsEnumerable());
var hasletters = Regex.IsMatch(s, pattern, RegexOptions.IgnoreCase);
For those that needlessly avoid .*, you can also solve this with LINQ:
var ans = letters.Aggregate(0, (p, c) => p >= 0 ? s.IndexOf(c.ToString(), p, StringComparison.InvariantCultureIgnoreCase) : p) != -1;
If it is possible to have repeated adjacent letters, you need to complicate the LINQ solution slightly:
var ans = letters.Aggregate(0, (p, c) => {
if (p >= 0) {
var newp = s.IndexOf(c.ToString(), p, StringComparison.InvariantCultureIgnoreCase);
return newp >= 0 ? newp+1 : newp;
}
else
return p;
}) != -1;
Given the (ugly) machinations required to basically terminate Aggregate early, and given the (ugly and inefficient) syntax required to use an inline anonymous expression call to get rid of the temporary newp, I created some extensions to help, an Aggregate that can terminate early:
public static TAccum AggregateWhile<TAccum, T>(this IEnumerable<T> src, TAccum seed, Func<TAccum, T, TAccum> accumFn, Predicate<TAccum> whileFn) {
using (var e = src.GetEnumerator()) {
if (!e.MoveNext())
throw new Exception("At least one element required by AggregateWhile");
var ans = accumFn(seed, e.Current);
while (whileFn(ans) && e.MoveNext())
ans = accumFn(ans, e.Current);
return ans;
}
}
Now you can solve the problem fairly easily:
var ans2 = letters.AggregateWhile(-1,
(p, c) => s.IndexOf(c.ToString(), p+1, StringComparison.InvariantCultureIgnoreCase),
p => p >= 0
) != -1;
Why not something like this?
static bool CheckInOrder(string source, string charsToCheck)
{
int index = -1;
foreach (var c in charsToCheck)
{
index = source.IndexOf(c, index + 1);
if (index == -1)
return false;
}
return true;
}
Then you can use the function like this:
bool result = CheckInOrder("this is my source string", "gaoaf");
This should work because IndexOf returns -1 if a string isn't found, and it only starts scanning AFTER the previous match.

compare the characters in two strings

In C#, how do I compare the characters in two strings.
For example, let's say I have these two strings
"bc3231dsc" and "bc3462dsc"
How do I programically figure out the the strings
both start with "bc3" and end with "dsc"?
So the given would be two variables:
var1 = "bc3231dsc";
var2 = "bc3462dsc";
After comparing each characters from var1 to var2, I would want the output to be:
leftMatch = "bc3";
center1 = "231";
center2 = "462";
rightMatch = "dsc";
Conditions:
1. The strings will always be a length of 9 character.
2. The strings are not case sensitive.
The string class has 2 methods (StartsWith and Endwith) that you can use.
After reading your question and the already given answers i think there are some constraints are missing, which are maybe obvious to you, but not to the community. But maybe we can do a little guess work:
You'll have a bunch of string pairs that should be compared.
The two strings in each pair are of the same length or you are only interested by comparing the characters read simultaneously from left to right.
Get some kind of enumeration that tells me where each block starts and how long it is.
Due to the fact, that a string is only a enumeration of chars you could use LINQ here to get an idea of the matching characters like this:
private IEnumerable<bool> CommonChars(string first, string second)
{
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
var charsToCompare = first.Zip(second, (LeftChar, RightChar) => new { LeftChar, RightChar });
var matchingChars = charsToCompare.Select(pair => pair.LeftChar == pair.RightChar);
return matchingChars;
}
With this we can proceed and now find out how long each block of consecutive true and false flags are with this method:
private IEnumerable<Tuple<int, int>> Pack(IEnumerable<bool> source)
{
if (source == null)
throw new ArgumentNullException("source");
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
bool current = iterator.Current;
int index = 0;
int length = 1;
while (iterator.MoveNext())
{
if(current != iterator.Current)
{
yield return Tuple.Create(index, length);
index += length;
length = 0;
}
current = iterator.Current;
length++;
}
yield return Tuple.Create(index, length);
}
}
Currently i don't know if there is an already existing LINQ function that provides the same functionality. As far as i have already read it should be possible with SelectMany() (cause in theory you can accomplish any LINQ task with this method), but as an adhoc implementation the above was easier (for me).
These functions could then be used in a way something like this:
var firstString = "bc3231dsc";
var secondString = "bc3462dsc";
var commonChars = CommonChars(firstString, secondString);
var packs = Pack(commonChars);
foreach (var item in packs)
{
Console.WriteLine("Left side: " + firstString.Substring(item.Item1, item.Item2));
Console.WriteLine("Right side: " + secondString.Substring(item.Item1, item.Item2));
Console.WriteLine();
}
Which would you then give this output:
Left side: bc3
Right side: bc3
Left side: 231
Right side: 462
Left side: dsc
Right side: dsc
The biggest drawback is in someway the usage of Tuple cause it leads to the ugly property names Item1 and Item2 which are far away from being instantly readable. But if it is really wanted you could introduce your own simple class holding two integers and has some rock-solid property names. Also currently the information is lost about if each block is shared by both strings or if they are different. But once again it should be fairly simply to get this information also into the tuple or your own class.
static void Main(string[] args)
{
string test1 = "bc3231dsc";
string tes2 = "bc3462dsc";
string firstmatch = GetMatch(test1, tes2, false);
string lasttmatch = GetMatch(test1, tes2, true);
string center1 = test1.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
string center2 = test2.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
}
public static string GetMatch(string fist, string second, bool isReverse)
{
if (isReverse)
{
fist = ReverseString(fist);
second = ReverseString(second);
}
StringBuilder builder = new StringBuilder();
char[] ar1 = fist.ToArray();
for (int i = 0; i < ar1.Length; i++)
{
if (fist.Length > i + 1 && ar1[i].Equals(second[i]))
{
builder.Append(ar1[i]);
}
else
{
break;
}
}
if (isReverse)
{
return ReverseString(builder.ToString());
}
return builder.ToString();
}
public static string ReverseString(string s)
{
char[] arr = s.ToCharArray();
Array.Reverse(arr);
return new string(arr);
}
Pseudo code of what you need..
int stringpos = 0
string resultstart = ""
while not end of string (either of the two)
{
if string1.substr(stringpos) == string1.substr(stringpos)
resultstart =resultstart + string1.substr(stringpos)
else
exit while
}
resultstart has you start string.. you can do the same going backwards...
Another solution you can use is Regular Expressions.
Regex re = new Regex("^bc3.*?dsc$");
String first = "bc3231dsc";
if(re.IsMatch(first)) {
//Act accordingly...
}
This gives you more flexibility when matching. The pattern above matches any string that starts in bc3 and ends in dsc with anything between except a linefeed. By changing .*? to \d, you could specify that you only want digits between the two fields. From there, the possibilities are endless.
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
List<string> common_str = commonStrings(s1,s2);
foreach ( var s in common_str)
Console.WriteLine(s);
}
static public List<string> commonStrings(string s1, string s2){
int len = s1.Length;
char [] match_chars = new char[len];
for(var i = 0; i < len ; ++i)
match_chars[i] = (Char.ToLower(s1[i])==Char.ToLower(s2[i]))? '#' : '_';
string pat = new String(match_chars);
Regex regex = new Regex("(#+)", RegexOptions.Compiled);
List<string> result = new List<string>();
foreach (Match match in regex.Matches(pat))
result.Add(s1.Substring(match.Index, match.Length));
return result;
}
}
for UPDATE CONDITION
using System;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
int len = 9;//s1.Length;//cond.1)
int l_pos = 0;
int r_pos = len;
for(int i=0;i<len && Char.ToLower(s1[i])==Char.ToLower(s2[i]);++i){
++l_pos;
}
for(int i=len-1;i>0 && Char.ToLower(s1[i])==Char.ToLower(s2[i]);--i){
--r_pos;
}
string leftMatch = s1.Substring(0,l_pos);
string center1 = s1.Substring(l_pos, r_pos - l_pos);
string center2 = s2.Substring(l_pos, r_pos - l_pos);
string rightMatch = s1.Substring(r_pos);
Console.Write(
"leftMatch = \"{0}\"\n" +
"center1 = \"{1}\"\n" +
"center2 = \"{2}\"\n" +
"rightMatch = \"{3}\"\n",leftMatch, center1, center2, rightMatch);
}
}

Eric Lippert's challenge "comma-quibbling", best answer?

I wanted to bring this challenge to the attention of the stackoverflow community. The original problem and answers are here. BTW, if you did not follow it before, you should try to read Eric's blog, it is pure wisdom.
Summary:
Write a function that takes a non-null IEnumerable and returns a string with the following characteristics:
If the sequence is empty the resulting string is "{}".
If the sequence is a single item "ABC" then the resulting string is "{ABC}".
If the sequence is the two item sequence "ABC", "DEF" then the resulting string is "{ABC and DEF}".
If the sequence has more than two items, say, "ABC", "DEF", "G", "H" then the resulting string is "{ABC, DEF, G and H}". (Note: no Oxford comma!)
As you can see even our very own Jon Skeet (yes, it is well known that he can be in two places at the same time) has posted a solution but his (IMHO) is not the most elegant although probably you can not beat its performance.
What do you think? There are pretty good options there. I really like one of the solutions that involves the select and aggregate methods (from Fernando Nicolet). Linq is very powerful and dedicating some time to challenges like this make you learn a lot. I twisted it a bit so it is a bit more performant and clear (by using Count and avoiding Reverse):
public static string CommaQuibbling(IEnumerable<string> items)
{
int last = items.Count() - 1;
Func<int, string> getSeparator = (i) => i == 0 ? string.Empty : (i == last ? " and " : ", ");
string answer = string.Empty;
return "{" + items.Select((s, i) => new { Index = i, Value = s })
.Aggregate(answer, (s, a) => s + getSeparator(a.Index) + a.Value) + "}";
}
Inefficient, but I think clear.
public static string CommaQuibbling(IEnumerable<string> items)
{
List<String> list = new List<String>(items);
if (list.Count == 0) { return "{}"; }
if (list.Count == 1) { return "{" + list[0] + "}"; }
String[] initial = list.GetRange(0, list.Count - 1).ToArray();
return "{" + String.Join(", ", initial) + " and " + list[list.Count - 1] + "}";
}
If I was maintaining the code, I'd prefer this to more clever versions.
How about this approach? Purely cumulative - no back-tracking, and only iterates once. For raw performance, I'm not sure you'll do better with LINQ etc, regardless of how "pretty" a LINQ answer might be.
using System;
using System.Collections.Generic;
using System.Text;
static class Program
{
public static string CommaQuibbling(IEnumerable<string> items)
{
StringBuilder sb = new StringBuilder('{');
using (var iter = items.GetEnumerator())
{
if (iter.MoveNext())
{ // first item can be appended directly
sb.Append(iter.Current);
if (iter.MoveNext())
{ // more than one; only add each
// term when we know there is another
string lastItem = iter.Current;
while (iter.MoveNext())
{ // middle term; use ", "
sb.Append(", ").Append(lastItem);
lastItem = iter.Current;
}
// add the final term; since we are on at least the
// second term, always use " and "
sb.Append(" and ").Append(lastItem);
}
}
}
return sb.Append('}').ToString();
}
static void Main()
{
Console.WriteLine(CommaQuibbling(new string[] { }));
Console.WriteLine(CommaQuibbling(new string[] { "ABC" }));
Console.WriteLine(CommaQuibbling(new string[] { "ABC", "DEF" }));
Console.WriteLine(CommaQuibbling(new string[] {
"ABC", "DEF", "G", "H" }));
}
}
If I was doing a lot with streams which required first/last information, I'd have thid extension:
[Flags]
public enum StreamPosition
{
First = 1, Last = 2
}
public static IEnumerable<R> MapWithPositions<T, R> (this IEnumerable<T> stream,
Func<StreamPosition, T, R> map)
{
using (var enumerator = stream.GetEnumerator ())
{
if (!enumerator.MoveNext ()) yield break ;
var cur = enumerator.Current ;
var flags = StreamPosition.First ;
while (true)
{
if (!enumerator.MoveNext ()) flags |= StreamPosition.Last ;
yield return map (flags, cur) ;
if ((flags & StreamPosition.Last) != 0) yield break ;
cur = enumerator.Current ;
flags = 0 ;
}
}
}
Then the simplest (not the quickest, that would need a couple more handy extension methods) solution will be:
public static string Quibble (IEnumerable<string> strings)
{
return "{" + String.Join ("", strings.MapWithPositions ((pos, item) => (
(pos & StreamPosition.First) != 0 ? "" :
pos == StreamPosition.Last ? " and " : ", ") + item)) + "}" ;
}
Here as a Python one liner
>>> f=lambda s:"{%s}"%", ".join(s)[::-1].replace(',','dna ',1)[::-1]
>>> f([])
'{}'
>>> f(["ABC"])
'{ABC}'
>>> f(["ABC","DEF"])
'{ABC and DEF}'
>>> f(["ABC","DEF","G","H"])
'{ABC, DEF, G and H}'
This version might be easier to understand
>>> f=lambda s:"{%s}"%" and ".join(s).replace(' and',',',len(s)-2)
>>> f([])
'{}'
>>> f(["ABC"])
'{ABC}'
>>> f(["ABC","DEF"])
'{ABC and DEF}'
>>> f(["ABC","DEF","G","H"])
'{ABC, DEF, G and H}'
Here's a simple F# solution, that only does one forward iteration:
let CommaQuibble items =
let sb = System.Text.StringBuilder("{")
// pp is 2 previous, p is previous
let pp,p = items |> Seq.fold (fun (pp:string option,p) s ->
if pp <> None then
sb.Append(pp.Value).Append(", ") |> ignore
(p, Some(s))) (None,None)
if pp <> None then
sb.Append(pp.Value).Append(" and ") |> ignore
if p <> None then
sb.Append(p.Value) |> ignore
sb.Append("}").ToString()
(EDIT: Turns out this is very similar to Skeet's.)
The test code:
let Test l =
printfn "%s" (CommaQuibble l)
Test []
Test ["ABC"]
Test ["ABC";"DEF"]
Test ["ABC";"DEF";"G"]
Test ["ABC";"DEF";"G";"H"]
Test ["ABC";null;"G";"H"]
I'm a fan of the serial comma: I eat, shoot, and leave.
I continually need a solution to this problem and have solved it in 3 languages (though not C#). I would adapt the following solution (in Lua, does not wrap answer in curly braces) by writing a concat method that works on any IEnumerable:
function commafy(t, andword)
andword = andword or 'and'
local n = #t -- number of elements in the numeration
if n == 1 then
return t[1]
elseif n == 2 then
return concat { t[1], ' ', andword, ' ', t[2] }
else
local last = t[n]
t[n] = andword .. ' ' .. t[n]
local answer = concat(t, ', ')
t[n] = last
return answer
end
end
This isn't brilliantly readable, but it scales well up to tens of millions of strings. I'm developing on an old Pentium 4 workstation and it does 1,000,000 strings of average length 8 in about 350ms.
public static string CreateLippertString(IEnumerable<string> strings)
{
char[] combinedString;
char[] commaSeparator = new char[] { ',', ' ' };
char[] andSeparator = new char[] { ' ', 'A', 'N', 'D', ' ' };
int totalLength = 2; //'{' and '}'
int numEntries = 0;
int currentEntry = 0;
int currentPosition = 0;
int secondToLast;
int last;
int commaLength= commaSeparator.Length;
int andLength = andSeparator.Length;
int cbComma = commaLength * sizeof(char);
int cbAnd = andLength * sizeof(char);
//calculate the sum of the lengths of the strings
foreach (string s in strings)
{
totalLength += s.Length;
++numEntries;
}
//add to the total length the length of the constant characters
if (numEntries >= 2)
totalLength += 5; // " AND "
if (numEntries > 2)
totalLength += (2 * (numEntries - 2)); // ", " between items
//setup some meta-variables to help later
secondToLast = numEntries - 2;
last = numEntries - 1;
//allocate the memory for the combined string
combinedString = new char[totalLength];
//set the first character to {
combinedString[0] = '{';
currentPosition = 1;
if (numEntries > 0)
{
//now copy each string into its place
foreach (string s in strings)
{
Buffer.BlockCopy(s.ToCharArray(), 0, combinedString, currentPosition * sizeof(char), s.Length * sizeof(char));
currentPosition += s.Length;
if (currentEntry == secondToLast)
{
Buffer.BlockCopy(andSeparator, 0, combinedString, currentPosition * sizeof(char), cbAnd);
currentPosition += andLength;
}
else if (currentEntry == last)
{
combinedString[currentPosition] = '}'; //set the last character to '}'
break; //don't bother making that last call to the enumerator
}
else if (currentEntry < secondToLast)
{
Buffer.BlockCopy(commaSeparator, 0, combinedString, currentPosition * sizeof(char), cbComma);
currentPosition += commaLength;
}
++currentEntry;
}
}
else
{
//set the last character to '}'
combinedString[1] = '}';
}
return new string(combinedString);
}
Another variant - separating punctuation and iteration logic for the sake of code clarity. And still thinking about perfomrance.
Works as requested with pure IEnumerable/string/ and strings in the list cannot be null.
public static string Concat(IEnumerable<string> strings)
{
return "{" + strings.reduce("", (acc, prev, cur, next) =>
acc.Append(punctuation(prev, cur, next)).Append(cur)) + "}";
}
private static string punctuation(string prev, string cur, string next)
{
if (null == prev || null == cur)
return "";
if (null == next)
return " and ";
return ", ";
}
private static string reduce(this IEnumerable<string> strings,
string acc, Func<StringBuilder, string, string, string, StringBuilder> func)
{
if (null == strings) return "";
var accumulatorBuilder = new StringBuilder(acc);
string cur = null;
string prev = null;
foreach (var next in strings)
{
func(accumulatorBuilder, prev, cur, next);
prev = cur;
cur = next;
}
func(accumulatorBuilder, prev, cur, null);
return accumulatorBuilder.ToString();
}
F# surely looks much better:
let rec reduce list =
match list with
| [] -> ""
| head::curr::[] -> head + " and " + curr
| head::curr::tail -> head + ", " + curr :: tail |> reduce
| head::[] -> head
let concat list = "{" + (list |> reduce ) + "}"
Disclaimer: I used this as an excuse to play around with new technologies, so my solutions don't really live up to the Eric's original demands for clarity and maintainability.
Naive Enumerator Solution
(I concede that the foreach variant of this is superior, as it doesn't require manually messing about with the enumerator.)
public static string NaiveConcatenate(IEnumerable<string> sequence)
{
StringBuilder sb = new StringBuilder();
sb.Append('{');
IEnumerator<string> enumerator = sequence.GetEnumerator();
if (enumerator.MoveNext())
{
string a = enumerator.Current;
if (!enumerator.MoveNext())
{
sb.Append(a);
}
else
{
string b = enumerator.Current;
while (enumerator.MoveNext())
{
sb.Append(a);
sb.Append(", ");
a = b;
b = enumerator.Current;
}
sb.AppendFormat("{0} and {1}", a, b);
}
}
sb.Append('}');
return sb.ToString();
}
Solution using LINQ
public static string ConcatenateWithLinq(IEnumerable<string> sequence)
{
return (from item in sequence select item)
.Aggregate(
new {sb = new StringBuilder("{"), a = (string) null, b = (string) null},
(s, x) =>
{
if (s.a != null)
{
s.sb.Append(s.a);
s.sb.Append(", ");
}
return new {s.sb, a = s.b, b = x};
},
(s) =>
{
if (s.b != null)
if (s.a != null)
s.sb.AppendFormat("{0} and {1}", s.a, s.b);
else
s.sb.Append(s.b);
s.sb.Append("}");
return s.sb.ToString();
});
}
Solution with TPL
This solution uses a producer-consumer queue to feed the input sequence to the processor, whilst keeping at least two elements buffered in the queue. Once the producer has reached the end of the input sequence, the last two elements can be processed with special treatment.
In hindsight there is no reason to have the consumer operate asynchronously, which would eliminate the need for a concurrent queue, but as I said previously, I was just using this as an excuse to play around with new technologies :-)
public static string ConcatenateWithTpl(IEnumerable<string> sequence)
{
var queue = new ConcurrentQueue<string>();
bool stop = false;
var consumer = Future.Create(
() =>
{
var sb = new StringBuilder("{");
while (!stop || queue.Count > 2)
{
string s;
if (queue.Count > 2 && queue.TryDequeue(out s))
sb.AppendFormat("{0}, ", s);
}
return sb;
});
// Producer
foreach (var item in sequence)
queue.Enqueue(item);
stop = true;
StringBuilder result = consumer.Value;
string a;
string b;
if (queue.TryDequeue(out a))
if (queue.TryDequeue(out b))
result.AppendFormat("{0} and {1}", a, b);
else
result.Append(a);
result.Append("}");
return result.ToString();
}
Unit tests elided for brevity.
Late entry:
public static string CommaQuibbling(IEnumerable<string> items)
{
string[] parts = items.ToArray();
StringBuilder result = new StringBuilder('{');
for (int i = 0; i < parts.Length; i++)
{
if (i > 0)
result.Append(i == parts.Length - 1 ? " and " : ", ");
result.Append(parts[i]);
}
return result.Append('}').ToString();
}
public static string CommaQuibbling(IEnumerable<string> items)
{
int count = items.Count();
string answer = string.Empty;
return "{" +
(count==0) ? "" :
( items[0] +
(count == 1 ? "" :
items.Range(1,count-1).
Aggregate(answer, (s,a)=> s += ", " + a) +
items.Range(count-1,1).
Aggregate(answer, (s,a)=> s += " AND " + a) ))+ "}";
}
It is implemented as,
if count == 0 , then return empty,
if count == 1 , then return only element,
if count > 1 , then take two ranges,
first 2nd element to 2nd last element
last element
Here's mine, but I realize it's pretty much like Marc's, some minor differences in the order of things, and I added unit-tests as well.
using System;
using NUnit.Framework;
using NUnit.Framework.Extensions;
using System.Collections.Generic;
using System.Text;
using NUnit.Framework.SyntaxHelpers;
namespace StringChallengeProject
{
[TestFixture]
public class StringChallenge
{
[RowTest]
[Row(new String[] { }, "{}")]
[Row(new[] { "ABC" }, "{ABC}")]
[Row(new[] { "ABC", "DEF" }, "{ABC and DEF}")]
[Row(new[] { "ABC", "DEF", "G", "H" }, "{ABC, DEF, G and H}")]
public void Test(String[] input, String expectedOutput)
{
Assert.That(FormatString(input), Is.EqualTo(expectedOutput));
}
//codesnippet:93458590-3182-11de-8c30-0800200c9a66
public static String FormatString(IEnumerable<String> input)
{
if (input == null)
return "{}";
using (var iterator = input.GetEnumerator())
{
// Guard-clause for empty source
if (!iterator.MoveNext())
return "{}";
// Take care of first value
var output = new StringBuilder();
output.Append('{').Append(iterator.Current);
// Grab next
if (iterator.MoveNext())
{
// Grab the next value, but don't process it
// we don't know whether to use comma or "and"
// until we've grabbed the next after it as well
String nextValue = iterator.Current;
while (iterator.MoveNext())
{
output.Append(", ");
output.Append(nextValue);
nextValue = iterator.Current;
}
output.Append(" and ");
output.Append(nextValue);
}
output.Append('}');
return output.ToString();
}
}
}
}
How about skipping complicated aggregation code and just cleaning up the string after you build it?
public static string CommaQuibbling(IEnumerable<string> items)
{
var aggregate = items.Aggregate<string, StringBuilder>(
new StringBuilder(),
(b,s) => b.AppendFormat(", {0}", s));
var trimmed = Regex.Replace(aggregate.ToString(), "^, ", string.Empty);
return string.Format(
"{{{0}}}",
Regex.Replace(trimmed,
", (?<last>[^,]*)$", #" and ${last}"));
}
UPDATED: This won't work with strings with commas, as pointed out in the comments. I tried some other variations, but without definite rules about what the strings can contain, I'm going to have real problems matching any possible last item with a regular expression, which makes this a nice lesson for me on their limitations.
I quite liked Jon's answer, but that's because it's much like how I approached the problem. Rather than specifically coding in the two variables, I implemented them inside of a FIFO queue.
It's strange because I just assumed that there would be 15 posts that all did exactly the same thing, but it looks like we were the only two to do it that way. Oh, looking at these answers, Marc Gravell's answer is quite close to the approach we used as well, but he's using two 'loops', rather than holding on to values.
But all those answers with LINQ and regex and joining arrays just seem like crazy-talk! :-)
I don't think that using a good old array is a restriction. Here is my version using an array and an extension method:
public static string CommaQuibbling(IEnumerable<string> list)
{
string[] array = list.ToArray();
if (array.Length == 0) return string.Empty.PutCurlyBraces();
if (array.Length == 1) return array[0].PutCurlyBraces();
string allExceptLast = string.Join(", ", array, 0, array.Length - 1);
string theLast = array[array.Length - 1];
return string.Format("{0} and {1}", allExceptLast, theLast)
.PutCurlyBraces();
}
public static string PutCurlyBraces(this string str)
{
return "{" + str + "}";
}
I am using an array because of the string.Join method and because if the possibility of accessing the last element via an index. The extension method is here because of DRY.
I think that the performance penalities come from the list.ToArray() and string.Join calls, but all in one I hope that piece of code is pleasent to read and maintain.
I think Linq provides fairly readable code. This version handles a million "ABC" in .89 seconds:
using System.Collections.Generic;
using System.Linq;
namespace CommaQuibbling
{
internal class Translator
{
public string Translate(IEnumerable<string> items)
{
return "{" + Join(items) + "}";
}
private static string Join(IEnumerable<string> items)
{
var leadingItems = LeadingItemsFrom(items);
var lastItem = LastItemFrom(items);
return JoinLeading(leadingItems) + lastItem;
}
private static IEnumerable<string> LeadingItemsFrom(IEnumerable<string> items)
{
return items.Reverse().Skip(1).Reverse();
}
private static string LastItemFrom(IEnumerable<string> items)
{
return items.LastOrDefault();
}
private static string JoinLeading(IEnumerable<string> items)
{
if (items.Any() == false) return "";
return string.Join(", ", items.ToArray()) + " and ";
}
}
}
You can use a foreach, without LINQ, delegates, closures, lists or arrays, and still have understandable code. Use a bool and a string, like so:
public static string CommaQuibbling(IEnumerable items)
{
StringBuilder sb = new StringBuilder("{");
bool empty = true;
string prev = null;
foreach (string s in items)
{
if (prev!=null)
{
if (!empty) sb.Append(", ");
else empty = false;
sb.Append(prev);
}
prev = s;
}
if (prev!=null)
{
if (!empty) sb.Append(" and ");
sb.Append(prev);
}
return sb.Append('}').ToString();
}
public static string CommaQuibbling(IEnumerable<string> items)
{
var itemArray = items.ToArray();
var commaSeparated = String.Join(", ", itemArray, 0, Math.Max(itemArray.Length - 1, 0));
if (commaSeparated.Length > 0) commaSeparated += " and ";
return "{" + commaSeparated + itemArray.LastOrDefault() + "}";
}
Here's my submission. Modified the signature a bit to make it more generic. Using .NET 4 features (String.Join() using IEnumerable<T>), otherwise works with .NET 3.5. Goal was to use LINQ with drastically simplified logic.
static string CommaQuibbling<T>(IEnumerable<T> items)
{
int count = items.Count();
var quibbled = items.Select((Item, index) => new { Item, Group = (count - index - 2) > 0})
.GroupBy(item => item.Group, item => item.Item)
.Select(g => g.Key
? String.Join(", ", g)
: String.Join(" and ", g));
return "{" + String.Join(", ", quibbled) + "}";
}
There's a couple non-C# answers, and the original post did ask for answers in any language, so I thought I'd show another way to do it that none of the C# programmers seems to have touched upon: a DSL!
(defun quibble-comma (words)
(format nil "~{~#[~;~a~;~a and ~a~:;~#{~a~#[~; and ~:;, ~]~}~]~}" words))
The astute will note that Common Lisp doesn't really have an IEnumerable<T> built-in, and hence FORMAT here will only work on a proper list. But if you made an IEnumerable, you certainly could extend FORMAT to work on that, as well. (Does Clojure have this?)
Also, anyone reading this who has taste (including Lisp programmers!) will probably be offended by the literal "~{~#[~;~a~;~a and ~a~:;~#{~a~#[~; and ~:;, ~]~}~]~}" there. I won't claim that FORMAT implements a good DSL, but I do believe that it is tremendously useful to have some powerful DSL for putting strings together. Regex is a powerful DSL for tearing strings apart, and string.Format is a DSL (kind of) for putting strings together but it's stupidly weak.
I think everybody writes these kind of things all the time. Why the heck isn't there some built-in universal tasteful DSL for this yet? I think the closest we have is "Perl", maybe.
Just for fun, using the new Zip extension method from C# 4.0:
private static string CommaQuibbling(IEnumerable<string> list)
{
IEnumerable<string> separators = GetSeparators(list.Count());
var finalList = list.Zip(separators, (w, s) => w + s);
return string.Concat("{", string.Join(string.Empty, finalList), "}");
}
private static IEnumerable<string> GetSeparators(int itemCount)
{
while (itemCount-- > 2)
yield return ", ";
if (itemCount == 1)
yield return " and ";
yield return string.Empty;
}
return String.Concat(
"{",
input.Length > 2 ?
String.Concat(
String.Join(", ", input.Take(input.Length - 1)),
" and ",
input.Last()) :
String.Join(" and ", input),
"}");
I have tried using foreach. Please let me know your opinions.
private static string CommaQuibble(IEnumerable<string> input)
{
var val = string.Concat(input.Process(
p => p,
p => string.Format(" and {0}", p),
p => string.Format(", {0}", p)));
return string.Format("{{{0}}}", val);
}
public static IEnumerable<T> Process<T>(this IEnumerable<T> input,
Func<T, T> firstItemFunc,
Func<T, T> lastItemFunc,
Func<T, T> otherItemFunc)
{
//break on empty sequence
if (!input.Any()) yield break;
//return first elem
var first = input.First();
yield return firstItemFunc(first);
//break if there was only one elem
var rest = input.Skip(1);
if (!rest.Any()) yield break;
//start looping the rest of the elements
T prevItem = first;
bool isFirstIteration = true;
foreach (var item in rest)
{
if (isFirstIteration) isFirstIteration = false;
else
{
yield return otherItemFunc(prevItem);
}
prevItem = item;
}
//last element
yield return lastItemFunc(prevItem);
}
Here are a couple of solutions and testing code written in Perl based on the replies at http://blogs.perl.org/users/brian_d_foy/2013/10/comma-quibbling-in-perl.html.
#!/usr/bin/perl
use 5.14.0;
use warnings;
use strict;
use Test::More qw{no_plan};
sub comma_quibbling1 {
my (#words) = #_;
return "" unless #words;
return $words[0] if #words == 1;
return join(", ", #words[0 .. $#words - 1]) . " and $words[-1]";
}
sub comma_quibbling2 {
return "" unless #_;
my $last = pop #_;
return $last unless #_;
return join(", ", #_) . " and $last";
}
is comma_quibbling1(qw{}), "", "1-0";
is comma_quibbling1(qw{one}), "one", "1-1";
is comma_quibbling1(qw{one two}), "one and two", "1-2";
is comma_quibbling1(qw{one two three}), "one, two and three", "1-3";
is comma_quibbling1(qw{one two three four}), "one, two, three and four", "1-4";
is comma_quibbling2(qw{}), "", "2-0";
is comma_quibbling2(qw{one}), "one", "2-1";
is comma_quibbling2(qw{one two}), "one and two", "2-2";
is comma_quibbling2(qw{one two three}), "one, two and three", "2-3";
is comma_quibbling2(qw{one two three four}), "one, two, three and four", "2-4";
It hasn't quite been a decade since the last post so here's my variation:
public static string CommaQuibbling(IEnumerable<string> items)
{
var text = new StringBuilder();
string sep = null;
int last_pos = items.Count();
int next_pos = 1;
foreach(string item in items)
{
text.Append($"{sep}{item}");
sep = ++next_pos < last_pos ? ", " : " and ";
}
return $"{{{text}}}";
}
In one statement:
public static string CommaQuibbling(IEnumerable<string> inputList)
{
return
String.Concat("{",
String.Join(null, inputList
.Select((iw, i) =>
(i == (inputList.Count() - 1)) ? $"{iw}" :
(i == (inputList.Count() - 2) ? $"{iw} and " : $"{iw}, "))
.ToArray()), "}");
}
In .NET Core we can leverage SkipLast and TakeLast.
public static string CommaQuibblify(IEnumerable<string> items)
{
var head = string.Join(", ", items.SkipLast(2).Append(""));
var tail = string.Join(" and ", items.TakeLast(2));
return '{' + head + tail + '}';
}
https://dotnetfiddle.net/X58qvZ

Categories