I wanted to bring this challenge to the attention of the stackoverflow community. The original problem and answers are here. BTW, if you did not follow it before, you should try to read Eric's blog, it is pure wisdom.
Summary:
Write a function that takes a non-null IEnumerable and returns a string with the following characteristics:
If the sequence is empty the resulting string is "{}".
If the sequence is a single item "ABC" then the resulting string is "{ABC}".
If the sequence is the two item sequence "ABC", "DEF" then the resulting string is "{ABC and DEF}".
If the sequence has more than two items, say, "ABC", "DEF", "G", "H" then the resulting string is "{ABC, DEF, G and H}". (Note: no Oxford comma!)
As you can see even our very own Jon Skeet (yes, it is well known that he can be in two places at the same time) has posted a solution but his (IMHO) is not the most elegant although probably you can not beat its performance.
What do you think? There are pretty good options there. I really like one of the solutions that involves the select and aggregate methods (from Fernando Nicolet). Linq is very powerful and dedicating some time to challenges like this make you learn a lot. I twisted it a bit so it is a bit more performant and clear (by using Count and avoiding Reverse):
public static string CommaQuibbling(IEnumerable<string> items)
{
int last = items.Count() - 1;
Func<int, string> getSeparator = (i) => i == 0 ? string.Empty : (i == last ? " and " : ", ");
string answer = string.Empty;
return "{" + items.Select((s, i) => new { Index = i, Value = s })
.Aggregate(answer, (s, a) => s + getSeparator(a.Index) + a.Value) + "}";
}
Inefficient, but I think clear.
public static string CommaQuibbling(IEnumerable<string> items)
{
List<String> list = new List<String>(items);
if (list.Count == 0) { return "{}"; }
if (list.Count == 1) { return "{" + list[0] + "}"; }
String[] initial = list.GetRange(0, list.Count - 1).ToArray();
return "{" + String.Join(", ", initial) + " and " + list[list.Count - 1] + "}";
}
If I was maintaining the code, I'd prefer this to more clever versions.
How about this approach? Purely cumulative - no back-tracking, and only iterates once. For raw performance, I'm not sure you'll do better with LINQ etc, regardless of how "pretty" a LINQ answer might be.
using System;
using System.Collections.Generic;
using System.Text;
static class Program
{
public static string CommaQuibbling(IEnumerable<string> items)
{
StringBuilder sb = new StringBuilder('{');
using (var iter = items.GetEnumerator())
{
if (iter.MoveNext())
{ // first item can be appended directly
sb.Append(iter.Current);
if (iter.MoveNext())
{ // more than one; only add each
// term when we know there is another
string lastItem = iter.Current;
while (iter.MoveNext())
{ // middle term; use ", "
sb.Append(", ").Append(lastItem);
lastItem = iter.Current;
}
// add the final term; since we are on at least the
// second term, always use " and "
sb.Append(" and ").Append(lastItem);
}
}
}
return sb.Append('}').ToString();
}
static void Main()
{
Console.WriteLine(CommaQuibbling(new string[] { }));
Console.WriteLine(CommaQuibbling(new string[] { "ABC" }));
Console.WriteLine(CommaQuibbling(new string[] { "ABC", "DEF" }));
Console.WriteLine(CommaQuibbling(new string[] {
"ABC", "DEF", "G", "H" }));
}
}
If I was doing a lot with streams which required first/last information, I'd have thid extension:
[Flags]
public enum StreamPosition
{
First = 1, Last = 2
}
public static IEnumerable<R> MapWithPositions<T, R> (this IEnumerable<T> stream,
Func<StreamPosition, T, R> map)
{
using (var enumerator = stream.GetEnumerator ())
{
if (!enumerator.MoveNext ()) yield break ;
var cur = enumerator.Current ;
var flags = StreamPosition.First ;
while (true)
{
if (!enumerator.MoveNext ()) flags |= StreamPosition.Last ;
yield return map (flags, cur) ;
if ((flags & StreamPosition.Last) != 0) yield break ;
cur = enumerator.Current ;
flags = 0 ;
}
}
}
Then the simplest (not the quickest, that would need a couple more handy extension methods) solution will be:
public static string Quibble (IEnumerable<string> strings)
{
return "{" + String.Join ("", strings.MapWithPositions ((pos, item) => (
(pos & StreamPosition.First) != 0 ? "" :
pos == StreamPosition.Last ? " and " : ", ") + item)) + "}" ;
}
Here as a Python one liner
>>> f=lambda s:"{%s}"%", ".join(s)[::-1].replace(',','dna ',1)[::-1]
>>> f([])
'{}'
>>> f(["ABC"])
'{ABC}'
>>> f(["ABC","DEF"])
'{ABC and DEF}'
>>> f(["ABC","DEF","G","H"])
'{ABC, DEF, G and H}'
This version might be easier to understand
>>> f=lambda s:"{%s}"%" and ".join(s).replace(' and',',',len(s)-2)
>>> f([])
'{}'
>>> f(["ABC"])
'{ABC}'
>>> f(["ABC","DEF"])
'{ABC and DEF}'
>>> f(["ABC","DEF","G","H"])
'{ABC, DEF, G and H}'
Here's a simple F# solution, that only does one forward iteration:
let CommaQuibble items =
let sb = System.Text.StringBuilder("{")
// pp is 2 previous, p is previous
let pp,p = items |> Seq.fold (fun (pp:string option,p) s ->
if pp <> None then
sb.Append(pp.Value).Append(", ") |> ignore
(p, Some(s))) (None,None)
if pp <> None then
sb.Append(pp.Value).Append(" and ") |> ignore
if p <> None then
sb.Append(p.Value) |> ignore
sb.Append("}").ToString()
(EDIT: Turns out this is very similar to Skeet's.)
The test code:
let Test l =
printfn "%s" (CommaQuibble l)
Test []
Test ["ABC"]
Test ["ABC";"DEF"]
Test ["ABC";"DEF";"G"]
Test ["ABC";"DEF";"G";"H"]
Test ["ABC";null;"G";"H"]
I'm a fan of the serial comma: I eat, shoot, and leave.
I continually need a solution to this problem and have solved it in 3 languages (though not C#). I would adapt the following solution (in Lua, does not wrap answer in curly braces) by writing a concat method that works on any IEnumerable:
function commafy(t, andword)
andword = andword or 'and'
local n = #t -- number of elements in the numeration
if n == 1 then
return t[1]
elseif n == 2 then
return concat { t[1], ' ', andword, ' ', t[2] }
else
local last = t[n]
t[n] = andword .. ' ' .. t[n]
local answer = concat(t, ', ')
t[n] = last
return answer
end
end
This isn't brilliantly readable, but it scales well up to tens of millions of strings. I'm developing on an old Pentium 4 workstation and it does 1,000,000 strings of average length 8 in about 350ms.
public static string CreateLippertString(IEnumerable<string> strings)
{
char[] combinedString;
char[] commaSeparator = new char[] { ',', ' ' };
char[] andSeparator = new char[] { ' ', 'A', 'N', 'D', ' ' };
int totalLength = 2; //'{' and '}'
int numEntries = 0;
int currentEntry = 0;
int currentPosition = 0;
int secondToLast;
int last;
int commaLength= commaSeparator.Length;
int andLength = andSeparator.Length;
int cbComma = commaLength * sizeof(char);
int cbAnd = andLength * sizeof(char);
//calculate the sum of the lengths of the strings
foreach (string s in strings)
{
totalLength += s.Length;
++numEntries;
}
//add to the total length the length of the constant characters
if (numEntries >= 2)
totalLength += 5; // " AND "
if (numEntries > 2)
totalLength += (2 * (numEntries - 2)); // ", " between items
//setup some meta-variables to help later
secondToLast = numEntries - 2;
last = numEntries - 1;
//allocate the memory for the combined string
combinedString = new char[totalLength];
//set the first character to {
combinedString[0] = '{';
currentPosition = 1;
if (numEntries > 0)
{
//now copy each string into its place
foreach (string s in strings)
{
Buffer.BlockCopy(s.ToCharArray(), 0, combinedString, currentPosition * sizeof(char), s.Length * sizeof(char));
currentPosition += s.Length;
if (currentEntry == secondToLast)
{
Buffer.BlockCopy(andSeparator, 0, combinedString, currentPosition * sizeof(char), cbAnd);
currentPosition += andLength;
}
else if (currentEntry == last)
{
combinedString[currentPosition] = '}'; //set the last character to '}'
break; //don't bother making that last call to the enumerator
}
else if (currentEntry < secondToLast)
{
Buffer.BlockCopy(commaSeparator, 0, combinedString, currentPosition * sizeof(char), cbComma);
currentPosition += commaLength;
}
++currentEntry;
}
}
else
{
//set the last character to '}'
combinedString[1] = '}';
}
return new string(combinedString);
}
Another variant - separating punctuation and iteration logic for the sake of code clarity. And still thinking about perfomrance.
Works as requested with pure IEnumerable/string/ and strings in the list cannot be null.
public static string Concat(IEnumerable<string> strings)
{
return "{" + strings.reduce("", (acc, prev, cur, next) =>
acc.Append(punctuation(prev, cur, next)).Append(cur)) + "}";
}
private static string punctuation(string prev, string cur, string next)
{
if (null == prev || null == cur)
return "";
if (null == next)
return " and ";
return ", ";
}
private static string reduce(this IEnumerable<string> strings,
string acc, Func<StringBuilder, string, string, string, StringBuilder> func)
{
if (null == strings) return "";
var accumulatorBuilder = new StringBuilder(acc);
string cur = null;
string prev = null;
foreach (var next in strings)
{
func(accumulatorBuilder, prev, cur, next);
prev = cur;
cur = next;
}
func(accumulatorBuilder, prev, cur, null);
return accumulatorBuilder.ToString();
}
F# surely looks much better:
let rec reduce list =
match list with
| [] -> ""
| head::curr::[] -> head + " and " + curr
| head::curr::tail -> head + ", " + curr :: tail |> reduce
| head::[] -> head
let concat list = "{" + (list |> reduce ) + "}"
Disclaimer: I used this as an excuse to play around with new technologies, so my solutions don't really live up to the Eric's original demands for clarity and maintainability.
Naive Enumerator Solution
(I concede that the foreach variant of this is superior, as it doesn't require manually messing about with the enumerator.)
public static string NaiveConcatenate(IEnumerable<string> sequence)
{
StringBuilder sb = new StringBuilder();
sb.Append('{');
IEnumerator<string> enumerator = sequence.GetEnumerator();
if (enumerator.MoveNext())
{
string a = enumerator.Current;
if (!enumerator.MoveNext())
{
sb.Append(a);
}
else
{
string b = enumerator.Current;
while (enumerator.MoveNext())
{
sb.Append(a);
sb.Append(", ");
a = b;
b = enumerator.Current;
}
sb.AppendFormat("{0} and {1}", a, b);
}
}
sb.Append('}');
return sb.ToString();
}
Solution using LINQ
public static string ConcatenateWithLinq(IEnumerable<string> sequence)
{
return (from item in sequence select item)
.Aggregate(
new {sb = new StringBuilder("{"), a = (string) null, b = (string) null},
(s, x) =>
{
if (s.a != null)
{
s.sb.Append(s.a);
s.sb.Append(", ");
}
return new {s.sb, a = s.b, b = x};
},
(s) =>
{
if (s.b != null)
if (s.a != null)
s.sb.AppendFormat("{0} and {1}", s.a, s.b);
else
s.sb.Append(s.b);
s.sb.Append("}");
return s.sb.ToString();
});
}
Solution with TPL
This solution uses a producer-consumer queue to feed the input sequence to the processor, whilst keeping at least two elements buffered in the queue. Once the producer has reached the end of the input sequence, the last two elements can be processed with special treatment.
In hindsight there is no reason to have the consumer operate asynchronously, which would eliminate the need for a concurrent queue, but as I said previously, I was just using this as an excuse to play around with new technologies :-)
public static string ConcatenateWithTpl(IEnumerable<string> sequence)
{
var queue = new ConcurrentQueue<string>();
bool stop = false;
var consumer = Future.Create(
() =>
{
var sb = new StringBuilder("{");
while (!stop || queue.Count > 2)
{
string s;
if (queue.Count > 2 && queue.TryDequeue(out s))
sb.AppendFormat("{0}, ", s);
}
return sb;
});
// Producer
foreach (var item in sequence)
queue.Enqueue(item);
stop = true;
StringBuilder result = consumer.Value;
string a;
string b;
if (queue.TryDequeue(out a))
if (queue.TryDequeue(out b))
result.AppendFormat("{0} and {1}", a, b);
else
result.Append(a);
result.Append("}");
return result.ToString();
}
Unit tests elided for brevity.
Late entry:
public static string CommaQuibbling(IEnumerable<string> items)
{
string[] parts = items.ToArray();
StringBuilder result = new StringBuilder('{');
for (int i = 0; i < parts.Length; i++)
{
if (i > 0)
result.Append(i == parts.Length - 1 ? " and " : ", ");
result.Append(parts[i]);
}
return result.Append('}').ToString();
}
public static string CommaQuibbling(IEnumerable<string> items)
{
int count = items.Count();
string answer = string.Empty;
return "{" +
(count==0) ? "" :
( items[0] +
(count == 1 ? "" :
items.Range(1,count-1).
Aggregate(answer, (s,a)=> s += ", " + a) +
items.Range(count-1,1).
Aggregate(answer, (s,a)=> s += " AND " + a) ))+ "}";
}
It is implemented as,
if count == 0 , then return empty,
if count == 1 , then return only element,
if count > 1 , then take two ranges,
first 2nd element to 2nd last element
last element
Here's mine, but I realize it's pretty much like Marc's, some minor differences in the order of things, and I added unit-tests as well.
using System;
using NUnit.Framework;
using NUnit.Framework.Extensions;
using System.Collections.Generic;
using System.Text;
using NUnit.Framework.SyntaxHelpers;
namespace StringChallengeProject
{
[TestFixture]
public class StringChallenge
{
[RowTest]
[Row(new String[] { }, "{}")]
[Row(new[] { "ABC" }, "{ABC}")]
[Row(new[] { "ABC", "DEF" }, "{ABC and DEF}")]
[Row(new[] { "ABC", "DEF", "G", "H" }, "{ABC, DEF, G and H}")]
public void Test(String[] input, String expectedOutput)
{
Assert.That(FormatString(input), Is.EqualTo(expectedOutput));
}
//codesnippet:93458590-3182-11de-8c30-0800200c9a66
public static String FormatString(IEnumerable<String> input)
{
if (input == null)
return "{}";
using (var iterator = input.GetEnumerator())
{
// Guard-clause for empty source
if (!iterator.MoveNext())
return "{}";
// Take care of first value
var output = new StringBuilder();
output.Append('{').Append(iterator.Current);
// Grab next
if (iterator.MoveNext())
{
// Grab the next value, but don't process it
// we don't know whether to use comma or "and"
// until we've grabbed the next after it as well
String nextValue = iterator.Current;
while (iterator.MoveNext())
{
output.Append(", ");
output.Append(nextValue);
nextValue = iterator.Current;
}
output.Append(" and ");
output.Append(nextValue);
}
output.Append('}');
return output.ToString();
}
}
}
}
How about skipping complicated aggregation code and just cleaning up the string after you build it?
public static string CommaQuibbling(IEnumerable<string> items)
{
var aggregate = items.Aggregate<string, StringBuilder>(
new StringBuilder(),
(b,s) => b.AppendFormat(", {0}", s));
var trimmed = Regex.Replace(aggregate.ToString(), "^, ", string.Empty);
return string.Format(
"{{{0}}}",
Regex.Replace(trimmed,
", (?<last>[^,]*)$", #" and ${last}"));
}
UPDATED: This won't work with strings with commas, as pointed out in the comments. I tried some other variations, but without definite rules about what the strings can contain, I'm going to have real problems matching any possible last item with a regular expression, which makes this a nice lesson for me on their limitations.
I quite liked Jon's answer, but that's because it's much like how I approached the problem. Rather than specifically coding in the two variables, I implemented them inside of a FIFO queue.
It's strange because I just assumed that there would be 15 posts that all did exactly the same thing, but it looks like we were the only two to do it that way. Oh, looking at these answers, Marc Gravell's answer is quite close to the approach we used as well, but he's using two 'loops', rather than holding on to values.
But all those answers with LINQ and regex and joining arrays just seem like crazy-talk! :-)
I don't think that using a good old array is a restriction. Here is my version using an array and an extension method:
public static string CommaQuibbling(IEnumerable<string> list)
{
string[] array = list.ToArray();
if (array.Length == 0) return string.Empty.PutCurlyBraces();
if (array.Length == 1) return array[0].PutCurlyBraces();
string allExceptLast = string.Join(", ", array, 0, array.Length - 1);
string theLast = array[array.Length - 1];
return string.Format("{0} and {1}", allExceptLast, theLast)
.PutCurlyBraces();
}
public static string PutCurlyBraces(this string str)
{
return "{" + str + "}";
}
I am using an array because of the string.Join method and because if the possibility of accessing the last element via an index. The extension method is here because of DRY.
I think that the performance penalities come from the list.ToArray() and string.Join calls, but all in one I hope that piece of code is pleasent to read and maintain.
I think Linq provides fairly readable code. This version handles a million "ABC" in .89 seconds:
using System.Collections.Generic;
using System.Linq;
namespace CommaQuibbling
{
internal class Translator
{
public string Translate(IEnumerable<string> items)
{
return "{" + Join(items) + "}";
}
private static string Join(IEnumerable<string> items)
{
var leadingItems = LeadingItemsFrom(items);
var lastItem = LastItemFrom(items);
return JoinLeading(leadingItems) + lastItem;
}
private static IEnumerable<string> LeadingItemsFrom(IEnumerable<string> items)
{
return items.Reverse().Skip(1).Reverse();
}
private static string LastItemFrom(IEnumerable<string> items)
{
return items.LastOrDefault();
}
private static string JoinLeading(IEnumerable<string> items)
{
if (items.Any() == false) return "";
return string.Join(", ", items.ToArray()) + " and ";
}
}
}
You can use a foreach, without LINQ, delegates, closures, lists or arrays, and still have understandable code. Use a bool and a string, like so:
public static string CommaQuibbling(IEnumerable items)
{
StringBuilder sb = new StringBuilder("{");
bool empty = true;
string prev = null;
foreach (string s in items)
{
if (prev!=null)
{
if (!empty) sb.Append(", ");
else empty = false;
sb.Append(prev);
}
prev = s;
}
if (prev!=null)
{
if (!empty) sb.Append(" and ");
sb.Append(prev);
}
return sb.Append('}').ToString();
}
public static string CommaQuibbling(IEnumerable<string> items)
{
var itemArray = items.ToArray();
var commaSeparated = String.Join(", ", itemArray, 0, Math.Max(itemArray.Length - 1, 0));
if (commaSeparated.Length > 0) commaSeparated += " and ";
return "{" + commaSeparated + itemArray.LastOrDefault() + "}";
}
Here's my submission. Modified the signature a bit to make it more generic. Using .NET 4 features (String.Join() using IEnumerable<T>), otherwise works with .NET 3.5. Goal was to use LINQ with drastically simplified logic.
static string CommaQuibbling<T>(IEnumerable<T> items)
{
int count = items.Count();
var quibbled = items.Select((Item, index) => new { Item, Group = (count - index - 2) > 0})
.GroupBy(item => item.Group, item => item.Item)
.Select(g => g.Key
? String.Join(", ", g)
: String.Join(" and ", g));
return "{" + String.Join(", ", quibbled) + "}";
}
There's a couple non-C# answers, and the original post did ask for answers in any language, so I thought I'd show another way to do it that none of the C# programmers seems to have touched upon: a DSL!
(defun quibble-comma (words)
(format nil "~{~#[~;~a~;~a and ~a~:;~#{~a~#[~; and ~:;, ~]~}~]~}" words))
The astute will note that Common Lisp doesn't really have an IEnumerable<T> built-in, and hence FORMAT here will only work on a proper list. But if you made an IEnumerable, you certainly could extend FORMAT to work on that, as well. (Does Clojure have this?)
Also, anyone reading this who has taste (including Lisp programmers!) will probably be offended by the literal "~{~#[~;~a~;~a and ~a~:;~#{~a~#[~; and ~:;, ~]~}~]~}" there. I won't claim that FORMAT implements a good DSL, but I do believe that it is tremendously useful to have some powerful DSL for putting strings together. Regex is a powerful DSL for tearing strings apart, and string.Format is a DSL (kind of) for putting strings together but it's stupidly weak.
I think everybody writes these kind of things all the time. Why the heck isn't there some built-in universal tasteful DSL for this yet? I think the closest we have is "Perl", maybe.
Just for fun, using the new Zip extension method from C# 4.0:
private static string CommaQuibbling(IEnumerable<string> list)
{
IEnumerable<string> separators = GetSeparators(list.Count());
var finalList = list.Zip(separators, (w, s) => w + s);
return string.Concat("{", string.Join(string.Empty, finalList), "}");
}
private static IEnumerable<string> GetSeparators(int itemCount)
{
while (itemCount-- > 2)
yield return ", ";
if (itemCount == 1)
yield return " and ";
yield return string.Empty;
}
return String.Concat(
"{",
input.Length > 2 ?
String.Concat(
String.Join(", ", input.Take(input.Length - 1)),
" and ",
input.Last()) :
String.Join(" and ", input),
"}");
I have tried using foreach. Please let me know your opinions.
private static string CommaQuibble(IEnumerable<string> input)
{
var val = string.Concat(input.Process(
p => p,
p => string.Format(" and {0}", p),
p => string.Format(", {0}", p)));
return string.Format("{{{0}}}", val);
}
public static IEnumerable<T> Process<T>(this IEnumerable<T> input,
Func<T, T> firstItemFunc,
Func<T, T> lastItemFunc,
Func<T, T> otherItemFunc)
{
//break on empty sequence
if (!input.Any()) yield break;
//return first elem
var first = input.First();
yield return firstItemFunc(first);
//break if there was only one elem
var rest = input.Skip(1);
if (!rest.Any()) yield break;
//start looping the rest of the elements
T prevItem = first;
bool isFirstIteration = true;
foreach (var item in rest)
{
if (isFirstIteration) isFirstIteration = false;
else
{
yield return otherItemFunc(prevItem);
}
prevItem = item;
}
//last element
yield return lastItemFunc(prevItem);
}
Here are a couple of solutions and testing code written in Perl based on the replies at http://blogs.perl.org/users/brian_d_foy/2013/10/comma-quibbling-in-perl.html.
#!/usr/bin/perl
use 5.14.0;
use warnings;
use strict;
use Test::More qw{no_plan};
sub comma_quibbling1 {
my (#words) = #_;
return "" unless #words;
return $words[0] if #words == 1;
return join(", ", #words[0 .. $#words - 1]) . " and $words[-1]";
}
sub comma_quibbling2 {
return "" unless #_;
my $last = pop #_;
return $last unless #_;
return join(", ", #_) . " and $last";
}
is comma_quibbling1(qw{}), "", "1-0";
is comma_quibbling1(qw{one}), "one", "1-1";
is comma_quibbling1(qw{one two}), "one and two", "1-2";
is comma_quibbling1(qw{one two three}), "one, two and three", "1-3";
is comma_quibbling1(qw{one two three four}), "one, two, three and four", "1-4";
is comma_quibbling2(qw{}), "", "2-0";
is comma_quibbling2(qw{one}), "one", "2-1";
is comma_quibbling2(qw{one two}), "one and two", "2-2";
is comma_quibbling2(qw{one two three}), "one, two and three", "2-3";
is comma_quibbling2(qw{one two three four}), "one, two, three and four", "2-4";
It hasn't quite been a decade since the last post so here's my variation:
public static string CommaQuibbling(IEnumerable<string> items)
{
var text = new StringBuilder();
string sep = null;
int last_pos = items.Count();
int next_pos = 1;
foreach(string item in items)
{
text.Append($"{sep}{item}");
sep = ++next_pos < last_pos ? ", " : " and ";
}
return $"{{{text}}}";
}
In one statement:
public static string CommaQuibbling(IEnumerable<string> inputList)
{
return
String.Concat("{",
String.Join(null, inputList
.Select((iw, i) =>
(i == (inputList.Count() - 1)) ? $"{iw}" :
(i == (inputList.Count() - 2) ? $"{iw} and " : $"{iw}, "))
.ToArray()), "}");
}
In .NET Core we can leverage SkipLast and TakeLast.
public static string CommaQuibblify(IEnumerable<string> items)
{
var head = string.Join(", ", items.SkipLast(2).Append(""));
var tail = string.Join(" and ", items.TakeLast(2));
return '{' + head + tail + '}';
}
https://dotnetfiddle.net/X58qvZ