Why am I getting an IndexOutOfRangeException here? - c#

I can't figure out why I'm getting it in the Where clause below.
using System;
using System.Linq;
public static class Extensions
{
/// <summary>
/// Removes consecutive characters,
/// e.g. "aaabcc" --> "abc"
/// </summary>
public static void RemoveDuplicates(this string s)
{
var arr = s.ToCharArray()
.Where((i,c) => (i > 0) ? (c != s[i - 1]) : true)
.ToArray();
s = new string(arr);
}
}
public class Program
{
public static void Main()
{
var str = "aaabcc";
str.RemoveDuplicates();
Console.WriteLine(str);
}
}
Also, is there a way to make this slightly more efficient and compact while still using LINQ?

You have the wrong order of parameters here:
.Where((i, c) => (i > 0) ? (c != s[i - 1]) : true)
should become:
.Where((c, i) => (i > 0) ? (c != s[i - 1]) : true)

The error is the (i,c ) in your where.
You are using the following Enumerable extension (See MSDN)
public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source,
Func<TSource, int, bool> predicate)
Note that the index in the Func is the second parameter.
I think the fastest method would not be using linq, but would be a string extension:
public static string RemoveDuplicates(this string s)
{
if (String.IsNullOrEmpty(s)) return String.Empty(); // optional: return null
var resultBuilder = new StringBuilder(s.Length);
resultBuilder.Append(s.First());
for (int i=1; i< s.Length; ++i)
{
if (s[i] != s[i-1])
resultBuilder.Append(s[i]);
}
return resultBuilder.ToString();
}
However, if you really want to use linq, for instance because you want to append other linq statements you can mimic the above behaviour as follows, while still using lazy loading:
public static string RemoveDuplicates(this string s)
{
if (String.IsNullOrEmpty(s)) return String.Empty();
s.AsEnumerable().Take(1)
.Concat(s.AsEnumerable().Skip(1)
.where( (c, i) => c != s[i]));
}
Note that because of the check that string is not null, I am certain there is a First().
Because of the Skip(1), index 0 in the where statement equals s[1], and index i equals s[i-1] for i>0.

You Probably need to read the Docs before using a Method or a special overload:
Type: System.Func<TSource, Int32, Boolean>
A function to test each source element for a condition; the second
parameter of the function represents the index of the source element
So your code should be something like this:
.Where((item, index) => (index > 0) ? (item != s[index - 1]) : true)

You could use Distinct method already instead of your own method for this purpose:
var str = "aaabcca";
var result = string.Join("",str.ToCharArray().Distinct());
Result:
"abc"
Edit:If you want to remove sequential duplicates you could try this code instead:
var removesequential = string.Join("",str.Where((c, i) => i == 0 || c != str[i - 1]));
Result:
"abca"

Related

Looking to check for one's or zeros in a Bitstring

Does anyone have an idea how to check for one's or zeros in a Bitstring? The below code checks for ones and zeros in a string, but I would like to add an extension bitstring that does the same thing. This way, I can use the method on the bitstring itself with out having to first evaluate the string.
Currently, I have to check before I entered the bitstring method.
string MustBeBitsInStringOnesOrZeros = "11001";
bool boTesting = Is1Or0(MustBeBitsInStringOnesOrZeros);
// I would like to add an extension to check for ones and zeros
// Example: MustBeBitsInStringOnesOrZeros.Is1Or0();
if (boTesting == true)
{
Bitstring a = new Bitstring(MustBeBitsInStringOnesOrZeros);
}
else
{
string b = MustBeBitsInStringOnesOrZeros;
}
private static bool Is1Or0(string stringBit)
{
// This function check each
// character in a string for "1"
// or "0".
bool results = false;
for (int i = 0; i < stringBit.Length; i++)
{
var x = stringBit[i];
if (x == '1' || x == '0')
{
results = true;
}
else
{
results = false;
break;
}
}
return results;
}
===
Modified to show results of Bassie's example from a sealed class.
Bassie,
Well, what I was trying to say was that I cannot place the method in the sealed class with the keyword 'this' in the method. So I created another class but, I have to use it a different way and I wanted to use it the way you call it.
//I have to use it this way:
Bitstring OnesAndZeroCheck = new Bitstring(); // Bitstring is in a sealed class
Boolean g = OnesAndZeroCheck.IsBitstring2("1100111100011100101010101010101101010101010"); // Is in the sealed class
//but want to call it this way:
var successInput = "1101";
successInput.Is1Or0(); // true
If I understand you correctly, you could define your extension method like this
public static class StringExtensions
{
public static bool Is1Or0(this string stringBit)
=> stringBit.All(c => c == '1' || c == '0');
}
And call with
var successInput = "1101";
successInput.Is1Or0(); // true
var failureInput = "1121"
failureInput.Is1Or0(); // false
From MSDN Enumerable.All:
Determines whether all elements of a sequence satisfy a condition.
This works because a string is actually just an IEnumerable of char - so when we call the IEnumerable.All() extension method, we check the condition against each individual char in the string
Note you will need to include using System.Linq; to your file that contains the extension method
tested in video.
https://youtu.be/CgMFYctc3Ak
public static bool isBitstring(string s)
{
foreach (char c in s)
{
if (!(c >= '0' && c <= '1')) {
return false;
}
}
return true;
}
string str = "100000011100101010010101";
if (isBitstring(str))
{
Console.WriteLine("is Bitstring");
}
else
{
Console.WriteLine("is not Bitstring");
}

Check if string contains characters in certain order in C#r

I have a code that's working right now, but it doesn't check if the characters are in order, it only checks if they're there. How can I modify my code so the the characters 'gaoaf' are checked in that order in the string?
Console.WriteLine("5.feladat");
StreamWriter sw = new StreamWriter("keres.txt");
sw.WriteLine("gaoaf");
string s = "";
for (int i = 0; i < n; i++)
{
s = zadatok[i].nev+zadatok[i].cim;
if (s.Contains("g") && s.Contains("a") && s.Contains("o") && s.Contains("a") && s.Contains("f") )
{
sw.WriteLine(i);
sw.WriteLine(zadatok[i].nev + zadatok[i].cim);
}
}
sw.Close();
You can convert the letters into a pattern and use Regex:
var letters = "gaoaf";
var pattern = String.Join(".*",letters.AsEnumerable());
var hasletters = Regex.IsMatch(s, pattern, RegexOptions.IgnoreCase);
For those that needlessly avoid .*, you can also solve this with LINQ:
var ans = letters.Aggregate(0, (p, c) => p >= 0 ? s.IndexOf(c.ToString(), p, StringComparison.InvariantCultureIgnoreCase) : p) != -1;
If it is possible to have repeated adjacent letters, you need to complicate the LINQ solution slightly:
var ans = letters.Aggregate(0, (p, c) => {
if (p >= 0) {
var newp = s.IndexOf(c.ToString(), p, StringComparison.InvariantCultureIgnoreCase);
return newp >= 0 ? newp+1 : newp;
}
else
return p;
}) != -1;
Given the (ugly) machinations required to basically terminate Aggregate early, and given the (ugly and inefficient) syntax required to use an inline anonymous expression call to get rid of the temporary newp, I created some extensions to help, an Aggregate that can terminate early:
public static TAccum AggregateWhile<TAccum, T>(this IEnumerable<T> src, TAccum seed, Func<TAccum, T, TAccum> accumFn, Predicate<TAccum> whileFn) {
using (var e = src.GetEnumerator()) {
if (!e.MoveNext())
throw new Exception("At least one element required by AggregateWhile");
var ans = accumFn(seed, e.Current);
while (whileFn(ans) && e.MoveNext())
ans = accumFn(ans, e.Current);
return ans;
}
}
Now you can solve the problem fairly easily:
var ans2 = letters.AggregateWhile(-1,
(p, c) => s.IndexOf(c.ToString(), p+1, StringComparison.InvariantCultureIgnoreCase),
p => p >= 0
) != -1;
Why not something like this?
static bool CheckInOrder(string source, string charsToCheck)
{
int index = -1;
foreach (var c in charsToCheck)
{
index = source.IndexOf(c, index + 1);
if (index == -1)
return false;
}
return true;
}
Then you can use the function like this:
bool result = CheckInOrder("this is my source string", "gaoaf");
This should work because IndexOf returns -1 if a string isn't found, and it only starts scanning AFTER the previous match.

How to convert a multiple rank array using ConvertAll()?

I want to use ConvertAll like this:
var sou = new[,] { { true, false, false }, { true, true, true } };
var tar = Array.ConvertAll<bool, int>(sou, x => (x ? 1 : 0));
but I got compiler error:
cannot implicitly convert type bool[,] to bool[]
You could write a straightforward conversion extension:
public static class ArrayExtensions
{
public static TResult[,] ConvertAll<TSource, TResult>(this TSource[,] source, Func<TSource, TResult> projection)
{
if (source == null)
throw new ArgumentNullException("source");
if (projection == null)
throw new ArgumentNullException("projection");
var result = new TResult[source.GetLength(0), source.GetLength(1)];
for (int x = 0; x < source.GetLength(0); x++)
for (int y = 0; y < source.GetLength(1); y++)
result[x, y] = projection(source[x, y]);
return result;
}
}
Sample usage would look like this:
var tar = sou.ConvertAll(x => x ? 1 : 0);
The downside is that if you wanted to do any other transforms besides projection, you would be in a pickle.
Alternatively, if you want to be able to use LINQ operators on the sequence, you can do that easily with regular LINQ methods. However, you would still need a custom implementation to turn the sequence back into a 2D array:
public static T[,] To2DArray<T>(this IEnumerable<T> source, int rows, int columns)
{
if (source == null)
throw new ArgumentNullException("source");
if (rows < 0 || columns < 0)
throw new ArgumentException("rows and columns must be positive integers.");
var result = new T[rows, columns];
if (columns == 0 || rows == 0)
return result;
int column = 0, row = 0;
foreach (T element in source)
{
if (column >= columns)
{
column = 0;
if (++row >= rows)
throw new InvalidOperationException("Sequence elements do not fit the array.");
}
result[row, column++] = element;
}
return result;
}
This would allow a great deal more flexibility as you can operate on your source array as an IEnumerable{T} sequence.
Sample usage:
var tar = sou.Cast<bool>().Select(x => x ? 1 : 0).To2DArray(sou.GetLength(0), sou.GetLength(1));
Note that the initial cast is required to transform the sequence from IEnumerable paradigm to IEnumerable<T> paradigm since a multidimensional array does not implement the generic IEnumerable<T> interface. Most of the LINQ transforms only work on that.
If your array is of unknown rank, you can use this extension method (which depends on the MoreLinq Nuget package). I'm sure this can be optimized a lot, though, but this works for me.
using MoreLinq;
using System;
using System.Collections.Generic;
using System.Linq;
public static class ArrayExtensions
{
public static Array ConvertAll<TOutput>(this Array array, Converter<object, TOutput> converter)
{
foreach (int[] indices in GenerateIndices(array))
{
array.SetValue(converter.Invoke(array.GetValue(indices)), indices);
}
return array;
}
private static IEnumerable<int[]> GenerateCartesianProductOfUpperBounds(IEnumerable<int> upperBounds, IEnumerable<int[]> existingCartesianProduct)
{
if (!upperBounds.Any())
return existingCartesianProduct;
var slice = upperBounds.Slice(0, upperBounds.Count() - 1);
var rangeOfIndices = Enumerable.Range(0, upperBounds.Last() + 1);
IEnumerable<int[]> newCartesianProduct;
if (existingCartesianProduct.Any())
newCartesianProduct = rangeOfIndices.Cartesian(existingCartesianProduct, (i, p1) => new[] { i }.Concat(p1).ToArray()).ToArray();
else
newCartesianProduct = rangeOfIndices.Select(i => new int[] { i }).ToArray();
return GenerateCartesianProductOfUpperBounds(slice, newCartesianProduct);
}
private static IEnumerable<int[]> GenerateIndices(Array array)
{
var upperBounds = Enumerable.Range(0, array.Rank).Select(r => array.GetUpperBound(r));
return GenerateCartesianProductOfUpperBounds(upperBounds, Array.Empty<int[]>());
}
}

Linq: X objects in a row

I need help with a linq query that will return true if the list contains x objects in a row when the list is ordered by date.
so like this:
myList.InARow(x => x.Correct, 3)
would return true if there are 3 in a row with the property correct == true.
Not sure how to do this.
Using a GroupAdjacent extension, you can do:
var hasThreeConsecutiveCorrect
= myList.GroupAdjacent(item => item.Correct)
.Any(group => group.Key && group.Count() >= 3);
Here's another way with a Rollup extension (a cross between Select and Aggregate) that's somewhat more space-efficient:
var hasThreeConsecutiveCorrect
= myList.Rollup(0, (item, sum) => item.Correct ? (sum + 1) : 0)
.Contains(3);
There is nothing built into linq that handles this case easily. But it is a relatively simple matter to create your own extension method.
public static class EnumerableExtensions {
public IEnumerable<T> InARow<T>(this IEnumerable<T> list,
Predicate<T> filter, int length) {
int run = 0;
foreach (T element in list) {
if (filter(element)) {
if (++run >= length) return true;
}
else {
run = 0;
}
}
return false;
}
}
Updated:
myList.Aggregate(0,
(result, x) => (result >= 3) ? result : (x.Correct ? result + 1 : 0),
result => result >= 3);
Generalized version:
myList.Aggregate(0,
(result, x) => (result >= length) ? result : (filter(x) ? result + 1 : 0),
result => result >= length);

Improve performance of sorting files by extension

With a given array of file names, the most simpliest way to sort it by file extension is like this:
Array.Sort(fileNames,
(x, y) => Path.GetExtension(x).CompareTo(Path.GetExtension(y)));
The problem is that on very long list (~800k) it takes very long to sort, while sorting by the whole file name is faster for a couple of seconds!
Theoretical, there is a way to optimize it: instead of using Path.GetExtension() and compare the newly created extension-only-strings, we can provide a Comparison than compares the existing filename strings starting from the LastIndexOf('.') without creating new strings.
Now, suppose i found the LastIndexOf('.'), i want to reuse native .NET's StringComparer and apply it only to the part on string after the LastIndexOf('.'), to preserve all culture consideration. Didn't found a way to do that.
Any ideas?
Edit:
With tanascius's idea to use char.CompareTo() method, i came with my Uber-Fast-File-Extension-Comparer, now it sorting by extension 3x times faster! it even faster than all methods that uses Path.GetExtension() in some manner. what do you think?
Edit 2:
I found that this implementation do not considering culture since char.CompareTo() method do not considering culture, so this is not a perfect solution.
Any ideas?
public static int CompareExtensions(string filePath1, string filePath2)
{
if (filePath1 == null && filePath2 == null)
{
return 0;
}
else if (filePath1 == null)
{
return -1;
}
else if (filePath2 == null)
{
return 1;
}
int i = filePath1.LastIndexOf('.');
int j = filePath2.LastIndexOf('.');
if (i == -1)
{
i = filePath1.Length;
}
else
{
i++;
}
if (j == -1)
{
j = filePath2.Length;
}
else
{
j++;
}
for (; i < filePath1.Length && j < filePath2.Length; i++, j++)
{
int compareResults = filePath1[i].CompareTo(filePath2[j]);
if (compareResults != 0)
{
return compareResults;
}
}
if (i >= filePath1.Length && j >= filePath2.Length)
{
return 0;
}
else if (i >= filePath1.Length)
{
return -1;
}
else
{
return 1;
}
}
Create a new array that contains each of the filenames in ext.restofpath format (or some sort of pair/tuple format that can default sort on the extension without further transformation). Sort that, then convert it back.
This is faster because instead of having to retrieve the extension many times for each element (since you're doing something like N log N compares), you only do it once (and then move it back once).
Not the most memory efficient but the fastest according to my tests:
SortedDictionary<string, List<string>> dic = new SortedDictionary<string, List<string>>();
foreach (string fileName in fileNames)
{
string extension = Path.GetExtension(fileName);
List<string> list;
if (!dic.TryGetValue(extension, out list))
{
list = new List<string>();
dic.Add(extension, list);
}
list.Add(fileName);
}
string[] arr = dic.Values.SelectMany(v => v).ToArray();
Did a mini benchmark on 800k randomly generated 8.3 filenames:
Sorting items with Linq to Objects... 00:00:04.4592595
Sorting items with SortedDictionary... 00:00:02.4405325
Sorting items with Array.Sort... 00:00:06.6464205
You can write a comparer that compares each character of the extension. char has a CompareTo(), too (see here).
Basically you loop until you have no more chars left in at least one string or one CompareTo() returns a value != 0.
EDIT: In response to the edits of the OP
The performance of your comparer method can be significantly improved. See the following code. Additionally I added the line
string.Compare( filePath1[i].ToString(), filePath2[j].ToString(),
m_CultureInfo, m_CompareOptions );
to enable the use of CultureInfo and CompareOptions. However this slows down everything compared to a version using a plain char.CompareTo() (about factor 2). But, according to my own SO question this seems to be the way to go.
public sealed class ExtensionComparer : IComparer<string>
{
private readonly CultureInfo m_CultureInfo;
private readonly CompareOptions m_CompareOptions;
public ExtensionComparer() : this( CultureInfo.CurrentUICulture, CompareOptions.None ) {}
public ExtensionComparer( CultureInfo cultureInfo, CompareOptions compareOptions )
{
m_CultureInfo = cultureInfo;
m_CompareOptions = compareOptions;
}
public int Compare( string filePath1, string filePath2 )
{
if( filePath1 == null || filePath2 == null )
{
if( filePath1 != null )
{
return 1;
}
if( filePath2 != null )
{
return -1;
}
return 0;
}
var i = filePath1.LastIndexOf( '.' ) + 1;
var j = filePath2.LastIndexOf( '.' ) + 1;
if( i == 0 || j == 0 )
{
if( i != 0 )
{
return 1;
}
return j != 0 ? -1 : 0;
}
while( true )
{
if( i == filePath1.Length || j == filePath2.Length )
{
if( i != filePath1.Length )
{
return 1;
}
return j != filePath2.Length ? -1 : 0;
}
var compareResults = string.Compare( filePath1[i].ToString(), filePath2[j].ToString(), m_CultureInfo, m_CompareOptions );
//var compareResults = filePath1[i].CompareTo( filePath2[j] );
if( compareResults != 0 )
{
return compareResults;
}
i++;
j++;
}
}
}
Usage:
fileNames1.Sort( new ExtensionComparer( CultureInfo.GetCultureInfo( "sv-SE" ),
CompareOptions.StringSort ) );
the main problem here is that you are calling Path.GetExtension multiple times for each path. if this is doing a quicksort then you could expect Path.GetExtension to be called anywhere from log(n) to n times where n is the number of items in your list for each item in the list. So you are going to want to cache the calls to Path.GetExtension.
if you were using linq i would suggest something like this:
filenames.Select(n => new {name=n, ext=Path.GetExtension(n)})
.OrderBy(t => t.ext).ToArray();
this ensures that Path.GetExtension is only called once for each filename.

Categories