c# List manipulation - c#

If I have
List<String> text
how can I create a sub-list of all continious elements within a specific range e.g.
List<String> subList = /* all elements within text bar the first 2*/
Also are there any other useful List manipulation tips & tricks that might be useful?

This will work even without LINQ:
List<String> subList = text.GetRange(2, text.Count - 2);
Edit: Fixed a typo.

subList = text.Skip(2).ToList()
Skip(n) returns an IEnumerable<> with all elements except the first n.
When you really need a list after that, ToList() converts it back.

If you're on 3.5, then there are a lot of new and interesting methods available on List. Just check out the section 'Extension methods' here: http://msdn.microsoft.com/en-us/library/d9hw1as6.aspx

Related

Removing list item from another list

I have a list with some elements and I want to remove elements from another list. An item should be removed if its value Contains (not equals) the value from another list.
One of the ways is to do this:
var MyList = new List<string> { ... }
var ToRemove = new List<string> { ... }
MyList.RemoveAll(_ => ToRemove.Any(_.Contains));
It works...
but, I have a LOT of lists (>1 million) and since the ToRemove can be sorted, it would make sense to use that in order to speed the process.
It's easy to make a loop that does it, but is there a way to do this with the sorted collections?
Update:
On 20k iterations on a text with our forbidden list, I get this:
Forbidden list as List -> 00:00:07.1993364
Forbidden list as HashSet -> 00:00:07.9749997
It's consistent after multiple runs, so the hashset is slower
Since this is a removal of strings that contain strings that are in another list, a HashSet wouldn't be much help. Actually not much would be unless you were looking for exact full matches or maintain an index of all substrings (expensive and AFIK only SQL Server does this semi-efficiently outside the BigData realm).
If all you cared about was if it starts with items in 'ToRemove', sorting could help. Sort the 'MyList' and foreach string in 'ToRemove' custom binary search to find any string starting with that string and RemoveAt index until not starts with, then decrement index backwards removing until not starts with.
Well, sorting ToRemove may be beneficial because of binary search O(log n) complexity (you will need to rewrite _ => ToRemove.Any(_.Contains)).
But, instead, using a HashSet<string> instead of List<string> for ToRemove will be much faster, because finding an element in a hashset (using Contains) is O(1) operation.
Also, using LinkedList<string> for MyList can potentially be beneficial, since removing an item from a linked list is generally faster than removing from an array based list because of array size adjusting.

Not able to sort by Tag.AlbumArtists in c#

I'm trying to sort a group of mp3 files by artist and then by album. In the course of doing this I have come across two statements; one that works and one that doesn't. I was under the impression that to do both of the sorts I want to do all I had to do was to use them one after the other, but I can't do that because one of them doesn't work. See my code below:
foreach (string file in files)
{
TagLib.File fi = TagLib.File.Create(file);
listOfFiles.Add(fi);
}
List<TagLib.File> sortedByBand = listOfFiles.OrderBy(o =>
o.Tag.AlbumArtists).ToList();
List<TagLib.File> sortedBy = listOfFiles.OrderBy(o =>
o.Tag.AlbumArtists).ToList();
The list "sortedByBand" and it's accompanying sort results in the following message: Additional information: At least one object must implement IComparable.
Thanks in advance for any and all help rendered.
It looks like AlbumArtists is some sort of enumeration (like an array of strings), which can't be compared to other instances thus listOfFiles can't be sorted by it.
In this case, you'll have to convert AlbumArtists to something that can be compared with other instances. For example, a string:
List<TagLib.File> sortedByBand = listOfFiles.OrderBy(o =>
String.Join(";", o.Tag.AlbumArtists)).ToList();
This, of course, assumes that an MP3 with the artists {"Pink Floyd", "U2"} will occur before an MP3 of {"U2", "Pink Floyd"}. To avoid this, you'd want to sort the artist list itself before converting to a single string. Hope this helps!
I think your approach might not lead correct results, you should not do double sort, I thin LINQ offers an extension to sort then by, since you are using the array of strings as your sort element you getting that error I am not sure if it possible but based on the other suggestion you can try
/// sort the array inside the tag.
listOfFiles.ForEach(x =>Array.Sort(o.Tag.AlbumArtists));
///if this does not work try other suggestion
List<TagLib.File> sortedByBand=listOfFiles.OrderBy(o =>o.Tag.AlbumArtists.FirstOrDefault()).ToList();

refresh array element

i am using array control in which i am saving value one by one.
now i have to delet one of the element and refresh it simultaneuosly.
for example....
string[] arr= new string(25);
arr[0]="A";
arr[1]="B";
arr[2]="C"; and so on....
now after deleting second element via arr[1]=null;
i want refreshed array like mentioned below...
arr[0]="A";
arr[1]="C"; and so on....
please help...
thanks in advance,,,
It sounds like you should be using a List<string> rather than an array, this would give exactly the functionality you are describing.
Although arrays can be resized (thanks #Austin Brunkhorst), this is not "cheap" and you would you would need to move everything around yourself.
It should be noted, that with lots of inserts and removes Lists can get very inefficient, so you'd be better off with a LinkedList<string>. These have advantages and disadvantages. Google linked list for more info.
When you have a static data amount you should use Array, BUT when you have dinamic data amount you should use List<>.
If you want to resize arrays, you have to create a new and copy all elements from the old to the new one.
arr = arr.Where(s => s != null).ToArray();
If you would use a List<string> you could use methods like List.Remove or List.RemoveAt.
If you'll be adding/deleting entries at arbitrary positions in your collection a lot, you'd be better off using a LinkedList<string> instead
Instead of Array you can go with List
List<int> list = new List<int>();
list.Add(2);
list.Add(3);
list.Add(5);
list.Add(7);
you will get more options like
Contains
Exists
IndexOf
For Removing the items you will get the functions like
Remove
ex: dogs.Remove("bulldog"); // Remove bulldog
RemoveAt
ex: list.RemoveAt(1);
RemoveAll

finding all lines in a list that contain x or y?

can I do this without looping through the whole list?
List<string> responseLines = new List<string>();
the list is then filled with around 300 lines of text.
next I want to search the list and create a second list of all lines that either start with "abc" or contain "xyz".
I know I can do a for each but is there a better / faster way?
You could use LINQ. This is no different performance-wise to using foreach -- that's pretty much what it does behind the scenes -- but you might prefer the syntax:
var query = responseLines.Where(s => s.StartsWith("abc") || s.Contains("xyz"))
.ToList();
(If you're happy dealing with an IEnumerable<string> rather than List<string> then you can omit the final ToList call.)
var newList = (from line in responseLines
where line.StartsWith("abc") || line.Contains("xyz")
select line).ToList();
Try this:
List<string> responseLines = new List<string>();
List<string> myLines = responseLines.Where(line => line.StartsWith("abc", StringComparison.InvariantCultureIgnoreCase) || line.Contains("xyz")).ToList();
The StartsWith and Contains shortcut - the Contains will only evaluate if the StartsWith is not satisfied. This still iterates the whole list, but of course there is no way to avoid that if you want to check the whole list, but it saves you from doing typing a foreach.
Use LINQ:
List<string> list = responseLines.Where(x => x.StartsWith("abc") || x.Contains("xyz")).ToList();
Unless you need all the text for some reason, it would be quicker to inspect each line at the time when you were generating the List and discard the ones that don't match without ever adding them.
This depends on how the List is loaded as well - that code is not shown. This would be effective if you were reading from a text file since then you could just use your LINQ query to operate directly on the input data using File.ReadLines as the source instead of the final List<string>.
var query = File.ReadLines("input.txt").
Where(s => s.StartsWith("abc") || s.Contains("xyz"))
.ToList();
LINQ works well as far as offering you improved syntax for this sort of thing (See LukeH's answer for a good example), but it isn't any faster than iterating over it by hand.
If you need to do this operation often, you might want to come up with some kind of indexed data structure that watches for all "abc" or "xyz" strings as they come into the list, and can thereby use a faster algorithm for serving them up when asked, rather than iterating through the whole list.
If you don't have to do it often, it's probably a "premature optimization."
Quite simply, there is no possible algorithm that can guarantee you will never have to iterate through every item in the list. However, it is possible to improve the average number of items you need to iterate through - sorting the list before you begin your search. By doing so, the only times you would have to iterate through the entire list would be when it is filled with only "abc" and "xyz."
Assuming that it's not practical for you to have a pre-sorted list by the time you need to search through it, then the only way to improve the speed of your search would be to use a different data structure than a list - for example, a binary search tree.

Best way to compare two large string lists, using C# and LINQ?

I have a large list (~ 110,000 strings), which I need to compare to a similar sized list.
List A comes from 1 system.
List B comes from a SQL table (I can only read, no stored procs, etc)
What is the best way to find what values are in list A, that no longer exists in list B?
Is 100,000 strings a large number to be handled in an array?
thanks
So you have two lists like so:
List<string> listA;
List<string> listB;
Then use Enumerable.Except:
List<string> except = listA.Except(listB).ToList();
Note that if you want to, say, ignore case:
List<string> except = listA.Except(listB, StringComparer.OrdinalIgnoreCase).ToList();
You can replace the last parameter with an IEqualityComparer<string> of your choosing.
With LINQ:
var missing = listA.Except(listB).ToList();
Out of interest, do you HAVE to use List<string>? Because in .net 3.5 SP1, you can use the HashSet and it's ExceptWith method. To my understanding, HashSets are specifically optimized for comparisons between two Sets.
List<string> A = //get from file
List<string> B = //get from db
var C = A.Except(B);
Stealing from this question, it looks like you could use the Except<T>() method.

Categories