I've got collection of words, and i wanna create collection from this collection limited to 5 chars
Input:
Car
Collection
Limited
stackoverflow
Output:
car
colle
limit
stack
word.Substring(0,5) throws exception (length)
word.Take(10) is not good idea, too...
Any good ideas ??
LINQ to objects for this scenario? You can do a select as in this:
from w in words
select new
{
Word = (w.Length > 5) ? w.Substring(0, 5) : w
};
Essentially, ?: gets you around this issue.
var words = new [] { "Car", "Collection", "Limited", "stackoverflow" };
IEnumerable<string> cropped = words.Select(word =>
word[0..Math.Min(5, word.Length)]);
Ranges are available in C# 8, otherwise you'll need to do:
IEnumerable<string> cropped = words.Select(word =>
word.Substring(0, Math.Min(5, word.Length)));
Something you can do, is
string partialText = text.Substring(0, Math.Min(text.Length, 5));
I believe the kind of answer you were looking for would look like this:
var x = new string[] {"car", "Collection", "Limited", "stackoverflow" };
var output = x.Select(word => String.Join("", word.Take(5).ToList()));
The variable "output" contains the result:
car
Colle
Limit
stack
and the string "car" doesn't throw an exception.
But while Join and Take(5) works, it's generally much simpler to use, as was suggested in another answer,
subString = word.Substring(0,Math.Min(5,word.Length));
The latter code is more human-readable and lightweight, though there is definitely a slight coolness factor to using Linq on a string to take the first five characters, without having to check the length of the string.
Related
List<string> words = new List<string> { "abc", "acb", "cba", "cb a", "abf", "sea", "aba", "so ap", "a b c" };
string allowedChars = "abc";
I want to check invalid charachters in list string. I try use IndexOfAny() but IndexOfAny() does not check white space. How to check string which contains invalid charachters?
The result I want: "cb a", "abf", "sea", "so ap", "a b c"
var invalid = words.Where(w => !w.All(allowedChars.Contains));
It checks for every word if there is at least one char in it which is not contained in allowedChars.
This is the same but little bit more verbose:
var invalid = words.Where(w => w.Any(c => !allowedChars.Contains(c)));
Both stop execution if they found the first char that is not contained.
As Juharr has noted, if there are many characters allowed you could make it more efficient by using a HashSet<char> instead of the String:
var alloweCharSet = new HashSet<char>(allowedChars);
Returns all words that contain any other characters than allowedChars:
var result = words.Where(w => w.Except(allowedChars).Any());
While this gives the desired result, it was pointed out that this will, by calling Except, create a new Set for allowedChars for every word in the list. This might not give the best performance.
What i'm trying to do is split a string backwards. Meaning right to left.
string startingString = "<span class=\"address\">Hoopeston,, IL 60942</span><br>"
What I would do normally is this.
string[] splitStarting = startingString.Split('>');
so my splitStarting[1] would = "Hoopeston,, IL 60942</span"
then I would do
string[] splitAgain = splitStarting[1].Split('<');
so splitAgain[0] would = "Hoopeston,, IL 60942"
Now this is what I want to do, I want to split by ' ' (a space) reversed for the last 2 instances of ' '.
For example my array would come back like so:
[0]="60942"
[1]="IL"
[2] = "Hoopeston,,"
To make this even harder I only ever want the first two reverse splits, so normally I would do something like this
string[] splitCity,Zip = splitAgain[0].Split(new char[] { ' ' }, 3);
but how would you do that backwards? The reason for that is, is because it could be a two name city so an extra ' ' would break the city name.
Regular expression with named groups to make things so much simpler. No need to reverse strings. Just pluck out what you want.
var pattern = #">(?<city>.*) (?<state>.*) (?<zip>.*?)<";
var expression = new Regex(pattern);
Match m = expression .Match(startingString);
if(m.success){
Console.WriteLine("Zip: " + m.Groups["zip"].Value);
Console.WriteLine("State: " + m.Groups["state"].Value);
Console.WriteLine("City: " + m.Groups["city"].Value);
}
Should give the following results:
Found 1 match:
1. >Las Vegas,, IL 60942< has 3 groups:
1. Las Vegas,, (city)
2. IL (state)
3. 60942 (zip)
String literals for use in programs:
C#
#">(?<city>.*) (?<state>.*) (?<zip>.*?)<"
One possible solution - not optimal but easy to code - is to reverse the string, then to split that string using the "normal" function, then to reverse each of the individual split parts.
Another possible solution is to use regular expressions instead.
I think you should do it like this:
var s = splitAgain[0];
var zipCodeStart = s.LastIndexOf(' ');
var zipCode = s.Substring(zipCodeStart + 1);
s = s.Substring(0, zipCodeStart);
var stateStart = s.LastIndexOf(' ');
var state = s.Substring(stateStart + 1);
var city = s.Substring(0, stateStart );
var result = new [] {zipCode, state, city};
Result will contain what you requested.
If Split could do everything there would be so many overloads that it would become confusing.
Don't use split, just custom code it with substrings and lastIndexOf.
string str = "Hoopeston,, IL 60942";
string[] parts = new string[3];
int place = str.LastIndexOf(' ');
parts[0] = str.Substring(place+1);
int place2 = str.LastIndexOf(' ',place-1);
parts[1] = str.Substring(place2 + 1, place - place2 -1);
parts[2] = str.Substring(0, place2);
You can use a regular expression to get the three parts of the string inside the tag, and use LINQ extensions to get the strings in the right order.
Example:
string startingString = "<span class=\"address\">East St Louis,, IL 60942</span><br>";
string[] city =
Regex.Match(startingString, #"^.+>(.+) (\S+) (\S+?)<.+$")
.Groups.Cast<Group>().Skip(1)
.Select(g => g.Value)
.Reverse().ToArray();
Console.WriteLine(city[0]);
Console.WriteLine(city[1]);
Console.WriteLine(city[2]);
Output:
60942
IL
East St Louis,,
How about
using System.Linq
...
splitAgain[0].Split(' ').Reverse().ToArray()
-edit-
ok missed the last part about multi word cites, you can still use linq though:
splitAgain[0].Split(' ').Reverse().Take(2).ToArray()
would get you the
[0]="60942"
[1]="IL"
The city would not be included here though, you could still do the whole thing in one statement but it would be a little messy:
var elements = splitAgain[0].Split(' ');
var result = elements
.Reverse()
.Take(2)
.Concat( new[ ] { String.Join( " " , elements.Take( elements.Length - 2 ).ToArray( ) ) } )
.ToArray();
So we're
Splitting the string,
Reversing it,
Taking the two first elements (the last two originally)
Then we make a new array with a single string element, and make that string from the original array of elements minus the last 2 elements (Zip and postal code)
As i said, a litle messy, but it will get you the array you want. if you dont need it to be an array of that format you could obviously simplfy the above code a little bit.
you could also do:
var result = new[ ]{
elements[elements.Length - 1], //last element
elements[elements.Length - 2], //second to last
String.Join( " " , elements.Take( elements.Length - 2 ).ToArray( ) ) //rebuild original string - 2 last elements
};
At first I thought you should use Array.Reverse() method, but I see now that it is the splitting on the ' ' (space) that is the issue.
Your first value could have a space in it (ie "New York"), so you dont want to split on spaces.
If you know the string is only ever going to have 3 values in it, then you could use String.LastIndexOf(" ") and then use String.SubString() to trim that off and then do the same again to find the middle value and then you will be left with the first value, with or without spaces.
Was facing similar issue with audio FileName conventions.
Followed this way: String to Array conversion, reverse and split, and reverse each part back to normal.
char[] addressInCharArray = fullAddress.ToCharArray();
Array.Reverse(addressInCharArray);
string[] parts = (new string(addressInCharArray)).Split(new char[] { ' ' }, 3);
string[] subAddress = new string[parts.Length];
int j = 0;
foreach (string part in parts)
{
addressInCharArray = part.ToCharArray();
Array.Reverse(addressInCharArray);
subAddress[j++] = new string(addressInCharArray);
}
I have a requirement to contract a string such as...
Would you consider becoming a robot? You would be provided with a free annual oil change."
...to something much shorter but yet still humanly identifiable (it will need to be found from a select list - my current solution has users entering an arbitrary title for the sole purpose of selection)
I would like to extract only the portion of the string which forms a question (if possible) and then somehow reduce it to something like
WouldConsiderBecomingRobot
Are there any grammatical algorithms out there that might help me with this? I'm thinking there might be something that allows be to pick out just verbs and nouns.
As this is just to act as a key it doesn't have to be perfect; I'm not seeking to trivialise the inherant complexity of the english language.
Probably too simplistic, but I might be tempted to start with a list of "filler words":
var fillers = new[]{"you","I","am","the","a","are"};
Then extract everything before a questionmark (using regex, string mashing, whatever you fancy), yielding you "Would you consider becoming a robot".
Then go through the string extracting every word considered a filler.
var sentence = "Would you consider becoming a robot";
var newSentence = String.Join("",sentence.Split(" ").Where(w => !fillers.Contains(w)).ToArray());
// newSentence is "Wouldconsiderbecomingrobot".
Pascal casing each word would result in your desired string - i'll leave that as an excercise for the reader.
Create a popular social media website. When users want to join or post comments, prompt them to solve a captcha. The captcha will consist of matching your shortened versions of the long strings to their full versions. Your shortening algorithm will be based on a neural net or genetic algorithm which is trained from the capcha results.
You can also sell advertising on the website.
I ended up creating the following extension method which does work surprisingly well. Thanks to Joe Blow for his excellent and effective suggestions:
public static string Contract(this string e, int maxLength)
{
if(e == null) return e;
int questionMarkIndex = e.IndexOf('?');
if (questionMarkIndex == -1)
questionMarkIndex = e.Length - 1;
int lastPeriodIndex = e.LastIndexOf('.', questionMarkIndex, 0);
string question = e.Substring(lastPeriodIndex != -1 ? lastPeriodIndex : 0, questionMarkIndex + 1).Trim();
var punctuation =
new [] {",", ".", "!", ";", ":", "/", "...", "...,", "-,", "(", ")", "{", "}", "[", "]","'","\""};
question = punctuation.Aggregate(question, (current, t) => current.Replace(t, ""));
IDictionary<string, bool> words = question.Split(' ').ToDictionary(x => x, x => false);
string mash = string.Empty;
while (words.Any(x => !x.Value) && mash.Length < maxLength)
{
int maxWordLength = words.Where(x => !x.Value).Max(x => x.Key.Length);
var pair = words.Where(x => !x.Value).Last(x => x.Key.Length == maxWordLength);
words.Remove(pair);
words.Add(new KeyValuePair<string, bool>(pair.Key, true));
mash = string.Join("", words.Where(x => x.Value)
.Select(x => x.Key.Capitalize())
.ToArray()
);
}
return mash;
}
This contracts the following to 15 chars:
This does not have any prereqs - write an essay...: PrereqsWriteEssay
You've selected a car: YouveSelectedCar
I don't think there is any algorithm that can identify if each word of a string is a noun, adjective or whatever. The only solution would be to use a custom dictionary : just create a list of words that can't be identified as verbs or nouns (I, you, they, them, his, hers, of, a, the etc.).
Then you just have to keep all the words before the question mark that are not in the list.
It is just a workaround, and I you said, it is not perfect.
Hope this helps !
Welcome to the wonderful world of natural language processing. If you want to identify nouns and verbs, you will need a part of speech tagger.
I have a string array or arraylist that is passed to my program in C#. Here is some examples of what those strings contain:
"Spr 2009"
"Sum 2006"
"Fall 2010"
"Fall 2007"
I want to be able to sort this array by the year and then the season. Is there a way to write a sorting function to tell it to sort by the year then the season. I know it would be easier if they were separate but I can't help what is being given to me.
You need to write a method which will compare any two strings in the appropriate way, and then you can just convert that method into a Comparison<string> delegate to pass into Array.Sort:
public static int CompareStrings(string s1, string s2)
{
// TODO: Comparison logic :)
}
...
string[] strings = { ... };
Array.Sort(strings, CompareStrings);
You can do the same thing with a generic list, too:
List<string> strings = ...;
strings.Sort(CompareStrings);
You could split the strings by the space character, convert both parts to integers and then use LINQ:
string[] seasons = new[] { "Spr", "Sum", "Fall", "Winter" };
string[] args = new[] { "Spr 2009", "Sum 2006", "Fall 2010", "Fall 2007" };
var result = from arg in args
let parts = arg.Split(' ')
let year = int.Parse(parts[1])
let season = Array.IndexOf(seasons, parts[0])
orderby year ascending, season ascending
select new { year, season };
You could always separate them. Create name-value-value triplets and work with them like that. Use Left and Right string functions if the data is formatted consistently. Then you sort on the year part first, and then the season part. Although Jon's idea seems really good, this is one idea of what to put in that method.
I believe what you're looking for is the StringComparer class.
var strings = new string[] {"Spr 2009", "Sum 2006", "Fall 2010", "Fall 2007"};
var sorted = strings.OrderBy(s =>
{
var parts = s.Split(' ');
double result = double.Parse(parts[1]);
switch(parts[0])
{
case "Spr":
result += .25;
break;
case "Sum"
result += .5;
break;
case "Fall":
result += .75;
break;
}
return result;
});
I also considered Array.Sort, which might be a little faster, but you also mentioned that sometimes these are ArrayLists.
I have a collection like this
List<int> {1,15,17,8,3};
how to get a flat string like "1-15-17-8-3" through LINQ query?
thank you
something like...
string mystring = string.Join("-", yourlist.Select( o => o.toString()).toArray()));
(Edit: Now its tested, and works fine)
You can write an extension method and then call .ToString("-") on your IEnumerable object type as shown here:
int[] intArray = { 1, 2, 3 };
Console.WriteLine(intArray.ToString(","));
// output 1,2,3
List<string> list = new List<string>{"a","b","c"};
Console.WriteLine(intArray.ToString("|"));
// output a|b|c
Examples of extension method implementation are here:
http://coolthingoftheday.blogspot.com/2008/09/todelimitedstring-using-linq-and.html
http://www.codemeit.com/linq/c-array-delimited-tostring.html
Use Enumerable.Aggregate like so:
var intList = new[] {1,15,17,8,3};
string result = intList.Aggregate(string.Empty, (str, nextInt) => str + nextInt + "-");
This is the standard "LINQy" way of doing it - what you're wanting is the aggregate. You would use the same concept if you were coding in another language, say Python, where you would use reduce().
EDIT:
That will get you "1-15-17-8-3-". You can lop off the last character to get what you're describing, and you can do that inside of Aggregate(), if you'd like:
string result = intList.Aggregate(string.Empty, (str, nextInt) => str + nextInt + "-", str => str.Substring(0, str.Length - 1));
The first argument is the seed, the second is function that will perform the aggregation, and the third argument is your selector - it allows you to make a final change to the aggregated value - as an example, your aggregate could be a numeric value and you want return the value as a formatted string.
HTH,
-Charles
StringBuilder sb = new StringBuilder();
foreach(int i in collection)
{
sb.Append(i.ToString() + "-");
}
string result = sb.ToString().SubString(0,sb.ToString().ToCharArray().Length - 2);
Something like this perhaps (off the top of my head that is!).
The best answer is given by Tim J.
If, however, you wanted a pure LINQ solution then try something like this (much more typing, and much less readable than Tim J's answer):
string yourString = yourList.Aggregate
(
new StringBuilder(),
(sb, x) => sb.Append(x).Append("-"),
sb => (sb.Length > 0) ? sb.ToString(0, sb.Length - 1) : ""
);
(This is a variation on Charles's answer, but uses a StringBuilder rather than string concatenation.)