I'm trying to reverse a sentence like the following:
The input:
my name is john. i am 23 years old.
The output:
.old years 23 am i .john is name my
I can't figure it out how to switch the dot at the end.
I tried using Split but it always return the dot at the end of the word.
string[] words = sentence.Split(' ');
Array.Reverse(words);
return string.Join(" ", words);
Add extra logic to move period ('.') before the word starts, like
var sentence ="my name is john. i am 23 years old."; //Input string
string[] words = sentence.Split(' '); //Split sentence into words
Array.Reverse(words); //Revere the array of words
//If word starts with . then start your word with period and trim end.
var result = words.Select(x => x.EndsWith('.') ? $".{x.Trim('.')}" : x);
//^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^This was missing
Console.WriteLine(string.Join(" ", result));
Elegant one liner approach suggested by #metroSmurf in the comment section
var result = sentence.Split(' ') //Split array with space as a delimiter.
.Reverse() //Use IEnumerable<T>.Reverse to reverse the array. No need to use Array.Reverse()
.Select(x => x.EndsWith('.') ? $".{x.Trim('.')}" : x); //Apply same logic mentioned above.
Try online
You are reversing words separated by space (in old. the point is part of the word). If you want to reverse the points too you want to consider them as words (sperated by space):
public static class TextExtensions
{
public static string PointTrick(this string str) => str.Replace(".", " .");
public static string PointUntrick(this string str) => str.Replace(". ", ".");
public static string ReverseWords(this string str) => string.Join(" ", str.Split(" ").Reverse());
}
Those tests pass.
[TestClass]
public class SOTests
{
private string GetReversedWithPointTrick(string input) => input.PointTrick().ReverseWords().PointUntrick();
[TestMethod]
public void ReverseWordsTest()
{
var sut = "three two. one";
var expected = "one two. three";
var result = sut.ReverseWords();
Assert.AreEqual(expected, result);
}
[TestMethod]
public void ReverseWords_PointTrick_Test()
{
var sut = "my name is john. i am 23 years old.";
var expected = ".old years 23 am i .john is name my";
var result = GetReversedWithPointTrick(sut);
Assert.AreEqual(expected, result);
}
}
You can try combination of Linq and Regular Expressions:
using System.Linq;
using System.Text.RegularExpressions;
...
string source = "my name is john. i am 23 years old.";
// .old years 23 am i .john is name my
string result = string.Join(" ", Regex
.Matches(source, #"\S+")
.Cast<Match>()
.Select(m => Regex.Replace(m.Value, #"^(.*?)(\p{P}+)$", "$2$1"))
.Reverse());
Here we use two patterns: a simple one "\S+" which matches any characters which are not whitespaces. The next pattern ^(.*?)(\p{P}+)$ worth explaining:
^(.*?)(\p{P}+)$
here
^ - anchor, start of the string
(.*?) - group #1: any symbols, but as few as possible
(\p{P}+) - group #2: one or more punctuation symbols
$ - anchor, end of the string
and when matched we swap these groups: "&2&1"
Demo:
private static string Solve(string source) => string.Join(" ", Regex
.Matches(source, #"\S+")
.Cast<Match>()
.Select(m => Regex.Replace(m.Value, #"^(.*?)(\p{P}+)$", "$2$1"))
.Reverse());
...
string[] tests = new string[] {
"my name is john. i am 23 years old.",
"It's me, here am I!",
"Test...",
};
string report = string.Join(Environment.NewLine, tests
.Select(test => $"{test,-35} => {Solve(test)}"));
Console.Write(report);
Outcome:
my name is john. i am 23 years old. => .old years 23 am i .john is name my
It's me, here am I! => !I am here ,me It's
Test... => ...Test
Because someone always has to post a LINQ version of these things 😜
sentence.Split().Reverse.Select(w => w[^1] == '.' ? ('.' + w[..^1]) : w);
Split without any argument splits on whitespace, Reverse is a LINQ thing that reverses the input and then we just have a bit of logic that asks if the last (^1 is from the indexes and ranges feature of c# 9, meaning "one from the end") char is a dot, move it to the start (concat a dot plus all the string up to 1 from the end) othwise just output the word..
And all that remains is to string join it, which you know how to do: string.Join(" ", ...)
Related
I'm trying to create a program that splits a string to an array then adds
to that array.
Splitting the string works but adding to the array is really putting up a
fight.
//here i create the text
string text = Console.ReadLine();
Console.WriteLine();
//Here i split my text to elements in an Array
var punctuation = text.Where(Char.IsPunctuation).Distinct().ToArray();
var words = text.Split().Select(x => x.Trim(punctuation));
//here i display the splitted string
foreach (string x in words)
{
Console.WriteLine(x);
}
//Here a try to add something to the Array
Array.words(ref words, words.Length + 1);
words[words.Length - 1] = "addThis";
//I try to display the updated array
foreach (var x in words)
{
Console.WriteLine(x);
}
//Here are the error messages |*error*|
Array.|*words*|(ref words, words.|*Length*| + 1);
words[words.|*Length*| - 1] = "addThis";
'Array' does not contain definition for 'words'
Does not contain definition for Length
Does not contain definition for length */
Convert the IEnumerable to List:
var words = text.Split().Select(x => x.Trim(punctuation)).ToList();
Once it is a list, you can call Add
words.Add("addThis");
Technically, if you want to split on punctuation, I suggest Regex.Split instead of string.Split
using System.Text.RegularExpressions;
...
string text =
#"Text with punctuation: comma, full stop. Apostroph's and ""quotation?"" - ! Yes!";
var result = Regex.Split(text, #"\p{P}");
Console.Write(string.Join(Environment.NewLine, result));
Outcome:
Text with punctuation # Space is not a punctuation, 3 words combined
comma
full stop
Apostroph # apostroph ' is a punctuation, split as required
s and
quotation
Yes
if you want to add up some items, I suggest Linq Concat() and .ToArray():
string text =
string[] words = Regex
.Split(text, #"\p{P}")
.Concat(new string[] {"addThis"})
.ToArray();
However, it seems that you want to extract words, not to split on puctuation which you can do matching these words:
using System.Linq;
using System.Text.RegularExpressions;
...
string text =
#"Text with punctuation: comma, full stop. Apostroph's and ""quotation?"" - ! Yes!";
string[] words = Regex
.Matches(text, #"[\p{L}']+") // Let word be one or more letters or apostrophs
.Cast<Match>()
.Select(match => match.Value)
.Concat(new string[] { "addThis"})
.ToArray();
Console.Write(string.Join(Environment.NewLine, result));
Outcome:
Text
with
punctuation
comma
full
stop
Apostroph's
and
quotation
Yes
addThis
I have a string which I want to split in two. Usually it is a name, operator and a value. I'd like to split it into name and value. The name can be anything, the value too. What I have, is an array of operators and my idea is to use it as separators:
var input = "name>=2";
var separators = new string[]
{
">",
">=",
};
var result = input.Split(separators, StringSplitOptions.RemoveEmptyEntries);
Code above gives result being name and =2. But if I rearrange the order of separators, so the >= would be first, like this:
var separators = new string[]
{
">=",
">",
};
That way, I'm getting nice name and 2 which is what I'm trying to achieve. Sadly, keeping the separators in a perfect order is a no go for me. Also, my collection of separators is not immutable. So, I'm thinking maybe I could split the string with longer separators given precedence over the shorter ones?
Thanks for help!
Here is a related question, explaining why such behaviour occurs in Split() method.
You can try several options. If you have a colelction of the separators, you can sort them in the right order before splitting:
using System.Linq;
...
var result = input.Split(
separators.OrderByDescending(item => item.Length), // longest first
StringSplitOptions.RemoveEmptyEntries);
You can try organizing all (including possible) separators into a single pattern, e.g.
[><=]+
here we split by the longest sequence of >, < and =
var result = Regex.Split(input, "[><=]+");
Demo:
using System.Text.RegularExpressions;
...
string[] tests = new string[] {
"name>123",
"name<4",
"name=78",
"name==other",
"name===other",
"name<>78",
"name<<=4",
"name=>name + 455",
"name>=456",
"a_b_c=d_e_f",
};
string report = string.Join(Environment.NewLine, tests
.Select(test => string.Join("; ", Regex.Split(test, "[><=]+"))));
Console.Write(report);
Outcome:
name; 123
name; 4
name; 78
name; other
name; other
name; 78
name; 4
name; name + 455
name; 456
a_b_c; d_e_f
You may try doing a regex split on an alternation which lists the longer >= first:
var input = "name>=2";
string[] parts = Regex.Split(input, "(?:>=|>)");
foreach(var item in res)
{
Console.WriteLine(item.ToString());
}
This prints:
name
2
Note that had we split on (?:>|>=), the output would have been name and =2.
I`m facing with regex split problem.
Here is my pattern
string[] words = Regex.Split(line, "[\\s,.;:/?!()\\-]+");
And this is text file:
ir KAS gi mus nugales.
jei! mes MIRTI NEBIJOM,
JEIGU mes nugalejom mirti
DZUKAS
And I have a task to find last word in upper, here is code:
z = words.LastOrDefault(c => c.All(ch => char.IsUpper(ch)));
When in end of the line is some kind of delimiter, it just dont print z . When there are no delimiter (3th, 4th lines), everything is going fine..
Why does it happen?
Why not match the words (not split), and take the last one?
string source = #"ir KAS gi mus nugales.
jei!mes MIRTI NEBIJOM,
JEIGU mes nugalejom mirti
DZUKAS";
// or #"\b\p{Lu}+\b" depending on letters you want being selected out
string pattern = #"\b[A-Z]+\b";
string result = Regex
.Matches(source, pattern)
.OfType<Match>()
.Select(match => match.Value)
.LastOrDefault();
Edit: If I understand your requirements right (Regex.Split must be preserved, and you have to output the last all caps letters word per each line), you're looking for something like this:
var result = source
.Split(new string[] { Environment.NewLine }, StringSplitOptions.None)
.Select(line => Regex.Split(line, "[\\s,.;:/?!()\\-]+"))
.Select(words => words
.Where(word => word.Length > 0 && word.All(c => char.IsUpper(c)))
.LastOrDefault());
// You may want to filter out lines which doesn't have all-ups words:
// .Where(line => line != null);
Test
Console.Write(string.Join(Environment.NewLine, result));
Output
KAS
NEBIJOM
JEIGU
DZUKAS
Please notice, that .All(c => char.IsUpper(c)) includes empty string case, that's why we have to add explicit word.Length > 0. So you've faced not Regex but Linq problem (empty string sutisfies .All(...) condition).
using System;
using System.Text.RegularExpressions;
namespace ConsoleApp
{
class Program
{
static void Main()
{
string s = #"ir KAS gi mus nugales.
jei!mes MIRTI NEBIJOM,
JEIGU mes nugalejom mirti
DZUKAS";
Match result = Regex.Match(s, "([A-Z]+)", RegexOptions.RightToLeft);
Console.WriteLine(result.Value);
Console.ReadKey();
}
}
}
From the question and comments it's hard to figure out what you want but I'll try to cover both cases.
If you're looking for the last word in whole text that is uppercase you can do something like this :
Regex r = new Regex("[,.;:/?!()\\-]+", RegexOptions.Multiline);
string result = r.Replace(source, string.Empty).Split(' ').LastOrDefault(word => word.All(c => char.IsUpper(c));
If you want to find the last match from each line :
Regex r = new Regex("[,.;:/?!()\\-]+", RegexOptions.Multiline);
string[] result = r.Replace(source, string.Empty).Split(Environment.NewLine).Select(line => line.Split(' ').LastOrDefault(word => word.All(c => char.IsUpper(c)).ToArray();
EDIT:
I am new to c#, i need to trim a sentence which has many words. I need only first characters in all the words. For example
If a sentence is like this.
input : Bharat Electrical Limited => output : BEL
how do i accomplish this in c#?
Thanks in advance
Try
string sentence = "Bharat Electrical Limited";
var result = sentence.Split(' ').Aggregate("", (current, word) => current + word.Substring(0, 1));
EDIT: Here's a brief explanantion:
sentence.Split(' ') splits the string into elements based on space (' ')
.Aggregate("", (current, word) => current + word.Substring(0, 1)); is a linq expression to iterate through every word retrieve above perform an operation on it and
word.Substring(0, 1) returns the first letter of every word
This is the sort of thing that's easily accomplished with a regular expression:
s = Regex.Replace(s, #"(\S)\S*\s*", "$1");
This effectively matches consecutive non-white space characters, followed by white space, and replaces the whole sequence by its first character.
You can do something like this -
string sentence = "Bharat Electrical Limited";
//Split the words
var letters = sentence.Split(new char[] {' '}, StringSplitOptions.RemoveEmptyEntries);
//Take firsst letter of every word
var myAbbWord = letters.Aggregate(string.Empty, (current, letter) => current + letter.First());
myAbbWord should display BEL for you.
Here is the solution.
I hope it helps.
string str1 = "Bharat Electrical Limited";
var resultList = str1.Split(' ');
string result = resultList.Aggregate(String.Empty, (current, word) => current + word.First());
First thing you want to Split the string into words, then take First letter from each word. You can do this by a simple for loop like the following:
string inputStr = "Bharat Electrical Limited";
List<char> firstChars = new List<char>();
foreach (string word in inputStr.Split(new char[]{' '},StringSplitOptions.RemoveEmptyEntries))
{
firstChars.Add(word[0]); // Collecting first chars of each word
}
string outputStr = String.Join("", firstChars);
And this will be the Short way for this:
string inputStr = "Bharat Electrical Limited";
string shortWord = String.Join("", inputStr.Split(new char[]{' '},StringSplitOptions.RemoveEmptyEntries).Select(x => x[0]));
If the first character in each string is not Caps, then you can use any of the following options.
Make the input into Title cased sentence, before performing the action.
For this you can use the following code:
inputStr = System.Threading.Thread.CurrentThread.CurrentCulture.TextInfo.ToTitleCase(inputStr.ToLower());
Convert the Character to uppercase while we collect Characters from the word,
This can be achieved by:
firstChars.Add(char.ToUpper(word[0])); // For the first case
.Select(x => char.ToUpper(x[0])) // For the second case
Here you can find a working example for all above mentioned cases
Simplest way is :
string inputStr = "Bharat Electrical Limited";
string result = new String(inputStr.Split(' ').Select(word => (word[0])).ToArray());
// BEL
You need to add using System.Linq; to your source file.
Logic is:
Split the string into array or words (delimited by space), then project this array by selecting the first char of each string. The result is an array of the first characters. Then using the String overload constructor taking char array, construct the result string.
This might looks more friendly to you
string intput = "Bharat Electrical Limited";
string output = string.Join( "",intput.Split(new string[] {" "}, StringSplitOptions.RemoveEmptyEntries)
.Select(a => a.First()));
First split your input sentense with space and then use First() extension on string to get first character of string
Use this method
string inputStr = "Bharat Electrical Limited";
var arrayString = string.Join("", inputStr.Split(' ').Select(x => x[0]));
I have some code that tokenizes a equation input into a string array:
string infix = "( 5 + 2 ) * 3 + 4";
string[] tokens = tokenizer(infix, #"([\+\-\*\(\)\^\\])");
foreach (string s in tokens)
{
Console.WriteLine(s);
}
Now here is the tokenizer function:
public string[] tokenizer(string input, string splitExp)
{
string noWSpaceInput = Regex.Replace(input, #"\s", "");
Console.WriteLine(noWSpaceInput);
Regex RE = new Regex(splitExp);
return (RE.Split(noWSpaceInput));
}
When I run this, I get all characters split, but there is an empty string inserted before the parenthesis chracters...how do I remove this?
//empty string here
(
5
+
2
//empty string here
)
*
3
+
4
I would just filter them out:
public string[] tokenizer(string input, string splitExp)
{
string noWSpaceInput = Regex.Replace(input, #"\s", "");
Console.WriteLine(noWSpaceInput);
Regex RE = new Regex(splitExp);
return (RE.Split(noWSpaceInput)).Where(x => !string.IsNullOrEmpty(x)).ToArray();
}
What you're seeing is because you have nothing then a separator (i.e. at the beginning of the string is(), then two separator characters next to one another (i.e. )* in the middle). This is by design.
As you may have found with String.Split, that method has an optional enum which you can give to have it remove any empty entries, however, there is no such parameter with regular expressions. In your specific case you could simply ignore any token with a length of 0.
foreach (string s in tokens.Where(tt => tt.Length > 0))
{
Console.WriteLine(s);
}
Well, one option would be to filter them out afterwards:
return RE.Split(noWSpaceInput).Where(x => !string.IsNullOrEmpty(x)).ToArray();
Try this (if you don't want to filter the result):
tokenizer(infix, #"(?=[-+*()^\\])|(?<=[-+*()^\\])");
Perl demo:
perl -E "say join ',', split /(?=[-+*()^])|(?<=[-+*()^])/, '(5+2)*3+4'"
(,5,+,2,),*,3,+,4
Altho it would be better to use a match instead of split in this case imo.
I think you can use the [StringSplitOptions.RemoveEmptyEntries] by the split
static void Main(string[] args)
{
string infix = "( 5 + 2 ) * 3 + 4";
string[] results = infix.Split(" ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
foreach (var result in results)
Console.WriteLine(result);
Console.ReadLine();
}