finding occurrences of string within a string - c#

What is the quickest and most efficient way of finding a string within another string.
For instance I have this text;
"Hey #ronald and #tom where are we going this weekend"
However I want to find the strings which start with "#".

You can use Regular expressions.
string test = "Hey #ronald and #tom where are we going this weekend";
Regex regex = new Regex(#"#[\S]+");
MatchCollection matches = regex.Matches(test);
foreach (Match match in matches)
{
Console.WriteLine(match.Value);
}
That will output:
#ronald
#tom

You need to use Regular Expressions:
string data = "Hey #ronald and #tom where are we going this weekend";
var result = Regex.Matches(data, #"#\w+");
foreach (var item in result)
{
Console.WriteLine(item);
}

try this one:
string s = "Hey #ronald and #tom where are we going this weekend";
var list = s.Split(' ').Where(c => c.StartsWith("#"));

If you are after speed:
string source = "Hey #ronald and #tom where are we going this weekend";
int count = 0;
foreach (char c in source)
if (c == '#') count++;
If you want a one liner:
string source = "Hey #ronald and #tom where are we going this weekend";
var count = source.Count(c => c == '#');
Check here How would you count occurrences of a string within a string?

String str = "hallo world"
int pos = str.IndexOf("wo",0)

Related

Getting a numbers from a string with chars glued

I need to recover each number in a glued string
For example, from these strings:
string test = "number1+3"
string test1 = "number 1+4"
I want to recover (1 and 3) and (1 and 4)
How can I do this?
CODE
string test= "number1+3";
List<int> res;
string[] digits= Regex.Split(test, #"\D+");
foreach (string value in digits)
{
int number;
if (int.TryParse(value, out number))
{
res.Add(number)
}
}
This regex should work
string pattern = #"\d+";
string test = "number1+3";
foreach (Match match in Regex.Matches(test, pattern))
Console.WriteLine("Found '{0}' at position {1}",
match.Value, match.Index);
Note that if you intend to use it multiple times, it's better, for performance reasons, to create a Regex instance than using this static method.
var res = new List<int>();
var regex = new Regex(#"\d+");
void addMatches(string text) {
foreach (Match match in regex.Matches(text))
{
int number = int.Parse(match.Value);
res.Add(number);
}
}
string test = "number1+3";
addMatches(test);
string test1 = "number 1+4";
addMatches(test1);
MSDN link.
Fiddle 1
Fiddle 2
This calls for a regular expression:
(\d+)\+(\d+)
Test it
Match m = Regex.Match(input, #"(\d+)\+(\d+)");
string first = m.Groups[1].Captures[0].Value;
string second = m.Groups[2].Captures[0].Value;
An alternative to regular expressions:
string test = "number 1+4";
int[] numbers = test.Replace("number", string.Empty, StringComparison.InvariantCultureIgnoreCase)
.Trim()
.Split("+", StringSplitOptions.RemoveEmptyEntries)
.Select(x => Convert.ToInt32(x))
.ToArray();

How to check if a string contains all of the characters of a word

I wish to check if a string contains a all of the characters of a word given, for example:
var inputString = "this is just a simple text string";
And say I have the word:
var word = "ts";
Now it should pick out the words that contains t and s:
this just string
This is what I am working on:
var names = Regex.Matches(inputString, #"\S+ts\S+",RegexOptions.IgnoreCase);
however this does not give me back the words I like. If I had like just a character like t, it would give me back all of the words that contains t. If I had st instead of ts, it would give me back the word just.
Any idea of how this can work ?
Here is a LINQ solution which is easy on the eyes more natural than regex.
var testString = "this is just a simple text string";
string[] words = testString.Split(' ');
var result = words.Where(w => "ts".All(w.Contains));
The result is:
this
just
string
You can use LINQ's Enumerable.All :
var input = "this is just a simple text string";
var token = "ts";
var results = input.Split().Where(str => token.All(c => str.Contains(c))).ToList();
foreach (var res in results)
Console.WriteLine(res);
Output:
// this
// just
// string
You can use this pattern.
(?=[^ ]*t)(?=[^ ]*s)[^ ]+
You can make regex dynamically.
var inputString = "this is just a simple text string";
var word = "ts";
string pattern = "(?=[^ ]*{0})";
string regpattern = string.Join("" , word.Select(x => string.Format(pattern, x))) + "[^ ]+";
var wineNames = Regex.Matches(inputString, regpattern ,RegexOptions.IgnoreCase);
Option without LINQ and Regex (just for fun):
string input = "this is just a simple text string";
char[] chars = { 't', 's' };
var array = input.Split();
List<string> result = new List<string>();
foreach(var word in array)
{
bool isValid = true;
foreach (var c in chars)
{
if (!word.Contains(c))
{
isValid = false;
break;
}
}
if(isValid) result.Add(word);
}

Extracting parts of a string c#

In C# what would be the best way of splitting this sort of string?
%%x%%a,b,c,d
So that I end up with the value between the %% AND another variable containing everything right of the second %%
i.e. var x = "x"; var y = "a,b,c,d"
Where a,b,c.. could be an infinite comma seperated list. I need to extract the list and the value between the two double-percentage signs.
(To combat the infinite part, I thought perhaps seperating the string out to: %%x%% and a,b,c,d. At this point I can just use something like this to get X.
var tag = "%%";
var startTag = tag;
int startIndex = s.IndexOf(startTag) + startTag.Length;
int endIndex = s.IndexOf(tag, startIndex);
return s.Substring(startIndex, endIndex - startIndex);
Would the best approach be to use regex or use lots of indexOf and substring to do the extracting based on te static %% characters?
Given that what you want is "x,a,b,c,d" the Split() function is actually pretty powerful and regex would be overkill for this.
Here's an example:
string test = "%%x%%a,b,c,d";
string[] result = test.Split(new char[] { '%', ',' }, StringSplitOptions.RemoveEmptyEntries);
foreach (string s in result) {
Console.WriteLine(s);
}
Basicly we ask it to split by both '%' and ',' and ignore empty results (eg. the result between "%%"). Here's the result:
x
a
b
c
d
To Extract X:
If %% is always at the start then;
string s = "%%x%%a,b,c,d,h";
s = s.Substring(2,s.LastIndexOf("%%")-2);
//Console.WriteLine(s);
Else;
string s = "v,u,m,n,%%x%%a,b,c,d,h";
s = s.Substring(s.IndexOf("%%")+2,s.LastIndexOf("%%")-s.IndexOf("%%")-2);
//Console.WriteLine(s);
If you need to get them all at once then use this;
string s = "m,n,%%x%%a,b,c,d";
var myList = s.ToArray()
.Where(c=> (c != '%' && c!=','))
.Select(c=>c).ToList();
This'll let you do it all in one go:
string pattern = "^%%(.+?)%%(?:(.+?)(?:,|$))*$";
string input = "%%x%%a,b,c,d";
Match match = Regex.Match(input, pattern);
if (match.Success)
{
// "x"
string first = match.Groups[1].Value;
// { "a", "b", "c", "d" }
string[] repeated = match.Groups[2].Captures.Cast<Capture>()
.Select(c => c.Value).ToArray();
}
You can use the char.IsLetter to get all the list of letter
string test = "%%x%%a,b,c,d";
var l = test.Where(c => char.IsLetter(c)).ToArray();
var output = string.Join(", ", l.OrderBy(c => c));
Since you want the value between the %% and everything after in separate variables and you don't need to parse the CSV, I think a RegEx solution would be your best choice.
var inputString = #"%%x%%a,b,c,d";
var regExPattern = #"^%%(?<x>.+)%%(?<csv>.+)$";
var match = Regex.Match(inputString, regExPattern);
foreach (var item in match.Groups)
{
Console.WriteLine(item);
}
The pattern has 2 named groups called x and csv, so rather than just looping, you can easily reference them by name and assign them to values:
var x = match.Groups["x"];
var y = match.Groups["csv"];

Retrieve String Containing Specific substring C#

I am having an output in string format like following :
"ABCDED 0000A1.txt PQRSNT 12345"
I want to retreieve substring(s) having .txt in above string. e.g. For above it should return 0000A1.txt.
Thanks
You can either split the string at whitespace boundaries like it's already been suggested or repeatedly match the same regex like this:
var input = "ABCDED 0000A1.txt PQRSNT 12345 THE.txt FOO";
var match = Regex.Match (input, #"\b([\w\d]+\.txt)\b");
while (match.Success) {
Console.WriteLine ("TEST: {0}", match.Value);
match = match.NextMatch ();
}
Split will work if it the spaces are the seperator. if you use oter seperators you can add as needed
string input = "ABCDED 0000A1.txt PQRSNT 12345";
string filename = input.Split(' ').FirstOrDefault(f => System.IO.Path.HasExtension(f));
filname = "0000A1.txt" and this will work for any extension
You may use c#, regex and pattern, match :)
Here is the code, plug it in try. Please comment.
string test = "afdkljfljalf dkfjd.txt lkjdfjdl";
string ffile = Regex.Match(test, #"\([a-z0-9])+.txt").Groups[1].Value;
Console.WriteLine(ffile);
Reference: regexp
I did something like this:
string subString = "";
char period = '.';
char[] chArString;
int iSubStrIndex = 0;
if (myString != null)
{
chArString = new char[myString.Length];
chArString = myString.ToCharArray();
for (int i = 0; i < myString.Length; i ++)
{
if (chArString[i] == period)
iSubStrIndex = i;
}
substring = myString.Substring(iSubStrIndex);
}
Hope that helps.
First split your string in array using
char[] whitespace = new char[] { ' ', '\t' };
string[] ssizes = myStr.Split(whitespace);
Then find .txt in array...
// Find first element starting with .txt.
//
string value1 = Array.Find(array1,
element => element.Contains(".txt", StringComparison.Ordinal));
Now your value1 will have the "0000A1.txt"
Happy coding.

How do I "cut" out part of a string with a regex?

I need to cut out and save/use part of a string in C#. I figure the best way to do this is by using Regex. My string looks like this:
"changed from 1 to 10".
I need a way to cut out the two numbers and use them elsewhere. What's a good way to do this?
Error checking left as an exercise...
Regex regex = new Regex( #"\d+" );
MatchCollection matches = regex.Matches( "changed from 1 to 10" );
int num1 = int.Parse( matches[0].Value );
int num2 = int.Parse( matches[1].Value );
Matching only exactly the string "changed from x to y":
string pattern = #"^changed from ([0-9]+) to ([0-9]+)$";
Regex r = new Regex(pattern);
Match m = r.match(text);
if (m.Success) {
Group g = m.Groups[0];
CaptureCollection cc = g.Captures;
int from = Convert.ToInt32(cc[0]);
int to = Convert.ToInt32(cc[1]);
// Do stuff
} else {
// Error, regex did not match
}
In your regex put the fields you want to record in parentheses, and then use the Match.Captures property to extract the matched fields.
There's a C# example here.
Use named capture groups.
Regex r = new Regex("*(?<FirstNumber>[0-9]{1,2})*(?<SecondNumber>[0-9]{1,2})*");
string input = "changed from 1 to 10";
string firstNumber = "";
string secondNumber = "";
MatchCollection joinMatches = regex.Matches(input);
foreach (Match m in joinMatches)
{
firstNumber= m.Groups["FirstNumber"].Value;
secondNumber= m.Groups["SecondNumber"].Value;
}
Get Expresson to help you out, it has an export to C# option.
DISCLAIMER: Regex is probably not right (my copy of expresso expired :D)
Here is a code snippet that does almost what I wanted:
using System.Text.RegularExpressions;
string text = "changed from 1 to 10";
string pattern = #"\b(?<digit>\d+)\b";
Regex r = new Regex(pattern);
MatchCollection mc = r.Matches(text);
foreach (Match m in mc) {
CaptureCollection cc = m.Groups["digit"].Captures;
foreach (Capture c in cc){
Console.WriteLine((Convert.ToInt32(c.Value)));
}
}

Categories