Compare two strings in c# containing random URL [closed]

Compare two strings in c# containing random URL [closed] - c#

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I need to compare the below strings. The problem I have is the url in both strings will be different every time e.g:
www.google.com
http://www.google.com
google.co.uk!
So contains cannot match the strings because of the URL not matching.
String1 = "This is my string http://www.google.co.uk and that was my url"
String2 = "this is my string google.gr and that was my url"
So I basically want to compare the contents of the string minus the URl, each string can contain different text each time so looking for the URL at the same location each time will not work.
I have searched extensively on here for an answer to this problem, but I was unable to find a working solution.
Thanks in advance

Use regular expressions to remove links:
String string1 = "This is my string http://www.google.co.uk and that was my url";
String string2 = "this is my string http://google.gr and that was";
Regex rxp = new Regex(#"http://[^\s]*");
String clean1 = rxp.Replace(string1, "");
String clean2 = rxp.Replace(string2, "");
And now you can compare clean1 with clean2. OFC regexp above is just an example it'll just remove url's staring with "http://". You may need something more sophisticated, based on your real data.

Using Regular Expressions:
Regex regex = new Regex(#"\s((?:\S+)\.(?:\S+))");
string string1 = "This is my string http://www.google.co.uk and that was my url.";
string string2 = "this is my string google.gr and that was my url.";
var string1WithoutURI = regex.Replace(string1, ""); // Output: "This is my string and that was my url."
var string2WithoutURI = regex.Replace(string2, ""); // Output: "this is my string and that was my url."
// Regex.Replace(string1, #"\s((?:\S+)\.(?:\S+))", ""); // This can be used too to avoid having to declare the regex.
if (string1WithoutURI == string2WithoutURI)
{
// Do what you want with the two strings
}
Explaining the regex \s((?:\S+)\.(?:\S+))
1. \s Will match any white space character
2. ((?:\S+)\.(?:\S+)) Will match the url until the next white space character
2.1. (?:\S+) Will match any non-white space character without capturing the group again (with the ?:)
2.2. \. Will match the character ".", because it will always exist in a url
2.3. (?:\S+)) Again, will match any non-white space character without capturing the group again (with the ?:) to get everything after the dot.
That should do the trick...

Related

Split string from particular regular expression in c#

i have one string like
"8/6/08mz: Last name corrected from Paniaguato Arevalo-Paniaguaas listed on bills/MR, email Shasta, 1132644 06/24/08jh:To
Concentra/froi."
and i want to split this string when i get "8/6/08mz:" pattern so my updated string will be following
"8/6/08mz: Last name corrected from Paniaguato Arevalo-Paniaguaas listed on bills/MR, email Shasta, 1132644"
"06/24/08jh:To Concentra/froi."
how can i do it in c# please help me.

Using Regex.Split() and a Regular Expression?
I have a very bad one here:
[0-9]{1,2}\/[0-9]{1,2}\/[0-9]{1,2}[a-z]{2}:
https://regex101.com/r/cEFbbZ/1

You can verify the string starts with what you want, then split on the space preceded by 7 digits:
if (s.StartsWith("8/6/08mz: ")) {
var ans = Regex.Split(s, #"(?<=[0-9]{7}) ");
}

Regexp - Delete the one word before XXX, remove XXX too [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I need to remove a word XXX in a string and also one word before XXX.
How I can do that with C# Regexp?

Do a single regex replacement:
string input = #"Hello World XXX Goodbye XXX Rabbit!";
Regex rgx = new Regex(#"\s*\w+\s+(?:XXX|xxx)"); // or maybe [Xx]{3}
string result = rgx.Replace(input, "", 1);
Console.WriteLine(result);
Hello Goodbye XXX Rabbit!
Demo
This replacement only would target XXX for removal if it be preceded by a word (one character or more). Explore the demo to see how it would behave with various inputs.
We can also make the search pattern case insensitive via this:
Regex rgx = new Regex(#"\s*\w+\s+XXX", RegexOptions.IgnoreCase);
^^^^^ add this

U can use the replace method :
String s = "aaa bbb";
s = s.Replace("a", "")
// The example displays the following output:
// The initial string: 'aaa bbb'
// The final string: 'bbb'
Or use a Regex in replace :
tmp = s.Replace(n, "[^0-9a-zA-Z]+", "");

C# Extract part of the string that starts with specific letters

I have a string which I extract from an HTML document like this:
var elas = htmlDoc.DocumentNode.SelectSingleNode("//a[#class='a-size-small a-link-normal a-text-normal']");
if (elas != null)
{
//
_extractedString = elas.Attributes["href"].Value;
}
The HREF attribute contains this part of the string:
gp/offer-listing/B002755TC0/
And I'm trying to extract the B002755TC0 value, but the problem here is that the string will vary by its length and I cannot simply use Substring method that C# offers to extract that value...
Instead I was thinking if there's a clever way to do this, to perhaps a match beginning of the string with what I search?
For example I know for a fact that each href has this structure like I've shown, So I would simply match these keywords:
offer-listing/
So I would find this keyword and start extracting the part of the string B002755TC0 until the next " / " sign ?
Can someone help me out with this ?

This is a perfect job for a regular expression :
string text = "gp/offer-listing/B002755TC0/";
Regex pattern = new Regex(#"offer-listing/(\w+)/");
Match match = pattern.Match(text);
string whatYouAreLookingFor = match.Groups[1].Value;
Explanation : we just match the exact pattern you need.
'offer-listing/'
followed by any combination of (at least one) 'word characters' (letters, digits, hyphen, etc...),
followed by a slash.
The parenthesis () mean 'capture this group' (so we can extract it later with match.Groups[1]).
EDIT: if you want to extract also from this : /dp/B01KRHBT9Q/
Then you could use this pattern :
Regex pattern = new Regex(#"/(\w+)/$");
which will match both this string and the previous. The $ stands for the end of the string, so this literally means :
capture the characters in between the last two slashes of the string

Though there is already an accepted answer, I thought of sharing another solution, without using Regex. Just find the position of your pattern in the input + it's lenght, so the wanted text will be the next character. to find the end, search for the first "/" after the begining of the wanted text:
string input = "gp/offer-listing/B002755TC0/";
string pat = "offer-listing/";
int begining = input.IndexOf(pat)+pat.Length;
int end = input.IndexOf("/",begining);
string result = input.Substring(begining,end-begining);
If your desired output is always the last piece, you could also use split and get the last non-empty piece:
string result2 = input.Split(new string[]{"/"},StringSplitOptions.RemoveEmptyEntries)
.ToList().Last();

C# - Count a specific word in richTextBox1 and send the result to label1 [duplicate]

This question already has answers here:
How would you count occurrences of a string (actually a char) within a string?
(34 answers)
Closed 7 years ago.
I'm not sure, if this question is unique, but I couldn't find the answer that I was looking for.
I simply need a C# code that counts how many times a word appear in richTextBox1 and send the result to label1.
Example;
label1.text = how many times the word "house" appears in richTextBox1.
I know that I should try first but believe me I tried and I failed. I am new to this so I hope someone can show me.
Regards

I would solve it using Regular Expressions as follows:
using System.Text.RegularExpressions;
...
string searchstring = ....;
string input = richTextBox1.Text;
int count = Regex.Matches(input, searchstring, RegexOptions.IgnoreCase).Count;
label1.Text = count.ToString();
Do please note, that the code above only works for one single word containing the characters a-z, A-Z, 0-9 and _.
EDIT No.1: If you want to match any exact sequence, you could use the following method:
using System.Text.RegularExpressions;
...
string searchstring = ....;
string input = richTextBox1.Text;
searchstring = Regex.Escape(searchstring);
int count = Regex.Matches(input, searchstring, RegexOptions.IgnoreCase).Count;
label1.Text = count.ToString();
The method above should match any UTF-16 character sequence and count it accordingly.

Allow only A-I 0-9 and the symbols $# (RegEX)

I`m currently working on a student project in C#, and I want to check if a string contains only the following characters:
A-I
0-9
$
#
The original string:
string rawData ="$A008B20130503C103804D00000000E1022F0080G0128H022I022#";
My code is as follows:
string regEXstring = #"^[A-I0-9$#]+$";
Regex regex = new Regex(regEXstring);
if (regex.IsMatch(rawData))
{
dataOK = true;
}
else
dataOK = false;
What am I doing wrong?

Fixing your rawdata/rawData typo, the code works fine. The dataOK variable becomes true with your example data, and false if one adds other characters to the string.
Judging from your example data, you can improve the verification so that you can also determinte that:
the string starts with $
the string ends with #
the strings contains entities that consist of a single character followed by at least three digits
For that, use a pattern like:
string regEXstring = #"^\$([A-I]\d{3,})+#$";

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Compare two strings in c# containing random URL [closed] - c#

Related

Split string from particular regular expression in c#

Regexp - Delete the one word before XXX, remove XXX too [closed]

C# Extract part of the string that starts with specific letters

C# - Count a specific word in richTextBox1 and send the result to label1 [duplicate]

Allow only A-I 0-9 and the symbols $# (RegEX)

Categories

Resources