removing similar string from an array in c# - c#

Suppose i have array of strings as follows:
string[] array = new string[6];
array[0] = "http://www.s8wministries.org/general.php?id=35";
array[1] = "http://www.s8wministries.org/general.php?id=52";
array[2] = "http://www.ecogybiofuels.com/general.php?id=6";
array[3] = "http://www.stjohnsheriff.com/general.php?id=186";
array[4] = "http://www.stjohnsheriff.com/general.php?id=7";
array[5] = "http://www.bickellawfirm.com/general.php?id=1048";
Now I want to store only one similar occurrence of the string ie http://www.s8wministries.org/general.php?id=35 discarding any other string that has http://www.s8wministries.org and store it in another array.
Please how do I go about this?
my attempt is as follows:-
//remove similar string from array storing only one similar in another array
foreach (var olu in array)
{
string findThisString = olu.ToString();
string firstTen = findThisString.Substring(0, 15);
// See if substring is in the table.
int index1 = Array.IndexOf(array, firstTen); //substring is not in table
}

try this with List of string, so you have list of string containing URL, you can use URI class to compare domains:
for(int i = 0; i < strList.Length; i++)
{
Uri uriToCompare = new Uri(strArray[i]);
for(int j = i+1; j < strArray.Length; j++){
Uri uri = new Uri(strArray[j]);
if( uriToCompare.Host == uri.Host){
strList.RemoveAt(j);
}
}
}

Here is how I would approach this
Initialize a hashtable or a dictionary for holding domain names
Loop through each item
Do a string split operation with using '', '.', '/' etc as delimiters - find out the domain by parsing the parts.
Check if the domain name exists in the hashtable. If it does, discard the current entry. If it doesn't exist, insert into the hashtable and also add the current entry to a new list of your selected entries.
Another option would be to sort the entries alphabetically. Go through them one at a time. Select an entry with the domain name. Skip all the next entries with the same domain name. Select the next entry when the domain name changes again.

Let's say the result is to be stored in an array called unique_array and that your current array is called array. Pseudo-code follows:
bool found = false;
for(int i = 0; i < array_size; i++)
{ if(array[i] starts with "http://www.s8wministries.org")
{ if(found) continue;
found = true;
}
add array[i] to end of unique_array;
}

I would go the way of slightly more automation by creating a class that inherits IEqualityComparer (utilizing the great answer to this question):
public class PropertyComparer<T> : IEqualityComparer<T>
{
Func<T, T, bool> comparer;
public PropertyComparer<T>(Func<T, T, bool> comparer)
{
this.comparer = comparer;
}
public bool Equals(T a, T b)
{
return comparer(a, b);
}
public int GetHashCode(T a)
{
return a.GetHashCode();
}
}
Once you have that class - you can use Distinct like this:
var distinctArray = array.Select(s => new Uri(s)).Distinct(new PropertyComparer<Uri>((a, b) => a.Host == b.Host));
That leaves you with an array only containing distinct domains. It's an IEnumerable so you may want to .ToList() it or something, or revert it back to strings from Uris . But I think this method makes for much more readable code.

Please try below Code:
string[] array = new string[6];
array[0] = "http://www.s8wministries.org/general.php?id=35";
array[1] = "http://www.s8wministries.org/general.php?id=52";
array[2] = "http://www.ecogybiofuels.com/general.php?id=6";
array[3] = "http://www.stjohnsheriff.com/general.php?id=186";
array[4] = "http://www.stjohnsheriff.com/general.php?id=7";
array[5] = "http://www.bickellawfirm.com/general.php?id=1048";
var regex = #"http://www.[\w]+.[\w]+";
var distList = new List<string>();
var finalList = new List<string>();
foreach (string str in array)
{
Match match = Regex.Match(str, regex, RegexOptions.IgnoreCase);
if (match.Success)
{
var uniqueUrl = match.Groups[0].Value;
if (!distList.Contains(uniqueUrl))
{
distList.Add(uniqueUrl);
finalList.Add(str);
}
}
}
Here finalList contains the required list of URLs

Related

How can I split a string to store contents in two different arrays in c#?

The string I want to split is an array of strings.
the array contains strings like:
G1,Active
G2,Inactive
G3,Inactive
.
.
G24,Active
Now I want to store the G's in an array, and Active or Inactive in a different array. So far I have tried this which has successfully store all the G's part but I have lost the other part. I used Split fucntion but did not work so I have tried this.
int i = 0;
for(i = 0; i <= grids.Length; i++)
{
string temp = grids[i];
temp = temp.Replace(",", " ");
if (temp.Contains(' '))
{
int index = temp.IndexOf(' ');
grids[i] = temp.Substring(0, index);
}
//System.Console.WriteLine(temp);
}
Please help me how to achieve this goal. I am new to C#.
If I understand the problem correctly - we have an array of strings Eg:
arrayOfStrings[24] =
{
"G1,Active",
"G2,Inactive",
"G3,Active",
...
"G24,Active"
}
Now we want to split each item and store the g part in one array and the status into another.
Working with arrays the solution is to - traverse the arrayOfStrings.
Per each item in the arrayOfStrings we split it by ',' separator.
The Split operation will return another array of two elements the g part and the status - which will be stored respectively into distinct arrays (gArray and statusArray) for later retrieval. Those arrays will have a 1-to-1 relation.
Here is my implementation:
static string[] LoadArray()
{
return new string[]
{
"G1,Active",
"G2,Inactive",
"G3,Active",
"G4,Active",
"G5,Active",
"G6,Inactive",
"G7,Active",
"G8,Active",
"G9,Active",
"G10,Active",
"G11,Inactive",
"G12,Active",
"G13,Active",
"G14,Inactive",
"G15,Active",
"G16,Inactive",
"G17,Active",
"G18,Active",
"G19,Inactive",
"G20,Active",
"G21,Inactive",
"G22,Active",
"G23,Inactive",
"G24,Active"
};
}
static void Main(string[] args)
{
string[] myarrayOfStrings = LoadArray();
string[] gArray = new string[24];
string[] statusArray = new string[24];
int index = 0;
foreach (var item in myarrayOfStrings)
{
var arraySplit = item.Split(',');
gArray[index] = arraySplit[0];
statusArray[index] = arraySplit[1];
index++;
}
for (int i = 0; i < gArray.Length; i++)
{
Console.WriteLine("{0} has status : {1}", gArray[i] , statusArray[i]);
}
Console.ReadLine();
}
seems like you have a list of Gxx,Active my recomendation is first of all you split the string based on the space, which will give you the array previoulsy mentioned doing the next:
string text = "G1,Active G2,Inactive G3,Inactive G24,Active";
string[] splitedGItems = text.Split(" ");
So, now you have an array, and I strongly recommend you to use an object/Tuple/Dictionary depends of what suits you more in the entire scenario. for now i will use Dictionary as it seems to be key-value
Dictionary<string, string> GxListActiveInactive = new Dictionary<string, string>();
foreach(var singleGItems in splitedGItems)
{
string[] definition = singleGItems.Split(",");
GxListActiveInactive.Add(definition[0], definition[1]);
}
What im achiving in this code is create a collection which is key-value, now you have to search the G24 manually doing the next
string G24Value = GxListActiveInactive.FirstOrDefault(a => a.Key == "G24").Value;
just do it :
var splitedArray = YourStringArray.ToDictionary(x=>x.Split(',')[0],x=>x.Split(',')[1]);
var gArray = splitedArray.Keys;
var activeInactiveArray = splitedArray.Values;
I hope it will be useful
You can divide the string using Split; the first part should be the G's, while the second part will be "Active" or "Inactive".
int i;
string[] temp, activity = new string[grids.Length];
for(i = 0; i <= grids.Length; i++)
{
temp = grids[i].Split(',');
grids[i] = temp[0];
activity[i] = temp[1];
}

How to find the placement of a List within another List?

I am working with two lists. The first contains a large sequence of strings. The second contains a smaller list of strings. I need to find where the second list exists in the first list.
I worked with enumeration, and due to the large size of the data, this is very slow, I was hoping for a faster way.
List<string> first = new List<string>() { "AAA","BBB","CCC","DDD","EEE","FFF" };
List<string> second = new List<string>() { "CCC","DDD","EEE" };
int x = SomeMagic(first,second);
And I would need x to = 2.
Ok, here is my variant with old-good-for-each-loop:
private int SomeMagic(IEnumerable<string> source, IEnumerable<string> target)
{
/* Some obvious checks for `source` and `target` lenght / nullity are ommited */
// searched pattern
var pattern = target.ToArray();
// candidates in form `candidate index` -> `checked length`
var candidates = new Dictionary<int, int>();
// iteration index
var index = 0;
// so, lets the magic begin
foreach (var value in source)
{
// check candidates
foreach (var candidate in candidates.Keys.ToArray()) // <- we are going to change this collection
{
var checkedLength = candidates[candidate];
if (value == pattern[checkedLength]) // <- here `checkedLength` is used in sense `nextPositionToCheck`
{
// candidate has match next value
checkedLength += 1;
// check if we are done here
if (checkedLength == pattern.Length) return candidate; // <- exit point
candidates[candidate] = checkedLength;
}
else
// candidate has failed
candidates.Remove(candidate);
}
// check for new candidate
if (value == pattern[0])
candidates.Add(index, 1);
index++;
}
// we did everything we could
return -1;
}
We use dictionary of candidates to handle situations like:
var first = new List<string> { "AAA","BBB","CCC","CCC","CCC","CCC","EEE","FFF" };
var second = new List<string> { "CCC","CCC","CCC","EEE" };
If you are willing to use MoreLinq then consider using Window:
var windows = first.Window(second.Count);
var result = windows
.Select((subset, index) => new { subset, index = (int?)index })
.Where(z => Enumerable.SequenceEqual(second, z.subset))
.Select(z => z.index)
.FirstOrDefault();
Console.WriteLine(result);
Console.ReadLine();
Window will allow you to look at 'slices' of the data in chunks (based on the length of your second list). Then SequenceEqual can be used to see if the slice is equal to second. If it is, the index can be returned. If it doesn't find a match, null will be returned.
Implemented SomeMagic method as below, this will return -1 if no match found, else it will return the index of start element in first list.
private int SomeMagic(List<string> first, List<string> second)
{
if (first.Count < second.Count)
{
return -1;
}
for (int i = 0; i <= first.Count - second.Count; i++)
{
List<string> partialFirst = first.GetRange(i, second.Count);
if (Enumerable.SequenceEqual(partialFirst, second))
return i;
}
return -1;
}
you can use intersect extension method using the namepace System.Linq
var CommonList = Listfirst.Intersect(Listsecond)

How to get an index of a word in string array

I want to get the index of a word in a string array.
for example, the sentence I will input is 'I love you.'
I have words[1] = love, how can I get the position of 'love' is 1? I could do it but just inside the if state. I want to bring it outside. Please help me.
This is my code.
static void Main(string[] args)
{
Console.WriteLine("sentence: ");
string a = Console.ReadLine();
String[] words = a.Split(' ');
List<string> verbs = new List<string>();
verbs.Add("love");
int i = 0;
while (i < words.Length) {
foreach (string verb in verbs) {
if (words[i] == verb) {
int index = i;
Console.WriteLine(i);
}
} i++;
}
Console.ReadKey();
}
I could do it but just inside the if state. I want to bring it outside.
Your code identifies the index correctly, all you need to do now is storing it for use outside the loop.
Make a list of ints, and call Add on it for the matches that you identify:
var indexes = new List<int>();
while (i < words.Length) {
foreach (string verb in verbs) {
if (words[i] == verb) {
int index = i;
indexes.Add(i);
break;
}
}
i++;
}
You can replace the inner loop with a call of Contains method, and the outer loop with a for:
for (var i = 0 ; i != words.Length ; i++) {
if (verbs.Contains(words[i])) {
indexes.Add(i);
}
}
Finally, the whole sequence can be converted to a single LINQ query:
var indexes = words
.Select((w,i) => new {w,i})
.Where(p => verbs.Contains(p.w))
.Select(p => p.i)
.ToList();
Here is an example
var a = "I love you.";
var words = a.Split(' ');
var index = Array.IndexOf(words,"love");
Console.WriteLine(index);
private int GetWordIndex(string WordOrigin, string GetWord)
{
string[] words = WordOrigin.Split(' ');
int Index = Array.IndexOf(words, GetWord);
return Index;
}
assuming that you called the function as GetWordIndex("Hello C# World", "C#");, WordOrigin is Hello C# World and GetWord is C#
now according to the function:
string[] words = WordsOrigin.Split(' '); broke the string literal into an array of strings where the words would be split for every spaces in between them. so Hello C# World would then be broken down into Hello, C#, and World.
int Index = Array.IndexOf(words, GetWord); gets the Index of whatever GetWord is, according to the sample i provided, we are looking for the word C# from Hello C# World that is then splitted into an Array of String
return Index; simply returns whatever index it was located from

IComparer for string that checks if x starts with y

I've got an array of strings and I need to get all strings, that start with some 'prefix'. I wanna use Array.BinarySearch(). Is it possible? And how should I write a comparer if so?
No, you cannot use BinarySearch in this case. You could use Enumerable.Where instead:
Dim query = From str In array Where str.StartsWith("prefix")
or with (ugly in VB.NET) method synatx:
query = array.Where(Function(str) str.StartsWith("prefix"))
Edit: whoops, C#
var query = array.Where(s => s.StartsWith("prefix"));
Use ToArray if you want to create a new filtered array.
It's easy to create your own StartsWithComparer:
class StartsWithComparer : IComparer<string>
{
public int Compare(string a, string b) {
if(a.StartsWith(b)) {
return 0;
}
return a.CompareTo(b);
}
}
As others pointed out, this will only return one index. You can have a couple of helpers to return all items:
IEnumerable<string> GetBefore(IList<string> sorted, int foundIndex, string prefix) {
for(var i = foundIndex - 1; i >= 0; i--) {
if(sorted[i].StartsWith(prefix)) {
yield return sorted[i];
}
}
}
IEnumerable<string> GetCurrentAndAfter(IList<string> sorted, int foundIndex, string prefix) {
for(var i = foundIndex; i < sorted.Count; i++) {
if(sorted[i].StartsWith(prefix)) {
yield return sorted[i];
}
}
}
Then to use it:
var index = sorted.BinarySearch("asdf", new StartsWithComparer());
var previous = GetBefore(sorted, index, "asdf");
var currentAndAfter = GetCurrentAndAfter(sorted, index, "asdf");
You can wrap the whole thing in your own class, with a single method that returns all items that start with your prefix.

compare the characters in two strings

In C#, how do I compare the characters in two strings.
For example, let's say I have these two strings
"bc3231dsc" and "bc3462dsc"
How do I programically figure out the the strings
both start with "bc3" and end with "dsc"?
So the given would be two variables:
var1 = "bc3231dsc";
var2 = "bc3462dsc";
After comparing each characters from var1 to var2, I would want the output to be:
leftMatch = "bc3";
center1 = "231";
center2 = "462";
rightMatch = "dsc";
Conditions:
1. The strings will always be a length of 9 character.
2. The strings are not case sensitive.
The string class has 2 methods (StartsWith and Endwith) that you can use.
After reading your question and the already given answers i think there are some constraints are missing, which are maybe obvious to you, but not to the community. But maybe we can do a little guess work:
You'll have a bunch of string pairs that should be compared.
The two strings in each pair are of the same length or you are only interested by comparing the characters read simultaneously from left to right.
Get some kind of enumeration that tells me where each block starts and how long it is.
Due to the fact, that a string is only a enumeration of chars you could use LINQ here to get an idea of the matching characters like this:
private IEnumerable<bool> CommonChars(string first, string second)
{
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
var charsToCompare = first.Zip(second, (LeftChar, RightChar) => new { LeftChar, RightChar });
var matchingChars = charsToCompare.Select(pair => pair.LeftChar == pair.RightChar);
return matchingChars;
}
With this we can proceed and now find out how long each block of consecutive true and false flags are with this method:
private IEnumerable<Tuple<int, int>> Pack(IEnumerable<bool> source)
{
if (source == null)
throw new ArgumentNullException("source");
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
bool current = iterator.Current;
int index = 0;
int length = 1;
while (iterator.MoveNext())
{
if(current != iterator.Current)
{
yield return Tuple.Create(index, length);
index += length;
length = 0;
}
current = iterator.Current;
length++;
}
yield return Tuple.Create(index, length);
}
}
Currently i don't know if there is an already existing LINQ function that provides the same functionality. As far as i have already read it should be possible with SelectMany() (cause in theory you can accomplish any LINQ task with this method), but as an adhoc implementation the above was easier (for me).
These functions could then be used in a way something like this:
var firstString = "bc3231dsc";
var secondString = "bc3462dsc";
var commonChars = CommonChars(firstString, secondString);
var packs = Pack(commonChars);
foreach (var item in packs)
{
Console.WriteLine("Left side: " + firstString.Substring(item.Item1, item.Item2));
Console.WriteLine("Right side: " + secondString.Substring(item.Item1, item.Item2));
Console.WriteLine();
}
Which would you then give this output:
Left side: bc3
Right side: bc3
Left side: 231
Right side: 462
Left side: dsc
Right side: dsc
The biggest drawback is in someway the usage of Tuple cause it leads to the ugly property names Item1 and Item2 which are far away from being instantly readable. But if it is really wanted you could introduce your own simple class holding two integers and has some rock-solid property names. Also currently the information is lost about if each block is shared by both strings or if they are different. But once again it should be fairly simply to get this information also into the tuple or your own class.
static void Main(string[] args)
{
string test1 = "bc3231dsc";
string tes2 = "bc3462dsc";
string firstmatch = GetMatch(test1, tes2, false);
string lasttmatch = GetMatch(test1, tes2, true);
string center1 = test1.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
string center2 = test2.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
}
public static string GetMatch(string fist, string second, bool isReverse)
{
if (isReverse)
{
fist = ReverseString(fist);
second = ReverseString(second);
}
StringBuilder builder = new StringBuilder();
char[] ar1 = fist.ToArray();
for (int i = 0; i < ar1.Length; i++)
{
if (fist.Length > i + 1 && ar1[i].Equals(second[i]))
{
builder.Append(ar1[i]);
}
else
{
break;
}
}
if (isReverse)
{
return ReverseString(builder.ToString());
}
return builder.ToString();
}
public static string ReverseString(string s)
{
char[] arr = s.ToCharArray();
Array.Reverse(arr);
return new string(arr);
}
Pseudo code of what you need..
int stringpos = 0
string resultstart = ""
while not end of string (either of the two)
{
if string1.substr(stringpos) == string1.substr(stringpos)
resultstart =resultstart + string1.substr(stringpos)
else
exit while
}
resultstart has you start string.. you can do the same going backwards...
Another solution you can use is Regular Expressions.
Regex re = new Regex("^bc3.*?dsc$");
String first = "bc3231dsc";
if(re.IsMatch(first)) {
//Act accordingly...
}
This gives you more flexibility when matching. The pattern above matches any string that starts in bc3 and ends in dsc with anything between except a linefeed. By changing .*? to \d, you could specify that you only want digits between the two fields. From there, the possibilities are endless.
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
List<string> common_str = commonStrings(s1,s2);
foreach ( var s in common_str)
Console.WriteLine(s);
}
static public List<string> commonStrings(string s1, string s2){
int len = s1.Length;
char [] match_chars = new char[len];
for(var i = 0; i < len ; ++i)
match_chars[i] = (Char.ToLower(s1[i])==Char.ToLower(s2[i]))? '#' : '_';
string pat = new String(match_chars);
Regex regex = new Regex("(#+)", RegexOptions.Compiled);
List<string> result = new List<string>();
foreach (Match match in regex.Matches(pat))
result.Add(s1.Substring(match.Index, match.Length));
return result;
}
}
for UPDATE CONDITION
using System;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
int len = 9;//s1.Length;//cond.1)
int l_pos = 0;
int r_pos = len;
for(int i=0;i<len && Char.ToLower(s1[i])==Char.ToLower(s2[i]);++i){
++l_pos;
}
for(int i=len-1;i>0 && Char.ToLower(s1[i])==Char.ToLower(s2[i]);--i){
--r_pos;
}
string leftMatch = s1.Substring(0,l_pos);
string center1 = s1.Substring(l_pos, r_pos - l_pos);
string center2 = s2.Substring(l_pos, r_pos - l_pos);
string rightMatch = s1.Substring(r_pos);
Console.Write(
"leftMatch = \"{0}\"\n" +
"center1 = \"{1}\"\n" +
"center2 = \"{2}\"\n" +
"rightMatch = \"{3}\"\n",leftMatch, center1, center2, rightMatch);
}
}

Categories