Search multiple column values with one search string - c#

I have this query where I want to return results if the search string provided yields data from the 'FirstName' and 'LastName' column of my database. They each work individually as I can get results for Firstname = 'john' and LastName = 'doe'. I also want to be able to pass in a search string of 'John doe' and get results. How would I implement this using .net/linq
snippet:
var query = _context.Players.Where(p => p.Firstname.Contains(searchString.ToLower().Trim()) || p.Lastname.Contains(searchString.ToLower().Trim()));

use Split function like the following:
var parts = searchString.Split();
snippet: var query = _context.Players.Where(
p => p.Firstname.Contains(parts[0].ToLower().Trim())
|| p.Lastname.Contains(parts[1].ToLower().Trim()));
Extracted from the official docs:
If the separator parameter is null or contains no characters, white-space characters are assumed to be the delimiters.

Separating the input data is also convenient
var parts = searchString.Split();
var partOne = parts[0].ToLower().Trim();
var partTwo = parts[1].ToLower().Trim()
var query = _context.Players.Where(
p => p.Firstname.Contains(partOne)
|| p.Lastname.Contains(partTwo));

Created an extension method class to isolate the functional parts of search algo. This is based on pattern matching algo. You can try this one once.
public class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
}
static class Extensions
{
public static void Sanitize(this string item)
{
Regex rgx = new Regex("[^a-AA-Z0-9 -]");
item = rgx.Replace(item, " ");
}
public static string GetPipedString(this string item)
{
StringBuilder builder = new StringBuilder();
item.Split(' ').ToList().ForEach(x => builder.Append('|').Append(x));
builder.Remove(0, 1);
return builder.ToString();
}
public static IEnumerable<Person> FindPlayers(this IEnumerable<Person> persons, string searchKey)
{
searchKey.Sanitize();
string pattern = string.Format(#"^?:{0}\w*$", searchKey.GetPipedString());
return persons.Where(x => Regex.IsMatch(
string.Join(string.Empty,
new List<string>() { x.FirstName, x.LastName }),
pattern,
RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace));
}
}
class Program
{
static void Main(string[] args)
{
/* Assuming peoples is the IEnumerable<Person>.
Anyways there is an explicit sanitization of the string to remove the non alphanum characters*/
var items = peoples.FindPlayers("ANY DATA SPACE SEPARATED").ToList();
}
}

Related

C# / Sorting a text file / IComparer / Custom sort

I have a text file that I want to be sorted.
Each line has a package name, a pipe and a version number.
Examples:
AutoFixture|4.15.0
Castle.Windsor.Lifestyles|0.3.0
I tried to use the default list.Sort() method but I obtained:
AutoFixture|4.15.0
Castle.Core|3.3.0
Castle.Windsor.Lifestyles|0.3.0
Castle.Windsor|3.3.0
FluentAssertions|5.10.3
Instead of
AutoFixture|4.15.0
Castle.Core|3.3.0
Castle.Windsor|3.3.0
Castle.Windsor.Lifestyles|0.3.0
FluentAssertions|5.10.3
As shown, I would like "Castle.Windsor" to appear before "Castle.Windsor.Lifestyles".
I'm pretty sure I have to use the IComparer but I can't find a way to get the shorter name first.
So far, I created a custom sort like this which is not working..
public class PackageComparer : IComparer<string>
{
// Assume that each line has the format: name|number
private readonly Regex packageRegEx = new Regex(#"[\w.]+\|[\d.]+", RegexOptions.Compiled);
public int Compare(string x, string y)
{
var firstPackage = this.packageRegEx.Match(x);
var firstLeft = firstPackage.Groups[1].Value;
var firstRight = firstPackage.Groups[2].Value;
var secondPackage = this.packageRegEx.Match(y);
var secondLeft = secondPackage.Groups[1].Value;
var secondRight = secondPackage.Groups[2].Value;
if (firstLeft < secondLeft)
{
return -1;
}
if (firstRight > secondLeft)
{
return 1;
}
return string.CompareOrdinal(firstSceneAlpha, secondSceneAlpha);
}
}
Well, you can use Linq, split by the pipe and order by the package name then by the versioning:
var input = #"AutoFixture|4.15.0
Castle.Core|3.3.0
Castle.Windsor.Lifestyles|0.3.0
Castle.Windsor|3.3.0
FluentAssertions|5.10.3
Castle.Core|3.1.0";
var list = input.Split(new string[]{"\r\n","\n"},StringSplitOptions.None).ToList();
list = list
.OrderBy(x => x.Split('|')[0])
.ThenBy(x => new Version(x.Split('|')[1]))
.ToList();
Outputs:
AutoFixture|4.15.0
Castle.Core|3.1.0
Castle.Core|3.3.0
Castle.Windsor|3.3.0
Castle.Windsor.Lifestyles|0.3.0
FluentAssertions|5.10.3
You can do something like this:
public class YourClassName
{
public string PackageName { get; set; }
public string Pipe { get; set; }
public string Version { get; set; }
}
Load your data into list to sort
List<YourClassName> list = souce of data;
list = SortList<YourClassName>(list, "PackageName");
SortList Method:
public List<YourClassName> SortList<TKey>(List<YourClassName> list, string sortBy)
{
PropertyInfo property = list.GetType().GetGenericArguments()[0].GetProperty(sortBy);
return list.OrderBy(e => property.GetValue(e, null)).ToList<YourClassName>();
}

mongodb query: filter results by string attribute that is partly matched (filter string is subset of attribute string)

i want to query data from a mongoDB database and want to apply a filter to it. Stuff like this works fine:
var wantedAttributes = "word";
Collection.Find(Builders<MyModel>.Filter.Eq("Attributes", wantedAttributes)).ToList();
but only if my wantedAttributes match exactly to the Attributes field value in the db.
My usecase is that, the Attributes values are lists of strings, like:
word1, word2, word3
word2, word3, word1
word3, word1, word4
What i want is a method to get or match all entries that contain a given set of words, but not necessary in the same order. More words are allowed but not less!
So if my wantedAttributes = word4 i want to get the third entry only and if my wantedAttributes = word1,word2 i want the first and the second.
The wantedAttributes do not necessary has to be a string of comma separated words, but the database entries are.
What is the best way to achieve that?
Try this:
var wanteds = "word4";
var filter = Builders<MyModel>.Filter.Empty;
foreach (var wanted in wanteds.Split(','))
{
filter = filter & Builders<MyModel>.Filter.Where(m => m.Str.Contains(wanted.Trim()));
}
var models = collection.Find(filter).ToList();
My model:
class MyModel
{
[BsonElement("id")]
public int Id { get; set; }
[BsonElement("str")]
public string Str { get; set; }
}
You can customize the wanted string and string separator.
since you asked the best way, i'd suggest storing the attributes as a string array instead of a string in the db and query like so:
var wantedAttributes = new[] { "word1", "word2" };
var result = collections.AsQueryable<MyModel>()
.Where(m => wantedAttributes.All(a => m.Attributes.Contains(a)))
.ToList();
this way you can index the Attributes field and get lightning fast results. the m.Str.Contains(wanted.Trim()) method which #charles suggested will cause a regex match which will not be able to use an index. only prefixed regex queries can use indexes in mongodb.
here's full test program:
using MongoDB.Entities;
using MongoDB.Entities.Core;
using System;
using System.Linq;
namespace StackOverFlow
{
public class MyModel : Entity
{
public string[] Attributes { get; set; }
}
public static class Program
{
private static void Main()
{
new DB("test");
new[] {
new MyModel{
Attributes = new[]{ "word1", "word2", "word3" }
},
new MyModel{
Attributes = new[]{ "word2", "word3", "word1" }
},
new MyModel{
Attributes = new[]{ "word3", "word1", "word4" }
}
}
.Save();
var wantedAttributes = new[] { "word1", "word2" };
var result = DB.Queryable<MyModel>()
.Where(m => wantedAttributes.All(a => m.Attributes.Contains(a)))
.ToList();
}
}
}

Interleave an array of email addresses avoiding items with same domain to be consecutive

I'm looking for an efficient way of sorting an array of email addresses to avoid items with the same domain to be consecutive, in C#.
Email addresses inside the array are already distinct and all of them are lower case.
Example:
Given an array with the following entries:
john.doe#domain1.com
jane_doe#domain1.com
patricksmith#domain2.com
erick.brown#domain3.com
I would like to obtain something similar to the following:
john.doe#domain1.com
patricksmith#domain2.com
jane_doe#domain1.com
erick.brown#domain3.com
With the help of an extension method (stolen from https://stackoverflow.com/a/27533369/172769), you can go like this:
List<string> emails = new List<string>();
emails.Add("john.doe#domain1.com");
emails.Add("jane_doe#domain1.com");
emails.Add("patricksmith#domain2.com");
emails.Add("erick.brown#domain3.com");
var q = emails.GroupBy(m => m.Split('#')[1]).Select(g => new List<string>(g)).Interleave();
The Interleave method is defined as:
public static IEnumerable<T> Interleave<T>(this IEnumerable<IEnumerable<T>> source )
{
var queues = source.Select(x => new Queue<T>(x)).ToList();
while (queues.Any(x => x.Any())) {
foreach (var queue in queues.Where(x => x.Any())) {
yield return queue.Dequeue();
}
}
}
So basically, we create groups based on the domain part of the email adresses, project (or Select) each group into a List<string>, and then "Interleave" those lists.
I have tested against your sample data, but more thorough testing might be needed to find edge cases.
DotNetFiddle snippet
Cheers
This will distribute them semi-evenly and attempt to avoid matching domains next to each other (although in certain lists that may be impossible). This answer will use OOP and Linq.
DotNetFiddle.Net Example
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
var seed = new List<string>()
{
"1#a.com",
"2#a.com",
"3#a.com",
"4#a.com",
"5#a.com",
"6#a.com",
"7#a.com",
"8#a.com",
"9#a.com",
"10#a.com",
"1#b.com",
"2#b.com",
"3#b.com",
"1#c.com",
"4#b.com",
"2#c.com",
"3#c.com",
"4#c.com"
};
var work = seed
// Create a list of EmailAddress objects
.Select(s => new EmailAddress(s)) // s.ToLowerCase() ?
// Group the list by Domain
.GroupBy(s => s.Domain)
// Create a List<EmailAddressGroup>
.Select(g => new EmailAddressGroup(g))
.ToList();
var currentDomain = string.Empty;
while(work.Count > 0)
{
// this list should not be the same domain we just used
var noDups = work.Where(w => w.Domain != currentDomain);
// if none exist we are done, or it can't be solved
if (noDups.Count() == 0)
{
break;
}
// find the first group with the most items
var workGroup = noDups.First(w => w.Count() == noDups.Max(g => g.Count()));
// get the email address and remove it from the group list
var workItem = workGroup.Remove();
// if the group is empty remove it from *work*
if (workGroup.Count() == 0)
{
work.Remove(workGroup);
Console.WriteLine("removed: " + workGroup.Domain);
}
Console.WriteLine(workItem.FullEmail);
// last domain looked at.
currentDomain = workItem.Domain;
}
Console.WriteLine("Cannot disperse email addresses affectively, left overs:");
foreach(var workGroup in work)
{
while(workGroup.Count() > 0)
{
var item = workGroup.Remove();
Console.WriteLine(item.FullEmail);
}
}
}
public class EmailAddress
{
public EmailAddress(string emailAddress)
{
// Additional Email Address Validation
var result = emailAddress.Split(new char[] {'#'}, StringSplitOptions.RemoveEmptyEntries)
.ToList();
if (result.Count() != 2)
{
new ArgumentException("emailAddress");
}
this.FullEmail = emailAddress;
this.Name = result[0];
this.Domain = result[1];
}
public string Name { get; private set; }
public string Domain { get; private set; }
public string FullEmail { get; private set; }
}
public class EmailAddressGroup
{
private List<EmailAddress> _emails;
public EmailAddressGroup(IEnumerable<EmailAddress> emails)
{
this._emails = emails.ToList();
this.Domain = emails.First().Domain;
}
public int Count()
{
return _emails.Count();
}
public string Domain { get; private set; }
public EmailAddress Remove()
{
var result = _emails.First();
_emails.Remove(result);
return result;
}
}
}
Output:
1#a.com
1#b.com
2#a.com
1#c.com
3#a.com
2#b.com
4#a.com
2#c.com
5#a.com
3#b.com
6#a.com
3#c.com
7#a.com
removed: b.com
4#b.com
8#a.com
removed: c.com
4#c.com
9#a.com
Cannot disperse email addresses affectively, left overs:
10#a.com
Something like this will spread them equally, but you will have the problems (=consecutive elements) at the end of the new list...
var list = new List<string>();
list.Add("john.doe#domain1.com");
list.Add("jane_doe#domain1.com");
list.Add("patricksmith#domain2.com");
list.Add("erick.brown#domain3.com");
var x = list.GroupBy(content => content.Split('#')[1]);
var newlist = new List<string>();
bool addedSomething=true;
int i = 0;
while (addedSomething) {
addedSomething = false;
foreach (var grp in x) {
if (grp.Count() > i) {
newlist.Add(grp.ElementAt(i));
addedSomething = true;
}
}
i++;
}
Edit: Added a high level description :)
What this code does is group each element by the domain, sort the groups by size in descending order (largest group first), project the elements of each group into a stack, and pop them off of each stack (always pop the next element off the largest stack with a different domain). If there is only a single stack left, then its contents are yielded.
This should make sure that all domains distributed as evenly as possible.
MaxBy extension method from: https://stackoverflow.com/a/31560586/969962
private IEnumerable<string> GetNonConsecutiveEmails(List<string> list)
{
var emailAddresses = list.Distinct().Select(email => new EmailAddress { Email = email, Domain = email.Split('#')[1]}).ToArray();
var groups = emailAddresses
.GroupBy(addr => addr.Domain)
.Select (group => new { Domain = group.Key, EmailAddresses = new Stack<EmailAddress>(group)})
.ToList();
EmailAddress lastEmail = null;
while(groups.Any(g => g.EmailAddresses.Any()))
{
// Try and pick from the largest stack.
var stack = groups
.Where(g => (g.EmailAddresses.Any()) && (lastEmail == null ? true : lastEmail.Domain != g.Domain))
.MaxBy(g => g.EmailAddresses.Count);
// Null check to account for only 1 stack being left.
// If so, pop the elements off the remaining stack.
lastEmail = (stack ?? groups.First(g => g.EmailAddresses.Any())).EmailAddresses.Pop();
yield return lastEmail.Email;
}
}
class EmailAddress
{
public string Domain;
public string Email;
}
public static class Extensions
{
public static T MaxBy<T,U>(this IEnumerable<T> data, Func<T,U> f) where U:IComparable
{
return data.Aggregate((i1, i2) => f(i1).CompareTo(f(i2))>0 ? i1 : i2);
}
}
What I am trying to do here is to sort them first.
Then I re-arrange from a different end. I'm sure there're more efficient ways to do this but this is one easy way to do it.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication4
{
class Program
{
static void Main(string[] args)
{
String[] emails = { "john.doe#domain1.com", "jane_doe#domain1.com", "patricksmith#domain2.com", "erick.brown#domain3.com" };
var result = process(emails);
}
static String[] process(String[] emails)
{
String[] result = new String[emails.Length];
var comparer = new DomainComparer();
Array.Sort(emails, comparer);
for (int i = 0, j = emails.Length - 1, k = 0; i < j; i++, j--, k += 2)
{
if (i == j)
result[k] = emails[i];
else
{
result[k] = emails[i];
result[k + 1] = emails[j];
}
}
return result;
}
}
public class DomainComparer : IComparer<string>
{
public int Compare(string left, string right)
{
int at_pos = left.IndexOf('#');
var left_domain = left.Substring(at_pos, left.Length - at_pos);
at_pos = right.IndexOf('#');
var right_domain = right.Substring(at_pos, right.Length - at_pos);
return String.Compare(left_domain, right_domain);
}
}
}

Searching Country by Country code

I am working on a search method, which will be called with Ajax, and updates a Webgrid in Mvc4.
The search will go through a list of Project objects, which contains some fields.
One of the fields is Country. And right now my code only checks if the input string contains the search string:
private bool StringStartWith(string input, string searchstring)
{
bool startwith = false;
var inputlist = new List<string>(input.ToLower().Split(' ').Distinct());
var searchList = new List<string>(searchstring.ToLower().Split(' '));
var count = (from inp in inputlist from sear in searchList where inp.StartsWith(sear) select inp).Count();
if (count == searchList.Count)
startwith = true;
return startwith;
}
But I also want to be able to search by country code. So if I write "DK", it should tell that it is equal to Denmark.
I hope I can get some help for it.
Thanks.
//UPDATE!!
iTURTEV answer helped me to make my method work as it should. I just had to update my method as shown here:
private bool InputStartWithSearch(string input, string searchstring)
{
if(searchstring[searchstring.Length-1].Equals(' '))
searchstring = searchstring.Substring(0,searchstring.Length-2);
bool startwith = false;
var inputlist = new List<string>(input.ToLower().Split(' ').Distinct());
var searchList = new List<string>(searchstring.ToLower().Split(' '));
if (searchstring.Length == 2)
{
var countryCode = new RegionInfo(searchstring.ToUpper()).EnglishName;
if (inputlist.Any(country => country.ToLower().Equals(countryCode.ToLower())))
{
return true;
}
}
var count = (from inp in inputlist from sear in searchList where inp.StartsWith(sear) select inp).Count();
if (count == searchList.Count)
startwith = true;
return startwith;
}
Thanks a lot.
May be you can use RegionInfo:
// returns Bulgaria
new RegionInfo("BG").EnglishName;
Assuming:
public class Country {
public int Id { get; set; }
public string Name { get; set; }
public string IsoCode { get; set; }
}
Then:
return x.Countries.Where(q =>
q.Name != null && q.Name.ToLowerInvariant().Contains(text) ||
q.IsoCode != null && q.IsoCode.ToLowerInvariant().Contains(text));
This will return every Country having text on its name or code. It's important to check for nulls unless you're using [Required] data annotation, if you don't want this to be case insensitive you could remove the .ToLowerInvariant().

Returning table with CLR

I want to write an CLR procedure which takes a text and returns a table with all the words in this text. But I can't figure out how to return a table. Could you please tell me it?
[Microsoft.SqlServer.Server.SqlFunction]
public static WhatTypeShouldIWriteHere Function1(SqlString str)
{
string[] words = Regex.Split(str, #"\W+").Distinct().ToArray();
//how to return a table with one column of words?
}
Thank you for your help.
UPDATED: I need to do it for sql-2005
Here is a full blown sample. I got tired of searching for this myself and even though this is answered, I thought I would post this just to keep a fresh reference online.
using System;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Text.RegularExpressions;
using System.Collections;
using System.Collections.Generic;
public partial class UserDefinedFunctions {
[SqlFunction]
public static SqlBoolean RegexPatternMatch(string Input, string Pattern) {
return Regex.Match(Input, Pattern).Success ? new SqlBoolean(true) : new SqlBoolean(false);
}
[SqlFunction]
public static SqlString RegexGroupValue(string Input, string Pattern, int GroupNumber) {
Match m = Regex.Match(Input, Pattern);
SqlString value = m.Success ? m.Groups[GroupNumber].Value : null;
return value;
}
[SqlFunction(DataAccess = DataAccessKind.Read, FillRowMethodName = "FillMatches", TableDefinition = "GroupNumber int, MatchText nvarchar(4000)")]
public static IEnumerable RegexGroupValues(string Input, string Pattern) {
List<RegexMatch> GroupCollection = new List<RegexMatch>();
Match m = Regex.Match(Input, Pattern);
if (m.Success) {
for (int i = 0; i < m.Groups.Count; i++) {
GroupCollection.Add(new RegexMatch(i, m.Groups[i].Value));
}
}
return GroupCollection;
}
public static void FillMatches(object Group, out SqlInt32 GroupNumber, out SqlString MatchText) {
RegexMatch rm = (RegexMatch)Group;
GroupNumber = rm.GroupNumber;
MatchText = rm.MatchText;
}
private class RegexMatch {
public SqlInt32 GroupNumber { get; set; }
public SqlString MatchText { get; set; }
public RegexMatch(SqlInt32 group, SqlString match) {
this.GroupNumber = group;
this.MatchText = match;
}
}
};
You can return any list that implements an IEnumerable. Check this out.
[SqlFunction(DataAccess = DataAccessKind.Read, FillRowMethodName = "FillMatches", TableDefinition = "GroupNumber int, MatchText nvarchar(4000)")]
public static IEnumerable Findall(string Pattern, string Input)
{
List<RegexMatch> GroupCollection = new List<RegexMatch>();
Regex regex = new Regex(Pattern);
if (regex.Match(Input).Success)
{
int i = 0;
foreach (Match match in regex.Matches(Input))
{
GroupCollection.Add(new RegexMatch(i, match.Groups[0].Value));
i++;
}
}
return GroupCollection;
}
That was a slight alteration from the code by "Damon Drake"
This one does a findall instead of returning the first value found.
so
declare #txt varchar(100) = 'Race Stat 2017-2018 -(FINAL)';
select * from dbo.findall('(\d+)', #txt)
returns
This is a new area of SQL Server, you should consult this article. Which shows the syntax of a table-valued function -- that is what you want to create.

Categories