c# mongodb case sensitive search - c#

I have a collection in which I store Email and password of user.
I obviously don't want to require the user to insert his email case sensitive and to be exactly as when he first registered.
I'm using mongodb 2.0 c# driver, I'm repeating it because I saw solutions to queries written with regex but I'm afraid I cant user it in here.
my query looks like
var filter = Builders<ME_User>.Filter.And(
Builders<ME_User>.Filter.Eq(u => u.Email, email),
Builders<ME_User>.Filter.Eq(u => u.Password, password));
ME_User foundUser = null;
var options = new FindOptions<ME_User>
{
Limit = 1
};
using (var cursor = await manager.User.FindAsync(filter, options))
{
while (await cursor.MoveNextAsync())
{
var batch = cursor.Current;
foreach (ME_User user in batch)
foundUser = user;
}
}
I have an issue with disorder, kill me, but I cant allow myself save this data again with lower case and have 2 copies of the same thing. Also, I want the email to be saved EXACTLY like the user inserted it.

Filtering on string fields in Mongodb is case sensitive without using regular expressions. Why exactly you cannot use regular expressions?
Your query can be edited like this:
var filter = Builders<ME_User>.Filter.And(
Builders<ME_User>.Filter.Regex(u => u.Email, new BsonRegularExpression("/^" + email + "$/i"),
Builders<ME_User>.Filter.Eq(u => u.Password, password));
Notice the "^" and "$" signs to specify a complete word search and most important the case-insensitive operator at the end of the regular expression ("/i").
Another way vould be the Text search, that requires the creation of a text index and is case insensitive for latin alphabet: http://docs.mongodb.org/manual/reference/operator/query/text/#match-operation
In C#, you will use with the Text Filter:
var filter = Builders<ME_User>.Filter.And(
Builders<ME_User>.Filter.Text(email),
Builders<ME_User>.Filter.Eq(u => u.Password, password));
With a text index query in an OR clause, you will need to create an index on Password field as well, otherwise the OR query will produce an error:
Other non-TEXT clauses under OR have to be indexed as well

I'd prefer using Linq.
var match = theCollection.AsQueryable().SingleOrDefault(x =>
x.Email.ToLower() == emailToSearchFor.ToLower());

For the c# driver 2.1 (MongoDB 3.0)
By default its case-sensitive.
To case-insensitive just add of "i" in BsonRegularExpression. please refer below
var filter = Builders<Users>.Filter.Regex(k=>k.Name, new BsonRegularExpression(name, "i"));
var users = await _mongoDbContext.Users.Find(filter).ToListAsync();

Related

Interpolated strings stored in a variable [duplicate]

Can one store the template of a string in a variable and use interpolation on it?
var name = "Joe";
var template = "Hi {name}";
I then want to do something like:
var result = $template;
The reason is my templates will come from a database.
I guess that these strings will have always the same number of parameters, even if they can change. For example, today template is "Hi {name}", and tomorrow could be "Hello {name}".
Short answer: No, you cannot do what you have proposed.
Alternative 1: use the string.Format method.
You can store in your database something like this:
"Hi {0}"
Then, when you retrieve the string template from the db, you can write:
var template = "Hi {0}"; //retrieved from db
var name = "Joe";
var result = string.Format(template, name);
//now result is "Hi Joe"
With 2 parameters:
var name2a = "Mike";
var name2b = "John";
var template2 = "Hi {0} and {1}!"; //retrieved from db
var result2 = string.Format(template2, name2a, name2b);
//now result2 is "Hi Mike and John!"
Alternative 2: use a placeholder.
You can store in your database something like this:
"Hi {name}"
Then, when you retrieve the string template from the db, you can write:
var template = "Hi {name}"; //retrieved from db
var name = "Joe";
var result = template.Replace("{name}", name);
//now result is "Hi Joe"
With 3 parameters:
var name2a = "Mike";
var name2b = "John";
var template2 = "Hi {name2a} and {name2b}!"; //retrieved from db
var result2 = template2
.Replace("{name2a}", name2a)
.Replace("{name2b}", name2b);
//now result2 is "Hi Mike and John!"
Pay attention at which token you choose for your placeholders. Here I used surrounding curly brackets {}. You should find something that is unlikely to cause collisions with the rest of your text. And that depends entirely on your context.
This can be done as requested using dynamic compilation, such as through the Microsoft.CodeAnalysis.CSharp.Scripting package. For example:
var name = "Joe";
var template = "Hi {name}";
var result = await CSharpScript.EvaluateAsync<string>(
"var name = \"" + name + "\"; " +
"return $\"" + template + "\";");
Note that this approach is slow, and you'd need to add more logic to handle escaping of quotes (and injection attacks) within strings, but the above serves as a proof-of-concept.
No you can't do that since it needs name value at the time string is created (compile time). Consider using String.Format or String.Replace instead.
I just had the same need in my app so will share my solution using String.Replace(). If you're able to use LINQ then you can use the Aggregate method (which is a reducing function, if you're familiar with functional programming) combined with a Dictionary that provides the substitutions you want.
string template = "Hi, {name} {surname}";
Dictionary<string, string> substitutions = new Dictionary<string, string>() {
{ "name", "Joe" },
{ "surname", "Bloggs" },
};
string result = substitutions.Aggregate(template, (args, pair) =>
args.Replace($"{{{pair.Key}}}", pair.Value)
);
// result == "Hi, Joe Bloggs"
This works by starting with the template and then iterating over each item in the substitution dictionary, replacing the occurrences of each one. The result of one Replace() call is fed into the input to the next, until all substitutions are performed.
The {{{pair.Key}}} bit is just to escape the { and } used to find a placeholder.
This is pretty old now, but as I've just come across it it's new to me!
It's a bit overkill for what you need, but I have used Handlebars.NET for this sort of thing.
You can create quite complex templates and merge in hierarchical data structures for the context. There's rules for looping and conditional sections, partial template compositing and even helper function extension points. It also handles many data types gracefully.
There's way too much to go into here, but a short example to illustrate...
var source = #"Hello {{Guest.FirstName}}{{#if Guest.Surname}} {{Guest.Surname}}{{/if}}!";
var template = Handlebars.Compile(source);
var rec = new {
Guest = new { FirstName = "Bob", Surname = null }
};
var resultString = template(rec);
In this case the surname will only be included in the output if the value is not null or empty.
Now admittedly this is more complicated for users than simple string interpolation, but remember that you can still just use {{fieldName}} if you want to, just that you can do lots more as well.
This particular nuGet is a port of HandlebarsJs so it has a high degree of compatibility. HandlebarsJs is itself a port of Mustache - there are direct dotNet ports of Mustache but IMHO HandlebarsNET is the business.

DocumentDB filter an array by an array

I have a document that looks essentially like this:
{
"Name": "John Smith",
"Value": "SomethingIneed",
"Tags: ["Tag1" ,"Tag2", "Tag3"]
}
My goal is to write a query where I find all documents in my database whose Tag property contains all of the tags in a filter.
For example, in the case above, my query might be ["Tag1", "Tag3"]. I want all documents whose tags collection contains Tag1 AND Tag3.
I have done the following:
tried an All Contains type linq query
var tags = new List<string>() {"Test", "TestAccount"};
var req =
Client.CreateDocumentQuery<Contact>(UriFactory.CreateDocumentCollectionUri("db", "collection"))
.Where(x => x.Tags.All(y => tags.Contains(y)))
.ToList();
Created a user defined function (I couldn't get this to work at all)
var tagString = "'Test', 'TestAccount'";
var req =
Client.CreateDocumentQuery<Contact>(UriFactory.CreateDocumentCollectionUri("db", "collection"),
$"Select c.Name, c.Email, c.id from c WHERE udf.containsAll([${tagString}] , c.Tags)").ToList();
with containsAll defined as:
function arrayContainsAnotherArray(needle, haystack){
for(var i = 0; i < needle.length; i++){
if(haystack.indexOf(needle[i]) === -1)
return false;
}
return true;
}
Use System.Linq.Dynamic to create a predicate from a string
var query = new StringBuilder("ItemType = \"MyType\"");
if (search.CollectionValues.Any())
{
foreach (var searchCollectionValue in search.CollectionValues)
{
query.Append($" and Collection.Contains(\"{searchCollectionValue}\")");
}
}
3 actually worked for me, but the query was very expensive (more than 2000 RUs on a collection of 10K documents) and I am getting throttled like crazy. My result set for the first iteration of my application must be able to support 10K results in the result set. How can I best query for a large number of results with an array of filters?
Thanks.
The UDF could be made to work but it would be a full table scan and so not recommended unless combined with other highly-selective criteria.
I believe the most performant (index-using) approach would be to split it into a series of AND statements. You could do this programmatically building up your query string (being careful to fully escape and user-provided data for security reasons). So, the resulting query would look like:
SELECT *
FROM c
WHERE
ARRAY_CONTAINS(c.Tags, "Tag1") AND
ARRAY_CONTAINS(c.Tags, "Tag3")

part of String contained in a List

Here's what I'm trying to do:
Create a list with some values from mysql.
Search this list with a variable ( I named it Existed )
If Existed contains a specific string, then do some actions.
Here's a sample of my list data:
List ( name users )
Facebook
Google
Yahoo
Strongman
Zombies
Stratovarius
If Existed inside users contains Strong, then perform some action.
My code so far is below. The problem is that it never enters the action and for some reason I believe it does not see "Strong" right.
List<string> users = dbm.FindManagers();
foreach (var Existed in users)
{
if (Existed.Contains(rName_Add_User_result))
{
dbm.AddSubuser(Existed, rName_result);
}
}
Can't reproduce. This works for me:
var rName_Add_User_result = " Strong ";
//List<string> users = dbm.FindManagers();
var users = new List<string>() {"Facebook", "Google", "Yahoo", "Strongman", "Zombies", "Stratovarius"};
foreach (var Existed in users.Where(u => u.ToUpper().Contains(rName_Add_User_result.ToUpper().Trim()))
{
//dbm.AddSubuser(Existed, rName_result);
Console.WriteLine(Existed);
}
Result:
Strongman
Not sure but could be because of case sensitivity. Try converting it to lower and then compare
if (Existed.ToLower().Contains(rName_Add_User_result))
{
dbm.AddSubuser(Existed, rName_result);
}

Keywords in Full Text Index Searching causing results to not come back

We are using Full Text Index Searching on a Company Name field.
We are using EF for the data layer, and I have been asked to not use stored procs.
Here is the method in my Data Access layer:
public Task<List<Company>> SearchByName(string searchText)
{
return DataContext.Company.SqlQuery(
"select CompanyId AS Id, * from Company.Company AS c where contains(c.Name, #SearchText)",
new SqlParameter("#SearchText", ParseSearchTextForMultiwordSearch(searchText)))
.ToListAsync();
}
We wanted to split the words out in the search and then concatonate them together for an AND search. This means that a query like "My Company" would actually be searched against the index for the words "My" and "Company".
This code does the merging of terms for the select query above.
public string ParseSearchTextForMultiwordSearch(string searchText)
{
var words = GetValidSearchTerms(searchText);
var quotedWords = words.Select(x => string.Format("\"{0}*\"", x));
return string.Join(" AND ", quotedWords);
}
Everything works great until you start adding in "key words". So far, we have figured out and, or, or not included in a search return 0 results. There is no error, there just are no results.
Here is our method that "blacklists" certain words so they are left out of the search query.
private static List<string> GetValidSearchTerms(string searchText)
{
//AND and OR are keywords used by SQL Server Full Text Indexing.
var blacklist = new string[] {
"and",
"or",
"not"
};
//Filter them out here
var words = searchText.Split(' ');
var validWords = words.Where(x => !blacklist.Contains(x));
return validWords.ToList();
}
The problem is that we just discovered another "keyword" that seems to be causing the issue. "Do" causes no results to come back. I can just dd it to the blacklist, but as this thing grows it is starting to feel like the wrong way to handle this.
Is there a better way to handle this?
EDIT:
A couple other scenarios
If I do not massage the search string at all, searching on the word "not" causes an error "Null or empty full-text predicate."
Same scenario, just applying the string as is, if I make a company "Company Do Not Delete", any versions of the string that have Do or Not in them return 0 results.
It's been a while since I posted this question and after a few iterations I came up with some search logic that works for our needs.
First off, we have a business rule that requires searching to include an ampersand. Full Text indexing seems to drop off the &, making the results that return incorrect. So, I had to special case any & searches to use a like statement instead.
I left my code to do do as above where it parses out a blacklist of words and tries a CONTAINS search. If that fails for any reason, I perform a FREETEXT search instead.
public async Task<List<Company>> SearchByName(string searchText)
{
var results = new List<Company>();
if (string.IsNullOrWhiteSpace(searchText))
return results;
if (searchText.IndexOf("&") >= 0)
{
var likeQuery = string.Format("%{0}%", searchText);
results = await DataContext.Company.SqlQuery("SELECT CompanyId AS Id, IsEligible AS IsReadOnly, *" +
" FROM Company.Company AS con" +
" WHERE con.Name LIKE #SearchText",
new SqlParameter("#SearchText", likeQuery))
.ToListAsync();
}
else
{
var terms = ParseSearchTextForMultiwordSearch(searchText);
if (string.IsNullOrWhiteSpace(terms))
return results;
// SqlQuery does not take any column mappings into account (https://entityframework.codeplex.com/workitem/233)
// So we have to manually map the columns in the select statement
var sqlQueryFormat = "SELECT CompanyId AS Id, IsEligible AS IsReadOnly, *" +
" FROM Company.Company AS con" +
" WHERE {0}(con.Name, #SearchText)";
var sqlQuery = string.Format(sqlQueryFormat, "CONTAINS");
var errored = false;
try
{
results = await DataContext.Company.SqlQuery(sqlQuery,
new SqlParameter("#SearchText", terms))
.ToListAsync();
}
catch
{
//catch the error but do nothing with it
errored = true;
}
//when the contains search fails due to some unknown error, use Freetext as a backup
if (errored)
{
sqlQuery = string.Format(sqlQueryFormat, "FREETEXT");
results = await DataContext.Company.SqlQuery(sqlQuery,
new SqlParameter("#SearchText", terms))
.ToListAsync();
}
}
return results;
}

Apply rules to data

In my (C# + SQL Server) application, the user will be able to define rules over data such as:
ITEM_ID = 1
OR (ITEM_NAME LIKE 'something' AND ITEM_PRICE > 123
AND (ITEM_WEIGHT = 456 OR ITEM_HEIGHT < 789))
The set of items to validate will always be different but they are not a huge number. However, the number of rules is high (let's say, 100000).
How can I check which rules validated (considering also into account performance) a given set of numbers?
This looks like your "rules" or conditions should be performed in C# instead.
If you are really going to feed 100,000 ORs and ANDs into the WHERE clause of your SQL statement, you are going to have a very hard time scaling your application. I can only imagine the mess of indexes you would have to have to have any arbitrary set of 100,000 conditions be applied to the data set and every permutation perform well.
Instead, I would run a basic select query and read each row and filter it in C# instead. Then you can track which conditions/rules do and don't pass for each row by applying each rule individually and tracking pass/fail.
Of course, if you are querying a very large table, then performance could become an issue, but you stated that "The set of items to validate ... are not a huge number" so I assume it would be relatively quick to bring back all the data for the table and perform your rules in code, or apply some fundamental filtering up front, then more specific filtering back in code.
Out of curiosity, how are the users entering these "rules", like:
ITEM_ID = 1
OR (ITEM_NAME LIKE 'something' AND ITEM_PRICE > 123
AND (ITEM_WEIGHT = 456 OR ITEM_HEIGHT < 789))
Please please please tell me they aren't entering actual SQL queries (in text form) and you are just appending them together, like:
var sql = "select * from myTable where ";
foreach(var rule in rules)
sql += rule;
Maybe some kind of rule-builder UI that builds up these SQL-looking statements?
You could use some of Microsoft's own parsing engine for T-SQL.
You can find them in the assemblies Microsoft.Data.Schema.ScriptDom.dll and Microsoft.Data.Schema.ScriptDom.Sql.dll.
TSql100Parser parser = new TSql100Parser(false);
IList<ParseError> errors;
Expression expr = parser.ParseBooleanExpression(
new StringReader(condition),
out errors
);
if (errors != null && errors.Count > 0)
{
// Error handling
return;
}
If you don't get any errors, the string is a valid filter expression. Though there might be some harmful expressions.
If you wish, you could run the expression trough your own visitor to detect any unwanted constructs (such as sub-queries). But be aware that you would have to override almost all of the 650 overloads, for both Visit(...) and ExplicitVisit(...). Partial classes would be good here.
When you are satisfied, could then build a complete SELECT statement, with all of the expressions:
var schemaObject = new SchemaObjectName();
schemaObject.Identifiers.Add(new Identifier {Value = "MyTable"});
var queryExpression = new QuerySpecification();
queryExpression.FromClauses.Add(
new SchemaObjectTableSource {SchemaObject = schemaObject});
// Add the expression from before (repeat as necessary)
Literal zeroLiteral = new Literal
{
LiteralType = LiteralType.Integer,
Value = "0",
};
Literal oneLiteral = new Literal
{
LiteralType = LiteralType.Integer,
Value = "1",
};
WhenClause whenClause = new WhenClause
{
WhenExpression = expr, // <-- here
ThenExpression = oneLiteral,
};
CaseExpression caseExpression = new CaseExpression
{
ElseExpression = zeroLiteral,
};
caseExpression.WhenClauses.Add(whenClause);
queryExpression.SelectElements.Add(caseExpression);
var selectStatement = new SelectStatement {QueryExpression = queryExpression};
... and turn it all back into a string:
var generator = new Sql100ScriptGenerator();
string query;
generator.GenerateScript(selectStatement, out query);
Console.WriteLine(query);
Output:
SELECT CASE WHEN ITEM_ID = 1
OR (ITEM_NAME LIKE 'something'
AND ITEM_PRICE > 123
AND (ITEM_WEIGHT = 456
OR ITEM_HEIGHT < 789)) THEN 1 ELSE 0 END
FROM MyTable
If this expression gets too large to handle, you could always split up the rules into chunks, to run a few at the time.
Though, to be allowed to redistribute the Microsoft.Data.Schema.ScriptDom.*.dll files, you have to own a licence of Visual Studio Team System (Is this included in at least VS Pro/Ultimate?)
Link: http://blogs.msdn.com/b/gertd/archive/2008/08/22/redist.aspx

Categories