Phrase search on PostgreSQL 11 using Entity Framework

Phrase search on PostgreSQL 11 using Entity Framework - c#

I've implemented Full Text on PostgreSQL 11 using an API made with .NET Core 3 and Entity Framework Core 3.
I'm able to filter with a single word, but as soon as I use two or more words it doesn't work anymore.
I've tried using the proximity operator <->, using also 10~20 words distance, like <10> between each word, but it doesn't work, and I don't get any result back.
Here is the code:
public async Task<List<QuestionListItemSearch>> SearchQuestions(string search)
{
search = search.Replace(" ", "<20>");
return await (from uiEnt in base.context.UserInfos
join efq in base.context.ExpertsForumQuestions on uiEnt.UserInfoId equals efq.UserInfoId
join efa in base.context.ExpertsForumAnswers on efq.ExpertsForumQuestionId equals efa.ExpertsForumQuestionId into ExpertsForum
from n in ExpertsForum.DefaultIfEmpty()
where efq.SearchVector.Matches(search)
group new { uiEnt, efq, n } by new {
uiEnt.FirstName,
uiEnt.LastName,
efq.ExpertsForumQuestionId,
efq.Title,
efq.Description,
efq.SubmissionDate,
SearchRanking = efq.SearchVector.Rank(EF.Functions.ToTsQuery(search)) } into newGroup
select new QuestionListItemSearch
{
ExpertsForumQuestionId = newGroup.Key.ExpertsForumQuestionId,
Title = newGroup.Key.Title,
Description = newGroup.Key.Description,
SubmissionDate = newGroup.Key.SubmissionDate,
FirstNameSubmitter = newGroup.Key.FirstName,
LastNameSubmitter = newGroup.Key.LastName,
NumberOfAnswers = newGroup.Max(x => x.n.ExpertsForumAnswerId) == null ? 0 : newGroup.Count(),
SearchRanking = newGroup.Key.SearchRanking
}).OrderByDescending(q => q.SearchRanking).ToListAsync();
}
I've tried directly the query to PostgreSQL using SQL but I can't get any result:
SELECT "Title", "SearchVector", ts_rank(c."SearchVector", to_tsquery('crop<10>rotation')) FROM PUBLIC."ExpertsForumQuestions" c
WHERE (c."SearchVector" ## to_tsquery('crop<10>rotation'))
Please note that the database has several records with both crop and rotation words in the text. There is also a record containing the two words close to each other "crop rotation"
Using the operator | instead of the proximity operator it works.
Is there a simple way to perform a phase search like in Microsoft SQL Server?

Related

DocumentDB filter an array by an array

I have a document that looks essentially like this:
{
"Name": "John Smith",
"Value": "SomethingIneed",
"Tags: ["Tag1" ,"Tag2", "Tag3"]
}
My goal is to write a query where I find all documents in my database whose Tag property contains all of the tags in a filter.
For example, in the case above, my query might be ["Tag1", "Tag3"]. I want all documents whose tags collection contains Tag1 AND Tag3.
I have done the following:
tried an All Contains type linq query
var tags = new List<string>() {"Test", "TestAccount"};
var req =
Client.CreateDocumentQuery<Contact>(UriFactory.CreateDocumentCollectionUri("db", "collection"))
.Where(x => x.Tags.All(y => tags.Contains(y)))
.ToList();
Created a user defined function (I couldn't get this to work at all)
var tagString = "'Test', 'TestAccount'";
var req =
Client.CreateDocumentQuery<Contact>(UriFactory.CreateDocumentCollectionUri("db", "collection"),
$"Select c.Name, c.Email, c.id from c WHERE udf.containsAll([${tagString}] , c.Tags)").ToList();
with containsAll defined as:
function arrayContainsAnotherArray(needle, haystack){
for(var i = 0; i < needle.length; i++){
if(haystack.indexOf(needle[i]) === -1)
return false;
}
return true;
}
Use System.Linq.Dynamic to create a predicate from a string
var query = new StringBuilder("ItemType = \"MyType\"");
if (search.CollectionValues.Any())
{
foreach (var searchCollectionValue in search.CollectionValues)
{
query.Append($" and Collection.Contains(\"{searchCollectionValue}\")");
}
}
3 actually worked for me, but the query was very expensive (more than 2000 RUs on a collection of 10K documents) and I am getting throttled like crazy. My result set for the first iteration of my application must be able to support 10K results in the result set. How can I best query for a large number of results with an array of filters?
Thanks.

The UDF could be made to work but it would be a full table scan and so not recommended unless combined with other highly-selective criteria.
I believe the most performant (index-using) approach would be to split it into a series of AND statements. You could do this programmatically building up your query string (being careful to fully escape and user-provided data for security reasons). So, the resulting query would look like:
SELECT *
FROM c
WHERE
ARRAY_CONTAINS(c.Tags, "Tag1") AND
ARRAY_CONTAINS(c.Tags, "Tag3")

c# mongodb case sensitive search

I have a collection in which I store Email and password of user.
I obviously don't want to require the user to insert his email case sensitive and to be exactly as when he first registered.
I'm using mongodb 2.0 c# driver, I'm repeating it because I saw solutions to queries written with regex but I'm afraid I cant user it in here.
my query looks like
var filter = Builders<ME_User>.Filter.And(
Builders<ME_User>.Filter.Eq(u => u.Email, email),
Builders<ME_User>.Filter.Eq(u => u.Password, password));
ME_User foundUser = null;
var options = new FindOptions<ME_User>
{
Limit = 1
};
using (var cursor = await manager.User.FindAsync(filter, options))
{
while (await cursor.MoveNextAsync())
{
var batch = cursor.Current;
foreach (ME_User user in batch)
foundUser = user;
}
}
I have an issue with disorder, kill me, but I cant allow myself save this data again with lower case and have 2 copies of the same thing. Also, I want the email to be saved EXACTLY like the user inserted it.

Filtering on string fields in Mongodb is case sensitive without using regular expressions. Why exactly you cannot use regular expressions?
Your query can be edited like this:
var filter = Builders<ME_User>.Filter.And(
Builders<ME_User>.Filter.Regex(u => u.Email, new BsonRegularExpression("/^" + email + "$/i"),
Builders<ME_User>.Filter.Eq(u => u.Password, password));
Notice the "^" and "$" signs to specify a complete word search and most important the case-insensitive operator at the end of the regular expression ("/i").
Another way vould be the Text search, that requires the creation of a text index and is case insensitive for latin alphabet: http://docs.mongodb.org/manual/reference/operator/query/text/#match-operation
In C#, you will use with the Text Filter:
var filter = Builders<ME_User>.Filter.And(
Builders<ME_User>.Filter.Text(email),
Builders<ME_User>.Filter.Eq(u => u.Password, password));
With a text index query in an OR clause, you will need to create an index on Password field as well, otherwise the OR query will produce an error:
Other non-TEXT clauses under OR have to be indexed as well

I'd prefer using Linq.
var match = theCollection.AsQueryable().SingleOrDefault(x =>
x.Email.ToLower() == emailToSearchFor.ToLower());

For the c# driver 2.1 (MongoDB 3.0)
By default its case-sensitive.
To case-insensitive just add of "i" in BsonRegularExpression. please refer below
var filter = Builders<Users>.Filter.Regex(k=>k.Name, new BsonRegularExpression(name, "i"));
var users = await _mongoDbContext.Users.Find(filter).ToListAsync();

Using Telerik ORM it is throwing error when joining 2 tables from different contexts

I am trying to use a join in LINQ to join to tables that are in different contexts. When I join two tables in the same context it works, so I believe my join is OK, but when I join two tables that are in different contexts I get an error. Is it possible to join on two tables that are from different contexts?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using RmmDal.Contexts.RmmCrm;
using RmmDal.Contexts.LMS;
using Telerik.OpenAccess;
namespace ConsoleApplication_Test_ORM
{
class Program
{
static void Main(string[] args)
{
RmmDal.Contexts.RmmCrm.RmmCrmContext dbContextRmmCrm = new RmmDal.Contexts.RmmCrm.RmmCrmContext();
RmmDal.Contexts.LMS.LMS_000Context dbContextLMS = new RmmDal.Contexts.LMS.LMS_000Context();
try
{
Guid LeadId = new Guid("9EF2874C-D37F-4503-A3D8-1A73774BFBBC");
//This doesn't work, I think because it is using 2 seperate Contexts
//I need this to work
var Leads1 = from lo in dbContextLMS.Tbl_Loan_Appls
join la in dbContextRmmCrm.LeadApplications
on lo.Appl_No equals la.Appl_No
select new
{
SSN = lo.Cust_SSN,
TDCCustID = lo.Cust_ID
};
//This works, I think because they are the same context
var Leads2 = from lo in dbContextLMS.Tbl_Loan_Appls
join la in dbContextLMS.Tbl_Customers
on lo.Cust_ID equals la.Cust_ID
select new
{
SSN = lo.Cust_SSN,
TDCCustID = lo.Cust_ID
};
var something = Leads1.FirstOrDefault();
var something2 = Leads1.FirstOrDefault();
}
catch (Exception ex)
{
throw ex;
}
}
}
}
Here is the error that is thrown:
An exception occured during the execution of 'Extent<RmmDal.Tbl_Loan_Appl>().Join(Extent<RmmDal.Contexts.RmmCrm.LeadApplication>(), lo => lo.Appl_No, la => la.Appl_No, (lo, la) => new <>f__AnonymousType0`2(SSN = lo.Cust_SSN, TDCCustID = lo.Cust_ID))'. Failure: Object reference not set to an instance of an object.
See InnerException for more details.
Complete Expression:
.Call System.Linq.Queryable.Join(
.Constant<Telerik.OpenAccess.Query.ExtentQueryImpl`1[RmmDal.Tbl_Loan_Appl]>(Extent<RmmDal.Tbl_Loan_Appl>()),
.Constant<Telerik.OpenAccess.Query.ExtentQueryImpl`1[RmmDal.Contexts.RmmCrm.LeadApplication]>(Extent<RmmDal.Contexts.RmmCrm.LeadApplication>()),
'(.Lambda #Lambda1<System.Func`2[RmmDal.Tbl_Loan_Appl,System.Int64]>),
'(.Lambda #Lambda2<System.Func`2[RmmDal.Contexts.RmmCrm.LeadApplication,System.Int64]>),
'(.Lambda #Lambda3<System.Func`3[RmmDal.Tbl_Loan_Appl,RmmDal.Contexts.RmmCrm.LeadApplication,<>f__AnonymousType0`2[System.String,System.Int64]]>))
.Lambda #Lambda1<System.Func`2[RmmDal.Tbl_Loan_Appl,System.Int64]>(RmmDal.Tbl_Loan_Appl $lo) {
$lo.Appl_No
}
.Lambda #Lambda2<System.Func`2[RmmDal.Contexts.RmmCrm.LeadApplication,System.Int64]>(RmmDal.Contexts.RmmCrm.LeadApplication $la)
{
$la.Appl_No
}
.Lambda #Lambda3<System.Func`3[RmmDal.Tbl_Loan_Appl,RmmDal.Contexts.RmmCrm.LeadApplication,<>f__AnonymousType0`2[System.String,System.Int64]]>(
RmmDal.Tbl_Loan_Appl $lo,
RmmDal.Contexts.RmmCrm.LeadApplication $la) {
.New <>f__AnonymousType0`2[System.String,System.Int64](
$lo.Cust_SSN,
$lo.Cust_ID)
}

Joining entities that come from two different contexts is not supported by design.
The only way to join these data sets is to use in-memory join like Trust me - I'm a Doctor suggested.
The brute force method can be just to call .ToList() on both context endpoints and then use the in-memory data in the join query. This will be inefficient and problematic since a lot of data will be put in memory and possibly discarded after the join is performed so expect really bad performance.
A more efficient way will be page through the results from the left side and use a .Contains() method to filter out the "joined" records from the right side.
// Load a small fragment of leads in memory
var Leads1 = dbContextLMS.Tbl_Loan_Appls.Skip(0).Take(10).ToList();
// find the IDs
var leadIds = Leads1.Select(l= > l.Appl_No);
// filter out only the matching applications
var applications = dbContextRmmCrm.LeadApplications
.Where(a=> leadIds.Contains(a.Appli_No))
.Select(a=> new { SSN = a.Cust_SSN, TDCCustID = а.Cust_ID });
Paging is required in order to work with a small subset of data so the .Contains() clause can be safely translated into an SQL IN clause. You will have to wrap the code snippet in a loop and increment the Skip() and Take() parameters accordingly.

dynamic query to azure tables

I'm using azure table storage to store blog posts. Each blog post can have different tags.
So I'm going to have three different tables.
One which will store the blog posts.
One to store the tags
One that will store the relation between the tags and posts
So my question is as following, is it possible to create dynamic search queuries? Because I do not know until at run time how many tags I want to search.
As I understand it you can only query azure table using LINQ. Or can I input a string query that I can change dynamically?
UPDATE
Here's some example data that's in the blog table
PartitionKey,RowKey,Timestamp,Content,FromUser,Tags
user1, 1, 2012-08-08 13:57:23, "Hello World", "root", "yellow,red"
blogTag table
PartitionKey,RowKey,Timestamp,TagId,TagName
"red", "red", 2012-08-08 11:40:29, 1, red
"yellow", "yellow", 2012-08-08 11:40:29, 2, yellow
relation table
PartitionKey,RowKey,Timestamp,DataId,TagId
1, 1, 2012-08-08 11:40:29, 1, 1
2, 1, 2012-08-08 13:57:23, 1, 2
One usage example of these tables is for example when I want to get all blog post with certain tag.
I have to query the tagId from the blogTag table
There after I need to search in the relation table for the dataId
Lastly I need to search blog table for blog post with that dataId
I'm using LINQ to perform the query and it looks like following
CloudTableQuery<DataTag> tagIds = (from e in ctx2.CreateQuery<DataTag>("datatags")
where e.PartitionKey == tags
select e).AsTableServiceQuery<DataTag>();
I tried Gaurav Mantri suggestion of using filter, and it works. But I'm afraid of how the effiency of that will be. And about the limitation of 15 discrete comparison that's only allowed.

You can simple build where clause and pass to where method for example:
var whereClause="(PartitionKey eq 'Key1') and (PartitionKey eq 'Key2')"
CloudStorageAccount storageAccount = CloudStorageAccount.Parse("AccountDetails");
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
CloudTable table = tableClient.GetTableReference(<TableName>);
table.CreateIfNotExists();
TableQuery<YourAzureTableEntity> query =
new TableQuery<YourAzureTableEntity>()
.Where(whereClause));
var list = table.ExecuteQuery(query).ToList();

I am also facing exactly same problem. I did find one solution which I am pasting below:
public static IEnumerable<T> Get(CloudStorageAccount storageAccount, string tableName, string filter)
{
string tableEndpoint = storageAccount.TableEndpoint.AbsoluteUri;
var tableServiceContext = new TableServiceContext(tableEndpoint, storageAccount.Credentials);
string query = string.Format("{0}{1}()?filter={2}", tableEndpoint, tableName, filter);
var queryResponse = tableServiceContext.Execute<T>(new Uri(query)) as QueryOperationResponse<T>;
return queryResponse.ToList();
}
Basically it utilizes DataServiceContext's Execute(Uri) method: http://msdn.microsoft.com/en-us/library/cc646700.aspx.
You would need to specify the filter condition as you would do if you're invoking the query functionality through REST API (e.g. PartitionKey eq 'mypk' and RowKey ge 'myrk').
Not sure if this is the best solution :) Looking forward to comments on this.

It is possible, but it may not be a good idea. Adding multiple query parameters like that always results in a table scan. That's probably OK in a small table, but if your tables are going to be large it will be very slow. For large tables, you're better off running a separate query for each key combination.
That said, you can build a dynamic query with some LINQ magic. Here is the helper class I've used for that:
public class LinqBuilder
{
/// <summary>
/// Build a LINQ Expression that roughly matches the SQL IN() operator
/// </summary>
/// <param name="columnValues">The values to filter for</param>
/// <returns>An expression that can be passed to the LINQ .Where() method</returns>
public static Expression<Func<RowType, bool>> BuildListFilter<RowType, ColumnType>(string filterColumnName, IEnumerable<ColumnType> columnValues)
{
ParameterExpression rowParam = Expression.Parameter(typeof(RowType), "r");
MemberExpression column = Expression.Property(rowParam, filterColumnName);
BinaryExpression filter = null;
foreach (ColumnType columnValue in columnValues)
{
BinaryExpression newFilterClause = Expression.Equal(column, Expression.Constant(columnValue));
if (filter != null)
{
filter = Expression.Or(filter, newFilterClause);
}
else
{
filter = newFilterClause;
}
}
return Expression.Lambda<Func<RowType, bool>>(filter, rowParam);
}
public static Expression<Func<RowType, bool>> BuildComparisonFilter<RowType, ColumnType>(string filterColumnName, Func<MemberExpression, BinaryExpression> buildComparison)
{
ParameterExpression rowParam = Expression.Parameter(typeof(RowType), "r");
MemberExpression column = Expression.Property(rowParam, filterColumnName);
BinaryExpression filter = buildComparison(column);
return Expression.Lambda<Func<RowType, bool>>(filter, rowParam);
}
}
You would use it something like this:
var whereClause = BuildListFilter(queryColumnName, columnValues);
CloudTableQuery<RowType> query = (from r in tableServiceContext.CreateQuery<MyRow>("MyTable")
where r.PartitionKey == partitionKey
select r)
.Where(whereClause) //Add in our multiple where clauses
.AsTableServiceQuery(); //Convert to table service query
var results = query.ToList();
Note also that the Table service enforces a maximum number of constraints per query. The documented maximum is 15 per query, but when I last tried this (which was some time ago) the actual maximum was 14.

Building something like this in table storage is quite cumbersome; akin to forcing a square peg in a round hole.
Instead you could considered using Blob storage to store your Blogs and Lucene.NET to implement your search of tags. Lucene would also allow more complex searches like (Tag = "A" and Tag = "B" and Tag != "C") and in addition would also allow searching over the blog text itself, if you so choose.
http://code.msdn.microsoft.com/windowsazure/Azure-Library-for-83562538

Regex Replace to assist Orderby in LINQ

I'm using LINQ to SQL to pull records from a database, sort them by a string field, then perform some other work on them. Unfortunately the Name field that I'm sorting by comes out of the database like this
Name
ADAPT1
ADAPT10
ADAPT11
...
ADAPT2
ADAPT3
I'd like to sort the Name field in numerical order. Right now I'm using the Regex object to replace "ADAPT1" with "ADAPT01", etc. I then sort the records again using another LINQ query. The code I have for this looks like
var adaptationsUnsorted = from aun in dbContext.Adaptations
where aun.EventID == iep.EventID
select new Adaptation
{
StudentID = aun.StudentID,
EventID = aun.EventID,
Name = Regex.Replace(aun.Name,
#"ADAPT([0-9])$", #"ADAPT0$1"),
Value = aun.Value
};
var adaptationsSorted = from ast in adaptationsUnsorted
orderby ast.Name
select ast;
foreach(Adaptation adaptation in adaptationsSorted)
{
// do real work
}
The problem I have is that the foreach loop throws the exception
System.NotSupportedException was unhandled
Message="Method 'System.String Replace(System.String, System.String,
System.String)' has no supported translation to SQL."
Source="System.Data.Linq"
I'm also wondering if there's a cleaner way to do this with just one LINQ query. Any suggestions would be appreciated.

Force the hydration of the elements by enumerating the query (call ToList). From that point on, your operations will be against in-memory objects and those operations will not be translated into SQL.
List<Adaptation> result =
dbContext.Adaptation
.Where(aun => aun.EventID = iep.EventID)
.ToList();
result.ForEach(aun =>
aun.Name = Regex.Replace(aun.Name,
#"ADAPT([0-9])$", #"ADAPT0$1")
);
result = result.OrderBy(aun => aun.Name).ToList();

Implement a IComparer<string> with your logic:
var adaptationsUnsorted = from aun in dbContext.Adaptations
where aun.EventID == iep.EventID
select new Adaptation
{
StudentID = aun.StudentID,
EventID = aun.EventID,
Name = aun.Name,
Value = aun.Value
};
var adaptationsSorted = adaptationsUnsorted.ToList<Adaptation>().OrderBy(a => a.Name, new AdaptationComparer ());
foreach (Adaptation adaptation in adaptationsSorted)
{
// do real work
}
public class AdaptationComparer : IComparer<string>
{
public int Compare(string x, string y)
{
string x1 = Regex.Replace(x, #"ADAPT([0-9])$", #"ADAPT0$1");
string y1 = Regex.Replace(y, #"ADAPT([0-9])$", #"ADAPT0$1");
return Comparer<string>.Default.Compare(x1, y1);
}
}
I didn't test this code but it should do the job.

I wonder if you can add a calculated+persisted+indexed field to the database, that does this for you. It would be fairly trivial to write a UDF that gets the value as an integer (just using string values), but then you can sort on this column at the database. This would allow you to use Skip and Take effectively, rather than constantly fetching all the data to the .NET code (which simply doesn't scale).

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.