How Lucene.net filter works - c#

i got a piece of code to add filter with Lucene.net but good explanation was not there to understand the code. so here i paste the code for explanation.
List<SearchResults> Searchresults = new List<SearchResults>();
string indexFileLocation = #"C:\o";
Lucene.Net.Store.Directory dir = Lucene.Net.Store.FSDirectory.GetDirectory(indexFileLocation);
string[] searchfields = new string[] { "fname", "lname", "dob", "id"};
IndexSearcher indexSearcher = new IndexSearcher(dir);
Filter fil= new QueryWrapperFilter(new TermQuery( new Term(field, "5/12/1998")));
var hits = indexSearcher.Search(QueryMaker(searchString, searchfields), fil);
for (int i = 0; i < hits.Length(); i++)
{
SearchResults result = new SearchResults();
result.fname = hits.Doc(i).GetField("fname").StringValue();
result.lname = hits.Doc(i).GetField("lname").StringValue();
result.dob = hits.Doc(i).GetField("dob").StringValue();
result.id = hits.Doc(i).GetField("id").StringValue();
Searchresults.Add(result);
}
i need explanation for the below two line
Filter fil= new QueryWrapperFilter(new TermQuery( new Term(field, "5/12/1998")));
var hits = indexSearcher.Search(QueryMaker(searchString, searchfields), fil);
i just like to know first lucene search & pull all data and after implement filter or from the beginning lucene pull data based on filter? please guide. thanks.

i just like to know first lucene search & pull all data and after implement filter or from the beginning lucene pull data based on filter? please guide. thanks.
Lucene.Net will perform your search AND your filtered query and after it, it will "merge" the result. The reason to do it I believe is to cache the filtered query, because it will be more likely to have a hit on the next time than the search query.

Related

Searching for a Netsuite InboundShipment using SuiteTalk

I am trying to do a search to find out if there is an existing InboundShipment in NetSuite with a given ExternalDocumentValue.
The problem I am having is the ExternalDocumentNumber is a string but the InboundShipmentSearch seems to be wanting a RecordRef array and I do not know what value to create the recordRef with. Here is my current code
InboundShipmentSearchAdvanced isa = new InboundShipmentSearchAdvanced();
// isa.criteria.basic.externalDocumentNumber.searchValue =
InboundShipmentSearchBasic ts = new InboundShipmentSearchBasic();
Client.SearchPreferences.bodyFieldsOnly = false;
isa.criteria = new InboundShipmentSearch();
isa.criteria.basic = new InboundShipmentSearchBasic();
isa.criteria.basic.externalDocumentNumber = new SearchMultiSelectField();
isa.criteria.basic.externalDocumentNumber.#operator =SearchMultiSelectFieldOperator.anyOf;
List<RecordRef> rrlist = new List<RecordRef>();
RecordRef rr = new RecordRef(); RecordType.
rr.name = "HJ_InboundShip_1"; // I don't think this is what I need to prime the record ref.
rrlist.Add(rr);
isa.criteria.basic.externalDocumentNumber.searchValue = rrlist.ToArray();
The issue is since that value is a string and does not really seem to relate to any linked record in the schema, I don't know how to set up the rec. ref for the search. I wondered if anyone had any idea of what I would need to do that.
RecordRef's are a way to define a record lookup for links to existing records, and need instantiation with the either the internalid or externalid of the record. See SuiteAnswers id 10801.

How to create and use a HMM Dynamic Bayesian Network in Bayes Server?

I'm trying to build a prediction module implementing a Hidden Markov Model type DBN in Bayes Server 7 C#. I managed to create the network structure but I'm not sure if its correct because their documentation and examples are not very comprehensive and I also don't fully understand how the prediction is meant to be done in the code after training is complete.
Here is a how my Network creation and training code looks:
var Feature1 = new Variable("Feature1", VariableValueType.Continuous);
var Feature2 = new Variable("Feature2", VariableValueType.Continuous);
var Feature3 = new Variable("Feature3", VariableValueType.Continuous);
var nodeFeatures = new Node("Features", new Variable[] { Feature1, Feature2, Feature3 });
nodeFeatures.TemporalType = TemporalType.Temporal;
var nodeHypothesis = new Node(new Variable("Hypothesis", new string[] { "state1", "state2", "state3" }));
nodeHypothesis.TemporalType = TemporalType.Temporal;
// create network and add nodes
var network = new Network();
network.Nodes.Add(nodeHypothesis);
network.Nodes.Add(nodeFeatures);
// link the Hypothesis node to the Features node within each time slice
network.Links.Add(new Link(nodeHypothesis, nodeFeatures));
// add a temporal link of order 5. This links the Hypothesis node to itself in the next time slice
for (int order = 1; order <= 5; order++)
{
network.Links.Add(new Link(nodeHypothesis, nodeHypothesis, order));
}
var temporalDataReaderCommand = new DataTableDataReaderCommand(evidenceDataTable);
var temporalReaderOptions = new TemporalReaderOptions("CaseId", "Index", TimeValueType.Value);
// here we map variables to database columns
// in this case the variables and database columns have the same name
var temporalVariableReferences = new VariableReference[]
{
new VariableReference(Feature1, ColumnValueType.Value, Feature1.Name),
new VariableReference(Feature2, ColumnValueType.Value, Feature2.Name),
new VariableReference(Feature3, ColumnValueType.Value, Feature3.Name)
};
var evidenceReaderCommand = new EvidenceReaderCommand(
temporalDataReaderCommand,
temporalVariableReferences,
temporalReaderOptions);
// We will use the RelevanceTree algorithm here, as it is optimized for parameter learning
var learning = new ParameterLearning(network, new RelevanceTreeInferenceFactory());
var learningOptions = new ParameterLearningOptions();
// Run the learning algorithm
var result = learning.Learn(evidenceReaderCommand, learningOptions);
And this is my attempt at prediction:
// we will now perform some queries on the network
var inference = new RelevanceTreeInference(network);
var queryOptions = new RelevanceTreeQueryOptions();
var queryOutput = new RelevanceTreeQueryOutput();
int time = 0;
// query a probability variable
var queryHypothesis = new Table(nodeHypothesis, time);
inference.QueryDistributions.Add(queryHypothesis);
double[] inputRow = GetInput();
// set some temporal evidence
inference.Evidence.Set(Feature1, inputRow[0], time);
inference.Evidence.Set(Feature2, inputRow[1], time);
inference.Evidence.Set(Feature3, inputRow[2], time);
inference.Query(queryOptions, queryOutput);
int hypothesizedClassId;
var probability = queryHypothesis.GetMaxValue(out hypothesizedClassId);
Console.WriteLine("hypothesizedClassId = {0}, score = {1}", hypothesizedClassId, probability);
Here I'm not even sure how to "Unroll" the network properly to get the prediction and what value to assign to the variable "time". If someone can shed some light on how this toolkit works, I would greatly appreciate it. Thanks.
The code looks fine except for the network structure, which should look something like this for an HMM (the only change to your code is the links):
var Feature1 = new Variable("Feature1", VariableValueType.Continuous);
var Feature2 = new Variable("Feature2", VariableValueType.Continuous);
var Feature3 = new Variable("Feature3", VariableValueType.Continuous);
var nodeFeatures = new Node("Features", new Variable[] { Feature1, Feature2, Feature3 });
nodeFeatures.TemporalType = TemporalType.Temporal;
var nodeHypothesis = new Node(new Variable("Hypothesis", new string[] { "state1", "state2", "state3" }));
nodeHypothesis.TemporalType = TemporalType.Temporal;
// create network and add nodes
var network = new Network();
network.Nodes.Add(nodeHypothesis);
network.Nodes.Add(nodeFeatures);
// link the Hypothesis node to the Features node within each time slice
network.Links.Add(new Link(nodeHypothesis, nodeFeatures));
// An HMM also has an order 1 link on the latent node
network.Links.Add(new Link(nodeHypothesis, nodeHypothesis, 1));
It is also worth noting the following:
You can add multiple distributions to 'inference.QueryDistributions' and query them all at once
While it is perfectly valid to set evidence manually and then query, see EvidenceReader, DataReader and either DatabaseDataReader or DataTableDataReader, if you want to execute the query over multiple records.
Check out the TimeSeriesMode on ParameterLearningOptions
If you want the 'Most probable explanation' set queryOptions.Propagation = PropagationMethod.Max; // an extension of the Viterbi algorithm for HMMs
Check out the following link:
https://www.bayesserver.com/docs/modeling/time-series-model-types
An Hidden Markov model (as a Bayesian network) has a discrete latent variable and a number of child nodes. In Bayes Server you can combine multiple variables in a child node, much like a standard HMM. In Bayes Server you can also mix and match discrete/continuous nodes, handle missing data, and add additional structure (e.g. mixture of HMM, and many other exotic models).
Regarding prediction, once you have built the structure from the link above, there is a DBN prediction example at https://www.bayesserver.com/code/
(Note that you can predict an individual variable in the future (even if you have missing data), you can predict multiple variables (joint probability) in the future, you can predict how anomalous the time series is (log-likelihood) and for discrete (sequence) predictions you can predict the most probable sequence.)
It it is not clear, ping Bayes Server Support and they will add an example for you.

DocumentDB filter an array by an array

I have a document that looks essentially like this:
{
"Name": "John Smith",
"Value": "SomethingIneed",
"Tags: ["Tag1" ,"Tag2", "Tag3"]
}
My goal is to write a query where I find all documents in my database whose Tag property contains all of the tags in a filter.
For example, in the case above, my query might be ["Tag1", "Tag3"]. I want all documents whose tags collection contains Tag1 AND Tag3.
I have done the following:
tried an All Contains type linq query
var tags = new List<string>() {"Test", "TestAccount"};
var req =
Client.CreateDocumentQuery<Contact>(UriFactory.CreateDocumentCollectionUri("db", "collection"))
.Where(x => x.Tags.All(y => tags.Contains(y)))
.ToList();
Created a user defined function (I couldn't get this to work at all)
var tagString = "'Test', 'TestAccount'";
var req =
Client.CreateDocumentQuery<Contact>(UriFactory.CreateDocumentCollectionUri("db", "collection"),
$"Select c.Name, c.Email, c.id from c WHERE udf.containsAll([${tagString}] , c.Tags)").ToList();
with containsAll defined as:
function arrayContainsAnotherArray(needle, haystack){
for(var i = 0; i < needle.length; i++){
if(haystack.indexOf(needle[i]) === -1)
return false;
}
return true;
}
Use System.Linq.Dynamic to create a predicate from a string
var query = new StringBuilder("ItemType = \"MyType\"");
if (search.CollectionValues.Any())
{
foreach (var searchCollectionValue in search.CollectionValues)
{
query.Append($" and Collection.Contains(\"{searchCollectionValue}\")");
}
}
3 actually worked for me, but the query was very expensive (more than 2000 RUs on a collection of 10K documents) and I am getting throttled like crazy. My result set for the first iteration of my application must be able to support 10K results in the result set. How can I best query for a large number of results with an array of filters?
Thanks.
The UDF could be made to work but it would be a full table scan and so not recommended unless combined with other highly-selective criteria.
I believe the most performant (index-using) approach would be to split it into a series of AND statements. You could do this programmatically building up your query string (being careful to fully escape and user-provided data for security reasons). So, the resulting query would look like:
SELECT *
FROM c
WHERE
ARRAY_CONTAINS(c.Tags, "Tag1") AND
ARRAY_CONTAINS(c.Tags, "Tag3")

LINQ to SQL Intersect Query Failing

I have a Frequently Asked Question (FAQ) database with columns: id, Question, Answer, Category, and Keywords.
I want to take input from a user and search my database for matches. I ultimately want to take the search string and retrieve all records where any of the search string is found in either the Question or Keyword columns.
Since I'm relatively new at Linq to Sql, I'm trying to get just a single column search working first, and then try to get to a double column search. My code is still failing. I've seen other posts similar the topic, but the recommended solutions do not work exactly for my situation. Attempts to tweak those solutions have failed.
Problematic portion of my code:
private FAQViewModel getSearchResultFAQs(string search)
{
FAQViewModel vm = new FAQViewModel();
vm.isSearch = true;
string[] searchTerms = search.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
//var ques = db.FAQs.Where(q => q.Keywords.Intersect(searchTerms).Any());
var ques = db.FAQs.Where(q => searchTerms.Any(t => q.Keywords.Contains(t)));
List<FAQ> list = new List<FAQ>();
foreach(var qs in ques)
{
FAQ f = new FAQ();
f.Question = qs.Question;
f.Answer = qs.Answer;
f.Category = qs.Category;
f.Keywords = qs.Keywords;
f.id = qs.id;
list.Add(f);
}
CategoryModel cm = new CategoryModel();
cm.faqs = list;
vm.faqs.Add(cm);
return vm;
}
This code fails. q.Keywords is underlined in red stating
string does not contain a definition for 'Intersect'
Any assistance would be greatly appreciated.
EDIT:
I commented out the bad line of code and used Gilad's first recommendation. I now get the following error:
"Local sequence cannot be used in LINQ to SQL implementations of query operators except the Contains operator."
It doesn't like the foreach codeblock:
foreach(var qs in ques)
I'm really out of my depth here but its so close to working.

Dynamic Search using Suitetalk

I am trying to create a C# application (Using suitetalk) that would allow us to search through Netsuite records. The record type will be specified dynamically. Please can you help?
I have checked the webservices and identified that SearchRecord class has many sub classes of type AccountSearch, ItemSearch, etc.
However, I wanted to do these searches dynamically.
AccountSearch acc = new AccountSearch();
SearchResult searchresult = new SearchResult();
searchresult = _service.search(acc);
The above code gives me the list of accounts. But, the AccountSearch is hardcoded here.
The piece of code below works.
SearchRecord search;
SearchRecord searchCriteria;
SearchRecordBasic searchBasicCriteria;
if(recordType.equals(RecordType.account)){
search = new AccountSearchAdvanced();
searchCriteria = new AccountSearch();
searchBasicCriteria = new AccountSearchBasic();
//set criteria on searchBasicCriteria
((AccountSearch) searchCriteria).setBasic((AccountSearchBasic) searchBasicCriteria);
((AccountSearchAdvanced) search).setCriteria((AccountSearch) searchCriteria);
}else if(recordType.equals(RecordType.customer)){
search = new CustomerSearchAdvanced();
searchCriteria = new CustomerSearch();
searchBasicCriteria = new CustomerSearchBasic();
//set criteria on searchBasicCriteria
((CustomerSearch) searchCriteria).setBasic((CustomerSearchBasic) searchBasicCriteria);
((CustomerSearchAdvanced) search).setCriteria((CustomerSearch) searchCriteria);
}else{
search = null;
}
if(search != null) _service.search(search);
But I think a better solution would be to create specific methods for each type of search. That way the code is more readable, plus you avoid all that casting. Then you would have to handle the returned RecordList for each specific record type.

Categories