lucene.net sort not working access violation - c#

I am trying to sort my results in lucene
I keep getting this error however
An unhandled exception of type 'System.AccessViolationException' occurred in Search.dll
Additional information: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
I have tried setting Field.Index to analysed and not analysed but no joy.
Analyzer analyzer = new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29);
var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, "Title", analyzer);
Query query = parser.Parse(searchTerm.Trim() + "*");
var searcher = new IndexSearcher(directory, true);
var sortBy = new Lucene.Net.Search.Sort(new Lucene.Net.Search.SortField("Title", Lucene.Net.Search.SortField.STRING, true));
var filter = new QueryWrapperFilter(query);
// TopDocs topDocs3 = searcher.Search(query, filter, 500,sortBy);
// TopDocs topDocs = searcher.Search(query,500);
TopDocs topDocs2 = searcher.Search(query,null, 500, new Sort(new SortField("Title", SortField.STRING)));
var re = searcher.Search(query, null, 10, new Sort(new SortField("id", SortField.INT, true)));

I have encountered the same error when trying to order my search results in LUCENE_30. I must say I wrote this example in a hurry and is not tested.
What I did was the following:
string sortText = Enum.GetName(typeof(SortableFields), sortBy);
SortField field = new SortField(sortText, SortField.STRING, sortDesc);
var sortByField = new Lucene.Net.Search.Sort(field);
TopFieldCollector collector = Lucene.Net.Search.TopFieldCollector.Create(sortByField, MaxSearchResultsReturned, false, false, false, false);
using (Analyzer analyzer = new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30))
{
var queryParse = new QueryParser(Lucene.Net.Util.Version.LUCENE_30, IndexFields.FullText, analyzer);
queryParse.AllowLeadingWildcard = true;
Query query = queryParse.Parse(searchText);
using (var searcher = new IndexSearcher(directory, true))
{
searcher.Search(query, collector);
totalRows = collector.TotalHits;
TopDocs matches = collector.TopDocs(skip, take);
// convert results to known objects
var results = new List<SearchResult>();
foreach (var item in matches.ScoreDocs)
{
int id = item.Doc;
Document doc = searcher.Doc(id);
SearchResult result = new SearchResult();
result.ID = doc.GetField("ID").StringValue;
results.Add(result);
}
}
}
return results;

Related

Scroll for all records ElasticSearch C# Nest

Query to scroll at the matching records from the query
this is the query of nest in C# to get all the records from nest C# find many questions which can solve it by using different method linq method but i want to do this this way any suggestions help would be appreciated
string[] MERCHANTNO = MerchantId.Split(",");
var mustClause = new List<QueryContainer>();
var filterClause = new List<QueryContainer>();
var filters = new List<QueryContainer>();
filters.Add(new TermsQuery{
Field = new Field("MERCHANTNO"),
Terms = MERCHANTNO,
});
Logger.LogInformation(clsName, funcName, "Filter Clause is:", filters);
var SearchRequest = new SearchRequest<AcquirerDTO>(idxName) {
Size = 10000,
SearchType = Elasticsearch.Net.SearchType.QueryThenFetch,
Scroll = "5m",
Query = new BoolQuery { Must = filters }
};
var searchResponse = await _elasticClient.SearchAsync<AcquirerDTO>( SearchRequest );
The code for Scroll all the Records you have in ElasticSearch is
Filter
filters.Add(new TermsQuery {
Field = new Field("MERCHANTNO"), >>> Value needs to be searched
Terms = MERCHANTNO,
});
Date Range Filter
filterClause.Add(new DateRangeQuery {
Boost = 1.1,
Field = new Field("filedate"),
GreaterThanOrEqualTo = DateMath.Anchored(yesterday),
LessThanOrEqualTo = DateMath.Anchored(Today),
Format = "yyyy-MM-dd",
TimeZone = "+01:00"
});
Search Request for scrolling
var SearchRequest = new SearchRequest<AcquirerDTO>(idxName) {
From = 0,
Scroll = scrollTimeoutMinutes,
Size = scrollPageSize,
Query = new BoolQuery
{
Must = filters,
Filter = filterClause
}
};
var searchResponse = await _elasticClient.SearchAsync<AcquirerDTO>(SearchRequest);
if (searchResponse.ApiCall.ResponseBodyInBytes != null) {
var requestJson = System.Text.Encoding.UTF8.GetString(searchResponse.ApiCall.RequestBodyInBytes);
var JsonFormatQuery = JsonConvert.SerializeObject(JsonConvert.DeserializeObject(requestJson), Formatting.Indented);
}
This is the code for Scrolling all the results in kibana
List<AcquirerDTO> results = new List<AcquirerDTO>();
if (searchResponse.Documents.Any())
results.AddRange(searchResponse.Documents);
string scrollid = searchResponse.ScrollId;
bool isScrollSetHasData = true;
while (isScrollSetHasData)
{
ISearchResponse<AcquirerDTO> loopingResponse = _elasticClient.Scroll<AcquirerDTO>(scrollTimeoutMinutes, scrollid);
if (loopingResponse.IsValid)
{
results.AddRange(loopingResponse.Documents);
scrollid = loopingResponse.ScrollId;
}
isScrollSetHasData = loopingResponse.Documents.Any();
}
var records = results;

IList<T> return as a generic

I'm a beginner for coding ,and I was trying to create a search engine , but there s a part I dont know how to solve it that returns an IList as a generic,
public IList<T> Search<T>(string textSearch)
{
IList<T> list = new List<T>();
var result = new DataTable();
using (Analyzer analyzer = new PanGuAnalyzer())
{
var queryParser = new QueryParser(Version.LUCENE_30, "FullText", analyzer);
queryParser.AllowLeadingWildcard = true;
var query = queryParser.Parse(textSearch);
var collector = TopScoreDocCollector.Create(1000, true);
Searcher.Search(query, collector);
var matches = collector.TopDocs().ScoreDocs;
result.Columns.Add("Title");
result.Columns.Add("Starring");
result.Columns.Add("ID");
foreach (var item in matches)
{
var id = item.Doc;
var doc = Searcher.Doc(id);
var row = result.NewRow();
row["Title"] = doc.GetField("Title").StringValue;
row["Starring"] = doc.GetField("Starring").StringValue;
row["ID"] = doc.GetField("ID").StringValue;
result.Rows.Add(row);
}
}
return result;
}
but in this code , I couldn't return result ,it says Cannot Implicitly convert type 'Data.DataTable' to 'Generic.IList',An explicit conversion exists.so how can I solve this?
I guess you don't want to support generics since it doesn't make sense and is impossible. You have a class, for example Film, then return a List<Film>, you don't need the DataTable:
public IList<Film> SearchFilms(string textSearch)
{
IList<Film> list = new List<Film>();
using (Analyzer analyzer = new PanGuAnalyzer())
{
var queryParser = new QueryParser(Version.LUCENE_30, "FullText", analyzer);
queryParser.AllowLeadingWildcard = true;
var query = queryParser.Parse(textSearch);
var collector = TopScoreDocCollector.Create(1000, true);
Searcher.Search(query, collector);
var matches = collector.TopDocs().ScoreDocs;
foreach (var item in matches)
{
var film = new Film();
var id = item.Doc;
var doc = Searcher.Doc(id);
film.Title = doc.GetField("Title").StringValue;
film.Starring = doc.GetField("Starring").StringValue;
film.ID = doc.GetField("ID").StringValue;
list.Add(film);
}
}
return list;
}
Your return statement should be
result.AsEnumerable().ToList();
Don't forget to add namespace
using System.Linq;

how to process hits on lucene 3.03

List<SearchResults> Searchresults = new List<SearchResults>();
// Specify the location where the index files are stored
string indexFileLocation = #"D:\Lucene.Net\Data\Persons";
Lucene.Net.Store.Directory dir = FSDirectory.Open(indexFileLocation);
// specify the search fields, lucene search in multiple fields
string[] searchfields = new string[] { "FirstName", "LastName", "DesigName", "CatagoryName" };
IndexSearcher indexSearcher = new IndexSearcher(dir);
// Making a boolean query for searching and get the searched hits
Query som = QueryMaker(searchString, searchfields);
int n = 1000;
TopDocs hits = indexSearcher.Search(som,null,n);
for (int i = 0; i <hits.TotalHits; i++)
{
SearchResults result = new SearchResults();
result.FirstName = hits.ScoreDocs.GetValue(i).ToString();
result.FirstName = hits.Doc.GetField("FirstName").StringValue();
result.LastName = hits.Doc(i).GetField("LastName").StringValue();
result.DesigName = hits.Doc(i).GetField("DesigName").StringValue();
result.Addres = hits.Doc(i).GetField("Addres").StringValue();
result.CatagoryName = hits.Doc(i).GetField("CatagoryName").StringValue();
Searchresults.Add(result);
}
i have table fields first name last name .... how can i process hit to get the values from the search result
i have an error that says TopDocs does not contain defination for doc
Lean on the compiler. There is no property or method called Doc in TopDocs class. In ScoreDocs property of TopDocs class you have list of hits with document number and score. You need to use this document number to get actual document. After that use method Doc which is in IndexSearcher to query for document with this number. And then you can get stored field data from that document.
You can process results like that:
foreach (var scoreDoc in hits.ScoreDocs)
{
var result = new SearchResults();
var doc = indexSearcher.Doc(scoreDoc.Doc);
result.FirstName = doc.GetField("FirstName").StringValue;
result.LastName = doc.GetField("LastName").StringValue;
result.DesigName = doc.GetField("DesigName").StringValue;
result.Addres = doc.GetField("Addres").StringValue;
result.CategoryName = doc.GetField("CategoryName").StringValue;
Searchresults.Add(result);
}
Or in more LINQ way:
var searchResults =
indexSearcher
.Search(som, null, n)
.ScoreDocs
.Select(scoreDoc => indexSearcher.Doc(scoreDoc))
.Select(doc =>
{
var result = new SearchResults();
result.FirstName = doc.GetField("FirstName").StringValue;
result.LastName = doc.GetField("LastName").StringValue;
result.DesigName = doc.GetField("DesigName").StringValue;
result.Addres = doc.GetField("Addres").StringValue;
result.CategoryName = doc.GetField("CategoryName").StringValue;
return result;
})
.ToList();
Separation of hits method will let you clear the matched documents and in future if you want to highlight the matched documents then you can easily embed the lucene.net highlighter in getMatchedHits method.
List<SearchResults> Searchresults = new List<SearchResults>();
// Specify the location where the index files are stored
string indexFileLocation = #"D:\Lucene.Net\Data\Persons";
Lucene.Net.Store.Directory dir = FSDirectory.Open(indexFileLocation);
// specify the search fields, lucene search in multiple fields
string[] searchfields = new string[] { "FirstName", "LastName", "DesigName", "CatagoryName" };
IndexSearcher indexSearcher = new IndexSearcher(dir);
// Making a boolean query for searching and get the searched hits
Query som = QueryMaker(searchString, searchfields);
int n = 1000;
var hits = indexSearcher.Search(som,null,n).ScoreDocs;
Searchresults = getMatchedHits(hits,indexSearcher);
getMatchedHits method code:
public static List<SearchResults> getMatchedHits(ScoreDoc[] hits, IndexSearcher searcher)
{
List<SearchResults> list = new List<SearchResults>();
SearchResults obj;
try
{
for (int i = 0; i < hits.Count(); i++)
{
// get the document from index
Document doc = searcher.Doc(hits[i].Doc);
string strFirstName = doc.Get("FirstName");
string strLastName = doc.Get("LastName");
string strDesigName = doc.Get("DesigName");
string strAddres = doc.Get("Addres");
string strCategoryName = doc.Get("CategoryName");
obj = new SearchResults();
obj.FirstName = strFirstName;
obj.LastName = strLastName;
obj.DesigName= strDesigName;
obj.Addres = strAddres;
obj.CategoryName = strCategoryName;
list.Add(obj);
}
return list;
}
catch (Exception ex)
{
return null; // or throw exception
}
}
Hope it Helps!

Field Boosting Doesn't Work/Effect Lucene.net

I'm trying to set boosting on documents fields to make the search results more accurate but as i see it doesn't work
however
here is my code
Indexing:
private static void _addToLuceneIndex(Datafile Datafile, IndexWriter writer)
{
// remove older index entry
var searchQuery = new TermQuery(new Term("Id", Datafile.article.Id.ToString()));
writer.DeleteDocuments(searchQuery);
// add new index entry
var doc = new Document();
var id = new Field("Id", Datafile.article.Id.ToString(), Field.Store.YES, Field.Index.NOT_ANALYZED);
var content = new Field("Content", Datafile.article.Content, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS);
content.Boost = 4;
var title = new Field("Title", Datafile.article.Title, Field.Store.YES, Field.Index.ANALYZED);
title.Boost = 6;
doc.Add(id);
doc.Add(content);
doc.Add(title);
foreach (var item in Datafile.article.Article_Tag)
{
var tmpta = new Field("Atid", item.Id.ToString(), Field.Store.YES, Field.Index.NOT_ANALYZED);
var tagname = new Field("Tagname", item.Tag.name, Field.Store.YES, Field.Index.ANALYZED);
tagname.Boost = 8;
doc.Add(tmpta);
doc.Add(tagname);
}
// add lucene fields mapped to db fields
// add entry to index
writer.AddDocument(doc);
}
i've used Lukenet to see if the fields boosted however it doesn't and boosting still equal to 1.0
so i tried to run and test it ,but the result disappoint me anyway
here is my search code:
searching:
private static IEnumerable<Datafile> _search(string searchQuery, string searchField = "")
{
// validation
if (string.IsNullOrEmpty(searchQuery.Replace("*", "").Replace("?", "")))
return new List<Datafile>();
var indexReader = IndexReader.Open(Directory, false);
// set up lucene searcher
using (var searcher = new IndexSearcher(indexReader))
{
var hits_limit = 1000;
// search by single field
var enanalyzer = new SnowballAnalyzer(Version.LUCENE_30, "English");
var aranalyzer = new SnowballAnalyzer(Version.LUCENE_30, "Arabic");
string[] fields = new string[] { "Title", "Content", "Tagname" };
// Dictionary<string, float> boosts = new Dictionary<string, float>();
// boosts.Add("Title", 5);
// boosts.Add("Content", 3);
// boosts.Add("Tagname", 7);
var enparser = new MultiFieldQueryParser(Version.LUCENE_30, fields, enanalyzer);
var arparser = new MultiFieldQueryParser(Version.LUCENE_30, fields, aranalyzer);
var query = QueryModel(searchQuery, new QueryParser[] { enparser, arparser });
searcher.SetDefaultFieldSortScoring(true, false);
TopFieldCollector collector = TopFieldCollector.Create(new Sort(new SortField(null, SortField.SCORE, false), new SortField("Title", SortField.STRING, true), new SortField("Tagname", SortField.STRING, true), new SortField("Content", SortField.STRING, true)),
hits_limit,
false, // fillFields - not needed, we want score and doc only
true, // trackDocScores - need doc and score fields
true, // trackMaxScore - related to trackDocScores
false); // should docs be in docId order?
searcher.Search(query, collector);
var hits = collector.TopDocs().ScoreDocs;
var results = new List<Datafile>();
foreach (var hit in hits)
{
var doc = searcher.Doc(hit.Doc);
var df = _mapLuceneDocumentToData(doc);
df.score = hit.Score;
results.Add(df);
}
searcher.Dispose();
return results;
// search by multiple fields (ordered by RELEVANCE)
}
}
QueryModel Method:
private static Query QueryModel(string searchQuery, QueryParser[] parsers)
{
BooleanQuery query = new BooleanQuery();
searchQuery = "*" + searchQuery + "*";
foreach (var parser in parsers)
{
parser.AllowLeadingWildcard = true;
var thequery = parser.Parse(searchQuery);
query.Add(new BooleanClause(thequery, Occur.SHOULD));
}
return query;
}
i'm new with lucene.net i love it but i can't get my head around this problem
PS:
also i want to get a fuzzy query as like when the user enter :
city in russua to get a result as if he enter: city in russia
i tried FuzzyQuery Class But it doesn't work anyway ,and is it necessary to use FuzzyQuery Class or not to get that result
So Since no one answer my question and i have found a solution for this issue i've used a search time query boosting and here is my code:
var QParser = new QueryParser(Version.LUCENE_30, "Content", analyzer);
QParser.AllowLeadingWildcard = true;
var Query = new QParser.Parse(searchQuery);
Query.Boost = 7.0f;
return Query;
you can use BooleanQuery if you want to Do an Or,And search

Lucene returns same exact search results no matter the search term

Here is my code
term = Server.UrlDecode(term);
string indexFileLocation = "C:\\lucene\\Index\\post";
Lucene.Net.Store.Directory dir =
Lucene.Net.Store.FSDirectory.GetDirectory(indexFileLocation, false);
//create an index searcher that will perform the search
Lucene.Net.Search.IndexSearcher searcher = new
Lucene.Net.Search.IndexSearcher(dir);
//build a query object
Lucene.Net.Index.Term searchTerm =
new Lucene.Net.Index.Term("post_title", term);
Lucene.Net.Analysis.Standard.StandardAnalyzer analyzer = new Lucene.Net.Analysis.Standard.StandardAnalyzer();
Lucene.Net.QueryParsers.QueryParser queryParser = new
Lucene.Net.QueryParsers.QueryParser("post_title", analyzer);
Lucene.Net.Search.Query query = queryParser.Parse(term);
//execute the query
Lucene.Net.Search.Hits hits = searcher.Search(query);
List<string> s = new List<string>();
for (int i = 0; i < hits.Length(); i++)
{
Lucene.Net.Documents.Document doc = hits.Doc(i);
s.Add(doc.Get("post_title_raw"));
}
ViewData["s"] = s;
here is my indexing code
//create post lucene index
LuceneType lt = new LuceneType();
lt.Analyzer = new Lucene.Net.Analysis.Standard.StandardAnalyzer();
lt.Writer = new Lucene.Net.Index.IndexWriter("C:/lucene/Index/post", lt.Analyzer, true);
using (var context = new MvcApplication1.Entity.test2Entities())
{
var posts = from p in context.post
where object.Equals(p.post_parentid, null) && p.post_isdeleted == false
let Answers = from a in context.post
where a.post_parentid == p.post_id
select new
{
a.post_description
}
let Comments = from c in context.comment
where c.post.post_id == p.post_id
select new
{
c.comment_text
}
select new
{
p,
Answers,
Comments
};
foreach (var post in posts)
{
//lets concate all the answers and comments
StringBuilder answersSB = new StringBuilder();
StringBuilder CommentsSB = new StringBuilder();
foreach (var answer in post.Answers)
answersSB.Append(answer.post_description);
foreach (var comment in post.Comments)
CommentsSB.Append(comment.comment_text);
//add rows
lt.Doc.Add(new Lucene.Net.Documents.Field(
"post_id",
post.p.post_id.ToString(),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.UN_TOKENIZED
));
lt.Doc.Add(new Lucene.Net.Documents.Field(
"post_title",
new System.IO.StringReader(post.p.post_title)));
lt.Doc.Add(new Lucene.Net.Documents.Field(
"post_title_raw",
post.p.post_title,
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
lt.Doc.Add(new Lucene.Net.Documents.Field(
"post_titleslug",
post.p.post_titleslug,
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
lt.Doc.Add(new Lucene.Net.Documents.Field(
"post_tagtext",
new System.IO.StringReader(post.p.post_tagtext)));
lt.Doc.Add(new Lucene.Net.Documents.Field(
"post_tagtext",
post.p.post_tagtext,
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
lt.Doc.Add(new Lucene.Net.Documents.Field(
"post_description",
new System.IO.StringReader(post.p.post_description)));
lt.Doc.Add(new Lucene.Net.Documents.Field(
"post_description_raw",
post.p.post_description,
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
lt.Doc.Add(new Lucene.Net.Documents.Field(
"post_Answers",
new System.IO.StringReader(answersSB.ToString())));
lt.Doc.Add(new Lucene.Net.Documents.Field(
"post_Comments",
new System.IO.StringReader(CommentsSB.ToString())));
}
lt.Writer.AddDocument(lt.Doc);
lt.Writer.Optimize();
lt.Writer.Close();
why does this return the same reuslts for any search term?
Lucene.Net.Search.Query query = queryParser.Parse(term);
In the code above instead of searchterm you have used term
Your code must be like below
Lucene.Net.Search.Query query = queryParser.Parse(searchterm);
You can make some small alteration as like below
//build a query object
Lucene.Net.Index.Term searchTerm =
new Lucene.Net.Index.Term("post_title", term);
TermQuery tq = new TermQuery(searchTerm);
......
......
Lucene.Net.Search.Query query = tq;
Now there is no need of Parser.
IF still u need parser then you can change the above line as
Lucene.Net.Search.Query query = queryParser.Parse(tq.ToString());
Hope this helps.
Not a direct answer, but get LUKE (It works with .NET indexes too) and open your index -- Try to use it's querier using the right type of optimizer. If that works, you know the problem is in your querying. If it doesn't it could be in both the indexing and the querying, but at least this ought to get you on the right track.

Categories