Why doesnt' Lucene remove docs?

Why doesnt' Lucene remove docs? - c#

I'm using Lucene.NET 2.3.1 with a MultiSearcher.
For testing purposes, I'm indexing a database with about 10000 rows. I have two indexes and I select randomly which to insert each row in. This works correctly, but as Lucene doesn't have update capabilities, I have to test if the row exists (I have an Id field) and then delete it.
I have a List and a List, and each is created with this code:
IndexModifier mod = new IndexModifier(path, new StandardAnalyzer(), create);
m_Modifiers.Add(mod);
m_Readers.Add(IndexReader.Open(path));
m_Searchers.Add(new IndexSearcher(path));
Now the delete code:
Hits results = m_Searcher.Search(new TermQuery(t));
for (int i = 0; i < results.Length(); i++)
{
DocId = results .Id(i);
Index = m_Searcher.SubSearcher(DocId);
DocId = m_Searcher.SubDoc(DocId);
m_Modifiers[Index].DeleteDocument(DocId);
}
The search is correct and I'm getting results when the row exists. SubSearcher returns always 0 or 1, if Index is 0, SubDoc returns the same ID passed, and if it's 1, then it returns around the number passed minus 5000 times the number of times I have indexed the DB. It seems as if it wasn't deleting anything.
Each time I index the database, I optimize and close the indices, and Luke says it has no pending deletions.
What could be the problem?

I am not sure what's the end goal of this activity, so pardon if the following solution doesn't meet your requirements.
First, if you want to delete documents, you can use IndexReader, which you have already created. IndexModifier is not required.
Second, you need not find the subsearcher ID and document ID in that subsearcher. You can as well use the top-level MultiReader. I would write the equivalent java code as follows.
IndexReader[] readers = new IndexReader[size];
// Initialize readers
MultiReader multiReader = new MultiReader(readers);
IndexSearcher searcher = new IndexSearcher(multiReader);
Hits results = searcher.search(new TermQuery(t));
for (int i = 0; i < results.length(); i++) {
int docID = results.id(i);
multiReader.deleteDocument(docID);
}
multiReader.commit(); // Check if this throws an exception.
multiReader.close();
searcher.close();

Related

Querying parse in unity project returns all blank data

I am implementing the Parse Unity SDK in order to have a high score system. I run a query on my data to get the top ten players and their scores. (It should be sorted by score). For some reason when my code is run, I get a blank string for the name and a 0 for the score even though my data has real values in it.
Here is the query:
int[] scores = new int[10];
string[] names = new string[10];
int i = 0;
var query = ParseObject.GetQuery ("HighScores").OrderByDescending ("score").Limit (10);
query.FindAsync().ContinueWith (t =>
{
IEnumerable<ParseObject> results = t.Result;
foreach (var obj in results)
{
scores[i] = obj.Get<int>("score");
names[i] = obj.Get<string>("playerName");
i++;
}
});
The class name is "HighScores" and I am trying to access the score ("score") and player name ("playerName") of each saved entry.
EDIT:
I found that there are zero results returned so it must be something with the query. I don't see what could be wrong with it.
8/17/15
I still have not found out what is going on with my query. Any ideas?

It turns out that I was getting the data from the query. The query was fine all along. The actual issue was that I was trying to output my newly found scores to a string that was being called before the query was finished getting the data since the query is an asynchronous call. Instead of doing it this way, I let the query run fully and once the query is finished, I set a static bool called finishedRunningQuery to true. Now in the update() method I have it check: if (finishedRunningQuery) then update the high score text. This fixes the issue.

Is this a DDD rule?

Ok so I have a database-table with 1000 rows.
From these I need to randomly extract 4 entries.
This is a business rule. I could easily do the random thing in LINQ or SQL. But my Domain project must be independent, not referencing any other project.
So I should have a list there, load it with all the 1000 rows and randomly extract 4 to be DDD-clean.
Is this ok? What if the db-table has 100k rows?

If the primary keys were sequential and not interrupted then that would yield large performance benefits for the 100k or beyond tables. Even if they are not sequential I believe you can just check for that and iterate lightly to find it.
Basically you are going to want to get a count of the table
var rowCount = db.DbSet<TableName>().Count(); //EF for pseudo
And then get 4 random numbers in that range
var rand = new Random();
var randIds = Enumerable.Range(0,4).Select(i => rand.Next(0,rowCount).Array();
And then iterate through that getting the records by id (this could also be done using a contains and the id, I was not sure which was faster but with only 4 find should execute quickly).
var randomRecords = new List<TableName>();
foreach(int id in randIds)
{
var match = db.DbSet<TableName>().Find(id);
//if the id is not present, then you can iterate a little to find it
//or provide a custom column for the exact record as you indicate in comments
while(match != null)
{
match = db.DbSet<TableName>().Find(++id);
}
randomRecords.Add(match);
}

Building on Travis's code, this should work for you. It basically gets a count of records, generates 4 random numbers, then asks for the nth record in the table and adds it to the result list.
var rowCount = db.TableName.Count();
var rand = new Random();
var randIds = Enumerable.Range(0,4).Select(i => rand.Next(0,rowCount));
var randomRecords = new List<TableName>();
foreach(int id in randIds)
{
var match = db.TableName
.OrderBy(x=>x.id) // Order by the primary key -- I assumed id
.Skip(id).First();
randomRecords.Add(match);
}
You could also do something like this, IF you have an autoincrementing id field that is the primary key. The caveat is this isn't a fixed time function since you aren't sure how many loops may be required:
var idMax = db.TableName.Max(t=>t.id);
var rand = new Random();
var randomRecords = new List<TableName>();
while(randomRecords.Count()<4)
{
var match = db.TableName.Find(rand.Next(0,idMax));
if(match!=null)
randomRecords.Add(match);
}
If you don't care for absolute randomness (This is very very not random, with some things weighted more than others), but this is the fastest method, requiring only one database trip:
var idMax = db.TableName.Max(t=>t.id);
var rand = new Random();
var randIds = Enumerable.Range(0,4).Select(i => rand.Next(1,idMax));
var query=db.TableName.Where(t=>false);
foreach(int id in randIds)
{
query=query.Concat(db.TableName.OrderBy(t=>t.id).Where(t=>t.id>=id).Take(1));
}
var randomRecords=query.ToList();

List<List> confusion

snippets of my code
List<List<optionsSort>> stocks = new List<List<optionsSort>>();
optionsSort tempStock1 = new optionsSort();
List<optionsSort> stock = new List<optionsSort>();
then some code,
for (int j = 1; j < optionsSt.Count; j++)
{
if (optionsSt[j].isin == optionsSt[j - 1].isin)
{
tempStock1.name = optionsSt[j].name;
tempStock1.date = optionsSt[j].date;
tempStock1.strike = optionsSt[j].strike;
tempStock1.size = optionsSt[j].size;
tempStock1.isin = optionsSt[j].isin;
tempStock1.callPut = optionsSt[j].callPut;
stock.Add(tempStock1);
}
else
{
stocks.Add(stock);
k = k + 1;
stock.Clear();
tempStock1.name = optionsSt[j].name;
tempStock1.date = optionsSt[j].date;
tempStock1.strike = optionsSt[j].strike;
tempStock1.size = optionsSt[j].size;
tempStock1.isin = optionsSt[j].isin;
tempStock1.callPut = optionsSt[j].callPut;
stock.Add(tempStock1);
}
}//endfor
Basicly, im going through a large list to sort elements into groups, a new List name stocks.
now the problem is, when I add to stocks all elements contained in the list stock and then clear stock on the next line to start again, I delete all the elements I have stored in stocks.
Any Ideas. Do I have to index stocks like stocks[i].Add(stock) so each block of similar stocks is an element in stocks.
Thanks for any help.

The problem is that List<T> objects, like all classes in .NET, are reference types. That means that every time you add stock to stocks you aren't adding a new list, you are only adding a reference to the same list in memory. So when you later call Clear, that is reflected both in your variable stock and in all other references in stocks.
You can resolve this by making a shallow copy of stock every time you add it to stocks:
stocks.Add(stock.ToList());

You're not creating a new list, you're using one list, and filling it and clearing it repeatedly. Since your outer list contains only one list, repeated multiple times, that list will have the same contents in every instance. That is, when you clear your list, you can no longer access the old contents, even if you try to access them from inside the outer list.
What you need to do is to change this line:
stock.Clear();
To this:
stock = new List<optionsSort>();
That is what you really meant. :)

Finalise SQLite 3 statement

I'm developing metro app using Windows 8 release preview and C#(VS 2012), I'm new to SQLite, I integrated SQLite 3.7.13 in my App and it is working fine, Observe my code below
var dbPath = Path.Combine(Windows.Storage.ApplicationData.Current.LocalFolder.Path, "Test.db");
using (var db = new SQLite.SQLiteConnection(dbPath))
{
var data = db.Table<tablename>().Where(tablename => tablename.uploaded_bool == false && tablename.Sid == 26);
try
{
int iDataCount = data.Count();
int id;
if (iDataCount > 0)
{
for (int i = 0; i < iDataCount; i++)
{
Elements = data.ElementAt(i);
id = Elements.id;
/*
Doing some code
*/
}
int i = db.Delete<tablename>(new tablename() { Sid = 26 });
}
}
catch (Exception ex)
{
}
}
where "Sid" is column in my database and with number "26" i will get n number of rows
So, using a for loop i need to do some code and after the for loop I need to delete records of Sid(26) in database, So at this line
int i = db.Delete<tablename>(new tablename() { Sid = 26 });
I'm getting unable to close due to unfinalised statements exception, So my question is how to finalise the statement in sqlite3,Apparently SQLite3 has a finalize method for destroying previous DB calls but I am not sure how to implement this. Please help me.

Under the covers sqlite-net does some amazing things in an attempt to manage queries and connections for you.
For example, the line
var data = db.Table<tablename>().Where(...)
Does not actually establish a connection or execute anything against the database. Instead, it creates an instance of a class called TableQuery which is enumerable.
When you call
int iDataCount = data.Count();
TableQuery actually executes
GenerateCommand("count(*)").ExecuteScalar<int>();
When you call
Elements = data.ElementAt(i);
TableQuery actually calls
return Skip(index).Take(1).First();
Take(1).First() eventually calls GetEnumerator, which compiles a SQLite command, executes it with TOP 1, and serializes the result back into your data class.
So, basically, every time you call data.ElementAt you are executing another query. This is different from standard .NET enumerations where you are just accessing an element in a collection or array.
I believe this is the root of your problem. I would recommend that instead of getting the count and using a for(i, ...) loop, you simply do foreach (tablename tn in data). This will cause all records to be fetched at once instead of record by record in the loop. That alone may be enough to close the query and allow you to delete the table during the loop. If not, I recommend you create a collection and add each SID to the collection during the loop. Then, after the loop go back and remove the sids in another pass. Worst case scenario, you could close the connection between loops.
Hope that helps.

split SortedList to multiple lists or arrays [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How to split an array into a group of n elements each?
I believe I oversimplified this question so I am editing it a bit. From within a .NET 3.5 console application I have a SortedList string,string that will contain an unknown number of key/value pairs. I will get this collection by reading in rows from a table within a Microsoft Word document. The user will then be able to add additional items into this collection. Once the user has finished adding to the collection I then need to write the collection back to a new Microsoft Word document. The difficulty is that the items must be written back to the document in alphabetical order to a multicolumn table, first down the left side of the table and then down the right side of the table and since the output will likely be spread across multiple pages I need to also keep the order across multiple pages. So the first table on the first page may contain A through C on the left side of the table and C through F on the right side of the table then if the table exceeds the page a new table is needed. The new table may contain F through I and the right side L through O.Since the table will likely span multiple pages and I know the maximum number of rows per table per page I can do the math to determine how many tables I will need overall. This image is representative of the output:
For the sake of brevity if a output table can contain a maximum of 7 rows per page and 2 items per row and I have 28 items then I will need to write the output to 2 tables but of course I won't really know how many tables I will need until I read in the data so I can't simply hardcode the number of output tables.
What is the best way to take my SortedList and split it out into n collections in order to create the table structure described?

It is not necessary to split the list (if the only purpose is to write items in a table).
You can just iterate through the list and write row breaks in appropriate places.
for (int i = 0; i < sortedList.Count; i++)
{
if (i % 3 == 0)
{
Console.Write("|"); // write beginning of the row
}
Console.Write(sortedList[i].ToString().PadRight(10)); // write cell
Console.Write("|"); // write cell divider
if (i % 3 == 2)
{
Console.WriteLine() // write end of the row
}
}
// optional: write empty cells if sortedList.Count % 3 != 0
// optional: write end of the row if sortedList.Count % 3 != 2
You should extend your question by specifying what is the output of your script. If you want to write a table to the console, the above solution is probably the best. However, if you are using rich user interface (such as WinForms or ASP.NET), you should use built-in tools and controls to display data in table.

I played with LINQ a little bit and came up with this solution. It creates some kind of tree structure based on the "input parameters" (rowsPerPage and columnsPerPage). The columns on the last page could not have the same size (the code can be easily fixed if it is a problem).
SortedList<string, string> sortedList ... // input sortedList
int rowsPerPage = 7;
int columnsPerPage = 2;
var result = from col in
(from i in sortedList.Select((item, index) => new { Item = item, Index = index })
group i by (i.Index / rowsPerPage) into g
select new { ColumnNumber = g.Key, Items = g })
group col by (col.ColumnNumber / columnsPerPage) into page
select new { PageNumber = page.Key, Columns = page };
foreach (var page in result)
{
Console.WriteLine("Page no. {0}", page.PageNumber);
foreach (var col in page.Columns)
{
Console.WriteLine("\tColumn no. {0}", col.ColumnNumber);
foreach (var item in col.Items)
{
Console.WriteLine("\t\tItem key: {0}, value: {1}", item.Item.Key, item.Item.Value);
}
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.