Ok so I have a database-table with 1000 rows.
From these I need to randomly extract 4 entries.
This is a business rule. I could easily do the random thing in LINQ or SQL. But my Domain project must be independent, not referencing any other project.
So I should have a list there, load it with all the 1000 rows and randomly extract 4 to be DDD-clean.
Is this ok? What if the db-table has 100k rows?
If the primary keys were sequential and not interrupted then that would yield large performance benefits for the 100k or beyond tables. Even if they are not sequential I believe you can just check for that and iterate lightly to find it.
Basically you are going to want to get a count of the table
var rowCount = db.DbSet<TableName>().Count(); //EF for pseudo
And then get 4 random numbers in that range
var rand = new Random();
var randIds = Enumerable.Range(0,4).Select(i => rand.Next(0,rowCount).Array();
And then iterate through that getting the records by id (this could also be done using a contains and the id, I was not sure which was faster but with only 4 find should execute quickly).
var randomRecords = new List<TableName>();
foreach(int id in randIds)
{
var match = db.DbSet<TableName>().Find(id);
//if the id is not present, then you can iterate a little to find it
//or provide a custom column for the exact record as you indicate in comments
while(match != null)
{
match = db.DbSet<TableName>().Find(++id);
}
randomRecords.Add(match);
}
Building on Travis's code, this should work for you. It basically gets a count of records, generates 4 random numbers, then asks for the nth record in the table and adds it to the result list.
var rowCount = db.TableName.Count();
var rand = new Random();
var randIds = Enumerable.Range(0,4).Select(i => rand.Next(0,rowCount));
var randomRecords = new List<TableName>();
foreach(int id in randIds)
{
var match = db.TableName
.OrderBy(x=>x.id) // Order by the primary key -- I assumed id
.Skip(id).First();
randomRecords.Add(match);
}
You could also do something like this, IF you have an autoincrementing id field that is the primary key. The caveat is this isn't a fixed time function since you aren't sure how many loops may be required:
var idMax = db.TableName.Max(t=>t.id);
var rand = new Random();
var randomRecords = new List<TableName>();
while(randomRecords.Count()<4)
{
var match = db.TableName.Find(rand.Next(0,idMax));
if(match!=null)
randomRecords.Add(match);
}
If you don't care for absolute randomness (This is very very not random, with some things weighted more than others), but this is the fastest method, requiring only one database trip:
var idMax = db.TableName.Max(t=>t.id);
var rand = new Random();
var randIds = Enumerable.Range(0,4).Select(i => rand.Next(1,idMax));
var query=db.TableName.Where(t=>false);
foreach(int id in randIds)
{
query=query.Concat(db.TableName.OrderBy(t=>t.id).Where(t=>t.id>=id).Take(1));
}
var randomRecords=query.ToList();
Related
I have a Frame<int, string> which consists of a OHLCV data. I'm calculating technical analysis indicators for that Frame and since the first few records aren't accurate due to the fact that there are at the very begin, I have to remove them. How do I do that?
public override Frame<int, string> PopulateIndicators(Frame<int, string> dataFrame)
{
var candles = dataFrame.Rows.Select(kvp => new Candle
{
Timestamp = kvp.Value.GetAs<DateTime>("Timestamp"),
Open = kvp.Value.GetAs<decimal>("Open"),
High = kvp.Value.GetAs<decimal>("High"),
Low = kvp.Value.GetAs<decimal>("Low"),
Close = kvp.Value.GetAs<decimal>("Close"),
Volume = kvp.Value.GetAs<decimal>("Volume")
}).Observations.Select(e => e.Value).ToList<IOhlcv>();
// TODO: Truncate/remove the first 50 rows
dataFrame.AddColumn("Rsi", candles.Rsi(14));
}
Most operations in Deedle are expressed in terms of row keys, rather than indices. The idea behind this is that, if you work with ordederd data, you should have some ordered row keys.
This means that this is easier to do based on row keys. However, if you have an ordered row index, you can get a key at a certain location and then use it for filtering in Where. I would try something like:
var firstKey = dataFrame.GetRowKeyAt(50);
var after50 = dataFrame.Where(kvp => kvp.Key > firstKey);
I'm reading result from an json file inside the local project.it returns more than 4000 result.I want to get only random number of results (between 500- 1000) from that result.
var finalResultz = finalResults("All","All","All");//step one
in here it returns more than 4000 results.then I put into a list like this.
List<Results> searchOne = new List<Results>();//step two
foreach(var itms in finalResultz)
{
searchOne.Add(new Results
{
resultDestination = returnRegionName(itms.areaDestination),
mainImageurl = itms.testingImageUrl
});
}
ViewBag.requested = searchOne;
but I want to get only the results like I said.I want to resize the count in step one or in step two.how can I do that.hope your help.
If you want a random count of results, you can just .Take() a random number of the records. First, you'll need a Random:
var random = new Random();
If you want between 500-1000, just get a random value in that range:
var count = random.Next(500, 1001);
Then you can take those records from the collection:
var newList = oldList.Take(count).ToList();
(Of course, you may want to make sure it contains that many records first.)
Note that this will take the first N records from the collection. So in order to take random records from anywhere in the collection, you'd need to shuffle (randomize) the collection before taking the records. There are a number of ways you can do that. One approach which may not be the absolute fastest but is generally "fast enough" for simplicity is to just sort by a GUID. So something like this:
var newList = oldList.OrderBy(x => Guid.NewGuid()).Take(count).ToList();
Or maybe use the randomizer again:
var newList = oldList.OrderBy(x => random.Next()).Take(count).ToList();
You can use Random class and Take() method to extract N elements.
// create new instance of random class
Random rnd = new Random();
// get number of elements that will be retrieved from 500 to 1000
var elementsCount = rnd.Next(500, 1000);
// order source collection by random numbers and then take N elements:
var searchOne = finalResultz.OrderBy(x => rnd.Next()).Take(elementsCount);
Say I have List<string> FontStyle containing the following
"a0.png",
"b0.png",
"b1.png",
"b2.png",
"b3.png",
"c0.png",
"c1.png",
"d0.png",
"d1.png",
"d2.png"
I want to randomly select a string from the list with its first character matches a certain character. For example if the character is c. The method will returns either c0.png or c1.png randomly.
How do I do this using LINQ?
This should do the trick:
var random = new Random();
var list = new List<string> {
"a0.png",
"b0.png",
"b1.png",
"b2.png",
"b3.png",
"c0.png",
"c1.png",
"d0.png",
"d1.png",
"d2.png"
};
var startingChar = "d";
var filteredList = list.Where(s => s.StartsWith(startingChar)).ToList();
Console.WriteLine(filteredList.Count);
int index = random.Next(filteredList.Count);
Console.WriteLine(index);
var font = filteredList[index];
Console.WriteLine(font);
but the problem with the entire solution is that the smaller the resulting filtered list is the less likely you are to get really random values. The Random class works much better on much larger constraints - so just keep that in mind.
Random random = ...;
var itemsStartingWithC = input
.Where(x => x.StartsWith("c"))
.ToList();
var randomItemStartingWithC =
itemsStartingWithC.ElementAt(random.Next(0, itemsStartingWithC.Count()));
The call to ToList isn't strictly necessary, but results in faster code in this instance. Without it, Count() will fully enumerate and ElementAt will need to enumerate to the randomly selected index.
How might I take two random records from a list using Linq?
Random rnd = new Random();
var sequence = Enumerable.Range(1, 2).Select(n => lst[rnd.Next(0, lst.Count)]).ToList();
For Linq-to-Objects and EF4 it's pretty simple
db.Users.OrderBy(r => Guid.NewGuid()).Take(2)
For Linq-to-SQL You can check this article
http://michaelmerrell.com/2010/03/randomize-result-orders-in-t-sql-and-linq-to-sql/
Add function Random mapped to SQL function NEWID to DataContext.
partial class DataContext
{
[Function(Name = "NEWID", IsComposable = true)]
public Guid Random()
{
throw new NotImplementedException();
}
}
Usage
var qry = from row in DataBase.Customers
where row.IsActive
select row;
int count = qry.Count();
int index = new Random().Next(count);
Customer cust = qry.Skip(index).FirstOrDefault();
There is no direct way. You can try this, not pretty though.
int randomRecord = new Random().Next() % List.Count(); //To make sure its valid index in list
var qData = List.Skip(randomRecord).Take(1);
var qValue = qData.ToList().First();
This is what ended up working for me, it ensures no duplicates are returned:
public List<T> GetRandomItems(List<T> items, int count = 3)
{
var length = items.Count();
var list = new List<T>();
var rnd = new Random();
var seed = 0;
while (list.Count() < count)
{
seed = rnd.Next(0, length);
if(!list.Contains(items[seed]))
list.Add(items[seed]);
}
return list;
}
Why do you want to use Linq to get two random records?
Create a Random instance and get two random number whose values are less than the length of the list.
List has Indexer property, so doing List[index] is not costly.
Keep it simple. Always prefer readability. If you just make things complicated, the programmers who are going to maintain your code will have hard time.
I am just curious to know why exactly you want to do this in Linq? That just seems like a overhead to me.
am I missing something?
I'm using Lucene.NET 2.3.1 with a MultiSearcher.
For testing purposes, I'm indexing a database with about 10000 rows. I have two indexes and I select randomly which to insert each row in. This works correctly, but as Lucene doesn't have update capabilities, I have to test if the row exists (I have an Id field) and then delete it.
I have a List and a List, and each is created with this code:
IndexModifier mod = new IndexModifier(path, new StandardAnalyzer(), create);
m_Modifiers.Add(mod);
m_Readers.Add(IndexReader.Open(path));
m_Searchers.Add(new IndexSearcher(path));
Now the delete code:
Hits results = m_Searcher.Search(new TermQuery(t));
for (int i = 0; i < results.Length(); i++)
{
DocId = results .Id(i);
Index = m_Searcher.SubSearcher(DocId);
DocId = m_Searcher.SubDoc(DocId);
m_Modifiers[Index].DeleteDocument(DocId);
}
The search is correct and I'm getting results when the row exists. SubSearcher returns always 0 or 1, if Index is 0, SubDoc returns the same ID passed, and if it's 1, then it returns around the number passed minus 5000 times the number of times I have indexed the DB. It seems as if it wasn't deleting anything.
Each time I index the database, I optimize and close the indices, and Luke says it has no pending deletions.
What could be the problem?
I am not sure what's the end goal of this activity, so pardon if the following solution doesn't meet your requirements.
First, if you want to delete documents, you can use IndexReader, which you have already created. IndexModifier is not required.
Second, you need not find the subsearcher ID and document ID in that subsearcher. You can as well use the top-level MultiReader. I would write the equivalent java code as follows.
IndexReader[] readers = new IndexReader[size];
// Initialize readers
MultiReader multiReader = new MultiReader(readers);
IndexSearcher searcher = new IndexSearcher(multiReader);
Hits results = searcher.search(new TermQuery(t));
for (int i = 0; i < results.length(); i++) {
int docID = results.id(i);
multiReader.deleteDocument(docID);
}
multiReader.commit(); // Check if this throws an exception.
multiReader.close();
searcher.close();