RavenDB - Sift through documents and get count using index/query - c#

I have a City document, Site document. City can have multiple sites. Site document has the city information in it. There are about 100 city documents and 10000 site documents in RavenDB
City Document:
{
"CityCode": "NY",
"CityName": "New York"
}
Site Document:
{
"SiteName": "MOMA",
"CityCode": "NY"
}
Objective is to get a list of all cities and the number of sites for each like...
City Sites
NY 12
CH 33
BO 56
and so on....
I am doing this.
int countSites = session.Query<Site>()
.Count();
var SiteCityList = session.Query<Site>()
.Take(countSites)
.ToList()
.GroupBy(x => x.CityCode)
.OrderBy(x => x.Count())
.ToDictionary(x => x.Key, x => x.Count());
This does not give all the data in the ravendb. I get only 11 rows of count by site at any time and even the counts are not accurate. What I want is to get a list of all 100 cities and number of sites for each city (in 100s) as a list as shown above. Thanks for the help.

Use a map/reduce index like this
public class CityCodeCount : AbstractIndexCreationTask<Site, CityCodeCount.ReduceResult>
{
public class ReduceResult
{
public string CityCode { get; set; }
public int Count { get; set; }
}
public CityCodeCount()
{
Map = sites => from site in sites
select new
{
site.CityCode,
Count = 1
};
Reduce = results => from result in results
group result by result.CityCode
into g
select new
{
CityCode = g.Key,
Count = g.Sum(x => x.Count)
};
}
}
Later then, you can query it easily.
var results = documentSession.Query<CityCodeCount.ReduceResult, CityCodeCount>()
.ToList();

If you want an alternative way, you can take a look at Faceted Search
It gives you slightly more flexibility than Map/Reduce, but will only work when you want a Count of items (which you do in your case).

You've got 2 options:
Create a Map/Reduce index and query against it
Use Faceted (Aggregated) Search to query
To make a choice between the two, take into account that
A specially dedicated index is faster to query, but has a some footprint on the storage and performance for re-indexing on changed records (can be important if you need immediate consistency and wait for non-stale indexes).
Facet is simpler to use when you already have the fields covered in an existing index. I believe you understand importance of static indexes and already have some.
While using a Map/Reduce index is straightforward and already covered by Daniel's answer, I provide below an example of using Facets:
var query = DbSession.Query<Site_IndexModel, Site_ForList>();
List<FacetValue> facetResults = (await query
.AggregateBy(builder => builder.ByField(s => s.CityCode ))
.ExecuteAsync()
).Single().Value.Values;
// Go through results, where each facetResult is { Range, Count } structure
return from result in facetResults
select new { CityCode = result.Range, Count = result.Count }
where
Site_ForList is an existing index for Site collection, which includes CityCode field
Site_IndexModel is the stored structure of Site_ForList index

Related

How to orderby more then one parameter in the same weight?

Im trying to make a program that sorts objects by more then one parameters.
I need the order by to be in the same weight for all the parameters. what functions do i need to use in order to get that result?
I tried to use OrderBy() and then- ThenBy() but its ordering the first parameter first so the ordering isn't equal weight.
values = File.ReadAllLines(filepath)
.Skip(1)
.Select(v => Fund.FromCsv(v))
.OrderByDescending(x => x.sharp)
.ThenBy(x=>x.yearlychange)
.ToList();
For example you can take the stocks market, in that case i want to order the stocks by the yield in the last year but also to order by standard deviation. in that way i can get stock that have the best yield in the last year but also the best standard deviation. it wont be the best yield from all, it will be combined.
As you have been already informed, it is not really a programistic problem, more like algorithm/domain one. Nevertheless, if you already would have the algorithm, you can, of course, do it like this way. (basing on the example you present in the comment)
void Main()
{
var funds = new List<Fund>();
funds.Add(new Fund() { Age = 18, Money = 100000 });
funds.Add(new Fund() { Age = 20, Money = 101000 });
//Here is normally your code with loading it from CSV
//File.ReadAllLines(filepath)
// .Skip(1)
// .Select(v => Fund.FromCsv(v))
var value = funds.OrderBy(t => t.SortingFactor);
}
public class Fund
{
public int Age { get; set; }
public decimal Money { get; set; }
public decimal SortingFactor
{
get
{
//Here is your domain algorithm you must sort out first (your example data would be)
return Money / Age;
}
}
}
I'm not sure if I fully understand your aim but another alternative if fund is not code you can modify is an anonymous object in your order by e.g.
values = File.ReadAllLines(filepath)
.Skip(1)
.Select(v => Fund.FromCsv(v))
.OrderByDescending(x => new { x.sharp, x.yearlychange })
.ToList();

LINQ Expression to Select objects by string property with maximum count of objects in its queue property without duplicates

I have a queue of Record objects as follows:
public class Record
{
public string TypeDesc { get; set; }
public Queue<Total> Totals { get; set; }
etc.....
}
I'm having trouble writing a LINQ expression to extract a subset that has only one of each TypeDesc but within each TypeDesc the one with the most Total objects in the Totals queue.
I'm not sure it matters but there is only one TypeDesc that has Total objects in the Totals queue property. All others the queue is empty. There are about 8 unique TypeDesc values.
Here's my attempt but the totals property is not available on "s".
var records = Records.Select(c => c.TypeDesc).Where(s => s.Totals.Count).Max().Distinct();
group the records by their TypeDesc property
For each group, select the one with the highest Totals.Count.
records.GroupBy(r => r.TypeDesc)
.Select(
g => g.Aggregate((acc, current) => current.Totals.Count > acc.Totals.Count
? current
: acc));
For complex queries like these, it's best to break the logic down a bit, to make the code more readable:
Func<IEnumerable<Record>, Record> mostTotals =
group => group.Aggregate(
(acc, current) => current.Totals.Count > acc.Totals.Count
? current
: acc);
var records = records.GroupBy(r => r.TypeDesc)
.Select(mostTotals);
Step 2 is achieved by using Aggregate, which iterates through the records in that group, and uses an "accumulator" to keep track of the record with the highest Totals.Count at each iteration.
To simplify, the aggregation function is equivalent to this:
//for each group
Record acc = null;
foreach(var current in group)
acc = current.Totals.Count > acc.Totals.Count
? current
: acc;

How to grab the index from a list using LINQ

I have a list I am populating with total sales from a team.
lstTeamSales.OrderBy(x => x.TotalSales);
This list has an int userID and a decimal totalSales.
I order it by totalSales. How can I at that point figure out the rank for the person logged in?
I know I can compare the person who is logged in by his userID to that of the userID in the list. If he is #3 in sales I need to return an int of his rank which would be Rank 3.
The question can be rephrased to "How do I get index of element in IEnumerable". Here is the answer: How to get index using LINQ?
Here is how to use it:
int rank = lstTeamSales.OrderBy(x => x.TotalSales).FindIndex(x => x.userID == currentUserID);
And this will be slightly more efficient than Select based approaches.
Update
It appears .FindIndex is not supported for LINQ. Any idea how to implement that functionality?
I may have figured it out testing it now. I just added .ToList() after the ORderBy().
No-no-no-no! It kills the whole idea :( The idea is to add extension method FindIndex to IEnumerable. And then use it. See example:
static class FindIndexEnumerableExtension
{
public static int FindIndex<T>(this IEnumerable<T> items, Func<T, bool> predicate)
{
if (items == null) throw new ArgumentNullException("items");
if (predicate == null) throw new ArgumentNullException("predicate");
int retVal = 0;
foreach (var item in items)
{
if (predicate(item)) return retVal;
retVal++;
}
return -1;
}
}
class YourClass
{
void YourMethod()
{
lstTeamSales.OrderBy(x => x.TotalSales).FindIndex(x => x.UserID == currentUserID);
}
}
After you define class FindIndexEnumerableExtension with FindIndex extension method, you can use this method anywhere in your code. All you need is just add using directive with module where FindIndexEnumerableExtension is defined. This is, basically, how LINQ works.
If you don't want to go with this solution then, at least, convert lstTeamSales to List before sorting it. And sort it using List<>.Sort() method.
You can use the select extenstion that takes a Func<TSource, Int32, TResult> (or the Expression equivalent) like so:
var userId = /* the userId */;
lstTeamSales.OrderBy(x => x.TotalSales).Select((x, i) => new
{
x.UserId,
x.TotalSales,
Rank = i + 1
}).FirstOrDefault(x => x.UserId == theUserId);
This will return an object with the user id, the total sales and the rank where the user id is fixed. It will return null if there is no entity where UserId = theUserId in the collection.
The index (i in the example) is 0-based. Adjust as needed.
Given a list of total sales, lstTeamSales and a number representing the sales you wish to find the rank for, userSales, what you'll need is the number of total sales in lstTeamSales that exceed userSales. If it's rank you want, then you'd probably want to exclude ties in the rank (i.e. if the top two sales numbers are both 1000, then they'd both be ranked 1)
You can do this simply by projecting only the sales numbers with Select, remove ties with a Distinct call, then use Count:
lstTeamSales.Select(x => x.TotalSales).Distinct().Count(x => x > userSales)
That would give you the total number of sales that are higher than the current user. From there, the rank of the current user is one above that number:
var rank = 1 + lstTeamSales.Select(x => x.TotalSales).Distinct().Count(x => x > userSales)
The Select((item, index) => ...) form allows for this (as shown by Simon), however as DMac mentions you probably want to consider duplicates. To incorporate this in a Select, you could use GroupBy:
lstTeamSales
.OrderByDescending(x => x.TotalSales).GroupBy(x => x.TotalSales)
.Select((group, i) => new {
Rank = i + 1,
Users = group.Select(x => x.UserId)
})
This would provide you with a list of ranks along with the lists of users who have that rank. Or you could flatten this with SelectMany, to get each user with its rank:
lstTeamSales
.OrderByDescending(x => x.TotalSales).GroupBy(x => x.TotalSales)
.SelectMany((x, i) => new {
Rank = i + 1,
User = x.UserId
})
You could filter this sequence to find users, but if you only want to look up a specific user's rank, then DMac's solution is the most direct. The above would be more useful for example if you wanted to list the top 5 sellers (see Take).

Sorting a list of objects with OrderByDescending

I have an object (KS), which holds ID and Title (which has a number as part of the Title).
All I'm trying to do is sort it into descending order. The object has:
ID Title
1 1 Outlook VPN
2 2 Outlook Access
3 4 Access VBA
4 3 Excel Automation
So when order by Title, it should read:
ID Title
3 4 Access VBA
4 3 Excel Automation
2 2 Outlook Access
1 1 Outlook VPN
The code I'm using to sort it is:
IEnumerable<KS> query = results.OrderByDescending(x => x.Title);
However, query still has the objects in the original order!
Is there something to do with having numbers at the start of Title that I'm missing?
EDIT
I've added the code from the controller for clarity:
[HttpPost]
// [ValidateAntiForgeryToken]
// id is a string of words eg: "outlook access vpn"
// I split the words and want to check the Title to see how many words appear
// Then sort by the most words found
public JsonResult Lookup(string id)
{
List<string> listOfSearch = id.Split(' ').ToList();
var results = db.KS.Where(x => listOfSearch.Any(item => x.Title.Contains(item)));
// search each result, and count how many of the search words in id are found
// then add the count to the start of Title
foreach (KS result in results)
{
result.KSId = 0;
foreach (string li in listOfSearch)
{
if (result.Title.ToLower().Contains(li.ToLower()))
{
result.KSId += 1;
}
}
result.Title = result.KSId.ToString() + " " + result.Title;
}
// sort the results based on the Title - which has number of words at the start
IEnumerable<KS> query = results.OrderByDescending(x => x.Title).ToList();
return Json(query, JsonRequestBehavior.AllowGet);
}
Here is a screenshot after query has been populated showing Titles in the order: 1, 2, 1, 1:
Model for the object if it helps is:
public class KS
{
public int KSId { get; set; }
public string KSSol { get; set; }
public string Title { get; set; }
public string Fix { get; set; }
}
As I said in a comment, put a .ToList() where you declare your results variable. That is:
var results = db.KS.Where(x => listOfSearch.Any(item => x.Title.Contains(item)))
.ToList();
If you don't do that, the foreach loop will modify objects that might not be the same as the objects you sort later, because the database query is run again each time you enumerate your IQueryable<>.
You can always just ignore the strange behavior and go the safe way:
List<KS> query = results.ToList();
query.Sort((a, b) => a.Whatever.CompareTo(b.Whatever));
return Json(query, blah);
I simple did this and it worked for me :-
var sortedOrder = Query.OrderBy(b => b.Title.Substring(b.Title.IndexOf(" ")));
All I have done is SubString the Title at the index of of the blank space when ordering the objects in the sequence, that way, the OrderBy is looking at the first character in the title rather than the number at the beginning.
Old question, but maybe this will help someone using C#. I used the following expressions to sort a list of objects based on their quantity parameter in ascending or descending order. Can modify it to compare text as the original question was concerned with.
Ascending Order:
locationMaterials.Sort((x, y) => x.Quantity.CompareTo(y.Quantity));
Descending Order:
locationMaterials.Sort((x, y) => y.Quantity.CompareTo(x.Quantity));
You are missing .ToList()
IEnumerable<KS> query = results.OrderByDescending(x => x.Title).ToList();
results.OrderByDescending(x => x.Title) is a query, and it has no data.
ToList() forces the query to be executed.
[EDIT]
My answer assumes that your results has acually not been materialized, and that that is the source of your problem.

Storing the list items into a variable after filtering by using linq

I got a list of items, want to filter the list based on column distinct value(i.e based on Level) and also after filtering need to get the count and store them as an int variable.
Can anyone please help me.
**List**
Public Class Totalitems
{
public string ItemName;
public string ItemId;
public string ItemGroup;
public int Level;
}
Id= "123asd";
List<Totalitems> l_items = this.getslist(Id);
/*How to filter based on distinct level */
/* var filteredItems = (
from p in l_items
select p.Level)
.Distinct(); */
**Finally:**
//Stores the elements contained in the List into a variable
int totalItemsafterFiltering = l_FilteredItems.Count;
You want to use GroupBy for this task:
var numberOfDifferentLevels = l_items.GroupBy(x => x.Level).Count();
GroupBy is especially useful, if you want to do something with the actual elements in the group. For example, you might want to know how many items per level there are:
var itemsPerLevel = l_items.GroupBy(x => x.Level)
.Select(x => new { Level = x.Key,
NumberOfItems = x.Count() });
Another approach when you really only care about the number of distinct levels, is the following:
var numberOfDifferentLevels = l_items.Select(x => x.Level).Distinct().Count();

Categories