How to read big amount of data in chunks? - c#

I have a project where I should use external WCF service that has method that looks as following:
Items catalogItems = externalClient.getCatalogItems(auth, idCatalog, 1, 100);
After I call getCatalogItems service method, I should transform the returned array of items to raw SOAP message in this manner:
Message response = Message.CreateMessage(MessageVersion.Default, ReplyAction_GetCatalogItems, catalogItems);
The last 2 parameters in getCatalogItems service method designates the size of chunk of data that should be obtained in each call. For example, if we have 1050 records all of them should be obtained 10 times in chunks of 100 and 1 time in chunk of 50.
I understand I should read the data until they are available. I have 2 questions:
How do I know where I should continue to read? For example, if I've read first portion of 100 records, how do I know where's the current position of reader?
How do I know when I reach the end?

One approach would be to make it the responsibility of the client to remember the state (ie the page number where the client currently is).
So you can change your method call to include a page number and items per page parameters:
Items catalogItems = externalClient.getCatalogItems(auth, idCatalog, pageNumber, itemsPerPage);
The service can then essentially select a set of items based on the pageNumber and itemsPerPage values and it need not hold the state. (Note: this can be easily translated to a select query if you are using a database as repository for the items)
You can possibly alter the return value to include the total number of items as well:
Example:
CatalogResponse respone = externalClient.getCatalogItems(auth, idCatalog, pageNumber, itemsPerPage);
public class CatalogResponse
{
private _totalItems;
private _items;
}
This also provides the flexibility for the client to determine the chunk of items to receive in each call and to the end user to select a page size.

Related

How many requests TableClient.Query<T> will make, if AsPages() is not used?

I have a question about Pageable< T > in C#. I have table storage in Azure named- Domains. I am using Azure.Data.Tables nuget package, and to Query all the domains i am using this :
var domains = _localDomainTableClient
.Query<Domain>()
.AsPages()
.SelectMany(d => d.Values);
But i dont understant something. What if i use Query< T > without AsPages method ?
IEnumerable<Domain> domainsPAges = tableClient.Query<Domain>();
I know that AsPages() returns a collection of pages. For example if I have 10000 items in the table, Query<Domain>().AsPages() should make 10 requests to the table and return me 10 pages with 1000 items in each page (unless I changed the default value) but I don't understand what exactly is happening if I don't use AsPages() ?
Example:
IEnumerable<Domain> domainsPAges = tableClient.Query<Domain>();
Query<Domain>() return Pageble< T > but, does it make 10 requests to the table again or does it take all the elements until the memory overflows (4 MB by default) or take all the elements at once ?
I check documentation, but I couldn't find what I needed.
A collection of values that may take multiple service requests to iterate over.
A collection of values retrieved in pages
What is it even mean ?
Thanks for help.
How many requests Query will make, if AsPages() is not used?
In C#, the Query method is part of the Azure Cosmos DB SDK.
The Query<T> method will retrieve all matching results in a single request to the database. This behaviour is known as a single-partition query.
If the query involves multiple partitions, then the Query method will make multiple requests to retrieve all matching results. The exact number of requests will depend on the number of partitions involved and the size of the result set.
If the AsPages() method is not used, then the query results will be returned as a single enumerable object. If the AsPages() method is used, the query results will be returned as a sequence of pages, each containing a subset of the query results. In this case, the number of requests made will depend on the size of each page and the total number of results matching the query criteria.
What if i use Query< T > without AsPages method
In C#, if you use Query<T> without the AsPages method, you will receive all the results of the query in a single batch.
The Query<T> method returns an IEnumerable that represents the results of the query. If you do not use AsPages, the entire result set will be loaded into memory at once. This leads to high memory usage and performance can be degraded, as it can consume a lot of memory.
AsPages method also enables you to retrieve the results in a more efficient manner. It allows you to retrieve the results in smaller batches or pages, which can help reduce the memory usage and improve performance. The AsPages method returns an IAsyncEnumerable<Page<T>>, where each Page represents a page of results.
Query() return Pageble< T > but, does it make 10 requests to the table again or does it take all the elements until the memory overflows (4 MB by default) or take all the elements at once?
The behaviour of Query() depends on the implementation of the method and the underlying database system.
when you execute a query that returns a pageable result, the database will execute the query and return only a subset of the results to the application, based on the page size and the current page number. The remaining results are retrieved on subsequent requests for the next pages.
So if you call Query<Domain>() with a specific page size, it will only retrieve the number of elements specified in that page size, and not all the elements in the table.
The number of requests made to the table will depend on the number of pages that need to be retrieved to return all the results for the query.
The memory used to store the retrieved results will depend on the size of the objects retrieved and the number of objects retrieved in each page. When using pageable results, the size of the objects returned is generally limited to reduce memory usage, so it's unlikely that the memory usage will exceed the default limit of 4 MB.
For more information refer this Query table entities and SO Thread.

How to cache stuff only once in a timespan from a device in .NET core?

I am trying to implement a simple caching in my backend (.NET core) of website. What I want to do is cache the number of times a GET request is called, but I want to do it only once for a device in some timespan.
What I mean is if some (user)device sends a GET request to view data A, then I want to count it as A requested once, but if the same person reload the page in (for example) 2 minutes again then I don't want to count it as a new request. But, after 2 minutes if the same device requests the data A again then I want to count it.
What I am currently using is simple MemoryCache. I have unique keys for each data, and I count the number of times the data is requested like follows:
if (_cache.TryGetValue(cacheKey, out numberOfTimes))
{
_cache.Set(cacheKey, numberOfTimes + 1, cacheOptions);
}
else
{
_cache.Set(cacheKey, numberOfTimes, cacheOptions);
}
But, I am not able to understand how can I mark the request from the same (user)device under a certain timespan as to not be counted. I would appreciate any help!

Server-Side Paging MVC 6.0

I have MVC project with WCF service.
When I display a list of data, I do want to load everything from the database/service and do a client paging. But I do want a server-side paging. If I have 100 records and my page size is 10, then when a user clicks on page 1, it will only retrieve the first 10 records from the database and if a user clicks on Page 3, then it will only retrieve the corresponding ten records.
I am not using Angular or any other bootstrap.
Can someone guide me how to do it?
public ActionResult Index(int pageNo = 1)
{
..
..
..
MyViewModel[] myViewModelListArray = MyService.GetData();
//when I create this PageList, BLL.GetData have to retreive all the records to show more than a single page no.
//But if the BLL.GetData() was changed to retrieve a subset, then it only shows a single page no.
//what I wanted to do is, show the correct no of pages (if there are 50 records, and pageSize is 10, then show
//page 1,2,3,4,5 and only retrieve 10 records at a time.
PagedList<MyViewModel> pageList = new PagedList<<MyViewModel>(myViewModelListArray, pageNo, pageSizeListing);
..
..
..
return View(pageList);
}
The best approach is to use LINQ to Entities operators Skip & Take.
For example, to page
int items_per_page = 10;
MyViewModel[] myViewModelListArray = MyService.GetData().OrderBy(p => p.ID).Skip((pageNo - 1) * items_per_page).Take(items_per_page).ToArray();
NOTE: The data must be ordered, so the pages have some consistency (but I did by an arbitrary field ID). Also some databases required 'order by' to apply 'limit' or 'top' (which is how Take/Skip are implemented).
I put it that way, because I dont know how you are retrieving the data.
But instead retrieving the full list with GetData and then filtering out, better include the pagination in the query inside GetData (so you don't retrieve unnecessary data).
Add paramters page size and page number to your service method and make the result an object which returns TotalCount and a List Items (Items being the items on the current page). Then you can use those values to create the PagedList.
Inside your business logic code you will do two queries one for the count of items and one for the items on the page.
Also if you are starting the project now do yourself a favor and remove the useless WCF service from your architecture.

How to create pagination based on client.Search results Elasticsearch Nest

Is there any way in which I can retrieve all the results from client.Search (I think this can be done using scroll API) and create a pagination for these results when displaying them? Is there any API from ES for doing so?
Or using From() and size(), can it be done?
For eg: Lets say, I have 100,000 documents on the index and when I search for a keyword it generates some 200 results. How can I use scroll, from and size to show them?
TIA
We use from and size options to implement pagination for Elasticsearch results.
The code snippet can be something like below:
def query(page)
size = 10
page ||= 1
from = (page-1) * size
# elasticsearch query with from * size options
end
You may need to know total number of results to implement pagination without sending a additional count request. To get the total results, you can use the total field of the response.
=== Updated
If you want to get the search results of the first page, then you can use query(1). If you want to get the search results of the second page, then you can use query(2) and so on.
The purpose of the scroll is slightly different. Let's say you need to get all records of the search results and the number of results are too large (eg., millions of results). If you retrieve all the data at once, it will arise a kind of memory issue or problems because of the high load. In this case, you can use scroll to fetch results step by step.
For the pagination, you don't need to get all data of the search results. You only need to get some data of a specific page. In this case, you may need to use just query with from and size options NOT scroll.

How to limit the number of elements in a checkboxlist?

I have a checkboxlist in my C#/asp.net project and I'm populating it with a dataTable that gets data from a query to my database. The query returns a large amount of data and I want to restrict the number of elements that it shows initially before I filter the data. (To, say, the top 1000). How would I go about doing this?
There are two places where you can limit the number of data.
In the database (assuming you use SQL Server) you can modify the query to return the top 1000 rows.
SELECT TOP 1000 * FROM SomeTable
Or you can filter the data after it arrives using Linq.
var newData = dataTable.AsEnumerable().Take(1000);
I would prefer the first method, so you don't truck around useless data. But the second definitely works as well if you need that data elsewhere.
You can use the Take<> generic IEnumerable method:
var data = someQuery.Exec();
var limitedData = data.Take(1000).ToArray();

Categories