IAsyncEnumerable and database queries

IAsyncEnumerable and database queries - c#

I have three controller methods returning IAsyncEnumerable of WeatherForecast.
The first one #1 uses SqlConnection and yields results from an async reader.
The second one #2 uses EF Core with the ability to use AsAsyncEnumerable extension.
The third one #3 uses EF Core and ToListAsync method.
I think the downside of #1 and #2 is if I, for example, do something time-consuming inside while or for each then the database connection will be open till the end. In scenario #3 I'm able to iterate over the list with a closed connection and do something else.
But, I don't know if IAsyncEnumerable makes sense at all for database queries. Are there any memory and performance issues? If I use IAsyncEnumerable for returning let's say HTTP request from API, then once a response is returned it's not in memory and I'm able to return the next one and so on. But what about the database, where is the whole table if I select all rows (with IAsyncEnumerable or ToListAsync)?
Maybe it's not a question for StackOverflow and I'm missing something big here.
#1
[HttpGet("db", Name = "GetWeatherForecastAsyncEnumerableDatabase")]
public async IAsyncEnumerable<WeatherForecast> GetAsyncEnumerableDatabase()
{
var connectionString = "";
await using var connection = new SqlConnection(connectionString);
string sql = "SELECT * FROM [dbo].[Table]";
await using SqlCommand command = new SqlCommand(sql, connection);
connection.Open();
await using var dataReader = await command.ExecuteReaderAsync();
while (await dataReader.ReadAsync())
{
yield return new WeatherForecast
{
Date = Convert.ToDateTime(dataReader["Date"]),
Summary = Convert.ToString(dataReader["Summary"]),
TemperatureC = Convert.ToInt32(dataReader["TemperatureC"])
};
}
await connection.CloseAsync();
}
#2
[HttpGet("ef", Name = "GetWeatherForecastAsyncEnumerableEf")]
public async IAsyncEnumerable<WeatherForecast> GetAsyncEnumerableEf()
{
await using var dbContext = _dbContextFactory.CreateDbContext();
await foreach (var item in dbContext
.Tables
.AsNoTracking()
.AsAsyncEnumerable())
{
yield return new WeatherForecast
{
Date = item.Date,
Summary = item.Summary,
TemperatureC = item.TemperatureC
};
}
}
#3
[HttpGet("eflist", Name = "GetWeatherForecastAsyncEnumerableEfList")]
public async Task<IEnumerable<WeatherForecast>> GetAsyncEnumerableEfList()
{
await using var dbContext = _dbContextFactory.CreateDbContext();
var result = await dbContext
.Tables
.AsNoTracking()
.Select(item => new WeatherForecast
{
Date = item.Date,
Summary = item.Summary,
TemperatureC = item.TemperatureC
})
.ToListAsync();
return result;
}

Server-Side
If I only cared with the server I'd go with option 4 in .NET 6 :
Use an injected DbContext, write a LINQ query and return the results as AsAsyncEnumerable() instead of ToListAsync()
public class WeatherForecastsController:ControllerBase
{
WeatherDbContext _dbContext;
public WeatherForecastsController(WeatherDbContext dbContext)
{
_dbContext=dbContext;
}
public async IAsyncEnumerable<WeatherForecast> GetAsync()
{
return _dbContext.Forecasts.AsNoTracking()
.Select(item => new WeatherForecast
{
Date = item.Date,
Summary = item.Summary,
TemperatureC = item.TemperatureC
})
.AsAsyncEnumerable();
}
}
A new Controller instance is created for every request which mean the DbContext will be around for as long as the request is being processed.
The [FromServices] attribute can be used to inject a DbContext into the action method directly. The behavior doesn't really change, the DbContext is still scoped to the request :
public async IAsyncEnumerable<WeatherForecast> GetAsync([FromServices] WeatherContext dbContext)
{
...
}
ASP.NET Core will emit a JSON array but at least the elements will be sent to the caller as soon as they're available.
Client-Side
The client will still have to receive the entire JSON array before deserialization.
One way to handle this in .NET 6 is to use DeserializeAsyncEnumerable to parse the response stream and emit items as they come:
using var stream=await client.GetAsStreamAsync(...);
var forecasts= JsonSerializer.DeserializeAsyncEnumerable(stream, new JsonSerializerOptions
{
DefaultBufferSize = 128
});
await foreach(var forecast in forecasts)
{
...
}
The default buffer size is 16KB so a smaller one is needed if we want to receive objects as soon as possible.
This is a parser-specific solution though.
Use a streaming JSON response
A common workaround to this problem is to use streaming JSON aka JSON per line, aka Newline Delimited JSON aka JSON-NL or whatever. All names refer to the same thing - sending a stream of unindented JSON objects separated by a newline. It's an old technique that many tried to hijack and present as their own
{ "Date": "2022-10-18", Summary = "Blah", "TemperatureC"=18.5 }
{ "Date": "2022-10-18", Summary = "Blah", "TemperatureC"=18.5 }
{ "Date": "2022-10-18", Summary = "Blah", "TemperatureC"=18.5 }
That's not valid JSON but many parsers can handle it. Even if a parser can't, we can simply read one line of text at a time and parse it.
Use a different protocol
Even streaming JSON responses is a workaround. HTTP doesn't allow server-side streaming in the first place. The server has to send all data even if the client only reads the first 3 items since there's no way to cancel the response.
It's more efficient to use a protocol that does allow streaming. ASP.NET Core 6 offers two options:
Use ASP.NET Core SignalR with Server-side streaming to push data to clients over WebSockets
A gRPC Service that uses server-side streaming
In both cases the server sends objects to clients as soon as they're available. Clients can cancel the stream as needed.
In a SignalR hub, the code could return an IAsyncEnumerable or a Channel:
public class AsyncEnumerableHub : Hub
{
...
public async IAsyncEnumerable<WeatherForecast> GetForecasts()
{
return _dbContext.Forecasts.AsNoTracking()
.Select(item => new WeatherForecast
{
Date = item.Date,
Summary = item.Summary,
TemperatureC = item.TemperatureC
})
.AsAsyncEnumerable();
}
}
In gRPC, the server method writes objects to a response stream :
public override async Task StreamingFromServer(ForecastRequest request,
IServerStreamWriter<ForecastResponse> responseStream, ServerCallContext context)
{
...
await foreach (var item in queryResults)
{
if (context.CancellationToken.IsCancellationRequested)
{
return;
}
await responseStream.WriteAsync(new ForecastResponse{Forecast=item});
}
}

Related

Entity Framework - improve query efficiency that retrieve lots of data

I have a database with lots of data - Excel file management.
The application manages objects when each object contains an Excel file (number of sheets, list of rows for each sheet).
The application contains a Data Grid and a list of sheets. The user will select revision number, and sheet name, the lines of the same sheet are displayed.
The objects are built like this:
Version object contains list of Pages, each page contains list of PageLine.
What is the best way to retrieve data ?
For example, my PopulateGrid method :
public void PopulateGrid()
{
CurrentPageLineGridObjects.Clear();
PreviousPageLineGridObjects.Clear();
SetCurrentConnectorPageList();
// get current revision
CurrentPageLineGridObjects = CurrentCombinedPageList.Where(page => page.Name ==
PageNameSelected).FirstOrDefault().PageLines.ToList().ToObservablePageLineGridObjectCollection();
//get prev revision
RevisionCOMBINED prevRevCombined = pgroupDataService.GetRevisionCombinedForPGroup(((PGroup)PGroupSelected.Object).Id).Result;
// get pages and pagelines for revision eeprom and override.
List<Page> eepromPages =
revisionEEPROMDataService.GetEEPROMPages(prevRevCombined.RevisionEEPROM.Id).Result;
}
public async Task<List<Page>> GetEEPROMPages(int eepromRevId)
{
string[] includes = { "Pages", "Pages.PageLines" };
IEnumerable<RevisionEEPROM> list = (IEnumerable<RevisionEEPROM>)await dataService.GetAll(includes);
return list.Where(r => r.Id == eepromRevId).SelectMany(p => p.Pages).ToList();
}
public async Task<IEnumerable<T>> GetAll()
{
using (DeployToolDBContex contex = _contexFactory.CreateDbContext())
{
IEnumerable<T> entities = await contex.Set<T>().ToListAsync();
return entities;
}
}
As you can see I pull out all the version data along with all the Sheets and all the PageLines and only then filter by the given version key.
It takes me quite a while to load.
I would appreciate any advice.
I tried to use IQueryable:
public async Task<List<T>> GetQueryable(string[] includes = null)
{
using (DeployToolDBContex context = _contextFactory.CreateDbContext())
{
if (includes != null)
{
var query = context.Set<T>().AsQueryable();
foreach (var include in includes)
query = query.Include(include);
return query.ToList();
}
else
{
List<T> entities = await context.Set<T>().AsQueryable().ToListAsync();
return entities;
}
}
}

This is terrible use of EF. For a start, code like this:
IEnumerable<RevisionEEPROM> list = (IEnumerable<RevisionEEPROM>)await dataService.GetAll(includes);
return list.Where(r => r.Id == eepromRevId).SelectMany(p => p.Pages).ToList();
You are fetching the entire table and associated includes (based on that includes array passed) into memory before filtering.
Given you are scoping the DbContext within that data service method with a using block, the best option would be to introduce a GetPagesForEepromRevision() method to fetch the pages for a given ID in your data service. Your Generic implementation for this Data Service should be a base class for these data services so that they can provide common functionality, but can be extended to support specific cases to optimize queries for each area. For instance if you have:
public class DataService<T>
{
public async Task<IEnumerable<T>> GetAll() {...}
// ...
}
extend it using:
public class EepromDataService : DataService<EEPROM>
{
public async Task<IEnumerable<Page>> GetPagesForEepromRevision(int eepromRevId)
{
using (DeployToolDBContext context = _contexFactory.CreateDbContext())
{
var pages = await context.Set<EEPROM>()
.Where(x => x.Id == eepromRevId)
.SelectMany(x => x.Pages)
.ToListAsync();
return pages;
}
}
}
So if your calling code was creating something like a var dataService = new DataService<EEPROM>(); this would change to var dataService = new EepromDataService();
The IQueryable option mentioned before:
public IQueryable<T> GetQueryable()
{
var query = _context.Set<T>().AsQueryable();
return query;
}
Then when you go to fetch your data:
var results = await dataService.GetQueryable()
.Where(r => r.Id == eepromRevId)
.SelectMany(r => r.Pages)
.ToListAsync();
return results;
This requires either a Unit of Work pattern which would scope the DbContext at the consumer level (eg: GetEEPROMPages method) or a shared dependency injected DbContext that spans both the caller where ToListAsync would be called as well as the data service. Since your example is scoping the DbContext inside the dataService with a using block that's probably a bigger change.
Overall you need to review your use of asynchronous vs. synchronous calls because other methods that do things like:
RevisionCOMBINED prevRevCombined = pgroupDataService.GetRevisionCombinedForPGroup(((PGroup)PGroupSelected.Object).Id).Result;
is very bad practice to just call .Result. If you need to call async calls from within a synchronous method then there are proper ways to do it and ensure things like exception bubbling can occur. For examples, see (How to call asynchronous method from synchronous method in C#?) If the code doesn't need to be asynchronous then leave it synchronous. async is not a silver "go faster" bullet, it is used to make supporting code more responsive so long as that code is actually written to leverage async the entire way. (I.e. HTTP Web requests in ASP.Net)

How to use mongodb as real time database with ASP.NET Core 5.0 and Angular 10?

Our applications all are running in kubernetes. For backend it is ASP.NET.Core 5 for front end we are using Angular 10 and we configured with ReplicaSets: true for mongodb (also running in kubernetes)
I have implemented full CRUD operation for Mongodb, I would like achieve real time changes when I insert or update something it will automatically appear in Front End without refreshing page.
I tried same approach for Change Operation using this
My code looks like:
// Create action (want to detect this and automatically update get method)
private readonly IMongoDatabase _mongoDatabase;
private readonly MongodbConfig _config;
public MongodbContext(MongodbConfig config)
{
_config = config;
var conventionPack = new ConventionPack {new CamelCaseElementNameConvention()};
ConventionRegistry.Register("camelCase", conventionPack, t => true);
var client = new MongoClient($"mongodb://{_config.Username}:{_config.Password}#{_config.Url}");
_mongoDatabase = client.GetDatabase(_config.Database);
}
public async Task Create(NotificationDto dto, string userId, CancellationToken cancellationToken)
{
var record = _mongoDatabase.GetCollection<NotificationDto>($"notifications-{userId}");
await record.InsertOneAsync(dto, cancellationToken: cancellationToken);
}
// Get collection
public async Task<List<NotificationDto>> Get(string userId, CancellationToken cancellationToken)
{
var record = _mongoDatabase.GetCollection<NotificationDto>($"notifications-{userId}");
FilterDefinition<NotificationDto> notifications = Builders<NotificationDto>.Filter.Eq("userId", userId);
var result = await record.Find(notifications).ToListAsync(cancellationToken);
return result;
}
My watch method for tracking changes
void Watch(string userId). // used BsonDocument instead of dto , does not helped
{
IMongoCollection<NotificationDto> collection =
_mongoDatabase.GetCollection<NotificationDto>($"notifications-{userId}");
ChangeStreamOptions options = new ChangeStreamOptions
{FullDocument = ChangeStreamFullDocumentOption.UpdateLookup};
var pipeline =
new EmptyPipelineDefinition<ChangeStreamDocument<NotificationDto>>().Match(
"{ operationType: { $in: [ 'replace', 'insert', 'update' ] } }");
var changeStream = collection.Watch(pipeline, options).ToEnumerable().GetEnumerator();
changeStream.MoveNext();
ChangeStreamDocument<NotificationDto> next = changeStream.Current;
changeStream.Dispose();
}
This class implemented by interface , I tried to create new method in interface and call watch inside it (by default we can not directly write inside), and call it in program, in startup or directly inside get method, all time it was frozen (unlimited spinning). I have no idea where we need to call Watch and how it will detect changes and also affects to UI to see changes. We are sure our bitnami mongodb configuration is good, all crud working fine, and I am not getting error in code while debugging but it gets stuck in MoveNext() line and starting unlimited loading.
So point is detect changes in mongodb notify api method which is using from UI and see changes.

MassTransit - Wait for all activities to complete and then continue processing

If I have to much activities, does it cause blocking resources or request time out?
Here is my scenario:
I have an api controller which sends an Order request to consumer; I use Request/Response patern to recieve ErrorMessage property from consumer and base on that property response back, if it's null I would want to return OK() otherwise, return BadRequest or Ok but with a message like: Product out of stock to notify to the client.
In my consumer, I have build a routing slip which have 2 activities:
CreateOrderActivity: Which creates an order with order details.
ReserveProductActivity: Which reduces the quantity of product in stock, if product quantity < 0 I'll publish a message with an ErrorMessage back to the consumer and compensate the previous activity.
public async Task Consume(ConsumeContext<ProcessOrder> context)
{
try
{
if (!string.IsNullOrEmpty(context.Message.ErrorMessage))
{
await context.RespondAsync<OrderSubmitted>(new
{
context.Message.OrderId,
context.Message.ErrorMessage
});
return;
}
RoutingSlipBuilder builder = new RoutingSlipBuilder(context.Message.OrderId);
// get configs
var settings = new Settings(_configuration);
// Add activities
builder.AddActivity(settings.CreateOrderActivityName, settings.CreateOrderExecuteAddress);
builder.SetVariables(new { context.Message.OrderId, context.Message.Address, context.Message.CreatedDate, context.Message.OrderDetails });
builder.AddActivity(settings.ReserveProductActivityName, settings.ReserveProductExecuteAddress);
builder.SetVariables(new { context.Message.OrderDetails });
await context.Execute(builder.Build());
await context.RespondAsync<OrderSubmitted>(new
{
context.Message.OrderId
});
}
catch (Exception ex)
{
_log.LogError("Can not create Order {OrderId}", context.Message.OrderId);
throw new Exception(ex.Message);
}
}
Code for ReserveProductActivity:
public async Task<ExecutionResult> Execute(ExecuteContext<ReserveProductArguments> context)
{
var orderDetails = context.Arguments.OrderDetails;
foreach (var orderDetail in orderDetails)
{
var product = await _productRepository.GetByProductId(orderDetail.ProductId);
if (product == null) continue;
var quantity = product.SetQuantity(product.QuantityInStock - orderDetail.Quantity);
if (quantity < 0)
{
var errorMessage = "Out of stock.";
await context.Publish<ProcessOrder>(new
{
ErrorMessage = errorMessage
});
throw new RoutingSlipException(errorMessage);
}
await _productRepository.Update(product);
}
return context.Completed(new Log(orderDetails.Select(x => x.ProductId).ToList()));
}
This line of code in a consumer method await context.Execute(builder.Build())
At first I thought it would build the routing slip and execute all activities first before going to the next line but it's not. Instead it's immediately going to the next line of code (which responses back to controller) and then after execute activities, which is not what I want. I need to check the quantity of product in 2nd activity first and base on that return back to the controller.
(In current, it always responses back to controller first - the line after buider.Buid(), and then if quantity < 0 it still goes to the very first if condition of the consume method but since it already responses, I cannot trigger response inside that if statement again).
So in short, if product is still available in 2nd activity I can send the reponse back like normal (which executes the code after context.Execute(builder.Build()), but if quantity < 0 - which I publish back to the consumer method with ErrorMessage, I would like it to jump to the very first if condition of Consume method (if(!string.IsNullOrEmpty(context.Message.ErrorMessage)) ...) and base on the ErrorMessage notify the client.
Is there something wrong with this approach? How can I achieve something like this?
Thanks

It isn't documented, but it is possible to use a proxy to execute a routing slip, and response to the request with the result of the routing slip. You can see the details in the unit tests:
https://github.com/MassTransit/MassTransit/blob/master/tests/MassTransit.Tests/Courier/RequestRoutingSlip_Specs.cs#L20
You could create the proxy, which builds the routing slip and executes it, and the response proxy - both of which are then configured on a receive endpoint as .Instance consumers.
class RequestProxy :
RoutingSlipRequestProxy<Request>
{
protected override void BuildRoutingSlip(RoutingSlipBuilder builder, ConsumeContext<Request> request)
{
// get configs
var settings = new Settings(_configuration);
// Add activities
builder.AddActivity(settings.CreateOrderActivityName, settings.CreateOrderExecuteAddress);
builder.SetVariables(new { context.Message.OrderId, context.Message.Address, context.Message.CreatedDate, context.Message.OrderDetails });
builder.AddActivity(settings.ReserveProductActivityName, settings.ReserveProductExecuteAddress);
builder.SetVariables(new { context.Message.OrderDetails });
}
}
class ResponseProxy :
RoutingSlipResponseProxy<Request, Response>
{
protected override Response CreateResponseMessage(ConsumeContext<RoutingSlipCompleted> context, Request request)
{
return new Response();
}
}
You could then call it from the consumer, or put the ordering logic in the proxy - whichever makes sense, and then use the request client from your controller to send the request and await the response.

Use LINQ (Skip and Take) methods with HttpClient.GetAsync method for improving performance?

I have used the following code to retrieve the content of a JSON feed and as you see I have used the paging techniques and Skip and Take methods like this:
[HttpGet("[action]")]
public async Task<myPaginatedReturnedData> MyMethod(int page)
{
int perPage = 10;
int start = (page - 1) * perPage;
using (HttpClient client = new HttpClient())
{
client.BaseAddress = new Uri("externalAPI");
MediaTypeWithQualityHeaderValue contentType =
new MediaTypeWithQualityHeaderValue("application/json");
client.DefaultRequestHeaders.Accept.Add(contentType);
HttpResponseMessage response = await client.GetAsync(client.BaseAddress);
string content = await response.Content.ReadAsStringAsync();
IEnumerable<myReturnedData> data =
JsonConvert.DeserializeObject<IEnumerable<myReturnedData>>(content);
myPaginatedReturnedData datasent = new myPaginatedReturnedData
{
Count = data.Count(),
myReturnedData = data.Skip(start).Take(perPage).ToList(),
};
return datasent;
}
}
My paging works fine, however I can't see any performance improvement and I know this is because every time I request a new page it calls the API again and again and after retrieving all contents, it filters it using Skip and Take methods, I am looking for a way to apply the Skip and Take methods with my HttpClient so that it only retrieves the needed records for every page. Is it possible? If so, how?

In order to apply the Take/Skip to the data retrieval, the server would have to know about them. You could do that with an IQueryable LINQ provider (see [1] for getting only an idea of how complex that is) or, better, by passing the appropriate values to the client.GetAsync call, something like
HttpResponseMessage response = await client.GetAsync(client.BaseAddress + $"?skip={start}&take={perPage}");
Of course, your server-side code has to interpret those skip and take parameters correctly; it's not automatic.
You might also want to look at OData (see [2]), but I have never actually used it in production; I just know it exists.
[1] https://msdn.microsoft.com/en-us/library/bb546158.aspx
[2] https://learn.microsoft.com/en-us/aspnet/web-api/overview/odata-support-in-aspnet-web-api/odata-v3/calling-an-odata-service-from-a-net-client

How to edit array to add keynames, from a httpclient api call in MVC?

I am very new to MVC and making api calls server side and need a little guidance. I have created a simple method to call an api to retrieve results in a JSON object:
apiController.cs (normal controller.cs file)
[HttpGet]
public JsonResult getDefaultStuff(string a = "abc") {
var url = "https://myapiurl";
var client = new HttpClient();
client.DefaultRequestHeaders.UserAgent.ParseAdd("Blah");
var response = client.GetStringAsync(url);
return Json(response, JsonRequestBehavior.AllowGet);
}
The results return in an array like this:
{Result: {examples: [[0000,6.121],[0000,1.122],[0000,9.172]]},"Id":81,"Exception":null,"Status":5,"IsCanceled":false,"IsCompleted":true,"CreationOptions":0,"AsyncState":null,"IsFaulted":false}
I need it to return with keynames like this :
{
"examples": [
{
"Keyname1": "45678",
"Keyname2": "1234"
},
{
"Keyname1": "14789",
"Keyname2": "1234"
},
{
"Keyname1": "12358",
"Keyname2": "4569"
}
]
}
Do I need to use IDictonary? I am unsure of the approach. Do I create a new object and then loop through each result adding keynames? an example would be much appreciated or just the approach will be very helpful.

You can do the following:
By using the Json.Net nuget package first deserialize the response into, for example, an anonymous object:
var deserialized = JsonConvert.DeserializeAnonymousType(response, new
{
examples = new[] { new decimal[] { } }
});
Then transform this object into the new one that has the property structure you need:
var result = new
{
examples = deserialized.Result.examples.Select(x => new
{
Keyname1 = x[0],
Keyname2 = x[1]
})
};
And return it to the client:
return Json(result, JsonRequestBehavior.AllowGet);
This is what your solution could roughly look like but you do have to keep several things in mind though:
Exception handling for deserialization
Checks for possible nulls in the deserialized object
Maybe also the safer way of retrieving values from the array to avoid possible out of range exceptions.
Also the GetStringAsync method is asynchronous and you should put an await keyword in front of it, but in order to do so you need to make your method async as well:
public async Task<JsonResult> getDefaultStuff(...)
If you don't have enough knowledge of asynchronous programming, here is the most advanced, in-depth and comprehensive video explaining it from top to bottom I have ever seen, so check it out whenever you find time...

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

IAsyncEnumerable and database queries - c#

Related

Entity Framework - improve query efficiency that retrieve lots of data

How to use mongodb as real time database with ASP.NET Core 5.0 and Angular 10?

MassTransit - Wait for all activities to complete and then continue processing

Use LINQ (Skip and Take) methods with HttpClient.GetAsync method for improving performance?

How to edit array to add keynames, from a httpclient api call in MVC?

Categories

Resources