Consuming OData V3 results in OutOfMemoryException - c#

This seemingly simply example of paginated requests to an OData feed results in OutOfMemoryException, when calling Load(). The ODataService class is a generated code class, generated for an OData V3 feed using the Unchase OData Connected Service extension from Marketplace.
using System;
using System.Collections.Generic;
using System.Data.Services.Client;
using System.Diagnostics;
using System.Linq;
using System.Net;
using System.Text.RegularExpressions;
namespace ODataToSqlMapper {
internal class ODataGetter {
public ODataService DataService { get; set; }
public ODataGetter() {
DataService = new ODataService(new Uri("my-url-here"));
DataService.Credentials = CredentialCache.DefaultCredentials;
DataService.Format.UseJson();
}
private void GetFiles(DataServiceQuery<File> query) {
DataServiceQueryContinuation<File> token = null;
Stopwatch timer = new Stopwatch();
int loadedCount = 0;
try {
timer.Start();
QueryOperationResponse<File> response = query.Execute() as QueryOperationResponse<File>;
long totalCount = response.TotalCount;
do {
Console.Write($"\rLoaded {loadedCount} / {totalCount} files ({(int)(100.0 * loadedCount / totalCount)} %) in {timer.ElapsedMilliseconds} ms. ");
if (token != null)
response = DataService.Execute<File>(token) as QueryOperationResponse<File>;
loadedCount += response.Count();
}
while ((token = response.GetContinuation()) != null);
}
catch (DataServiceQueryException ex) {
throw new ApplicationException("An error occurred during query execution.", ex);
}
}
internal void Load() {
DataServiceQuery<File> query = DataService.Files.IncludeTotalCount() as DataServiceQuery<File>;
GetFiles(query);
}
}
}
When taking a memory snapshot (see figure), I see that a lot of Uri objects (12.7 million of them) are taking up 575 MB of memory. I've tried splitting up my requests by modifying the query in Load() such that GetFiles() is called several times, however this doesn't seem to make any difference. Any ideas on how I can get around this?
I found this issue for a related library, which describes a similar problem. This was fixed in that particular library, but I can't use that, since it supports only OData V4 (and my feed is V3).

Related

.net core web application - what's using all this memory? Small enumerable vs big enumerable vs stream response

I'm trying to understand server memory usage when a web application sends a big response.
I generated a ~100 megabyte text file on disk just to act as some data to send. Note that I may not actually want to send a file back (ie, I won't always want to use a FileStreamResult in practise). The file is just a source of data for my experiment.
I have exposed three different get requests.
Small returns a tiny ActionResult<IEnumerable<string>>. This is my control.
Big returns the file contents as an ActionResult<IEnumerable<string>>.
Stream implements a custom IActionResult, and writes lines directly to the Response.Body.
When I run the web application I see the following results in visual studio diagnostic tools:
On startup, the process memory is 87 meg.
Hitting the /small url -> still ~87 meg.
Hitting the/big url -> quickly jumps to ~120 meg.
Hitting the /stream url -> slowly climbs to ~110 meg.
As far as I understand, kestrel has a response buffer of 64kb, and I think I have correctly decorated the methods to disable response caching. So, what is causing the memory consumption of the server to increase so much, and is there something I can do to ensure that the server doesn't have to use all this extra memory? I was hoping I could truly "stream" results down to the client without having server memory usage climb so much.
Also, I'm pretty new to asp net core mvc web application API's (what a mouthful!), so any other tips are welcome.
using System;
using System.IO;
using System.Collections.Generic;
using Microsoft.AspNetCore.Mvc;
using System.Threading.Tasks;
namespace WebApplication1.Controllers
{
public static class EnumerableFile
{
public static IEnumerable<string> Lines
{
get
{
using (var sr = new StreamReader(#"g:\data.dat"))
{
string s = null;
while ((s = sr.ReadLine()) != null)
{
yield return s;
}
}
}
}
}
public class StreamResult : IActionResult
{
public Task ExecuteResultAsync(ActionContext context)
{
return Task.Run(
() =>
{
using (var sr = new StreamReader(#"g:\data.dat"))
{
string s = null;
while ((s = sr.ReadLine()) != null)
{
context.HttpContext.Response.Body.Write(System.Text.Encoding.UTF8.GetBytes(s));
}
}
}
);
}
}
[ApiController]
public class ValuesController : ControllerBase
{
[Route("api/[controller]/small")]
[ResponseCache(NoStore = true, Location = ResponseCacheLocation.None)]
public ActionResult<IEnumerable<string>> Small()
{
return new string[] { "value1", "value2" };
}
[Route("api/[controller]/big")]
[ResponseCache(NoStore = true, Location = ResponseCacheLocation.None)]
public ActionResult<IEnumerable<string>> Big()
{
return Ok(EnumerableFile.Lines);
}
[Route("api/[controller]/stream")]
[ResponseCache(NoStore = true, Location = ResponseCacheLocation.None)]
public IActionResult Stream()
{
return new StreamResult();
}
}
}

Applying custom request options for a remote SPARQL connector in dotNetRdf

I'm trying to add custom headers to the HTTP requsets a SPARQL endpoint connector issues. The connector can use a custom remote endpoint, which inherits an ApplyCustomRequestOptions method I can override. Documentation for that method says
[...] add any additional custom request options/headers to the request.
However my overridden method is never called (so my custom options are not applied, so I can't add the headers).
The following code works as expected, except that my ApplyCustomRequestOptions is never invoked:
using System;
using System.Net;
using VDS.RDF.Query;
using VDS.RDF.Storage;
class Program
{
static void Main(string[] args)
{
var endpointUri = new Uri("https://query.wikidata.org/sparql");
var endpoint = new CustomEndpoint(endpointUri);
using (var connector = new SparqlConnector(endpoint))
{
var result = connector.Query("SELECT * WHERE {?s ?p ?o} LIMIT 1");
}
}
}
public class CustomEndpoint : SparqlRemoteEndpoint
{
public CustomEndpoint(Uri endpointUri) : base(endpointUri) { }
protected override void ApplyCustomRequestOptions(HttpWebRequest httpRequest)
{
// This is never executed.
base.ApplyCustomRequestOptions(httpRequest);
// Implementation omitted.
}
}
Is this the correct way to use these methods? If it isn't, what is it?
BTW this is dotNetRdf 1.0.12, .NET 4.6.1. I've tried multiple SPARQL endpoints, multiple queries (SELECT & CONSTRUCT) and multiple invocations of SparqlConnector.Query.
This is a bug. I've found the problem and fixed it and submitted a PR. You can track the status of the issue here: https://github.com/dotnetrdf/dotnetrdf/issues/103

.NET async webservice call with a callback

We have a legacy VB6 application that uses an ASMX webservice written in C# (.NET 4.5), which in turn uses a library (C#/.NET 4.5) to execute some business logic. One of the library methods triggers a long-running database stored procedure at the end of which we need to kick off another process that consumes the data generated by the stored procedure. Because one of the requirements is that control must immediately return to the VB6 client after calling the webservice, the library method is async, takes an Action callback as a parameter, the webservice defines the callback as an anonymous method and doesn't await the results of the library method call.
At a high level it looks like this:
using System;
using System.Data.SqlClient;
using System.Threading.Tasks;
using System.Web.Services;
namespace Sample
{
[WebService(Namespace = "urn:Services")]
[WebServiceBinding(ConformsTo = WsiProfiles.BasicProfile1_1)]
public class MyWebService
{
[WebMethod]
public string Request(string request)
{
// Step 1: Call the library method to generate data
var lib = new MyLibrary();
lib.GenerateDataAsync(() =>
{
// Step 2: Kick off a process that consumes the data created in Step 1
});
return "some kind of response";
}
}
public class MyLibrary
{
public async Task GenerateDataAsync(Action onDoneCallback)
{
try
{
using (var cmd = new SqlCommand("MyStoredProc", new SqlConnection("my DB connection string")))
{
cmd.CommandType = System.Data.CommandType.StoredProcedure;
cmd.CommandTimeout = 0;
cmd.Connection.Open();
// Asynchronously call the stored procedure.
await cmd.ExecuteNonQueryAsync().ConfigureAwait(false);
// Invoke the callback if it's provided.
if (onDoneCallback != null)
onDoneCallback.Invoke();
}
}
catch (Exception ex)
{
// Handle errors...
}
}
}
}
The above works in local tests, but when the code is deployed as a webservice Step 2 is never executed even though the Step 1 stored procedure completes and generates the data.
Any idea what we are doing wrong?
it is dangerous to leave tasks running on IIS, the app domain may be shut down before the method completes, that is likely what is happening to you. If you use HostingEnvironment.QueueBackgroundWorkItem you can tell IIS that there is work happening that needs to be kept running. This will keep the app domain alive for a extra 90 seconds (by default)
using System;
using System.Data.SqlClient;
using System.Threading.Tasks;
using System.Web.Services;
namespace Sample
{
[WebService(Namespace = "urn:Services")]
[WebServiceBinding(ConformsTo = WsiProfiles.BasicProfile1_1)]
public class MyWebService
{
[WebMethod]
public string Request(string request)
{
// Step 1: Call the library method to generate data
var lib = new MyLibrary();
HostingEnvironment.QueueBackgroundWorkItem((token) =>
lib.GenerateDataAsync(() =>
{
// Step 2: Kick off a process that consumes the data created in Step 1
}));
return "some kind of response";
}
}
public class MyLibrary
{
public async Task GenerateDataAsync(Action onDoneCallback)
{
try
{
using (var cmd = new SqlCommand("MyStoredProc", new SqlConnection("my DB connection string")))
{
cmd.CommandType = System.Data.CommandType.StoredProcedure;
cmd.CommandTimeout = 0;
cmd.Connection.Open();
// Asynchronously call the stored procedure.
await cmd.ExecuteNonQueryAsync().ConfigureAwait(false);
// Invoke the callback if it's provided.
if (onDoneCallback != null)
onDoneCallback();
}
}
catch (Exception ex)
{
// Handle errors...
}
}
}
}
If you want something more reliable than 90 extra seconds see the article "Fire and Forget on ASP.NET" by Stephen Cleary for some other options.
I have found a solution to my problem that involves the old-style (Begin/End) approach to asynchronous execution of code:
public void GenerateData(Action onDoneCallback)
{
try
{
var cmd = new SqlCommand("MyStoredProc", new SqlConnection("my DB connection string"));
cmd.CommandType = System.Data.CommandType.StoredProcedure;
cmd.CommandTimeout = 0;
cmd.Connection.Open();
cmd.BeginExecuteNonQuery(
(IAsyncResult result) =>
{
cmd.EndExecuteNonQuery(result);
cmd.Dispose();
// Invoke the callback if it's provided, ignoring any errors it may throw.
var callback = result.AsyncState as Action;
if (callback != null)
callback.Invoke();
},
onUpdateCompleted);
}
catch (Exception ex)
{
// Handle errors...
}
}
The onUpdateCompleted callback action is passed to the BeginExecuteNonQuery method as the second argument and is then consumed in the AsyncCallback (the first argument). This works like a charm both when debugging inside VS and when deployed to IIS.

Handling collection events with MongoDB C# driver (v2.0)

Playing with the new MongoDB driver (v2.0) has been quite challenging. Most of the examples you find on the web still refer to the legacy driver. The reference manual for v2.0 on the official Mongo site is "terse", to say the least.
I'm attempting to do a simple thing: detect when a collection has been changed in order to forward a C# event to my server application.
For doing so, I've found the following C# example (see below) that I'm trying to convert to the new API.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using MongoDB.Bson;
using MongoDB.Driver;
using MongoDB.Driver.Builders;
namespace TestTailableCursor {
public static class Program {
public static void Main(string[] args) {
try {
var server = MongoServer.Create("mongodb://localhost/?safe=true");
var database = server["test"];
if (database.CollectionExists("capped")) {
database.DropCollection("capped");
}
var collectionOptions = CollectionOptions.SetCapped(true).SetMaxDocuments(5).SetMaxSize(10000);
var commandResult = database.CreateCollection("capped", collectionOptions);
var collection = database["capped"];
// to test the tailable cursor manually insert documents into the test.capped collection
// while this program is running and verify that they are echoed to the console window
// see: http://www.mongodb.org/display/DOCS/Tailable+Cursors for C++ version of this loop
BsonValue lastId = BsonMinKey.Value;
while (true) {
var query = Query.GT("_id", lastId);
var cursor = collection.Find(query)
.SetFlags(QueryFlags.TailableCursor | QueryFlags.AwaitData)
.SetSortOrder("$natural");
using (var enumerator = (MongoCursorEnumerator<BsonDocument>) cursor.GetEnumerator()) {
while (true) {
if (enumerator.MoveNext()) {
var document = enumerator.Current;
lastId = document["_id"];
ProcessDocument(document);
} else {
if (enumerator.IsDead) {
break;
}
if (!enumerator.IsServerAwaitCapable) {
Thread.Sleep(TimeSpan.FromMilliseconds(100));
}
}
}
}
}
} catch (Exception ex) {
Console.WriteLine("Unhandled exception:");
Console.WriteLine(ex);
}
Console.WriteLine("Press Enter to continue");
Console.ReadLine();
}
private static void ProcessDocument(BsonDocument document)
{
Console.WriteLine(document.ToJson());
}
}
}
A few (related) questions:
Is that the right approach with the new driver?
If so, how do I set collection options (like SetCap in the example above). The new API includes something called "CollectionSettings", which seems totally
unrelated.
Is my only option to rely on the legacy driver?
Thanks for your help.
Is my only option to rely on the legacy driver?
No.
[...] how do I set collection options (like SetCap in the example above). The new API includes something called "CollectionSettings", which seems totally unrelated.
There's CreateCollectionSettings now. CollectionSettings is a setting for the driver, i.e. a way to specify default behavior per-collection. CreateCollectionOptions can be used like this:
db.CreateCollectionAsync("capped", new CreateCollectionOptions
{ Capped = true, MaxDocuments = 5, MaxSize = 10000 }).Wait();
Is that the right approach with the new driver?
I think so, tailable cursors are a feature of the database, and avoiding polling always makes sense.
I converted the gist of the code and it appears to work on my machineā„¢:
Be careful when using .Result and .Wait() in a web or UI application.
private static void ProcessDocument<T>(T document)where T : class
{
Console.WriteLine(document.ToJson());
}
static async Task Watch<T>(IMongoCollection<T> collection) where T: class
{
try {
BsonValue lastId = BsonMinKey.Value;
while (true) {
var query = Builders<T>.Filter.Gt("_id", lastId);
using (var cursor = await collection.FindAsync(query, new FindOptions<T> {
CursorType = CursorType.TailableAwait,
Sort = Builders<T>.Sort.Ascending("$natural") }))
{
while (await cursor.MoveNextAsync())
{
var batch = cursor.Current;
foreach (var document in batch)
{
lastId = document.ToBsonDocument()["_id"];
ProcessDocument(document);
}
}
}
}
}
catch (Exception ex) {
Console.WriteLine("Unhandled exception:");
Console.WriteLine(ex);
}
}

Reading multiple XML files freezes applicaton

I'm trying to write an application for a game called Eve Online, where people mine planets for product. The app is basically training for me, and is supposed to tell people when their operations expire, so they can reset. Some players have 30 characters, with 120 planets and over 200 timed operations.
I have written code to store the characters in an SQLite db, then I'm pulling each characters 2 verification codes and return their equipment. This info is stored in 2 separate XML documents. One with the planets, and another for each planets with the equipment on it. I have tried to solve getting it all in the following way:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Xml;
namespace PITimer
{
class GetPlanets
{
private long KeyID, CharacterID;
private string VCode;
string CharName;
public List<long> planets = new List<long>();
public List<string> planetNames = new List<string>();
public List<AllInfo> allInfo = new List<AllInfo>();
public GetPlanets(long KeyID, long CharacterID, string VCode, string CharName)
{
this.KeyID = KeyID;
this.CharacterID = CharacterID;
this.VCode = VCode;
this.CharName = CharName;
}
public async Task PullPlanets()
{
XmlReader lesern = XmlReader.Create("https://api.eveonline.com/char/PlanetaryColonies.xml.aspx?keyID=" + KeyID + "&vCode=" + VCode + "&characterID=" + CharacterID);
while (lesern.Read())
{
long planet = 000000;
string planetName;
planet = Convert.ToInt64(lesern.GetAttribute("planetID"));
planetName = lesern.GetAttribute("planetName");
if ((planet != 000000) && (planetName != null))
{
planets.Add(planet);
planetNames.Add(planetName);
await GetExpirationTimes(planet, planetName);
}
}
}
public async Task GetExpirationTimes(long planetID, string planetName)
{
string planet = planetID.ToString();
XmlReader lesern = XmlReader.Create("https://api.eveonline.com/char/PlanetaryPins.xml.aspx?keyID=" + KeyID + "&vCode=" + VCode + "&characterID=" + CharacterID + "&planetID=" + planet);
while (lesern.Read())
{
string expTime;
expTime = lesern.GetAttribute("expiryTime");
if ((expTime != null) && (expTime != "0001-01-01 00:00:00"))
{
allInfo.Add(new AllInfo(CharName, planetName, Convert.ToDateTime(expTime)));
}
}
}
public List<long> ReturnPlanets()
{
PullPlanets();
return planets;
}
public List<string> ReturnNames()
{
return planetNames;
}
public List<AllInfo> ReturnAllInfo()
{
return allInfo;
}
}
}
This does work and it returns the data, which I run into a ListView on Xamarin Android. My problem is that it freezes the system when it runs. I am testing with 2 characters, and the UI is sometimes frozen for several seconds. Other times it runs fine. But with 30 characters I am going to run into trouble.
I have tried solving this with async and task in every configuration I have been able to imagine. Its still either slow or not working at all. How do I set this up in a way that will work in an app?
Its also running a lighter check in a background service, which it does, but I fear I will really strain the system.
The planet XML can be seen here: https://api.eveonline.com/char/PlanetaryColonies.xml.aspx?keyID=3060230&vCode=1ft0nDTRaXgVM6r0co9QhJUq3tC5hYErfBFrt7Skilk4181krBiIRVhshH1TzkDP&characterID=94304895
I pull the planet ID and from that run this XML:
https://api.eveonline.com/char//PlanetaryPins.xml.aspx?keyID=3060230&vCode=1ft0nDTRaXgVM6r0co9QhJUq3tC5hYErfBFrt7Skilk4181krBiIRVhshH1TzkDP&characterID=94304895&planetID=40175117
I have used XMLDoc before, but thought reader would be better and faster for this.

Categories