C# Write Async a specific data size - c#

I have an API in C# that returns data from a DB and a frontend that paints that data in a table.
My approach was to read the data from the DB with an sqlReader, iterate through this reader adding each result to a list and return that list to the frontend.
Seems easy enough, until I receive massive query data. My solution was to return this data chunk by chunk but I'm stuck with it, this is the code I'm working with:
var sqlCommand = db.InitializeSqlCommand(query);
try
{
using (var reader = sqlCommand.ExecuteReader())
{
var results = new List<List<string>>();
var headers = new List<string>();
var rows = new List<string>();
for (var i = 0; i < reader.FieldCount; i++)
{
headers.Add(reader.GetName(i));
}
results.Add(headers);
while (reader.Read())
{
for (var i = 0; i < reader.FieldCount; i++)
{
rows.Add((reader[reader.GetName(i)]).ToString());
}
results.Add(rows);
var str = JsonConvert.SerializeObject(results);
var buffer = Encoding.UTF8.GetBytes(str);
//Thread.Sleep(1000);
await outputStream.WriteAsync(buffer, 0, buffer.Length);
rows.Clear();
results.Clear();
outputStream.Flush();
}
}
}
catch (HttpException ex)
{
if (ex.ErrorCode == -2147023667) // The remote host closed the connection.
{
}
}
finally
{
outputStream.Close();
db.Dispose();
}
With this, I'm able to return data one by one (tested with the Thread.sleep), but I'm stuck on how to return a specific amount, say 200 data or 1000, it really should not matter.
Any idea on how to proceed?
Thanks in advance.
Mese.

I think controlling the query is the better way since that is what will be fetched from the database. You can increase the OFFSET for every subsequent run. Example - after ORDER BY clause add OFFSET 200 ROWS FETCH NEXT 200 ROWS ONLY to skip 200 rows and get the next 200.
However since you've mentioned that you have no control on the query, then you can do something like this to filter our results on your end. The key trick here is to use reader.AsEnumerable.Skip(200).Take(200) to choose which rows to process. Update the input to Skip() in every iteration to process data accordingly.
// Offset variable will decide how many rows to skip, the outer while loop can be
// used to determine if more data is present and increment offset by 200 or any
// other value as required. Offset -> 0, 200, 400, 600, etc.. until data is present
bool hasMoreData = true;
int offset = 0;
while(hasMoreData)
{
// SQL Data reader and other important operations
foreach(var row in reader.AsEnumerable.Skip(offset).Take(200))
{
// Processing operations
}
// Check to ensure there are more rows
if(no more rows)
hasMoreData = false;
offset += 200;
}
Another thing to keep in mind is when you pull the data in batches, the query will execute multiple times and if during that time, a new record got added or deleted, then the batches will not function correctly. To get past this, you can do 2 things:
Validate a Unique ID of every record with unique ID's of already fetched records to make sure the same record isn't pulled twice (edge case due to record addition/deletion)
Add a buffer to your offset, such as
Skip(0).Take(100) // Pulls 0 - 100 records
Skip(90).Take(100) // Pulls 90 - 190 records (overlap of 10 to cater for additions/deletions)
Skip(180).Take(100) // Pulls 180 - 280 records (overlap of 10 to cater for additions/deletions)
and so on...
Hope this helps!

Related

Iterating over Linq-to-Entities IEnumerable causes OutOfMemoryException

The part of the code I'm working on receives an
IEnumerable<T> items
where each item contains a class with properties reflecting a MSSQL database table.
The database table has a total count of 953664 rows.
The dataset in code is filtered down to a set of 284360 rows.
The following code throws an OutOfMemoryException when the process reaches about 1,5 GB memory allocation.
private static void Save<T>(IEnumerable<T> items, IList<IDataWriter> dataWriters, IEnumerable<PropertyColumn> columns) where T : MyTableClass
{
foreach (var item in items)
{
}
}
The variable items is of type
IQueryable<MyTableClass>
I can't find anyone with the same setup, and other's solutions that I've found doesn't apply here.
I've also tried paging, using Skip and Take with a page size of 500, but that just takes a long time and ends up with the same result. It seems like objects aren't being released after each iteration. How is that?
How can I rewrite this code to cope with a larger collection set?
Well, as Servy has already said you didn't provide your code so I'll try to make some predictions... (Sorry for my english)
If you have an exception in "foreach (var item in items)" when you are using paging then, I guess, something wrong with paging. I wrote a couple of examples to explain my idea.
if first example I suggest you (just for test) put your filter inside the Save function.
private static void Save<T>(IQueryable<T> items, IList<IDataWriter> dataWriters, IEnumerable<PropertyColumn> columns) where T : MyTableClass
{
int pageSize = 500; //Only 500 records will be loaded.
int currentStep = 0;
while (true)
{
//Here we create a new request into the database using our filter.
var tempList = items.Where(yourFilter).Skip(currentStep * pageSize).Take(pageSize);
foreach (var item in tempList)
{
//If you have an exception here maybe something wrong in your dataWriters or columns.
}
currentStep++;
if (tempList.Count() == 0) //No records have been loaded so we can leave.
break;
}
}
The second example show how to use paging without any changes in the Save function
int pageSize = 500;
int currentStep = 0;
while (true)
{
//Here we create a new request into the database using our filter.
var tempList = items.Where(yourFilter).Skip(currentStep * pageSize).Take(pageSize);
Save(tempList, dataWriters, columns); //Calling saving function.
currentStep++;
if (tempList.Count() == 0)
break;
}
Try both of them and you'll either resolve your problem or find another place where an exception is raised.
By the way, another potential place is your dataWriters. I guess there you store all data that your have been received from the database. Maybe you shouldn't save all data? Just calculate memory size that all objects are required.
P.S. And don't use while(true) in your code. It just an example:)

Parse.com Query only returning first 100

Hi I was wondering why my parse query was only returning an 100 objects when their is over 3000 rows in the parse db. I am using this in a xamrian.ios application and its only getting the first 99 objects back any ideas help is appreciated. And yes I did debug the code its only retreieving the first 99 objects back.
public async void populateFromParseLocalDB()
{
var query = ParseObject.GetQuery ("clinics");;
IEnumerable<ParseObject> results = await query.FindAsync();
int i;
foreach (var record in results)
{
i++;
Console.WriteLine("in for each");
var name = record.Get<String>("Name");
Console.WriteLine(name);
}
int mycount = i;
}
From the Parse Docs:
You can limit the number of results by calling Limit. By default,
results are limited to 100, but anything from 1 to 1000 is a valid
limit:

Alternative to Recordset Looping

Back in the day using ADO, we used GetRows() to pull back an array and loop through it, because it was faster than using rs.MoveNext to walk through records. I'm writing an application that pulls back half a million rows and writes them out into a file. Pulling the data from SQL takes about 3 minutes, but writing it to a CSV is taking another 12 minutes. From the looks of it, it's because I'm looping through a SqlDataReader. What is a faster alternative?
Keep in mind, I do not know what the SQL Structure will look like as this is calling a reporting table that tells my application what query should be called. I looked at using linq and return an array, but that will require knowing the structure, so that will not work.
Note the below code, case statement has many cases, but to cut down on space, I removed them all, except one.
StringBuilder rowValue = new StringBuilder();
SqlDataReader reader = queryData.Execute(System.Data.CommandType.Text, sql, null);
//this is to handle multiple record sets
while (reader.HasRows)
{
for (int i = 0; i < reader.FieldCount; i++)
{
if (rowValue.Length > 0)
rowValue.Append("\",\"");
else
rowValue.Append("\"");
rowValue.Append(reader.GetName(i).Replace("\"", "'").Trim());
}
rowValue.Append("\"" + Environment.NewLine);
File.AppendAllText(soureFile, rowValue.ToString());
while (reader.Read())
{
rowValue = new StringBuilder();
for (int i = 0; i < reader.FieldCount; i++)
{
String value = "";
switch (reader.GetFieldType(i).Name.ToLower())
{
case "int16":
value = reader.IsDBNull(i) ? "NULL" : reader.GetInt16(i).ToString();
break;
}
if (rowValue.Length > 0)
rowValue.Append("\",=\""); //seperate items
else
rowValue.Append("\""); //first item of the row.
rowValue.Append(value.Replace("\"", "'").Trim());
}
rowValue.Append("\"" + Environment.NewLine); //last item of the row.
File.AppendAllText(soureFile, rowValue.ToString());
}
//next record set
reader.NextResult();
if (reader.HasRows)
File.AppendAllText(soureFile, Environment.NewLine);
}
reader.Close();
The problem here is almost certainly that you are calling File.AppendAllText() for every row. Since AppendAllText opens, writes, then closes the file every time it is called, it can get quite slow.
A better way would be either to use the AppendText() method or else an explicit StreamWriter.

Finalise SQLite 3 statement

I'm developing metro app using Windows 8 release preview and C#(VS 2012), I'm new to SQLite, I integrated SQLite 3.7.13 in my App and it is working fine, Observe my code below
var dbPath = Path.Combine(Windows.Storage.ApplicationData.Current.LocalFolder.Path, "Test.db");
using (var db = new SQLite.SQLiteConnection(dbPath))
{
var data = db.Table<tablename>().Where(tablename => tablename.uploaded_bool == false && tablename.Sid == 26);
try
{
int iDataCount = data.Count();
int id;
if (iDataCount > 0)
{
for (int i = 0; i < iDataCount; i++)
{
Elements = data.ElementAt(i);
id = Elements.id;
/*
Doing some code
*/
}
int i = db.Delete<tablename>(new tablename() { Sid = 26 });
}
}
catch (Exception ex)
{
}
}
where "Sid" is column in my database and with number "26" i will get n number of rows
So, using a for loop i need to do some code and after the for loop I need to delete records of Sid(26) in database, So at this line
int i = db.Delete<tablename>(new tablename() { Sid = 26 });
I'm getting unable to close due to unfinalised statements exception, So my question is how to finalise the statement in sqlite3,Apparently SQLite3 has a finalize method for destroying previous DB calls but I am not sure how to implement this. Please help me.
Under the covers sqlite-net does some amazing things in an attempt to manage queries and connections for you.
For example, the line
var data = db.Table<tablename>().Where(...)
Does not actually establish a connection or execute anything against the database. Instead, it creates an instance of a class called TableQuery which is enumerable.
When you call
int iDataCount = data.Count();
TableQuery actually executes
GenerateCommand("count(*)").ExecuteScalar<int>();
When you call
Elements = data.ElementAt(i);
TableQuery actually calls
return Skip(index).Take(1).First();
Take(1).First() eventually calls GetEnumerator, which compiles a SQLite command, executes it with TOP 1, and serializes the result back into your data class.
So, basically, every time you call data.ElementAt you are executing another query. This is different from standard .NET enumerations where you are just accessing an element in a collection or array.
I believe this is the root of your problem. I would recommend that instead of getting the count and using a for(i, ...) loop, you simply do foreach (tablename tn in data). This will cause all records to be fetched at once instead of record by record in the loop. That alone may be enough to close the query and allow you to delete the table during the loop. If not, I recommend you create a collection and add each SID to the collection during the loop. Then, after the loop go back and remove the sids in another pass. Worst case scenario, you could close the connection between loops.
Hope that helps.

Why doesnt' Lucene remove docs?

I'm using Lucene.NET 2.3.1 with a MultiSearcher.
For testing purposes, I'm indexing a database with about 10000 rows. I have two indexes and I select randomly which to insert each row in. This works correctly, but as Lucene doesn't have update capabilities, I have to test if the row exists (I have an Id field) and then delete it.
I have a List and a List, and each is created with this code:
IndexModifier mod = new IndexModifier(path, new StandardAnalyzer(), create);
m_Modifiers.Add(mod);
m_Readers.Add(IndexReader.Open(path));
m_Searchers.Add(new IndexSearcher(path));
Now the delete code:
Hits results = m_Searcher.Search(new TermQuery(t));
for (int i = 0; i < results.Length(); i++)
{
DocId = results .Id(i);
Index = m_Searcher.SubSearcher(DocId);
DocId = m_Searcher.SubDoc(DocId);
m_Modifiers[Index].DeleteDocument(DocId);
}
The search is correct and I'm getting results when the row exists. SubSearcher returns always 0 or 1, if Index is 0, SubDoc returns the same ID passed, and if it's 1, then it returns around the number passed minus 5000 times the number of times I have indexed the DB. It seems as if it wasn't deleting anything.
Each time I index the database, I optimize and close the indices, and Luke says it has no pending deletions.
What could be the problem?
I am not sure what's the end goal of this activity, so pardon if the following solution doesn't meet your requirements.
First, if you want to delete documents, you can use IndexReader, which you have already created. IndexModifier is not required.
Second, you need not find the subsearcher ID and document ID in that subsearcher. You can as well use the top-level MultiReader. I would write the equivalent java code as follows.
IndexReader[] readers = new IndexReader[size];
// Initialize readers
MultiReader multiReader = new MultiReader(readers);
IndexSearcher searcher = new IndexSearcher(multiReader);
Hits results = searcher.search(new TermQuery(t));
for (int i = 0; i < results.length(); i++) {
int docID = results.id(i);
multiReader.deleteDocument(docID);
}
multiReader.commit(); // Check if this throws an exception.
multiReader.close();
searcher.close();

Categories