I am trying to quickly load the most recent exceptions using ErrorStore in StackExchange.Exceptional in C#.
Is there any way I can get all of the exceptions since a recent date without having to load and sort all of the exceptions? How can I speed up my code?
My current code gets the 100 most recent exceptions, but it is very slow.
string[] applications = new[] {loc1, loc2};
var errors = new List<Error>();
applications.ForEach(app => ErrorStore.Default.GetAll(errors, app));
return errors.OrderByDescending(m => m.CreationDate).Take(100);
Here is the documentation for ErrorStore.
Your code is probably slow because of your third line of code
applications.ForEach(app => ErrorStore.Default.GetAll(errors, app));
Here is ErrorStore.GetAll
public int GetAll(List<Error> errors, string applicationName = null)
{
if (_isInRetry)
{
errors.AddRange(WriteQueue);
return errors.Count;
}
try { return GetAllErrors(errors, applicationName); }
catch (Exception ex) { BeginRetry(ex); }
return 0;
}
And here is one implementation of GetAllErrors
protected override int GetAllErrors(List<Error> errors, string applicationName = null)
{
using (var c = GetConnection())
{
errors.AddRange(c.Query<Error>(#"
Select Top (#max) *
From Exceptions
Where DeletionDate Is Null
And ApplicationName = #ApplicationName
Order By CreationDate Desc", new { max = _displayCount, ApplicationName = applicationName.IsNullOrEmptyReturn(ApplicationName) }));
}
return errors.Count;
}
The sql query in GetAllErrors will select all exceptions, not just the top 100. Suppose there are n total exceptions per application, and two applications. Thus, your third line of code would have to run in f(n) = 2n time.
To speed up your program, you can write a method similar to GetAllErrors, but modify the sql query to select only 100 exceptions from the error database, so even if there are billion exceptions, the code will finish after finding the first 100 that match.
You could also move the code to search for all application names to sql so you would not need to order by the creation date again.
Related
I am trying to get data from sql server using dapper. I have requirement to export 460K records stored in a Azure sql database. I decided to get data in batches, so I getting record of 10k records in each batch. I have planned to get the records in Parallel, so I added async methods to a list of task and did Task.WhenAll. The code works fine when i run locally but after deployed to k8s cluster, I am getting data read issue for some records. I am new to multi threading and I don't how to handle this issue. I tried to do a lock inside the method but the system crashes, Below is my code, the code might be clumsy because I was trying many solution to fix the issue.
for (int i = 0; i < numberOfPages; i++)
{
tableviewWithCondition.startRow = startRow;
resultData.Add(_tableviewRepository.GetTableviewRowsByPagination(tableviewExportCondition.TableviewName, modelMappingGroups, tableviewWithCondition.startRow, builder, pageSize, appName, i));
startRow += tableviewWithCondition.pageSize;
}
foreach(var task in resultData)
{
if (task != null)
{
dataToExport.AddRange(task.Result);
}
}
This is the method I implemented to get data from azure sql database using dapper.
public async Task<(IEnumerable<int> unprocessedData, IEnumerable<dynamic> rowData)> GetTableviewRowsByPagination(string tableName, IEnumerable<MappingGroup> tableviewAttributeDetails,
int startRow, SqlBuilder builder, int pageSize = 100, AppNameEnum appName = AppNameEnum.OptiSoil, int taskNumber = 1)
{
var _unitOfWork = _unitOfWorkServices.Build(appName.ToString());
List<int> unprocessedData = new List<int>();
try
{
var columns = tableviewAttributeDetails.Select(c => { return $"{c.mapping_group_value} [{c.attribute}]"; });
var joinedColumn = string.Join(",", columns);
builder.Select(joinedColumn);
var selector = builder.AddTemplate($"SELECT /**select**/ FROM {tableName} with (nolock) /**innerjoin**/ /**where**/ /**orderby**/ OFFSET {startRow} ROWS FETCH NEXT {(pageSize == 0 ? 100 : pageSize)} ROWS ONLY");
using (var connection = _unitOfWork.Connection)
{
connection.Open();
var data = await connection.QueryAsync(selector.RawSql, selector.Parameters);
Console.WriteLine($"data completed for task{taskNumber}");
return (unprocessedData, data);
}
}
catch(Exception ex)
{
Console.WriteLine($"Exception: {ex.Message}");
if (ex.InnerException != null)
Console.WriteLine($"InnerException: {ex.InnerException.Message}");
Console.WriteLine($"Error in fetching from row {startRow}");
unprocessedData.Add(startRow);
return (unprocessedData, null);
}
finally
{
_unitOfWork.Dispose();
}
}
The above code works fine locally, but in server I am getting below issue.
Exception: A transport-level error has occurred when sending the request to the server. (provider: TCP Provider, error: 35 - An internal exception was caught).
InnerException: The WriteAsync method cannot be called when another write operation is pending.
How to avoid this issue when fetch data in parallel tasks?
You're using the same connection and trying to execute multiple commands over it (I'm assuming this because of the naming), also should you be disposing the unit of work?
Rather than :
using (var connection = _unitOfWork.Connection)
{
connection.Open();
var data = await connection.QueryAsync(selector.RawSql, selector.Parameters);
Console.WriteLine($"data completed for task{taskNumber}");
return (unprocessedData, data);
}
Create a new connection for each item, if this is what you truly want to do. I imagine, and this is an educated guess it's working locally because of timing.
Also look into Task.WhenAll it's a better way collect all the results up. Rather than :
foreach(var task in resultData)
{
if (task != null)
{
dataToExport.AddRange(task.Result);
}
}
calling result on a task is usually bad practice.
I'm trying to retrieve the most recent events from an event log. I found this answer Read Event Log From Newest to Oldest but it involves loading the whole log which I wish to avoid.
So I tried going to the end of the log first likes this:
var eventLog = EventLog.GetEventLogs().OfType<EventLog>().Single(el => el.Log == "Security");
int logSize = eventLog.Entries.Count;
var lastScanTime = DateTime.Now.AddDays(-1),
for (int i = logSize - 1; i >= 0; i--)
{
var entry = eventLog.Entries[i];
if (entry.TimeWritten <= lastScanTime)
{
break;
}
entry.Dump();
}
This works for a while, but if I keep running it I start to get IndexOutOfRangeExceptions
It seems that the eventLog.Entries.Count no longer matches the number of entries, which you can check with the followiog code:
var eventLog = EventLog.GetEventLogs().OfType<EventLog>().Single(el => el.Log == "Security");
int size = eventLog.Entries.Count.Dump();
int size2 = eventLog.Entries.Cast<EventLogEntry>().ToList().Count.Dump();
The two size values return different values. Entries.Countkeeps growing but the number of entries in the list stops increasing. The count also matches what I see in the Windows Event Log viewer.
Its like something in the .net API breaks and no more events are available.
Anybody seen this before or have any fixes to get this approach to work.
Edit: I also found this A mystery IndexOutOfRange Exception recurring when reading from a full EventLog which seems like a similar problem but no solution. I can't use events because I may be looking to activity on the machine before my software is installed.
Edit: If I catch exceptions and put the results into a list I get a different number of events again:
var results = new List<EventLogEntry>();
for (int i = eventLog.Entries.Count; i >= 0; i--)
{
try
{
var entry = eventLog.Entries[i];
{
results.Add(entry);
}
}
catch (Exception ex) { }
}
It has a lower count than Entries.Count but more recent logs entries than Entries does which seems to have just stopped at a certain point.
In case anyone else has the same issue it seems to be a result of not always disposing of the EventLog object after reading from file.
Once this happens it seems to break or corrupt the file in some way any you can't get more recent data until you clear the file out.
I have inherited a WCF web service application that requires to have much better error tracking. What we do is query data from one system (AcuODBC), and send that data to another system (Salesforce). This query will return 10's of thousands of complex objects as a List<T>. We then process this List<T> in batches of 200 records at a time to map the fields to another object type, then send that batch to Salesforce. After this is completed, the next batch starts. Here's a brief example:
int intStart = 0, intEnd = 200;
//done in a loop, snipped for brevity
var leases = from i in trleases.GetAllLeases(branch).Skip(intStart).Take(intEnd)
select new sforceObject.SFDC_Lease() {
LeaseNumber = i.LeaseNumber.ToString(),
AccountNumber = i.LeaseCustomer,
Branch = i.Branch
(...)//about 150 properties
//do stuff with list and increment to next batch
intStart += 200;
However, the problem is if one object has a bad field mapping (Invalid Cast Exception), I would like to print out the object that failed to a log.
Question
Is there any way I can decipher which object of the 200 threw the exception? I could forgo the batch concept that was given to me, but I'd rather avoid that if possible for performance reasons.
This should accomplish what you are looking for with very minor code changes:
int intStart = 0, intEnd = 200, count = 0;
List<SDFC_Lease> leases = new List<SDFC_Lease>();
//done in a loop, snipped for brevity
foreach(var i in trleases.GetAllLeases(branch).Skip(intStart).Take(intEnd)) {
try {
count++;
leases.Add(new sforceObject.SFDC_Lease() {
LeaseNumber = i.LeaseNumber.ToString(),
AccountNumber = i.LeaseCustomer,
Branch = i.Branch
(...)//about 150 properties);
} catch (Exception ex) {
// you now have you culprit either as 'i' or from the index 'count'
}
}
//do stuff with 'leases' and increment to next batch
intStart += 200;
I think that you could use a flag in each set method of the properties of the class SFDC_Lease, and use a static property for this like:
public class SFDC_Lease
{
public static string LastPropertySetted;
public string LeaseNumber
{
get;
set
{
LastPropertySetted = "LeaseNumber";
LeaseNumber = value;
}
}
}
Plz, feel free to improve this design.
I need to run 4 stored procedured and create queries from data received. There is about 5k items in q container co it is 20k executions of stored procedures. I use LINQ to connect to DB and execute them and it works just great with normal foreach loop but there is one problem: code takes about one hour to complete. It is way to long so I tried to write Parrarel.ForEach instead of normal ForEach loop. Code crashes after few iterations - I guess LINQ connection just doesnt get on with Parrarel. Any ideas how to run LINQ stored procedures in multiple Threads?
var dataCollector = new EpmDataCollector();
Parallel.ForEach(q, history =>
{
try
{
var queriesBefore = dataCollector.GetQueries().Count;
var weight = dataCollector.CreateProjectQuery(history);//function executes stored procedure and creates queries from data received, then adds them to container (ConcurrentBag) in dataCollector
dataCollector.CreateHoursQuery(history);//like above
dataCollector.CreateCostQuery(history);//same
dataCollector.CreateIncomeQuery(history);//same
var log = ...
Global.log.Info(log);
//i++;
Interlocked.Increment(ref i);
if (i % 10 == 0)
{
//calculate and log estimation time
}
}
catch (Exception ex)
{
//catch code
}
});
System.Data.Linq.DataContext class is not thread safe.
Reference: https://msdn.microsoft.com/en-us/library/system.data.linq.datacontext(v=vs.110).aspx
Any public static members of this type are thread safe. Any instance members are not guaranteed to be thread safe.
That's why you have to create new instance of DataContext within the ForEach loop.
Also I'd rather look into SqlBulkCopy (https://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy(v=vs.110).aspx) which is specifically designed to handle thousands of inserts.
Move the creation of your EpmDataCollector into the loop like so:
Parallel.ForEach(q, history =>
{
try
{
var dataCollector = new EpmDataCollector();
var queriesBefore = dataCollector.GetQueries().Count;
var weight = dataCollector.CreateProjectQuery(history);//function executes stored procedure and creates queries from data received, then adds them to container (ConcurrentBag) in dataCollector
dataCollector.CreateHoursQuery(history);//like above
dataCollector.CreateCostQuery(history);//same
dataCollector.CreateIncomeQuery(history);//same
var log = ...
Global.log.Info(log);
//i++;
Interlocked.Increment(ref i);
if (i % 10 == 0)
{
//calculate and log estimation time
}
}
catch (Exception ex)
{
//catch code
}
});
Seeking some advice, best practice etc...
Technology: C# .NET4.0, Winforms, 32 bit
I am seeking some advice on how I can best tackle large data processing in my C# Winforms application which experiences high memory usage (working set) and the occasional OutOfMemory exception.
The problem is that we perform a large amount of data processing "in-memory" when a "shopping-basket" is opened. In simplistic terms when a "shopping-basket" is loaded we perform the following calculations;
For each item in the "shopping-basket" retrieve it's historical price going all the way back to the date the item first appeared in-stock (could be two months, two years or two decades of data). Historical price data is retrieved from text files, over the internet, any format which is supported by a price plugin.
For each item, for each day since it first appeared in-stock calculate various metrics which builds a historical profile for each item in the shopping-basket.
The result is that we can potentially perform hundreds, thousand and/or millions of calculations depending upon the number of items in the "shopping-basket". If the basket contains too many items we run the risk of hitting a "OutOfMemory" exception.
A couple of caveats;
This data needs to be calculated for each item in the "shopping-basket" and the data is kept until the "shopping-basket" is closed.
Even though we perform steps 1 and 2 in a background thread, speed is important as the number of items in the "shopping-basket" can greatly effect overall calculation speed.
Memory is salvaged by the .NET garbage collector when a "shopping-basket" is closed. We have profiled our application and ensure that all references are correctly disposed and closed when a basket is closed.
After all the calculations are completed the resultant data is stored in a IDictionary. "CalculatedData is a class object whose properties are individual metrics calculated by the above process.
Some ideas I've thought about;
Obviously my main concern is to reduce the amount of memory being used by the calculations however the volume of memory used can only be reduced if I
1) reduce the number of metrics being calculated for each day or
2) reduce the number of days used for the calculation.
Both of these options are not viable if we wish to fulfill our business requirements.
Memory Mapped Files
One idea has been to use memory mapped files which will store the data dictionary. Would this be possible/feasible and how can we put this into place?
Use a temporary database
The idea is to use a separate (not in-memory) database which can be created for the life-cycle of the application. As "shopping-baskets" are opened we can persist the calculated data to the database for repeated use, alleviating the requirement to recalculate for the same "shopping-basket".
Are there any other alternatives that we should consider? What is best practice when it comes to calculations on large data and performing them outside of RAM?
Any advice is appreciated....
The easiest solution is a database, perhaps SQLite. Memory mapped files don't automatically become dictionaries, you would have to code all the memory management yourself, and thereby fight with the .net GC system itself for ownership of he data.
If you're interested in trying the memory mapped file approach, you can try it now. I wrote a small native .NET package called MemMapCache that in essence creates a key/val database backed by MemMappedFiles. It's a bit of a hacky concept, but the program MemMapCache.exe keeps all references to the memory mapped files so that if your application crashes, you don't have to worry about losing the state of your cache.
It's very simple to use and you should be able to drop it in your code without too many modifications. Here is an example using it: https://github.com/jprichardson/MemMapCache/blob/master/TestMemMapCache/MemMapCacheTest.cs
Maybe it'd be of some use to you to at least further figure out what you need to do for an actual solution.
Please let me know if you do end up using it. I'd be interested in your results.
However, long-term, I'd recommend Redis.
As an update for those stumbling upon this thread...
We ended up using SQLite as our caching solution. The SQLite database we employ exists separate to the main data store used by the application. We persist calculated data to the SQLite (diskCache) as it's required and have code controlling cache invalidation etc. This was a suitable solution for us as we were able to achieve write speeds up and around 100,000 records per second.
For those interested, this is the code that controls inserts into the diskCache. Full credit for this code goes to JP Richardson (shown answering a question here) for his excellent blog post.
internal class SQLiteBulkInsert
{
#region Class Declarations
private SQLiteCommand m_cmd;
private SQLiteTransaction m_trans;
private readonly SQLiteConnection m_dbCon;
private readonly Dictionary<string, SQLiteParameter> m_parameters = new Dictionary<string, SQLiteParameter>();
private uint m_counter;
private readonly string m_beginInsertText;
#endregion
#region Constructor
public SQLiteBulkInsert(SQLiteConnection dbConnection, string tableName)
{
m_dbCon = dbConnection;
m_tableName = tableName;
var query = new StringBuilder(255);
query.Append("INSERT INTO ["); query.Append(tableName); query.Append("] (");
m_beginInsertText = query.ToString();
}
#endregion
#region Allow Bulk Insert
private bool m_allowBulkInsert = true;
public bool AllowBulkInsert { get { return m_allowBulkInsert; } set { m_allowBulkInsert = value; } }
#endregion
#region CommandText
public string CommandText
{
get
{
if(m_parameters.Count < 1) throw new SQLiteException("You must add at least one parameter.");
var sb = new StringBuilder(255);
sb.Append(m_beginInsertText);
foreach(var param in m_parameters.Keys)
{
sb.Append('[');
sb.Append(param);
sb.Append(']');
sb.Append(", ");
}
sb.Remove(sb.Length - 2, 2);
sb.Append(") VALUES (");
foreach(var param in m_parameters.Keys)
{
sb.Append(m_paramDelim);
sb.Append(param);
sb.Append(", ");
}
sb.Remove(sb.Length - 2, 2);
sb.Append(")");
return sb.ToString();
}
}
#endregion
#region Commit Max
private uint m_commitMax = 25000;
public uint CommitMax { get { return m_commitMax; } set { m_commitMax = value; } }
#endregion
#region Table Name
private readonly string m_tableName;
public string TableName { get { return m_tableName; } }
#endregion
#region Parameter Delimiter
private const string m_paramDelim = ":";
public string ParamDelimiter { get { return m_paramDelim; } }
#endregion
#region AddParameter
public void AddParameter(string name, DbType dbType)
{
var param = new SQLiteParameter(m_paramDelim + name, dbType);
m_parameters.Add(name, param);
}
#endregion
#region Flush
public void Flush()
{
try
{
if (m_trans != null) m_trans.Commit();
}
catch (Exception ex)
{
throw new Exception("Could not commit transaction. See InnerException for more details", ex);
}
finally
{
if (m_trans != null) m_trans.Dispose();
m_trans = null;
m_counter = 0;
}
}
#endregion
#region Insert
public void Insert(object[] paramValues)
{
if (paramValues.Length != m_parameters.Count)
throw new Exception("The values array count must be equal to the count of the number of parameters.");
m_counter++;
if (m_counter == 1)
{
if (m_allowBulkInsert) m_trans = m_dbCon.BeginTransaction();
m_cmd = m_dbCon.CreateCommand();
foreach (var par in m_parameters.Values)
m_cmd.Parameters.Add(par);
m_cmd.CommandText = CommandText;
}
var i = 0;
foreach (var par in m_parameters.Values)
{
par.Value = paramValues[i];
i++;
}
m_cmd.ExecuteNonQuery();
if(m_counter != m_commitMax)
{
// Do nothing
}
else
{
try
{
if(m_trans != null) m_trans.Commit();
}
catch(Exception)
{ }
finally
{
if(m_trans != null)
{
m_trans.Dispose();
m_trans = null;
}
m_counter = 0;
}
}
}
#endregion
}