SqlDependency triggers on uncommited data - c#

We have a import / bulk copy batch that runs in the middle of the night, something like
using (var tx = new TransactionScope(TransactionScopeOption.Required, TimeSpan.FromMinutes(1)))
{
//Delete existing
...
//Bulk copy to temp table
...
//Insert
...
tx.Complete();
}
We also have a cache that uses SqlDependency. But it triggers allready on the Delete statement making the Cache empty for a few seconds between the delete nad the reinsert. Can I configure SqlDependency to only listen to commited data?
SqlDep. code
private IEnumerable<TEntity> RegisterAndFetch(IBusinessContext context)
{
var dependency = new SqlDependency();
dependency.OnChange += OnDependencyChanged;
try
{
CallContext.SetData("MS.SqlDependencyCookie", dependency.Id);
var refreshed = OnCacheRefresh(context, data.AsEnumerable());
var result = refreshed.ToArray();
System.Diagnostics.Debug.WriteLine("{0} - CacheChanged<{1}>: {2}", DateTime.Now, typeof(TEntity).Name, result.Length);
return result;
}
finally
{
CallContext.SetData("MS.SqlDependencyCookie", null);
}
}
OnDependencyChanged basily calls above RegisterAndFetch method
Query:
protected override IEnumerable<BankHoliday> OnCacheRefresh(IBusinessContext context, IEnumerable<BankHoliday> currentData)
{
var firstOfMonth = DateTime.Now.FirstDayOfMonth(); // <-- Must be a constant in the query
return context.QueryOn<BankHoliday>().Where(bh => bh.Holiday >= firstOfMonth);
}

Related

TransactionScope with ReadUncommitted Isolation Level in a Recursive function locks table for first iteration and then releases table

I have a function that is being called recursively. In this function, a TransactionScope object with Isolation level ReadUncommitted is being used. The strange behaviour is that the first time, it hits Database, locks the table and after the completion of first iteration, it releases the table and the behaviour is fine. Can anyone explain why? Please note that I'm using Core ADO.Net. No ORM. My functon looks like the following:
public void ExportTransactionsData()
{
var now = DateTime.Now;
var slash = ConfigurationReader.PathCharacterToUse;
var startId = 0L;
var endId = 0L;
var hasMoreRecords = false;
using (var transScope = UtilityMethods.TransactionScope(IsolationLevel.ReadUncommitted, TransactionScopeOption.Required))
{
using (var stream = _provider.GetTransactionsXml(out startId, out endId, out hasMoreRecords))
{
if (stream.Length > 0)
{
var filePath = $"{ConfigurationReader.TransactionFilePath}{slash}EZFareTransactions_{startId}_{endId}_{now.ToString("yyyyMMddTHHmmss")}.xml";
using (IUploadClient client = UploadClientFactory.Create(ConfigurationReader.UploadType))
{
client.Upload(filePath, stream);
}
}
}
transScope.Complete();
}
if (hasMoreRecords)
{
ExportTransactionsData();
}
}

ExecuteAsync() of Azure Table Storage failing to insert all the records

I am trying to insert 10000 records into Azure table storage. I am using ExecuteAsync() to achieve it, but somehow approximately around 7500 records are inserted and rest of the records are lost. I am purposely not using await keyword because I don't want to wait for the result, just want to store them in the table. Below is my code snippet.
private static async void ConfigureAzureStorageTable()
{
CloudStorageAccount storageAccount =
CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
TableResult result = new TableResult();
CloudTable table = tableClient.GetTableReference("test");
table.CreateIfNotExists();
for (int i = 0; i < 10000; i++)
{
var verifyVariableEntityObject = new VerifyVariableEntity()
{
ConsumerId = String.Format("{0}", i),
Score = String.Format("{0}", i * 2 + 2),
PartitionKey = String.Format("{0}", i),
RowKey = String.Format("{0}", i * 2 + 2)
};
TableOperation insertOperation = TableOperation.Insert(verifyVariableEntityObject);
try
{
table.ExecuteAsync(insertOperation);
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}
}
Is anything incorrect with the usage of the method?
You still want to await table.ExecuteAsync(). That will mean that ConfigureAzureStorageTable() returns control to the caller at that point, which can continue executing.
The way you have it in the question, ConfigureAzureStorageTable() is going to continue past the call to table.ExecuteAsync() and exit, and things like table will go out of scope, while the table.ExecuteAsync() task is still not complete.
There are plenty of caveats about using async void on SO and elsewhere that you will also need to consider. You could just as easily have your method as async Task but not await it in the caller yet, but keep the returned Task around for clean termination, etc.
Edit: one addition - you almost certainly want to use ConfigureAwait(false) on your await there, as you don't appear to need to preserve any context. This blog post has some guidelines on that and async in general.
According to your requirement, I have tested your scenario on my side by using CloudTable.ExecuteAsync and CloudTable.ExecuteBatchAsync successfully. Here is my code snippet about using CloudTable.ExecuteBatchAsync to insert records to Azure Table Storage, you could refer to it.
Program.cs Main
class Program
{
static void Main(string[] args)
{
CloudStorageAccount storageAccount =
CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
TableResult result = new TableResult();
CloudTable table = tableClient.GetTableReference("test");
table.CreateIfNotExists();
//Generate records to be inserted into Azure Table Storage
var entities = Enumerable.Range(1, 10000).Select(i => new VerifyVariableEntity()
{
ConsumerId = String.Format("{0}", i),
Score = String.Format("{0}", i * 2 + 2),
PartitionKey = String.Format("{0}", i),
RowKey = String.Format("{0}", i * 2 + 2)
});
//Group records by PartitionKey and prepare for executing batch operations
var batches = TableBatchHelper<VerifyVariableEntity>.GetBatches(entities);
//Execute batch operations in parallel
Parallel.ForEach(batches, new ParallelOptions()
{
MaxDegreeOfParallelism = 5
}, (batchOperation) =>
{
try
{
table.ExecuteBatch(batchOperation);
Console.WriteLine("Writing {0} records", batchOperation.Count);
}
catch (Exception ex)
{
Console.WriteLine("ExecuteBatch throw a exception:" + ex.Message);
}
});
Console.WriteLine("Done!");
Console.WriteLine("Press any key to exit...");
Console.ReadKey();
}
}
TableBatchHelper.cs
public class TableBatchHelper<T> where T : ITableEntity
{
const int batchMaxSize = 100;
public static IEnumerable<TableBatchOperation> GetBatches(IEnumerable<T> items)
{
var list = new List<TableBatchOperation>();
var partitionGroups = items.GroupBy(arg => arg.PartitionKey).ToArray();
foreach (var group in partitionGroups)
{
T[] groupList = group.ToArray();
int offSet = batchMaxSize;
T[] entities = groupList.Take(offSet).ToArray();
while (entities.Any())
{
var tableBatchOperation = new TableBatchOperation();
foreach (var entity in entities)
{
tableBatchOperation.Add(TableOperation.InsertOrReplace(entity));
}
list.Add(tableBatchOperation);
entities = groupList.Skip(offSet).Take(batchMaxSize).ToArray();
offSet += batchMaxSize;
}
}
return list;
}
}
Note: As mentioned in the official document about inserting a batch of entities:
A single batch operation can include up to 100 entities.
All entities in a single batch operation must have the same partition key.
In summary, please try to check whether it could work on your side. Also, you could capture the detailed exception within your console application and capture the HTTP request via Fiddler to catch the HTTP error requests when you inserting records to Azure Table Storage.
How about using a TableBatchOperation to run batches of N inserts at once?
private const int BatchSize = 100;
private static async void ConfigureAzureStorageTable()
{
CloudStorageAccount storageAccount =
CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
TableResult result = new TableResult();
CloudTable table = tableClient.GetTableReference("test");
table.CreateIfNotExists();
var batchOperation = new TableBatchOperation();
for (int i = 0; i < 10000; i++)
{
var verifyVariableEntityObject = new VerifyVariableEntity()
{
ConsumerId = String.Format("{0}", i),
Score = String.Format("{0}", i * 2 + 2),
PartitionKey = String.Format("{0}", i),
RowKey = String.Format("{0}", i * 2 + 2)
};
TableOperation insertOperation = TableOperation.Insert(verifyVariableEntityObject);
batchOperation.Add(insertOperation);
if (batchOperation.Count >= BatchSize)
{
try
{
await table.ExecuteBatchAsync(batchOperation);
batchOperation = new TableBatchOperation();
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}
}
if(batchOperation.Count > 0)
{
try
{
await table.ExecuteBatchAsync(batchOperation);
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}
}
You can adjust BatchSize to what you need. Small disclaimer: I didn't try to run this, though it should work.
But I can't help but wonder why is your function async void? That should be reserved for event handlers and similar ones where you cannot decide the interface. In most cases you want to return a task. Because now the caller cannot catch exceptions that occur in this function.
async void is not a good practice unless it is an eventhandler.
https://msdn.microsoft.com/en-us/magazine/jj991977.aspx
If you plan to insert many records into azure table storage, batch insert is your best bet.
https://msdn.microsoft.com/en-us/library/azure/microsoft.windowsazure.storage.table.tablebatchoperation.aspx
Keep in mind that it has limit of 100 table operations per batch.
I had the same issue and fixed it by
forcing ExecuteAsync to wait for the results before it exist..
table.ExecuteAsync(insertOperation).GetAwaiter().GetResult()

AutoCad NET use EntLast with asynchronous command

As I discovered in a previous question:
AutoCad Command Rejected "Undo" when using Application.Invoke()
it appears sending commands such c:wd_insym2 (ACad Electrical command) cannot be called synchronously as it calls additional commands such as Undo, causing it to fail.
However, I need to store the EntityID of the entity I have just created with the command, using either the Lisp (entlast) or Autodesk.AutoCad.Internal.Utils.EntLast(). Obviously if I send my command asynchronously this will not give me the correct result.
Maxence suggested using the doc.CommandEnded handler, however I cannot imagine how this will fit in my program flow, as I need to execute each command individually and then store the new EntityID in a .NET variable.
Is there ANY way for me to either send such commands synchronously without running into reentrancy issues, or alternatively send commands asynchronously and wait for them to execute before continuing?
Have you tried Editor.CommandAsync (AutoCAD 2015 and +):
[CommandMethod("CMD1")]
public async void Command1()
{
Document doc = Application.DocumentManager.MdiActiveDocument;
Editor ed = doc.Editor;
await ed.CommandAsync("_CMD2");
ed.WriteMessage("Last entity handle: {0}", Utils.EntLast().Handle);
}
[CommandMethod("CMD2")]
public void Command2()
{
Document doc = Application.DocumentManager.MdiActiveDocument;
Database db = doc.Database;
using (Transaction tr = db.TransactionManager.StartTransaction())
{
var line = new Line(new Point3d(), new Point3d(10, 20, 30));
var currentSpace = (BlockTableRecord) tr.GetObject(db.CurrentSpaceId, OpenMode.ForWrite);
currentSpace.AppendEntity(line);
tr.AddNewlyCreatedDBObject(line, true);
tr.Commit();
}
}
If you want to do this in an older version of AutoCAD, it will be more complicated:
List<ObjectId> ids;
[CommandMethod("CMD1")]
public void Cmd1()
{
Document doc = Application.DocumentManager.MdiActiveDocument;
ids = new List<ObjectId>();
doc.CommandEnded += Doc_CommandEnded;
doc.SendStringToExecute("_CMD2 0 ", false, false, true);
}
private void Doc_CommandEnded(object sender, CommandEventArgs e)
{
if (e.GlobalCommandName != "CMD2") return;
ids.Add(Utils.EntLast());
var doc = (Document) sender;
if (ids.Count < 10)
{
double angle = ids.Count * Math.PI / 10;
doc.SendStringToExecute("_CMD2 " + Converter.AngleToString(angle) + "\n", false, false, true);
}
else
{
doc.CommandEnded -= Doc_CommandEnded;
doc.Editor.WriteMessage("\nHandles: {0}", string.Join(", ", ids.Select(id => id.Handle.ToString())));
}
}
[CommandMethod("CMD2")]
public void Cmd2()
{
Document doc = Application.DocumentManager.MdiActiveDocument;
Database db = doc.Database;
PromptDoubleResult pdr = doc.Editor.GetAngle("\nAngle: ");
if (pdr.Status == PromptStatus.Cancel) return;
using (Transaction tr = db.TransactionManager.StartTransaction())
{
var line = new Line(new Point3d(), new Point3d(Math.Cos(pdr.Value), Math.Sin(pdr.Value), 0));
var currentSpace = (BlockTableRecord) tr.GetObject(db.CurrentSpaceId, OpenMode.ForWrite);
currentSpace.AppendEntity(line);
tr.AddNewlyCreatedDBObject(line, true);
tr.Commit();
}
}

Threading issues with SQLite3 and C# async

I am trying to save some data to a SQLite3 database. If I do not use async, I can save the data without any problems. As soon as I try to use the following code however, I receive the following error:
{Unable to evaluate expression because the code is optimized or a native frame is on top of the call stack.}
From my UI, I invoke the following SyncDomainTablesAsync method:
private readonly IDataCoordinator _coordinator;
public Configuration(IDataCoordinator coordinator)
{
_coordinator = coordinator;
}
public async Task<int> SyncDomainTablesAsync(IProgress<string> progress, CancellationToken ct, DateTime? lastDateSynced=null, string tableName = null)
{
//Determine the different type of sync calls
// 1) Force Resync (Drop/Create Tables and Insert)
// 2) Auto Update
var domainTable = await GetDomainTablesAsync(progress,ct,lastDateSynced, tableName);
var items = domainTable.Items;
int processCount = await Task.Run<int>( async () =>
{
int p = 0;
progress.Report(String.Format("Syncing Configurations..."));
foreach (var item in items)
{
progress.Report(String.Format("Syncing {0} Information",item.Key));
var task = await SyncTableAsync(item.Value); // INVOKED BELOW
if (task) progress.Report(String.Format("Sync'd {0} {1} records", item.Value.Count,item.Key));
if (ct.IsCancellationRequested) goto Cancelled;
p += item.Value.Count;
}
Cancelled:
if (ct.IsCancellationRequested)
{
//Update Last Sync'd Records
progress.Report(String.Format("Canceling Configuration Sync..."));
ct.ThrowIfCancellationRequested();
}
else
progress.Report(String.Format("Syncing Configurations Compleleted"));
return p;
},ct);
return processCount;
}
private async Task<bool> SyncTableAsync(IEnumerable<object> items, bool includeRelationships = false)
{
try
{
//TODO: Replace with SaveObjects method
var i = await Task.Run(() => _coordinator.SaveObjects(items, includeRelationships));
if (i == 0)
return false;
}
catch(Exception ex)
{
return false;
}
return true;
}
The UI invokes the SyncDomainTablesAsync method. I then create a new Task and loop through the items that were returned from the GetDomainTablesAsync method. During each iteration I await until the SyncTableAsync method completes. Within the SyncTableAsync I am calling a SaveObject method inside of a class that implements my IDataCoordinator interface.
public override int SaveObjects(IEnumerable<object> items, Type underlyingType, bool saveRelationships = true)
{
int result = 0;
if (items == null)
throw new ArgumentNullException("Can not save collection of objects. The collection is null.");
else if (items.Count() == 0)
return 0;
// Check if table exists.
foreach (var item in items)
this.CreateTable(item.GetType(), saveRelationships);
using (SQLiteConnection connection = new SQLiteConnection(this.StorageContainerPath))
{
connection.BeginTransaction();
foreach (var item in items)
{
result += ProcessSave(item, saveRelationships, connection);
}
try
{
connection.Commit();
}
catch (SQLiteException ex)
{
connection.Rollback();
throw ex;
}
}
return result;
}
public override int CreateTable(Type type, bool createRelationalTables = false)
{
if (this.TableExists(type) == 1)
return 1;
using (SQLiteConnection cn = new SQLiteConnection(this.StorageContainerPath))
{
try
{
// Check if the Table attribute is used to specify a table name not matching that of the Type.Name property.
// If so, we generate a Sql Statement and create the table based on the attribute name.
//if (Attribute.IsDefined(type, typeof(TableAttribute)))
//{
// TableAttribute attribute = type.GetAttribute<TableAttribute>();
// Strongly typed to SQLiteCoordinator just to get a SqlQuery instance. The CreateCommand method will create a table based on 'type'
var query = new SqlQuery<SQLiteCoordinator>().CreateCommand(DataProviderTypes.Sqlite3, type);
query = query.TrimEnd(';') + ";";
cn.Execute(query);
//}
// Otherwise create the table using the Type.
//else
//{
// cn.CreateTable(type);
//}
// If we are to create relationship tables, we cascade through all relationship properties
// and create tables for them as well.
if (createRelationalTables)
{
this.CreateCascadingTables(type, cn);
}
}
catch (Exception ex)
{
return 0;
}
}
return 1;
}
The flow of the code goes
UI->SyncDomainTablesAsync->SyncTableAsync->SaveObjects->SaveTable(type)
The issue that I have is within Save Table. If I just use SaveTable synchronously I have no issues. Using it in my async method above, causes a thread abort exception. The exception is thrown within the SQLite.cs file included with SQLite.net (within the . The weird thing is that the table is created in the database, even though the exception is thrown. The error is thrown some times when the Prepare() function is called and the rest of the time when the SQLite3.Step() function is called.
public int ExecuteNonQuery ()
{
if (_conn.Trace) {
Debug.WriteLine ("Executing: " + this);
}
var r = SQLite3.Result.OK;
var stmt = Prepare (); // THROWS THE ERRROR
r = SQLite3.Step(stmt); // THROWS THE ERRROR
Finalize(stmt);
if (r == SQLite3.Result.Done) {
int rowsAffected = SQLite3.Changes (_conn.Handle);
return rowsAffected;
} else if (r == SQLite3.Result.Error) {
string msg = SQLite3.GetErrmsg (_conn.Handle);
throw SQLiteException.New (r, msg);
} else {
throw SQLiteException.New (r, r.ToString ());
}
}
I assume that because my foreach statement awaits the return of SyncTableAsync that none of the threads are closed. I am also getting a system transaction critical exception that says "attempting to access a unloaded app domain".
Am I using await/async incorrectly with Sqlite3 or is this an issue with Sqlite3 that I am not aware of.
Attached is a photo of the Parallel's stack and the exception.
EDIT
When I try to run the code above as well in unit tests, the unit tests process never dies. I have to exit Visual Studio in order to get the process to die. I am assuming something in SQLite.dll is grabbing a hold of the process when the exception is thrown and not letting go, but I am not sure.
EDIT 2
I can modify the initial method SyncDomainTablesAsync to the following and the code runs without error. The issue is my use of async and await I believe.
public async Task<int> SyncDomainTablesAsync(IProgress<string> progress, CancellationToken ct, DateTime? lastDateSynced=null, string tableName = null)
{
var domainTable = await GetDomainTablesAsync(progress,ct,lastDateSynced, tableName);
var items = domainTable.Items;
foreach (var item in items)
{
_coordinator.SaveObjects(item.Value, typeof(object), true);
}
return 1;
}

Windows Azure - Cleaning Up The WADLogsTable

I've read conflicting information as to whether or not the WADLogsTable table used by the DiagnosticMonitor in Windows Azure will automatically prune old log entries.
I'm guessing it doesn't, and will instead grow forever - costing me money. :)
If that's the case, does anybody have a good code sample as to how to clear out old log entries from this table manually? Perhaps based on timestamp? I'd run this code from a worker role periodically.
The data in tables created by Windows Azure Diagnostics isn't deleted automatically.
However, Windows Azure PowerShell Cmdlets contain cmdlets specifically for this case.
PS D:\> help Clear-WindowsAzureLog
NAME
Clear-WindowsAzureLog
SYNOPSIS
Removes Windows Azure trace log data from a storage account.
SYNTAX
Clear-WindowsAzureLog [-DeploymentId ] [-From ] [-To ] [-StorageAccountName ] [-StorageAccountKey ] [-UseD
evelopmentStorage] [-StorageAccountCredentials ] []
Clear-WindowsAzureLog [-DeploymentId <String>] [-FromUtc <DateTime>] [-ToUt
c <DateTime>] [-StorageAccountName <String>] [-StorageAccountKey <String>]
[-UseDevelopmentStorage] [-StorageAccountCredentials <StorageCredentialsAcc
ountAndKey>] [<CommonParameters>]
You need to specify -ToUtc parameter, and all logs before that date will be deleted.
If cleanup task needs to be performed on Azure within the worker role, C# cmdlets code can be reused. PowerShell Cmdlets are published under permissive MS Public License.
Basically, there are only 3 files needed without other external dependencies: DiagnosticsOperationException.cs, WadTableExtensions.cs, WadTableServiceEntity.cs.
Updated function of Chriseyre2000. This provides much more performance for those cases where you need to delete many thousands records: search by PartitionKey and chunked step-by-step process. And remember that the best choice it is to run it near storage (in cloud service).
public static void TruncateDiagnostics(CloudStorageAccount storageAccount,
DateTime startDateTime, DateTime finishDateTime, Func<DateTime,DateTime> stepFunction)
{
var cloudTable = storageAccount.CreateCloudTableClient().GetTableReference("WADLogsTable");
var query = new TableQuery();
var dt = startDateTime;
while (true)
{
dt = stepFunction(dt);
if (dt>finishDateTime)
break;
var l = dt.Ticks;
string partitionKey = "0" + l;
query.FilterString = TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.LessThan, partitionKey);
query.Select(new string[] {});
var items = cloudTable.ExecuteQuery(query).ToList();
const int chunkSize = 200;
var chunkedList = new List<List<DynamicTableEntity>>();
int index = 0;
while (index < items.Count)
{
var count = items.Count - index > chunkSize ? chunkSize : items.Count - index;
chunkedList.Add(items.GetRange(index, count));
index += chunkSize;
}
foreach (var chunk in chunkedList)
{
var batches = new Dictionary<string, TableBatchOperation>();
foreach (var entity in chunk)
{
var tableOperation = TableOperation.Delete(entity);
if (batches.ContainsKey(entity.PartitionKey))
batches[entity.PartitionKey].Add(tableOperation);
else
batches.Add(entity.PartitionKey, new TableBatchOperation {tableOperation});
}
foreach (var batch in batches.Values)
cloudTable.ExecuteBatch(batch);
}
}
}
You could just do it based on the timestamp but that would be very inefficient since the whole table would need to be scanned. Here is a code sample that might help where the partition key is generated to prevent a "full" table scan. http://blogs.msdn.com/b/avkashchauhan/archive/2011/06/24/linq-code-to-query-windows-azure-wadlogstable-to-get-rows-which-are-stored-after-a-specific-datetime.aspx
Here is a solution that trunctates based upon a timestamp. (Tested against SDK 2.0)
It does use a table scan to get the data but if run say once per day would not be too painful:
/// <summary>
/// TruncateDiagnostics(storageAccount, DateTime.Now.AddHours(-1));
/// </summary>
/// <param name="storageAccount"></param>
/// <param name="keepThreshold"></param>
public void TruncateDiagnostics(CloudStorageAccount storageAccount, DateTime keepThreshold)
{
try
{
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
CloudTable cloudTable = tableClient.GetTableReference("WADLogsTable");
TableQuery query = new TableQuery();
query.FilterString = string.Format("Timestamp lt datetime'{0:yyyy-MM-ddTHH:mm:ss}'", keepThreshold);
var items = cloudTable.ExecuteQuery(query).ToList();
Dictionary<string, TableBatchOperation> batches = new Dictionary<string, TableBatchOperation>();
foreach (var entity in items)
{
TableOperation tableOperation = TableOperation.Delete(entity);
if (!batches.ContainsKey(entity.PartitionKey))
{
batches.Add(entity.PartitionKey, new TableBatchOperation());
}
batches[entity.PartitionKey].Add(tableOperation);
}
foreach (var batch in batches.Values)
{
cloudTable.ExecuteBatch(batch);
}
}
catch (Exception ex)
{
Trace.TraceError(string.Format("Truncate WADLogsTable exception {0}", ex), "Error");
}
}
Here's my slightly different version of #Chriseyre2000's solution, using asynchronous operations and PartitionKey querying. It's designed to run continuously within a Worker Role in my case. This one may be a bit easier on memory if you have a lot of entries to clean up.
static class LogHelper
{
/// <summary>
/// Periodically run a cleanup task for log data, asynchronously
/// </summary>
public static async void TruncateDiagnosticsAsync()
{
while ( true )
{
try
{
// Retrieve storage account from connection-string
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(
CloudConfigurationManager.GetSetting( "CloudStorageConnectionString" ) );
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
CloudTable cloudTable = tableClient.GetTableReference( "WADLogsTable" );
// keep a weeks worth of logs
DateTime keepThreshold = DateTime.UtcNow.AddDays( -7 );
// do this until we run out of items
while ( true )
{
TableQuery query = new TableQuery();
query.FilterString = string.Format( "PartitionKey lt '0{0}'", keepThreshold.Ticks );
var items = cloudTable.ExecuteQuery( query ).Take( 1000 );
if ( items.Count() == 0 )
break;
Dictionary<string, TableBatchOperation> batches = new Dictionary<string, TableBatchOperation>();
foreach ( var entity in items )
{
TableOperation tableOperation = TableOperation.Delete( entity );
// need a new batch?
if ( !batches.ContainsKey( entity.PartitionKey ) )
batches.Add( entity.PartitionKey, new TableBatchOperation() );
// can have only 100 per batch
if ( batches[entity.PartitionKey].Count < 100)
batches[entity.PartitionKey].Add( tableOperation );
}
// execute!
foreach ( var batch in batches.Values )
await cloudTable.ExecuteBatchAsync( batch );
Trace.TraceInformation( "WADLogsTable truncated: " + query.FilterString );
}
}
catch ( Exception ex )
{
Trace.TraceError( "Truncate WADLogsTable exception {0}", ex.Message );
}
// run this once per day
await Task.Delay( TimeSpan.FromDays( 1 ) );
}
}
}
To start the process, just call this from the OnStart method in your worker role.
// start the periodic cleanup
LogHelper.TruncateDiagnosticsAsync();
If you don't care about any of the contents, just delete the table. Azure Diagnostics will just recreate it.
Slightly updated Chriseyre2000's code:
using ExecuteQuerySegmented instead of ExecuteQuery
observing TableBatchOperation limit of 100 operations
purging all Azure tables
public static void TruncateAllAzureTables(CloudStorageAccount storageAccount, DateTime keepThreshold)
{
TruncateAzureTable(storageAccount, "WADLogsTable", keepThreshold);
TruncateAzureTable(storageAccount, "WADCrashDump", keepThreshold);
TruncateAzureTable(storageAccount, "WADDiagnosticInfrastructureLogsTable", keepThreshold);
TruncateAzureTable(storageAccount, "WADPerformanceCountersTable", keepThreshold);
TruncateAzureTable(storageAccount, "WADWindowsEventLogsTable", keepThreshold);
}
public static void TruncateAzureTable(CloudStorageAccount storageAccount, string aTableName, DateTime keepThreshold)
{
const int maxOperationsInBatch = 100;
var tableClient = storageAccount.CreateCloudTableClient();
var cloudTable = tableClient.GetTableReference(aTableName);
var query = new TableQuery { FilterString = $"Timestamp lt datetime'{keepThreshold:yyyy-MM-ddTHH:mm:ss}'" };
TableContinuationToken continuationToken = null;
do
{
var queryResult = cloudTable.ExecuteQuerySegmented(query, continuationToken);
continuationToken = queryResult.ContinuationToken;
var items = queryResult.ToList();
var batches = new Dictionary<string, List<TableBatchOperation>>();
foreach (var entity in items)
{
var tableOperation = TableOperation.Delete(entity);
if (!batches.TryGetValue(entity.PartitionKey, out var batchOperationList))
{
batchOperationList = new List<TableBatchOperation>();
batches.Add(entity.PartitionKey, batchOperationList);
}
var batchOperation = batchOperationList.FirstOrDefault(bo => bo.Count < maxOperationsInBatch);
if (batchOperation == null)
{
batchOperation = new TableBatchOperation();
batchOperationList.Add(batchOperation);
}
batchOperation.Add(tableOperation);
}
foreach (var batch in batches.Values.SelectMany(l => l))
{
cloudTable.ExecuteBatch(batch);
}
} while (continuationToken != null);
}

Categories