Implementing connection retry policy on failure to connect with database - c#

I have my database on cloud i.e Azure so sometimes I get network related error like this:
A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)
I have decide to use Polly to retry connection attempt after some time, but I am not sure whether I have used it in the right way or not :
public class AddOperation
{
public void Start()
{
using (var processor = new MyProcessor())
{
for (int i = 0; i < 2; i++)
{
if(i==0)
{
var connection = new SqlConnection("Connection string 1");
processor.Process(connection);
}
else
{
var connection = new SqlConnection("Connection string 2");
processor.Process(connection);
}
}
}
}
}
public class MyProcessor : IDisposable
{
public void Process(DbConnection cn)
{
using (var cmd = cn.CreateCommand())
{
cmd.CommandText = "query";
cmd.CommandTimeout = 1800;
RetryPolicy retryPolicy = Policy
.Handle<DbException>()
.WaitAndRetry(new[]
{
TimeSpan.FromSeconds(3),
TimeSpan.FromSeconds(6),
TimeSpan.FromSeconds(9)
});
retryPolicy.Execute(() => ConnectionManager.OpenConnection(cn));
using (var reader = cmd.ExecuteReader(CommandBehavior.CloseConnection))
{
//code
}
}
}
}
public class ConnectionManager
{
public static void OpenConnection(DbConnection cn)
{
try
{
cn.Open();
return;
}
catch(DbException ex)
{
throw ex;
}
}
}
As per my understanding Polly will work something like this :
1st attempt : Wait 3 seconds then call ConnectionManager.OpenConnection(cn) again
2nd attempt : Wait 6 seconds then call ConnectionManager.OpenConnection(cn) again on DbException
3rd attempt : Wait 9 seconds then call ConnectionManager.OpenConnection(cn) again on DbException
But what if DbException occurs again? Will it process or send to my catch clause wrapping up Process method?
I am not sure whether I have understood it properly and implemented it correctly.
I will appreciate any help :)

Re:
what if DbException occurs again? Will [Polly] process or send to my catch clause wrapping up Process method?
The Polly wiki for Retry states:
If the action throws a handled exception, the policy:
Counts the exception
Checks whether another retry is permitted.
If not, the exception is rethrown and the policy terminates.
A simple example can demonstrate this.

Related

Transient Fault Retry logic best practices

Friends, I have a question about implementing a simple retry policy around the execution of the SQL command.
My question is: should the retry loop encapsulate the construction of the connection and transaction, or should it live inside the connection.
For example:
private void RetryLogSave(DynamicParameters parameters, int retries = 3)
{
int tries = 0;
using (var connection = new SqlConnection(_connectionString))
{
connection.Open();
using (var transaction = connection.BeginTransaction())
{
var logItemCommand = new CommandDefinition(commandText: Constants.InsertLogItem,
parameters: parameters, transaction: transaction, commandType: System.Data.CommandType.Text);
do
{
try
{
tries++;
connection.Execute(logItemCommand);
transaction.Commit();
break;
}
catch (Exception exc)
{
if (tries == retries)
{
transaction.Rollback();
throw exc;
}
Task.Delay(100 * tries).Wait();
}
}
while (true);
}
}
}
Is what I've done here appropriate and acceptable, or should the retry logic live on the outside of the SqlConnection construction?
Formalizing my comments as an answer.
should the retry logic live on the outside of the SqlConnection
construction?
Yes. If doing retry logic with keeping connection opened you're wasting resources. Someone else may use it while you're waiting N seconds for re-try. Opening/closing connections is usually (for most ODBC drivers) implemented on top of Connection Pooling mechanism. You do not actually close it - you allow connection to go back in pool to be reused by someone else. Keeping connections opened during re-try will force system to create more and more new physical connections (because they are not returned to the pool) and eventually your SQL Server will be exhausted.
Regarding re-try mechanism - to not reinvent the wheel, I usually use Polly library.
You can define somewhere static class with list of your polices:
public static class MyPolices
{
// Retry, waiting a specified duration between each retry
public static Policy RetryPolicy = Policy
.Handle<Exception>() // can be more specific like SqlException
.WaitAndRetry(new[]
{
TimeSpan.FromSeconds(1),
TimeSpan.FromSeconds(2),
TimeSpan.FromSeconds(3)
});
}
Then, simplify your method to
private void LogSave(DynamicParameters parameters)
{
using (var connection = new SqlConnection(_connectionString))
{
connection.Open();
using (var transaction = connection.BeginTransaction())
{
// make sure to not forget to dispose your command
var logItemCommand = new CommandDefinition(commandText: Constants.InsertLogItem,
parameters: parameters, transaction: transaction, commandType: System.Data.CommandType.Text);
try
{
// not sure if conn.Execute is your extension method?
connection.Execute(logItemCommand);
transaction.Commit();
}
catch (Exception exc)
{
transaction.Rollback();
throw;
}
}
}
}
and call it like this
MyPolices.RetryPolicy.Execute(() => LogSave(parameters));
This approach will make your code more SOLID keeping retry logic in isolation.

How to manage a network down and avoid error with PKCS11Interop

Using PKCS11Interop on Safenet HSMs, I got this error
"Method C_OpenSession returned 2147484548"
the error, in my documentation, is CKR_SMS_ERROR: "General error from secure messaging system - probably caused by HSM failure or network failure".
This confirm the problem it happens when the connectivity is lacking.
The problem is when this happens, the service isn't able to resume the communication when the connectivity is back, until I restart manually the service managing the HSM access.
When the service starts, I call this:
private Pkcs11 _pkcs11 = null;
private Slot _slot = null;
private Session _session = null;
public async void InitPkcs11()
{
try
{
_pkcs11 = new Pkcs11(pathCryptoki, Inter_Settings.AppType);
_slot = Inter_Helpers.GetUsableSlot(_pkcs11, nSlot);
_session = _slot.OpenSession(SessionType.ReadOnly);
_session.Login(CKU.CKU_USER, Inter_Settings.NormalUserPin);
}
catch (Exception e)
{
...
}
}
When I have to use the HSM, I call something like:
using (var LocalSession = _slot.OpenSession(SessionType.ReadOnly))
{
...
}
And, when I fail the communication due to a connectivity lack, I call a function to reset the connection and try to change the slot:
private bool switching = false;
public async void SwitchSlot()
{
try
{
if (!switching)
{
switching = true;
if (nSlot == 0)
{
nSlot = 2;
}
else
{
nSlot = 0;
}
_session.Logout();
_slot.CloseAllSessions();
_pkcs11.Dispose();
InitPkcs11();
switching = false;
}
}
catch (Exception e)
{
...
}
}
But, this last snippet doens't work as expected: it tries to change the slot, but it fails always to communicate with the HSM (after a network down). If I restart the service manually (when the connectivity is back), it works like charms. So, I'm sure I'm doing something wrong in the SwitchSlot function, when I try to close the _session and open a new one.
Do you see any errors/misunderstoonding here?

A transport-level error has occurred (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)

I have read both these SO questions and the MS Docs:
Unexplained SQL errors in production environment - possibly network related
"The semaphore timeout period has expired" SQL Azure
https://learn.microsoft.com/en-us/azure/sql-database/sql-database-connectivity-issues#retry-logic-for-transient-errors
And have the same error. I did not have any of these in my ConnectionString:
ConnectRetryCount, ConnectRetryInterval or Connection Timeout.
This is a method of my DB class:
public DataTable ExecuteSqlCommand(SqlCommand com)
{
var retryStrategy = new Incremental(5, TimeSpan.FromSeconds(1), TimeSpan.FromSeconds(2));
var retryPolicy = new RetryPolicy<SqlDatabaseTransientErrorDetectionStrategy>(retryStrategy);
SqlConnection con = new SqlConnection(ConfigurationManager.ConnectionStrings["DataContext"].ToString());
com.Connection = con;
SqlDataAdapter da = new SqlDataAdapter(com);
DataTable dt = new DataTable();
try
{
retryPolicy.ExecuteAction(() =>
{
con.Open();
da.Fill(dt);
});
}
catch (Exception e)
{
var telemetry = new TelemetryClient(); // app insights (azure)
telemetry.TrackException(e);
}
finally
{
con.Close();
}
return dt;
}
So what would be better? Remove the retry stuff from my code and use the attributes in my connection string? Let the framework do the work? Or is my current retry code sufficient? I have the feeling that the enterprise lib and retry stuff is obsolete, but cannot find a good source to confirm my thoughts.
I am using 4.7 and also have EF 6.2, but most queries are just SqlCommands using the code from above.
Make sure the user/login you use on your connection string has access to the master database on your Azure SQL Database server also, not only to the user database. That will provide faster connections and timeouts may disappear.
Using SQL Azure Execution Estrategy may help you with this issue.
public class MyConfiguration : DbConfiguration
{
public MyConfiguration()
{
SetExecutionStrategy("System.Data.SqlClient", () => new SqlAzureExecutionStrategy());
}
}
public class MyConfiguration : DbConfiguration
{
public MyConfiguration()
{
SetExecutionStrategy(
"System.Data.SqlClient",
() => new SqlAzureExecutionStrategy(1, TimeSpan.FromSeconds(30)));
}
}
For more information about SQL Azure Execution Strategy please visit this URL.

C# MySql error: An established connection was aborted by the software in your host machine

I have a strange problem when working with MySql in C#. When executing the command sometimes(not always) I get "An established connection was aborted by the software in your host machine" error. I cannot solve this problem and can't find any solution in web.
As a workaround I developed simple ReConnection class. When this exception occurs I simply call ReOpen() method. It works perfectly with ADO.NET and I am using this trick more than 2 year.
public class ReConnection : DbConnection
{
private readonly int _reconnectAttempts;
private readonly int _reconnectInterval;
private bool _isOpened;
private int _reConnectSessions;
private readonly object _syncRoot = new object();
private readonly DbCommand _pingCommand;
private readonly DbConnection _innerConnection;
public ReConnection(DbConnection innerConnection,
int reconnectAttempts = 4, int reconnectInterval = 1000)
{
_innerConnection = innerConnection;
_pingCommand = _innerConnection.CreateCommand();
_pingCommand.CommandText = "SELECT 1 FROM DUAL";
_reconnectAttempts = reconnectAttempts;
_reconnectInterval = reconnectInterval;
}
public override void Open()
{
_innerConnection.Open();
_isOpened = true;
}
public override void Close()
{
_isOpened = false;
while (_reConnectSessions > 0)
Thread.Sleep(100);
_innerConnection.Close();
}
internal bool ReOpen()
{
var restored = false;
lock (_syncRoot)
{
_reConnectSessions++;
var retries = _reconnectAttempts;
try
{
_pingCommand.ExecuteNonQuery();
restored = true;
}
catch (Exception)
{
while (_isOpened &&
((retries > 0) || (_reconnectAttempts == -1)) &&
!restored)
{
if ((retries < _reconnectAttempts) || (_reconnectAttempts == -1))
Thread.Sleep(_reconnectInterval);
try
{
_innerConnection.Close();
try
{
_innerConnection.Open();
}
catch{}
//here ecception occurs. Unable to write data to the transport connection: An established connection was aborted by the software in your host machine.
_innerConnection.Close(); // closing the corrupted connection
_innerConnection.Open(); // opening the new connection
//tut vnezapno poyavlaetsya transport connection, kuda mojno pisat' dannie
_pingCommand.ExecuteNonQuery();
restored = true;
}
catch (Exception){}
retries--;
}
}
}
}
}
}
But recently I have developed an app with Entity Framework and similar exception occurs here also. I can't develop previous workaround for Entity Framework.
Does anyone encounter such problem with MySql ?
What is the actual reason behind this exception?
Where the error occurs
Error Details
The connection will expire and terminated by the host after some time. This is managed by MySql so nothing you can do in the code.
If particular query takes a long time to execute (some of my queries takes ~ 20 mins to return), you can set the timeout via the connection string - just add this to you connection string "ConnectionIdleTimeout=1800;Command Timeout=0;".
The Connection idle time out stops the mySql server from killing the connection until the given time (it's not recommended to set it to 0, otherwise eventually the host will be swamped with the open connections, and report an error or max open connections exceeding the limit), while the command timeout = 0 stops any command to timeout by itself. A combination of the two should help with the error.
However, I don't personally like to cache the DbConnection, given the timeout behavior from the server side. I prefer to rely solely on the server to time out the connections, and on C# side, I usually just use:
using MySqlConnection connection = new MySqlConnection("yourConnectionString"); connection.Open();

Some tricky quick way to validate oracle db connection

My WCF service need to check is connection available now and can we work with it. We have many remote dbs. Their connection are weird sometimes and can't be used to query data or smth else.
So, for example this is regular connection string used:
User Id=user;Password=P#ssw0rd;Data Source=NVDB1;Connection Timeout=30
Here is service method, used for getting
public List<string> GetAliveDBs(string city)
{
if (String.IsNullOrEmpty(city))
return null;
List<string> cityDbs = (from l in alldbs where !String.IsNullOrEmpty(l.Value.city) && l.Value.city.ToUpper() == city.ToUpper() select l.Value.connString).ToList();
// There is no such city databases
if (cityDbs.Count == 0)
return null;
ReaderWriterLockSlim locker = new ReaderWriterLockSlim();
Parallel.ForEach(cityDbs, p =>
{
if (!IsConnectionActive(p.connString))
{
locker.EnterWriteLock();
try
{
cityDbs.RemoveAt(cityDbs.IndexOf(p));
}
finally
{
locker.ExitWriteLock();
}
}
});
return cityDbs;
}
static public bool IsConnectionAlive(string connectionString)
{
using (OracleConnection c = new OracleConnection(connectionString))
{
try
{
c.Open();
if ((c.State == ConnectionState.Open) && (c.Ping()))
return true;
else
return false;
}
catch (Exception exc)
{
return false;
}
}
}
I use devart components to communicate with Oracle DB.
Hope for your help, guys! Thanks in advance!
Try just executing a very low cost operation that should work no matter what schema you are connected to, e.g.
SELECT 1
(that statement works on MS SQL and MySQL... should work on Oracle too but I can't confirm that).
If you get the result you expect (in this case one row, with one column, containing a "1") then the connection is valid.
At least one connection pool manager uses this strategy to validate connections periodically.
UPDATE:
Here's a SQL Server version of your method. You can probably just replace "Sql" with "Oracle".
static public bool IsConnectionAlive(string connectionString)
{
try
{
using (SqlConnection conn = new SqlConnection(connectionString))
{
conn.Open();
using (SqlCommand cmd = new SqlCommand("SELECT 1", conn))
{
int result = (int)cmd.ExecuteScalar();
return (result == 1);
}
}
}
catch (Exception ex)
{
// You need to decide what to do here... e.g. does a malformed connection string mean the "connection isn't alive"?
// Maybe return false, maybe log the error and re-throw the exception?
throw;
}
}
If the goal is to simply determine if a server lives at the IP Address or host name then I'd recommend Ping (no 3 way handshake and has less overhead than a UDP message). You can use the System.Net.NetworkInformation.Ping class (see its documentation for an example) for this.
If you're looking to prove that there is actually something listening on the common Oracle port, I would suggest using either the System.Net.Sockets.TcpClient or System.Net.Sockets.Socket class (their documentation also provides examples) to provide this.
The simplest way to do this (by far) is to just open a connection using the Oracle API for C#. There is a very good tutorial that includes code here. It covers more than just the connection but you should be able to strip out the connection portion from the rest to fit your needs.
Oracle has products and software specifically for helping maintain high availability that can allow you to have dead connections removed from you connection pool through a setting called HA Events=true on the connection string. Your Oracle DBA will need to determine if your installation supports it.

Categories