Retry the request if the first one failed - c#

I am using a xml web service on my web app and sometimes remote server fails to respond in time. I came up with the idea of re-request if first attempt fails. To prevent loop I want to limit concurrent request at 2. I want to get an opinion if what I have done below is ok and would work as I expect it.
public class ScEngine
{
private int _attemptcount = 0;
public int attemptcount
{
get
{
return _attemptcount;
}
set
{
_attemptcount = value;
}
}
public DataSet GetStat(string q, string job)
{
try
{
//snip....
attemptcount += attemptcount;
return ds;
}
catch
{
if (attemptcount>=2)
{
return null;
}
else
{
return GetStat(q, job);
}
}
}
}

public class ScEngine
{
public DataSet GetStat(string q, string job)
{
int attemptCount;
while(attemptCount < 2)
{
try
{
attemptCount++;
var ds = ...//web service call
return ds;
}
catch {}
}
//log the error
return null;
}
}

You forgot to increment the attemptcount. Plus, if there's any error on the second run, it will not be caught (thus, becomes an unhandled exception).

I wouldn't recurse in order to retry. Also, I wouldn't catch and ignore all exceptions. I'd learn which exceptions indicate an error that should be retried, and would catch those. You will be ignoring serious errors, as your code stands.

You don't want to solve it this way. You will just put more load on the servers and cause more timeouts.
You can increase the web service timeout via httpRuntime. Web services typically return a lot of data in one call, so I find myself doing this pretty frequently. Don't forget to increase how long the client is willing to wait on the client side.

Here's a version that doesn't use recursion but achieves the same result. It also includes a delay so you can give the server time to recover if it hiccups.
/// <summary>
/// The maximum amount of attempts to use before giving up on an update, delete or create
/// </summary>
private const int MAX_ATTEMPTS = 2;
/// <summary>
/// Attempts to execute the specified delegate with the specified arguments.
/// </summary>
/// <param name="operation">The operation to attempt.</param>
/// <param name="arguments">The arguments to provide to the operation.</param>
/// <returns>The result of the operation if there are any.</returns>
public static object attemptOperation(Delegate operation, params object[] arguments)
{
//attempt the operation using the default max attempts
return attemptOperation(MAX_ATTEMPTS, operation, arguments);
}
/// <summary>
/// Use for creating a random delay between retry attempts.
/// </summary>
private static Random random = new Random();
/// <summary>
/// Attempts to execute the specified delegate with the specified arguments.
/// </summary>
/// <param name="operation">The operation to attempt.</param>
/// <param name="arguments">The arguments to provide to the operation.</param>
/// <param name="maxAttempts">The number of times to attempt the operation before giving up.</param>
/// <returns>The result of the operation if there are any.</returns>
public static object attemptOperation(int maxAttempts, Delegate operation, params object [] arguments)
{
//set our initial attempt count
int attemptCount = 1;
//set the default result
object result = null;
//we've not succeeded yet
bool success = false;
//keep trying until we get a result
while (success == false)
{
try
{
//attempt the operation and get the result
result = operation.DynamicInvoke(arguments);
//we succeeded if there wasn't an exception
success = true;
}
catch
{
//if we've got to the max attempts and still have an error, give up an rethrow it
if (attemptCount++ == maxAttempts)
{
//propogate the exception
throw;
}
else
{
//create a random delay in milliseconds
int randomDelayMilliseconds = random.Next(1000, 5000);
//sleep for the specified amount of milliseconds
System.Threading.Thread.Sleep(randomDelayMilliseconds);
}
}
}
//return the result
return result;
}

Related

What does the operationTimeout in ISessionClient.AcceptMessageSessionAsync actually do?

Context: I have some code that's creating a message session for a particular session, using
ISessionClient.Task<IMessageSession> AcceptMessageSessionAsync(string sessionId, TimeSpan operationTimeout);
Question: What does the operationTimeout in AcceptMessageSessionAsync do? I tried setting it to one minute but, after a minute, nothing happened. Does this timeout just set a property that I need to check myself? Shouldn't a SessionLockLostException fire?
Code Sample:
var session = await sessionClient.AcceptMessageSessionAsync(0, TimeSpan.FromMinutes(1));
var gotSession = true;
if (gotSession)
{
while (!session.IsClosedOrClosing)
{
try
{
Message message = await session.ReceiveAsync(TimeSpan.FromMinutes(2));
if (message != null)
{
await session.CompleteAsync(message.SystemProperties.LockToken);
}
else
{
await session.CloseAsync();
}
}
}
}
OperationTimeout in AcceptMessageSessionAsync is the amount of time for which the call should wait for to fetch the next session.
You can find the Here is the complete implementation of the AcceptMessageSessionAsync method
/// <summary>
/// Gets a particular session object identified by <paramref name="sessionId"/> that can be used to receive messages for that sessionId.
/// </summary>
/// <param name="sessionId">The sessionId present in all its messages.</param>
/// <param name="operationTimeout">Amount of time for which the call should wait to fetch the next session.</param>
/// <remarks>All plugins registered on <see cref="SessionClient"/> will be applied to each <see cref="MessageSession"/> that is accepted.
/// Individual sessions can further register additional plugins.</remarks>
public async Task<IMessageSession> AcceptMessageSessionAsync(string sessionId, TimeSpan operationTimeout)
{
this.ThrowIfClosed();
MessagingEventSource.Log.AmqpSessionClientAcceptMessageSessionStart(
this.ClientId,
this.EntityPath,
this.ReceiveMode,
this.PrefetchCount,
sessionId);
bool isDiagnosticSourceEnabled = ServiceBusDiagnosticSource.IsEnabled();
Activity activity = isDiagnosticSourceEnabled ? this.diagnosticSource.AcceptMessageSessionStart(sessionId) : null;
Task acceptMessageSessionTask = null;
var session = new MessageSession(
this.EntityPath,
this.EntityType,
this.ReceiveMode,
this.ServiceBusConnection,
this.CbsTokenProvider,
this.RetryPolicy,
this.PrefetchCount,
sessionId,
true);
try
{
acceptMessageSessionTask = this.RetryPolicy.RunOperation(
() => session.GetSessionReceiverLinkAsync(operationTimeout),
operationTimeout);
await acceptMessageSessionTask.ConfigureAwait(false);
}
catch (Exception exception)
{
if (isDiagnosticSourceEnabled)
{
this.diagnosticSource.ReportException(exception);
}
MessagingEventSource.Log.AmqpSessionClientAcceptMessageSessionException(
this.ClientId,
this.EntityPath,
exception);
await session.CloseAsync().ConfigureAwait(false);
throw AmqpExceptionHelper.GetClientException(exception);
}
finally
{
this.diagnosticSource.AcceptMessageSessionStop(activity, session.SessionId, acceptMessageSessionTask?.Status);
}
MessagingEventSource.Log.AmqpSessionClientAcceptMessageSessionStop(
this.ClientId,
this.EntityPath,
session.SessionIdInternal);
session.UpdateClientId(ClientEntity.GenerateClientId(nameof(MessageSession), $"{this.EntityPath}_{session.SessionId}"));
// Register plugins on the message session.
foreach (var serviceBusPlugin in this.RegisteredPlugins)
{
session.RegisterPlugin(serviceBusPlugin);
}
return session;
}
You can find the complete sample in below link
https://github.com/Azure/azure-service-bus-dotnet/blob/dev/src/Microsoft.Azure.ServiceBus/SessionClient.cs
Hope it helps.

How to pass through pages using selenium and c#

This is most likely a error of logic in my code.
In first place, what i'm trying to do is:
go to the respective page of my website which is in this case link and collect the data, with my public void GDataPicker.
now where i want you to help me is, i use the following code to see if the button next exists in the webpage, and collect it's respective data, but always give me the same error:
OpenQA.Selenium.StaleElementReferenceException: 'stale element reference: element is not attached to the page document
(Session info: chrome=58.0.3029.110)
(Driver info: chromedriver=2.30.477700 (0057494ad8732195794a7b32078424f92a5fce41),platform=Windows NT 10.0.15063 x86_64)' , i think it's probably because i don´t update my NextButtonElement.
Code:
Boolean ElementDisplayed;
try
{
Gdriver.Navigate().GoToUrl("http://www.codigo-postal.pt/");
IWebElement searchInput1 = Gdriver.FindElement(By.Id("cp4"));
searchInput1.SendKeys("4710");//4730
IWebElement searchInput2 = Gdriver.FindElement(By.ClassName("cp3"));
searchInput2.SendKeys("");//324
searchInput2.SendKeys(OpenQA.Selenium.Keys.Enter);
IWebElement NextButtonElement = Gdriver.FindElement(By.XPath("/html/body/div[4]/div/div/div[2]/ul/li[13]/a"));
GDataPicker();
while (ElementDisplayed = NextButtonElement.Displayed)
{
GDataPicker();
Gdriver.Manage().Timeouts().ImplicitlyWait(TimeSpan.FromSeconds(2000));
NextButtonElement.SendKeys(OpenQA.Selenium.Keys.Enter);
}
}
catch (NoSuchElementException i)
{
ElementDisplayed = false;
GDataPicker();
}
I cant help you with C#, however StaleElementReferenceException occurs when the element you act upon is still in the dom but has been replaced with an identical one. what i would do is catch that exception and find the element again
catch (StaleElementReferenceException i)
{
IWebElement NextButtonElement = Gdriver.FindElement(By.XPath("/html/body/div[4]/div/div/div[2]/ul/li[13]/a"));
}
http://www.seleniumhq.org/exceptions/stale_element_reference.jsp
I would use ExpectedConditions.ElementToBeClickable with the dynamic wait feature selenium has.
var wait = new WebDriverWait(GDriver, TimeSpan.FromSeconds(5));
IWebElement NextButtonElement = wait.Until(ExpectedConditions.ElementToBeClickable(By.XPath("/html/body/div[4]/div/div/div[2]/ul/li[13]/a")));
ExpectedConditions.ElementToBeClickable does exactly what you want it to do, wait a little bit until the element is displayed and not stale.
/// <summary>
/// An expectation for checking an element is visible and enabled such that you
/// can click it.
/// </summary>
/// <param name="locator">The locator used to find the element.</param>
/// <returns>The <see cref="IWebElement"/> once it is located and clickable (visible and enabled).</returns>
public static Func<IWebDriver, IWebElement> ElementToBeClickable(By locator)
{
return (driver) =>
{
var element = ElementIfVisible(driver.FindElement(locator));
try
{
if (element != null && element.Enabled)
{
return element;
}
else
{
return null;
}
}
catch (StaleElementReferenceException)
{
return null;
}
};
}
From https://github.com/SeleniumHQ/selenium/blob/master/dotnet/src/support/UI/ExpectedConditions.cs

Write to Windows Application Event Log without event source registration

Is there a way to write to this event log:
Or at least, some other Windows default log, where I don't have to register an event source?
Yes, there is a way to write to the event log you are looking for. You don't need to create a new source, just simply use the existent one, which often has the same name as the EventLog's name and also, in some cases like the event log Application, can be accessible without administrative privileges*.
*Other cases, where you cannot access it directly, are the Security EventLog, for example, which is only accessed by the operating system.
I used this code to write directly to the event log Application:
using (EventLog eventLog = new EventLog("Application"))
{
eventLog.Source = "Application";
eventLog.WriteEntry("Log message example", EventLogEntryType.Information, 101, 1);
}
As you can see, the EventLog source is the same as the EventLog's name. The reason of this can be found in Event Sources # Windows Dev Center (I bolded the part which refers to source name):
Each log in the Eventlog key contains subkeys called event sources. The event source is the name of the software that logs the event. It is often the name of the application or the name of a subcomponent of the application if the application is large. You can add a maximum of 16,384 event sources to the registry.
You can using the EventLog class, as explained on How to: Write to the Application Event Log (Visual C#):
var appLog = new EventLog("Application");
appLog.Source = "MySource";
appLog.WriteEntry("Test log message");
However, you'll need to configure this source "MySource" using administrative privileges:
Use WriteEvent and WriteEntry to write events to an event log. You must specify an event source to write events; you must create and configure the event source before writing the first entry with the source.
As stated in MSDN (eg. https://msdn.microsoft.com/en-us/library/system.diagnostics.eventlog(v=vs.110).aspx ), checking an non existing source and creating a source requires admin privilege.
It is however possible to use the source "Application" without.
In my test under Windows 2012 Server r2, I however get the following log entry using "Application" source:
The description for Event ID xxxx from source Application cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
{my event entry message}
the message resource is present but the message is not found in the string/message table
I defined the following method to create the source:
private string CreateEventSource(string currentAppName)
{
string eventSource = currentAppName;
bool sourceExists;
try
{
// searching the source throws a security exception ONLY if not exists!
sourceExists = EventLog.SourceExists(eventSource);
if (!sourceExists)
{ // no exception until yet means the user as admin privilege
EventLog.CreateEventSource(eventSource, "Application");
}
}
catch (SecurityException)
{
eventSource = "Application";
}
return eventSource;
}
I am calling it with currentAppName = AppDomain.CurrentDomain.FriendlyName
It might be possible to use the EventLogPermission class instead of this try/catch but not sure we can avoid the catch.
It is also possible to create the source externally, e.g in elevated Powershell:
New-EventLog -LogName Application -Source MyApp
Then, using 'MyApp' in the method above will NOT generate exception and the EventLog can be created with that source.
This is the logger class that I use. The private Log() method has EventLog.WriteEntry() in it, which is how you actually write to the event log. I'm including all of this code here because it's handy. In addition to logging, this class will also make sure the message isn't too long to write to the event log (it will truncate the message). If the message was too long, you'd get an exception. The caller can also specify the source. If the caller doesn't, this class will get the source. Hope it helps.
By the way, you can get an ObjectDumper from the web. I didn't want to post all that here. I got mine from here: C:\Program Files (x86)\Microsoft Visual Studio 10.0\Samples\1033\CSharpSamples.zip\LinqSamples\ObjectDumper
using System;
using System.Diagnostics;
using System.Diagnostics.CodeAnalysis;
using System.Globalization;
using System.Linq;
using System.Reflection;
using Xanico.Core.Utilities;
namespace Xanico.Core
{
/// <summary>
/// Logging operations
/// </summary>
public static class Logger
{
// Note: The actual limit is higher than this, but different Microsoft operating systems actually have
// different limits. So just use 30,000 to be safe.
private const int MaxEventLogEntryLength = 30000;
/// <summary>
/// Gets or sets the source/caller. When logging, this logger class will attempt to get the
/// name of the executing/entry assembly and use that as the source when writing to a log.
/// In some cases, this class can't get the name of the executing assembly. This only seems
/// to happen though when the caller is in a separate domain created by its caller. So,
/// unless you're in that situation, there is no reason to set this. However, if there is
/// any reason that the source isn't being correctly logged, just set it here when your
/// process starts.
/// </summary>
public static string Source { get; set; }
/// <summary>
/// Logs the message, but only if debug logging is true.
/// </summary>
/// <param name="message">The message.</param>
/// <param name="debugLoggingEnabled">if set to <c>true</c> [debug logging enabled].</param>
/// <param name="source">The name of the app/process calling the logging method. If not provided,
/// an attempt will be made to get the name of the calling process.</param>
public static void LogDebug(string message, bool debugLoggingEnabled, string source = "")
{
if (debugLoggingEnabled == false) { return; }
Log(message, EventLogEntryType.Information, source);
}
/// <summary>
/// Logs the information.
/// </summary>
/// <param name="message">The message.</param>
/// <param name="source">The name of the app/process calling the logging method. If not provided,
/// an attempt will be made to get the name of the calling process.</param>
public static void LogInformation(string message, string source = "")
{
Log(message, EventLogEntryType.Information, source);
}
/// <summary>
/// Logs the warning.
/// </summary>
/// <param name="message">The message.</param>
/// <param name="source">The name of the app/process calling the logging method. If not provided,
/// an attempt will be made to get the name of the calling process.</param>
public static void LogWarning(string message, string source = "")
{
Log(message, EventLogEntryType.Warning, source);
}
/// <summary>
/// Logs the exception.
/// </summary>
/// <param name="ex">The ex.</param>
/// <param name="source">The name of the app/process calling the logging method. If not provided,
/// an attempt will be made to get the name of the calling process.</param>
public static void LogException(Exception ex, string source = "")
{
if (ex == null) { throw new ArgumentNullException("ex"); }
if (Environment.UserInteractive)
{
Console.WriteLine(ex.ToString());
}
Log(ex.ToString(), EventLogEntryType.Error, source);
}
/// <summary>
/// Recursively gets the properties and values of an object and dumps that to the log.
/// </summary>
/// <param name="theObject">The object to log</param>
[SuppressMessage("Microsoft.Globalization", "CA1303:Do not pass literals as localized parameters", MessageId = "Xanico.Core.Logger.Log(System.String,System.Diagnostics.EventLogEntryType,System.String)")]
[SuppressMessage("Microsoft.Naming", "CA1720:IdentifiersShouldNotContainTypeNames", MessageId = "object")]
public static void LogObjectDump(object theObject, string objectName, string source = "")
{
const int objectDepth = 5;
string objectDump = ObjectDumper.GetObjectDump(theObject, objectDepth);
string prefix = string.Format(CultureInfo.CurrentCulture,
"{0} object dump:{1}",
objectName,
Environment.NewLine);
Log(prefix + objectDump, EventLogEntryType.Warning, source);
}
private static void Log(string message, EventLogEntryType entryType, string source)
{
// Note: I got an error that the security log was inaccessible. To get around it, I ran the app as administrator
// just once, then I could run it from within VS.
if (string.IsNullOrWhiteSpace(source))
{
source = GetSource();
}
string possiblyTruncatedMessage = EnsureLogMessageLimit(message);
EventLog.WriteEntry(source, possiblyTruncatedMessage, entryType);
// If we're running a console app, also write the message to the console window.
if (Environment.UserInteractive)
{
Console.WriteLine(message);
}
}
private static string GetSource()
{
// If the caller has explicitly set a source value, just use it.
if (!string.IsNullOrWhiteSpace(Source)) { return Source; }
try
{
var assembly = Assembly.GetEntryAssembly();
// GetEntryAssembly() can return null when called in the context of a unit test project.
// That can also happen when called from an app hosted in IIS, or even a windows service.
if (assembly == null)
{
assembly = Assembly.GetExecutingAssembly();
}
if (assembly == null)
{
// From http://stackoverflow.com/a/14165787/279516:
assembly = new StackTrace().GetFrames().Last().GetMethod().Module.Assembly;
}
if (assembly == null) { return "Unknown"; }
return assembly.GetName().Name;
}
catch
{
return "Unknown";
}
}
// Ensures that the log message entry text length does not exceed the event log viewer maximum length of 32766 characters.
private static string EnsureLogMessageLimit(string logMessage)
{
if (logMessage.Length > MaxEventLogEntryLength)
{
string truncateWarningText = string.Format(CultureInfo.CurrentCulture, "... | Log Message Truncated [ Limit: {0} ]", MaxEventLogEntryLength);
// Set the message to the max minus enough room to add the truncate warning.
logMessage = logMessage.Substring(0, MaxEventLogEntryLength - truncateWarningText.Length);
logMessage = string.Format(CultureInfo.CurrentCulture, "{0}{1}", logMessage, truncateWarningText);
}
return logMessage;
}
}
}
try
System.Diagnostics.EventLog appLog = new System.Diagnostics.EventLog();
appLog.Source = "This Application's Name";
appLog.WriteEntry("An entry to the Application event log.");

What are some methods I can use to detect robots?

Just because software is automated doesn't mean it will abide by your robots.txt.
What are some methods available to detect when someone is crawling or DDOSing your website? Assume your site has 100s of 1000s of pages and is worth crawling or DDOSing.
Here's a dumb idea I had that probably doesn't work: give each user a cookie with a unique value, and use the cookie to know when someone is making second/third/etc requests. This probably doesn't work because crawlers probably don't accept cookies, and thus in this scheme a robot will look like a new user with each request.
Does anyone have better ideas?
You could put links in your pages that are not visible, or clickable by end-users. Many bots just follow all links. Once someone requests one of those links you almost certainly have a crawler/robot.
Project Honey Pot keeps a list of 'bad' bots.
Here's a class I wrote to contact their web-service. You'll have to modify it some since I have a couple proprietary libs in it, but mostly it should be good to go. Sometimes their service sends back errors, but it does help cut down on some of the bad traffic.
using System;
using System.Linq;
using System.Net;
using System.Xml.Linq;
using SeaRisenLib2.Text;
using XmlLib;
/// <summary>
/// Summary description for HoneyPot
/// </summary>
public class HoneyPot
{
private const string KEY = "blacklistkey"; // blacklist key - need to register at httpbl.org to get it
private const string HTTPBL = "dnsbl.httpbl.org"; // blacklist lookup host
public HoneyPot()
{
}
public static Score GetScore_ByIP(string ip)
{
string sendMsg = "", receiveMsg = "";
int errorCount = 0; // track where in try/catch we fail for debugging
try
{
// for testing: ip = "188.143.232.31";
//ip = "173.242.116.72";
if ("127.0.0.1" == ip) return null; // localhost development computer
IPAddress address;
if (!IPAddress.TryParse(ip, out address))
throw new Exception("Invalid IP address to HoneyPot.GetScore_ByIP:" + ip);
errorCount++; // 1
string reverseIP = ip.ToArray('.').Reverse().ToStringCSV(".");
sendMsg = string.Format("{0}.{1}.{2}", KEY, reverseIP, HTTPBL);
errorCount++; // 2
//IPHostEntry value = Dns.GetHostByName(sendMsg);
IPHostEntry value = Dns.GetHostEntry(sendMsg);
errorCount++; // 3
address = value.AddressList[0];
errorCount++; // 4
receiveMsg = address.ToString();
errorCount++; // 5
int[] ipArray = receiveMsg.ToArray('.').Select(s => Convert.ToInt32(s)).ToArray();
errorCount++; // 6
if (127 != ipArray[0]) // error
throw new Exception("HoneyPot error");
errorCount++; // 7
Score score = new Score()
{
DaysSinceLastSeen = ipArray[1],
Threat = ipArray[2],
BotType = ipArray[3]
};
errorCount++; // 8
return score;
}
catch (Exception ex)
{
Log.Using("VisitorLog/HoneyPotErrors", log =>
{
log.SetString("IPrequest", ip);
log.SetString("SendMsg", sendMsg, XmlFile.ELEMENT);
log.SetString("RecvMsg", receiveMsg, XmlFile.ELEMENT);
log.SetString("Exception", ex.Message, XmlFile.ELEMENT);
log.SetString("ErrorCount", errorCount.ToString());
});
}
return null;
}
// Bitwise values
public enum BotTypeEnum : int
{
SearchEngine = 0,
Suspicious = 1,
Harvester = 2,
CommentSpammer = 4
}
public class Score
{
public Score()
{
BotType = -1;
DaysSinceLastSeen = -1;
Threat = -1;
}
public int DaysSinceLastSeen { get; internal set; }
public int Threat { get; internal set; }
/// <summary>
/// Use BotTypeEnum to understand value.
/// </summary>
public int BotType { get; internal set; }
/// <summary>
/// Convert HoneyPot Score values to String (DaysSinceLastSeen.Threat.BotType)
/// </summary>
/// <returns></returns>
public override string ToString()
{
return string.Format("{0}.{1}.{2}",
DaysSinceLastSeen,
Threat,
BotType);
}
public static explicit operator XElement(Score score)
{
XElement xpot = new XElement("HoneyPot");
if (null != score)
{
if (score.DaysSinceLastSeen >= 0)
xpot.SetString("Days", score.DaysSinceLastSeen);
if (score.Threat >= 0)
xpot.SetString("Threat", score.Threat);
if (score.BotType >= 0)
xpot.SetString("Type", score.BotType);
foreach (BotTypeEnum t in Enum.GetValues(typeof(BotTypeEnum)))
{
// Log enum values as string for each bitwise value represented in score.BotType
int value = (int)t;
if ((value == score.BotType) || ((value & score.BotType) > 0))
xpot.GetCategory(t.ToString());
}
}
return xpot;
}
public static explicit operator Score(XElement xpot)
{
Score score = null;
if (null != xpot)
score = new Score()
{
DaysSinceLastSeen = xpot.GetInt("Days"),
Threat = xpot.GetInt("Threat"),
BotType = xpot.GetInt("Type")
};
return score;
}
}
/// <summary>
/// Log score value to HoneyPot child Element (if score not null).
/// </summary>
/// <param name="score"></param>
/// <param name="parent"></param>
public static void LogScore(HoneyPot.Score score, XElement parent)
{
if ((null != score) && (null != parent))
{
parent.Add((XElement)score);
}
}
}
Though technically, it would not "detect" bot crawlers, I have an interesting way to block them. My method is to create an IIS filter or Apache plug-in. What you would do is encrypt all your html, asp, php, etc... pages. The only page not encrypted would be the index page. The index page simply installs a cookie with an encrypted public key, then redirects to a second index page. The IIS filter or Apache plug-in would then check each vistor to make sure they have this cookie. If it does, the filter would decrypt the requested page, then pass the page onto the web server for processing.
This method would allow a normal vistor to view your web page, but if a bot, which rejects cookies, tried to read your pages, they would all be encrypted.
A blacklist is probably not a good way of doing it, it would be better to have a whitelist of known bots that are allowed to make over a certain amount of hits per second. If someone not on that whitelist makes too many hits per second, start dropping their connections for a few seconds. This will help prevent ddosing, and still let unknown bots scan your site (although a lot slower than the ones you think matter).
You can keep a log of the offenders and see who are breaking the rules repeatedly :)

Better solution to multithreading riddle?

Here's the task: I need to lock based on a filename. There can be up to a million different filenames. (This is used for large-scale disk-based caching).
I want low memory usage and low lookup times, which means I need a GC'd lock dictionary. (Only in-use locks can be present in the dict).
The callback action can take minutes to complete, so a global lock is unacceptable. High throughput is critical.
I've posted my current solution below, but I'm unhappy with the complexity.
EDIT: Please do not post solutions that are not 100% correct. For example, a solution which permits a lock to be removed from the dictionary between the 'get lock object' phase and the 'lock' phase is NOT correct, whether or not it is an 'accepted' design pattern or not.
Is there a more elegant solution than this?
Thanks!
[EDIT: I updated my code to use looping vs. recursion based on RobV's suggestion]
[EDIT: Updated the code again to allow 'timeouts' and a simpler calling pattern. This will probably be the final code I use. Still the same basic algorithm as in the original post.]
[EDIT: Updated code again to deal with exceptions inside callback without orphaning lock objects]
public delegate void LockCallback();
/// <summary>
/// Provides locking based on a string key.
/// Locks are local to the LockProvider instance.
/// The class handles disposing of unused locks. Generally used for
/// coordinating writes to files (of which there can be millions).
/// Only keeps key/lock pairs in memory which are in use.
/// Thread-safe.
/// </summary>
public class LockProvider {
/// <summary>
/// The only objects in this collection should be for open files.
/// </summary>
protected Dictionary<String, Object> locks =
new Dictionary<string, object>(StringComparer.Ordinal);
/// <summary>
/// Synchronization object for modifications to the 'locks' dictionary
/// </summary>
protected object createLock = new object();
/// <summary>
/// Attempts to execute the 'success' callback inside a lock based on 'key'. If successful, returns true.
/// If the lock cannot be acquired within 'timoutMs', returns false
/// In a worst-case scenario, it could take up to twice as long as 'timeoutMs' to return false.
/// </summary>
/// <param name="key"></param>
/// <param name="success"></param>
/// <param name="failure"></param>
/// <param name="timeoutMs"></param>
public bool TryExecute(string key, int timeoutMs, LockCallback success){
//Record when we started. We don't want an infinite loop.
DateTime startedAt = DateTime.UtcNow;
// Tracks whether the lock acquired is still correct
bool validLock = true;
// The lock corresponding to 'key'
object itemLock = null;
try {
//We have to loop until we get a valid lock and it stays valid until we lock it.
do {
// 1) Creation/aquire phase
lock (createLock) {
// We have to lock on dictionary writes, since otherwise
// two locks for the same file could be created and assigned
// at the same time. (i.e, between TryGetValue and the assignment)
if (!locks.TryGetValue(key, out itemLock))
locks[key] = itemLock = new Object(); //make a new lock!
}
// Loophole (part 1):
// Right here - this is where another thread (executing part 2) could remove 'itemLock'
// from the dictionary, and potentially, yet another thread could
// insert a new value for 'itemLock' into the dictionary... etc, etc..
// 2) Execute phase
if (System.Threading.Monitor.TryEnter(itemLock, timeoutMs)) {
try {
// May take minutes to acquire this lock.
// Trying to detect an occurence of loophole above
// Check that itemLock still exists and matches the dictionary
lock (createLock) {
object newLock = null;
validLock = locks.TryGetValue(key, out newLock);
validLock = validLock && newLock == itemLock;
}
// Only run the callback if the lock is valid
if (validLock) {
success(); // Extremely long-running callback, perhaps throwing exceptions
return true;
}
} finally {
System.Threading.Monitor.Exit(itemLock);//release lock
}
} else {
validLock = false; //So the finally clause doesn't try to clean up the lock, someone else will do that.
return false; //Someone else had the lock, they can clean it up.
}
//Are we out of time, still having an invalid lock?
if (!validLock && Math.Abs(DateTime.UtcNow.Subtract(startedAt).TotalMilliseconds) > timeoutMs) {
//We failed to get a valid lock in time.
return false;
}
// If we had an invalid lock, we have to try everything over again.
} while (!validLock);
} finally {
if (validLock) {
// Loophole (part 2). When loophole part 1 and 2 cross paths,
// An lock object may be removed before being used, and be orphaned
// 3) Cleanup phase - Attempt cleanup of lock objects so we don't
// have a *very* large and slow dictionary.
lock (createLock) {
// TryEnter() fails instead of waiting.
// A normal lock would cause a deadlock with phase 2.
// Specifying a timeout would add great and pointless overhead.
// Whoever has the lock will clean it up also.
if (System.Threading.Monitor.TryEnter(itemLock)) {
try {
// It succeeds, so no-one else is working on it
// (but may be preparing to, see loophole)
// Only remove the lock object if it
// still exists in the dictionary as-is
object existingLock = null;
if (locks.TryGetValue(key, out existingLock)
&& existingLock == itemLock)
locks.Remove(key);
} finally {
// Remove the lock
System.Threading.Monitor.Exit(itemLock);
}
}
}
}
}
// Ideally the only objects in 'locks' will be open operations now.
return true;
}
}
Usage example
LockProvider p = new LockProvider();
bool success = p.TryExecute("filename",1000,delegate(){
//This code executes within the lock
});
Depending on what you are doing with the files (you say disk based caching so I assume reads as well as writes) then I would suggest trying something based upon ReaderWriterLock, if you can upgrade to .Net 3.5 then try ReaderWriterLockSlim instead as it performs much better.
As a general step to reducing the potential endless recursion case in your example change the first bit of the code to the following:
do
{
// 1) Creation/aquire phase
lock (createLock){
// We have to lock on dictionary writes, since otherwise
// two locks for the same file could be created and assigned
// at the same time. (i.e, between TryGetValue and the assignment)
if (!locks.TryGetValue(key, out itemLock))
locks[key] = itemLock = new Object(); //make a new lock!
}
// Loophole (part 1):
// Right here - this is where another thread could remove 'itemLock'
// from the dictionary, and potentially, yet another thread could
// insert a new value for 'itemLock' into the dictionary... etc, etc..
// 2) Execute phase
lock(itemLock){
// May take minutes to acquire this lock.
// Real version would specify a timeout and a failure callback.
// Trying to detect an occurence of loophole above
// Check that itemLock still exists and matches the dictionary
lock(createLock){
object newLock = null;
validLock = locks.TryGetValue(key, out newLock);
validLock = validLock && newLock == itemLock;
}
// Only run the callback if the lock is valid
if (validLock) callback(); // Extremely long-running callback.
}
// If we had an invalid lock, we have to try everything over again.
} while (!validLock);
This replaces your recursion with a loop which avoids any chance of a StackOverflow by endless recursion.
That solution sure looks brittle and complex. Having public callbacks inside locks is bad practice. Why won't you let LockProvider return some sort of 'lock' objects, so that the consumers do the lock themselves. This separates the locking of the locks dictionary from the execution. It might look like this:
public class LockProvider
{
private readonly object globalLock = new object();
private readonly Dictionary<String, Locker> locks =
new Dictionary<string, Locker>(StringComparer.Ordinal);
public IDisposable Enter(string key)
{
Locker locker;
lock (this.globalLock)
{
if (!this.locks.TryGetValue(key, out locker))
{
this.locks[key] = locker = new Locker(this, key);
}
// Increase wait count ínside the global lock
locker.WaitCount++;
}
// Call Enter and decrease wait count óutside the
// global lock (to prevent deadlocks).
locker.Enter();
// Only one thread will be here at a time for a given locker.
locker.WaitCount--;
return locker;
}
private sealed class Locker : IDisposable
{
private readonly LockProvider provider;
private readonly string key;
private object keyLock = new object();
public int WaitCount;
public Locker(LockProvider provider, string key)
{
this.provider = provider;
this.key = key;
}
public void Enter()
{
Monitor.Enter(this.keyLock);
}
public void Dispose()
{
if (this.keyLock != null)
{
this.Exit();
this.keyLock = null;
}
}
private void Exit()
{
lock (this.provider.globalLock)
{
try
{
// Remove the key before releasing the lock, but
// only when no threads are waiting (because they
// will have a reference to this locker).
if (this.WaitCount == 0)
{
this.provider.locks.Remove(this.key);
}
}
finally
{
// Release the keyLock inside the globalLock.
Monitor.Exit(this.keyLock);
}
}
}
}
}
And the LockProvider can be used as follows:
public class Consumer
{
private LockProvider provider;
public void DoStufOnFile(string fileName)
{
using (this.provider.Enter(fileName))
{
// Long running operation on file here.
}
}
}
Note that Monitor.Enter is called before we enter the try statement (using), which means in certain host environments (such as ASP.NET and SQL Server) we have the possibility of locks never being released when an asynchronous exception happens. Hosts like ASP.NET and SQL Server aggressively kill threads when timeouts occur. Rewriting this with the Enter outside the Monitor.Enter inside the try is a bit tricky though.
I hope this helps.
Could you not simply used a named Mutex, with the name derived from your filename?
Although not a lightweight synchronization primitive, it's simpler than managing your own synchronized dictionary.
However if you really do want to do it this way, I'd have thought the following implementation looks simpler. You need a synchonized dictionary - either the .NET 4 ConcurrentDictionary or your own implementation if you're on .NET 3.5 or lower.
try
{
object myLock = new object();
lock(myLock)
{
object otherLock = null;
while(otherLock != myLock)
{
otherLock = lockDictionary.GetOrAdd(key, myLock);
if (otherLock != myLock)
{
// Another thread has a lock in the dictionary
if (Monitor.TryEnter(otherLock, timeoutMs))
{
// Another thread still has a lock after a timeout
failure();
return;
}
else
{
Monitor.Exit(otherLock);
}
}
}
// We've successfully added myLock to the dictionary
try
{
// Do our stuff
success();
}
finally
{
lockDictionary.Remove(key);
}
}
}
There doesn't seem to be an elegant way to do this in .NET, although I have improved the algorithm thanks to #RobV's suggestion of a loop. Here is the final solution I settled on.
It is immune to the 'orphaned reference' bug that seems to be typical of the standard pattern followed by #Steven's answer.
using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
namespace ImageResizer.Plugins.DiskCache {
public delegate void LockCallback();
/// <summary>
/// Provides locking based on a string key.
/// Locks are local to the LockProvider instance.
/// The class handles disposing of unused locks. Generally used for
/// coordinating writes to files (of which there can be millions).
/// Only keeps key/lock pairs in memory which are in use.
/// Thread-safe.
/// </summary>
public class LockProvider {
/// <summary>
/// The only objects in this collection should be for open files.
/// </summary>
protected Dictionary<String, Object> locks =
new Dictionary<string, object>(StringComparer.Ordinal);
/// <summary>
/// Synchronization object for modifications to the 'locks' dictionary
/// </summary>
protected object createLock = new object();
/// <summary>
/// Attempts to execute the 'success' callback inside a lock based on 'key'. If successful, returns true.
/// If the lock cannot be acquired within 'timoutMs', returns false
/// In a worst-case scenario, it could take up to twice as long as 'timeoutMs' to return false.
/// </summary>
/// <param name="key"></param>
/// <param name="success"></param>
/// <param name="failure"></param>
/// <param name="timeoutMs"></param>
public bool TryExecute(string key, int timeoutMs, LockCallback success){
//Record when we started. We don't want an infinite loop.
DateTime startedAt = DateTime.UtcNow;
// Tracks whether the lock acquired is still correct
bool validLock = true;
// The lock corresponding to 'key'
object itemLock = null;
try {
//We have to loop until we get a valid lock and it stays valid until we lock it.
do {
// 1) Creation/aquire phase
lock (createLock) {
// We have to lock on dictionary writes, since otherwise
// two locks for the same file could be created and assigned
// at the same time. (i.e, between TryGetValue and the assignment)
if (!locks.TryGetValue(key, out itemLock))
locks[key] = itemLock = new Object(); //make a new lock!
}
// Loophole (part 1):
// Right here - this is where another thread (executing part 2) could remove 'itemLock'
// from the dictionary, and potentially, yet another thread could
// insert a new value for 'itemLock' into the dictionary... etc, etc..
// 2) Execute phase
if (System.Threading.Monitor.TryEnter(itemLock, timeoutMs)) {
try {
// May take minutes to acquire this lock.
// Trying to detect an occurence of loophole above
// Check that itemLock still exists and matches the dictionary
lock (createLock) {
object newLock = null;
validLock = locks.TryGetValue(key, out newLock);
validLock = validLock && newLock == itemLock;
}
// Only run the callback if the lock is valid
if (validLock) {
success(); // Extremely long-running callback, perhaps throwing exceptions
return true;
}
} finally {
System.Threading.Monitor.Exit(itemLock);//release lock
}
} else {
validLock = false; //So the finally clause doesn't try to clean up the lock, someone else will do that.
return false; //Someone else had the lock, they can clean it up.
}
//Are we out of time, still having an invalid lock?
if (!validLock && Math.Abs(DateTime.UtcNow.Subtract(startedAt).TotalMilliseconds) > timeoutMs) {
//We failed to get a valid lock in time.
return false;
}
// If we had an invalid lock, we have to try everything over again.
} while (!validLock);
} finally {
if (validLock) {
// Loophole (part 2). When loophole part 1 and 2 cross paths,
// An lock object may be removed before being used, and be orphaned
// 3) Cleanup phase - Attempt cleanup of lock objects so we don't
// have a *very* large and slow dictionary.
lock (createLock) {
// TryEnter() fails instead of waiting.
// A normal lock would cause a deadlock with phase 2.
// Specifying a timeout would add great and pointless overhead.
// Whoever has the lock will clean it up also.
if (System.Threading.Monitor.TryEnter(itemLock)) {
try {
// It succeeds, so no-one else is working on it
// (but may be preparing to, see loophole)
// Only remove the lock object if it
// still exists in the dictionary as-is
object existingLock = null;
if (locks.TryGetValue(key, out existingLock)
&& existingLock == itemLock)
locks.Remove(key);
} finally {
// Remove the lock
System.Threading.Monitor.Exit(itemLock);
}
}
}
}
}
// Ideally the only objects in 'locks' will be open operations now.
return true;
}
}
}
Consuming this code is very simple:
LockProvider p = new LockProvider();
bool success = p.TryExecute("filename",1000,delegate(){
//This code executes within the lock
});

Categories