Profiling C# application without using external tools - c#

I have a C# (dotnet core) application that gets input from the queue and processes it. The processing time is determined by many factors. One of them is the request itself.
My code looks something like this:
public static void main()
{
DoWork(new TaskInformation("abc"));
}
public static void DoWork(TaskInformation input)
{
InnerWork1(input);
InnerWork2(input);
}
private static void InnerWork1(TaskInformation input)
{
// Running code here.
Thread.Sleep(1000);
}
private static void InnerWork2(TaskInformation input)
{
// Running code here.
Thread.Sleep(2000);
}
In order to improve the performance of my application, I want to develop a small tool that will run and mesure execution time. If the time is above some threshold - it will do something.
So, what I want to do is "wrap" my inner functions (InnerWork1\2) with Stopwatch automatically - without doing it programmatically.
The way I'll turn on this profiling feature should be something like:
public static void main()
{
RunWithProfiling(DoWork, new TaskInformation("abc"));
}
So, in this case the expected profiling result will be:
DoWork - took 3 seconds
InnerWork1 - took 1 second
InnerWork2 - took 2 second
Questions:
The question is if it possible to implement a mechanism like this? Maybe with reflection?
Is this possible to run a C# profiler from code? Something like "the code will profile itself".

You can use a delegate to pass your method to a RunWithProfiling method, in my example I will use Action, then measure its invoke time with Stopwatch:
private static void RunWithProfiling(Action method)
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
method.Invoke();
stopwatch.Stop();
Console.WriteLine($"{method.Method.Name} - took {stopwatch.Elapsed.Seconds} second(s).");
}
and usage
public static void DoWork()
{
RunWithProfiling(InnerWork1);
RunWithProfiling(InnerWork2);
}
You can for example return long with RunWithProfiling method, then sum results up to get whole execution time.

Related

C# - Total Seconds expended since beginning of app use - data type [duplicate]

I am porting an app from ActionScript3.0 (Flex) to C# (WPF).AS3.0 has got a handy utility called getTimer() which returns time since Flash Virtual machine start in milliseconds.I was searching in C# through classes as
DateTime
DispatcherTimer
System.Diagnostics.Process
System.Diagnostics.Stopwatch
but found nothing like this.It seems a very basic feature to me.For example Unity3D which runs on Mono has something familiar. Do I miss some utility here?
Thanks in advance.
Process.GetCurrentProcess().StartTime is your friend.
..so to get elapsed time since start:
DateTime.UtcNow - Process.GetCurrentProcess().StartTime.ToUniversalTime()
alternatively, if you need more definition, System.Diagnostics.Stopwatch might be preferable. If so, start a stopwatch when your app starts:
Stopwatch sw = Stopwatch.StartNew();
then query the sw.Elapsed property during your execution run.
public static class Runtime
{
static Runtime()
{
var ThisProcess = System.Diagnostics.Process.GetCurrentProcess(); LastSystemTime = (long)(System.DateTime.Now - ThisProcess.StartTime).TotalMilliseconds; ThisProcess.Dispose();
StopWatch = new System.Diagnostics.Stopwatch(); StopWatch.Start();
}
private static long LastSystemTime;
private static System.Diagnostics.Stopwatch StopWatch;
//Public.
public static long CurrentRuntime { get { return StopWatch.ElapsedMilliseconds + LastSystemTime; } }
}
Then call: Runtime.CurrentRuntime to get the current programs runtime in miliseconds.
Note: You can replace the TotalMilliseconds/ElapsedMilliseconds to any other time metric you need.

C# are field reads guaranteed to be reliable (fresh) when using multithreading?

Background
My colleague thinks reads in multithreaded C# are reliable and will always give you the current, fresh value of a field, but I've always used locks because I was sure I'd experienced problems otherwise.
I spent some time googling and reading articles, but I mustn't be able to provide google with correct search input, because I didn't find exactly what I was after.
So I wrote the below program without locks in an attempt to prove why that's bad.
Question
I'm assuming the below is a valid test, then the results show that the reads aren't reliable/fresh.
Can someone explain what this is caused by? (reordering, staleness or something else)?
And link me to official Microsoft documentation/section explaining why this happens and what is the recommended solution?
If the below isn't a valid test, what would be?
Program
If there are two threads, one calls SetA and the other calls SetB, if the reads are unreliable without locks, then intermittently Foo's field "c" will be false.
using System;
using System.Threading.Tasks;
namespace SetASetBTestAB
{
class Program
{
class Foo
{
public bool a;
public bool b;
public bool c;
public void SetA()
{
a = true;
TestAB();
}
public void SetB()
{
b = true;
TestAB();
}
public void TestAB()
{
if (a && b)
{
c = true;
}
}
}
static void Main(string[] args)
{
int timesCWasFalse = 0;
for (int i = 0; i < 100000; i++)
{
var f = new Foo();
var t1 = Task.Run(() => f.SetA());
var t2 = Task.Run(() => f.SetB());
Task.WaitAll(t1, t2);
if (!f.c)
{
timesCWasFalse++;
}
}
Console.WriteLine($"timesCWasFalse: {timesCWasFalse}");
Console.WriteLine("Finished. Press Enter to exit");
Console.ReadLine();
}
}
}
Output
Release mode. Intel Core i7 6700HQ:
Run 1: timesCWasFalse: 8
Run 2: timesCWasFalse: 10
Of course it is not fresh. The average CPU nowadays has 3 layers of Caches between each cores Registers and the RAM. And it can take quite some time for a write to one cache to be propagate to all of them.
And then there is the JiT Compiler. Part of it's job is dead code dection. And one of the first things it will do is cut out "useless" variables. For example this code tried to force a OOM excpetion by running into the 2 GiB Limit on x32 Systems:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace OOM_32_forced
{
class Program
{
static void Main(string[] args)
{
//each short is 2 byte big, Int32.MaxValue is 2^31.
//So this will require a bit above 2^32 byte, or 2 GiB
short[] Array = new short[Int32.MaxValue];
/*need to actually access that array
Otherwise JIT compiler and optimisations will just skip
the array definition and creation */
foreach (short value in Array)
Console.WriteLine(value);
}
}
}
The thing is that if you cut out the output stuff, there is a decent chance that the JiT will remove the variable Array inlcuding the instantionation order. The JiT has a decent chance to reduce this programming to doing nothing at all at runtime.
volatile is first preventing the JiT from doing any optimisations on that value. And it might even have some effect on how the CPU processes stuff.

System Uptime & MemoryBarrier

I need a robust way of getting system uptime, and ended up using something as follows.
Added some comments to help people read it. I cannot use Task's as this has to run on a .NET 3.5 application.
// This is a structure, can't be marked as volatile
// need to implement MemoryBarrier manually as appropriate
private static TimeSpan _uptime;
private static TimeSpan GetUptime()
{
// Try and set the Uptime using per counters
var uptimeThread = new Thread(GetPerformanceCounterUptime);
uptimeThread.Start();
// If our thread hasn't finished in 5 seconds, perf counters are broken
if (!uptimeThread.Join(5 * 1000))
{
// Kill the thread and use Environment.TickCount
uptimeThread.Abort();
_uptime = TimeSpan.FromMilliseconds(
Environment.TickCount & Int32.MaxValue);
}
Thread.MemoryBarrier();
return _uptime;
}
// This sets the System uptime using the perf counters
// this gives the best result but on a system with corrupt perf counters
// it can freeze
private static void GetPerformanceCounterUptime()
{
using (var uptime = new PerformanceCounter("System", "System Up Time"))
{
uptime.NextValue();
_uptime = TimeSpan.FromSeconds(uptime.NextValue());
}
}
The part I am struggling with is where should Thread.MemoryBarrier() be placed?
I am placing it before reading the value, but either the current thread or a different thread could have written to it. Does the above look correct?
Edit, Answer based on Daniel
This is what I eneded up implementing, thank you both for the insight.
private static TimeSpan _uptime;
private static TimeSpan GetUptime()
{
var uptimeThread = new Thread(GetPerformanceCounterUptime);
uptimeThread.Start();
if (uptimeThread.Join(5*1000))
{
return _uptime;
}
else
{
uptimeThread.Abort();
return TimeSpan.FromMilliseconds(
Environment.TickCount & Int32.MaxValue);
}
}
private static void GetPerformanceCounterUptime()
{
using (var uptime = new PerformanceCounter("System", "System Up Time"))
{
uptime.NextValue();
_uptime = TimeSpan.FromSeconds(uptime.NextValue());
}
}
Edit 2
Updated based on Bob's comments.
private static DateTimeOffset _uptime;
private static DateTimeOffset GetUptime()
{
var uptimeThread = new Thread(GetPerformanceCounterUptime);
uptimeThread.Start();
if (uptimeThread.Join(5*1000))
{
return _uptime;
}
else
{
uptimeThread.Abort();
return DateTimeOffset.Now.Subtract(TimeSpan.FromMilliseconds(
Environment.TickCount & Int32.MaxValue));
}
}
private static void GetPerformanceCounterUptime()
{
if (_uptime != default(DateTimeOffset))
{
return;
}
using (var uptime = new PerformanceCounter("System", "System Up Time"))
{
uptime.NextValue();
_uptime = DateTimeOffset.Now.Subtract(
TimeSpan.FromSeconds(uptime.NextValue()));
}
}
Thread.Join already ensures that writes performed by the uptimeThread are visible on the main thread. You don't need any explicit memory barrier. (without the synchronization performed by Join, you'd need barriers on both threads - after the write and before the read)
However, there's a potential problem with your code: writing to a TimeSpan struct isn't atomic, and the main thread and the uptimeThread may write to it at the same time (Thread.Abort just signals abortion, but doesn't wait for the thread to finish aborting), causing a torn write.
My solution would be to not use the field at all when aborting. Also, multiple concurrent calls to GetUptime() may cause the same problem, so you should use an instance field instead.
private static TimeSpan GetUptime()
{
// Try and set the Uptime using per counters
var helper = new Helper();
var uptimeThread = new Thread(helper.GetPerformanceCounterUptime);
uptimeThread.Start();
// If our thread hasn't finished in 5 seconds, perf counters are broken
if (uptimeThread.Join(5 * 1000))
{
return helper._uptime;
} else {
// Kill the thread and use Environment.TickCount
uptimeThread.Abort();
return TimeSpan.FromMilliseconds(
Environment.TickCount & Int32.MaxValue);
}
}
class Helper
{
internal TimeSpan _uptime;
// This sets the System uptime using the perf counters
// this gives the best result but on a system with corrupt perf counters
// it can freeze
internal void GetPerformanceCounterUptime()
{
using (var uptime = new PerformanceCounter("System", "System Up Time"))
{
uptime.NextValue();
_uptime = TimeSpan.FromSeconds(uptime.NextValue());
}
}
}
However, I'm not sure if aborting the performance counter thread will work correctly at all - Thread.Abort() only aborts managed code execution. If the code is hanging within a Windows API call, the thread will keep running.
AFAIK writes in .NET are volatile, so the only place where you would need a memory fence would be before each read, since they are subject to reordering and/or caching. To quote from a post by Joe Duffy:
For reference, here are the rules as I have come to understand them
stated as simply as I can:
Rule 1: Data dependence among loads and stores is never violated.
Rule 2: All stores have release semantics, i.e. no load or store may move after one.
Rule 3: All volatile loads are acquire, i.e. no load or store may move before one.
Rule 4: No loads and stores may ever cross a full-barrier.
Rule 5: Loads and stores to the heap may never be introduced.
Rule 6: Loads and stores may only be deleted when coalescing adjacent loads and
stores from/to the same location.
Note that by this definition, non-volatile loads are not required to
have any sort of barrier associated with them. So loads may be freely
reordered, and writes may move after them (though not before, due to
Rule 2). With this model, the only true case where you’d truly need
the strength of a full-barrier provided by Rule 4 is to prevent
reordering in the case where a store is followed by a volatile load.
Without the barrier, the instructions may reorder.

VS 2010 Load Tests Results with custom counters

I am new on Load Testing (and in general, testing) with visual studio 2010 and I am dealing with several problems.
My question is, is there any way possible, to add a custom test variable on the Load Test Results?
I have the following UnitTest:
[TestMethod]
public void Test()
{
Stopwatch testTimer = new Stopwatch();
testTimer.Start();
httpClient.SendRequest();
testTimer.Stop();
double requestDelay = testTimer.Elapsed.TotalSeconds;
}
This UnitTest is used by many LoadTests and I want to add the requestDelay variable to the Load Test Result so I can get Min, Max and Avg values like all others Load Test Counters (e.g. Test Response Time).
Is that possible?
Using the link from the #Pritam Karmakar comment and the walkthroughs at the end of my post I finally managed to find a solution.
First I created a Load Test Plug-In and used the LoadTestStarting Event to create my Custom Counter Category and add to it all my counters:
void m_loadTest_LoadTestStarting(object sender, System.EventArgs e)
{
// Delete the category if already exists
if (PerformanceCounterCategory.Exists("CustomCounterSet"))
{
PerformanceCounterCategory.Delete("CustomCounterSet");
}
//Create the Counters collection and add my custom counters
CounterCreationDataCollection counters = new CounterCreationDataCollection();
counters.Add(new CounterCreationData(Counters.RequestDelayTime.ToString(), "Keeps the actual request delay time", PerformanceCounterType.AverageCount64));
// .... Add the rest counters
// Create the custom counter category
PerformanceCounterCategory.Create("CustomCounterSet", "Custom Performance Counters", PerformanceCounterCategoryType.MultiInstance, counters);
}
Then, in the LoadTest editor I right-clicked on the Agent CounterSet and selected Add Counters... In the Pick Performance Counters window I chose my performance category and add my counters to the CounterSet so the Load Test will gather their data:
Finally, every UnitTest creates instances of the Counters in the ClassInitialize method and then it updates the counters at the proper step:
[TestClass]
public class UnitTest1
{
PerformanceCounter RequestDelayTime;
[ClassInitialize]
public static void ClassInitialize(TestContext TestContext)
{
// Create the instances of the counters for the current test
RequestDelaytime = new PerformanceCounter("CustomCounterSet", "RequestDelayTime", "UnitTest1", false));
// .... Add the rest counters instances
}
[TestCleanup]
public void CleanUp()
{
RequestDelayTime.RawValue = 0;
RequestDelayTime.EndInit();
RequestDelayTime.RemoveInstance();
RequestDelayTime.Dispose();
}
[TestMethod]
public void TestMethod1()
{
// ... Testing
// update counters
RequestDelayTime.Incerement(time);
// ... Continue Testing
}
}
Links:
Creating Performance Counters Programmatically
Setting Performance Counters
Including unit test variable values in load test results
I think what you actually need is to use:
[TestMethod]
public void Test()
{
TestContext.BeginTimer("mytimer");
httpClient.SendRequest();
TestContext.EndTimer("mytimer");
}
You can find good documentation here.
Interesting question. Never tried this, but I have an idea.
Create 3 class level properties of MAX, MIN and AVG. during each test manipulate those values. And then write all final values once entire load test get executed via Classcleanup or assemblycleanup test attribute. You have to run the load test for 1-2 min and have to see which attribute method get called at the end. You can then print those final values in a flat file in local drive via textwriter.

C# object creation much slower than constructor call

For the life of my, I can't figure out this performance hit in my code. I have a container object where I measure how long it takes to run the constructor (object below), timing code in the public constructor
public class WorkUnit : IWorkUnit
{
private JobInformation m_JobInfo;
private MetaInfo m_MetaInfo;
private IPreProcJobInfo m_PreprocDetails;
readonly private Guid m_ID;
private Guid m_ParentID;
private Guid m_MasterJobID;
private string m_ErrorLog = string.Empty;
private PriorityKeeper m_Priority;
private WorkUnitClassification m_Classification;
private IJobPayload m_CachedPayload;
private IJobLogger m_Logger;
private EventHandler<JobEventArgs> m_IsFinished;
private ReaderWriterLockSlim m_Lock;
public WorkUnit(string InputXML, Guid JobID, IJobLogger Logger)
{
DateTime overstarttime = DateTime.Now;
try
{
....Do Stuff....
}
catch(XMLException e)
{...}
catch(Exception e)
{
...
throw;
}
double time = (DateTime.Now - overstarttime).TotalMilliseconds
Console.WriteLine("{0}", time);
}
/// <summary>
/// Private Constructor used to create children
/// </summary>
private WorkUnit(Guid MasterID, Guid ParentID, WorkUnitClassification Classification, PriorityKeeper Keeper)
{...}
[OnDeserializing()]
private void OnDeserialize(StreamingContext s)
{...}
public PriorityKeeper PriorityKey
{...}
public bool HasError
{...}
public bool Processing
{...}
public bool Splittable
{...}
public IEnumerable<IWorkUnit> Split(int RequestedPieces, int Bonds)
{...}
public void Merge(IResponse finishedjob)
{...}
public ReadOnlyCollection<IWorkUnit> Children
{...}
public bool IsMasterJob
{...}
public Guid MasterJobID
{...}
public Guid ID
{...}
public Guid ParentID
{...}
public EnumPriority Priority
{...}
public void ChangePriority(EnumPriority priority)
{...}
public string ErrorLog
{...}
public IMetaInfo MetaData
{...}
public IJobPayload GetProcessingInfo()
{... }
public IResponseGenerator GetResponseGenerator()
{... }
}
Now, I'm measuring the total time it takes to create the object as
DateTime starttime = DateTime.Now;
var test = new WorkUnit(temp, JobID, m_JobLogger);
double finished = (DateTime.Now - starttime).TotalMilliseconds;
and I'm consistently getting the following performance numbers -
Constructor time - 47 ms
Object creation time - 387 ms
47 ms is acceptable, 387 is really bad. Taking out the logging negligibly changes these numbers. Does anyone have any idea why this is taking so long? My system is VS 2008 SP1, targeting .NET 3.5 SP1 on Windows XP. I would appreciate any explanation. I'll be breaking out the profiler shortly, but I feel that it won't be able to delve into the level I need to explain this behavior. Thanks for any help.
EDIT: I'm running in release
Are you sure what you're seeing is the object creation time and not the effects of the CLR starting up?
Try running the test 50 times in a loop and ignoring the first result.
Steve,
Here are a couple of things to consider:
Switch from using DateTime to using a StopWatch. It is much more accurate for these types of situations.
Stop writing out to the console during the timing process. The IO is going to be significant, and impact your timings.
Make sure you're running in a release/optimized build, and not running under the Visual Studio Test Host. If you run from a default VS, switch to Release, build, and use Ctrl+F5 (instead of just F5) to run.
Given your timings, I'm guessing #2 is your issue. Visual Studio adds a lot of "hooks" that dramatically impact perf. timings when running inside of Visual Studio.
First use the StopWatch class to measure time instead. The resolution of the system time is way too low to give any accurate results.
Try to create more than one instance of the class. The first time the assembly might not be JIT:ed, which of course takes some time.
Time to bring out Red Gate Performance Profiler. Instead of asking us to guess what the issue might be...download a trial and let it tell you EXACTLY where your issue is.
Profilers are great tools. Any developer should be familiar with how to utilize them to pinpoint performance issues.
The question contains its own answer; there's more to instantiating an object than just running its constructor. When you call new you're asking the runtime to allocate space for an object, handle whatever internal bookkeeping the runtime needs, call the constructors for each base type (in this case, just object), and finally call your constructor.
When you measure the total instantiation time you're measuring all of that; when you time the constructor alone you're only measuring a part. If the numbers didn't differ, that would be a cause for concern.
As others have suggested, first and foremost, definitely switch to using System.Diagnostics.Stopwatch:
public WorkUnit(string InputXML, Guid JobID, IJobLogger Logger, out TimeSpan elapsed)
{
Stopwatch constructionStopwatch = Stopwatch.StartNew();
// constructor logic
constructionStopwatch.Stop();
elapsed = constructionStopwatch.Elapsed;
}
And then:
TimeSpan constructionTime = TimeSpan.Zero;
Stopwatch creationStopwatch = Stopwatch.StartNew();
var test = new WorkUnit(temp, JobID, m_JobLogger, out constructionTime);
creationStopwatch.Stop();
TimeSpan creationTime = creationStopwatch.Elapsed;
double constructionMs = constructionTime.TotalMilliseconds;
double creationMs = creationTime.TotalMilliseconds;
The reason I advise switching to using TimeSpan objects instead of doing something like (DateTime.Now - startTime).TotalMilliseconds is that, although it should make very little difference, technically in the latter case you are first calling getting the time and then getting the TotalMilliseconds property, which I am almost certain is a calculated value, in the constructor. Which means there's actually a step between checking the time in your constructor and checking the time immediately afterward. Really, this should be basically negligible, but it's good to cover all your bases.
Do you know that the Console.WriteLine in the constructor is throwing of your timing hugely? Any IO op will throw off these timings.
If you want real numbers, store the durations in a global somewhere then print them out after you have recorded everything.

Categories