I've been working on a DICOM viewer. It is currently working correctly. Furthermore, when I tried to convert the slices to a 3D model the process took about 3-4 minutes. After some profiling, it now takes less than a minute. However, there is something weird about the result of the last performance test I made:
May I know why a brace is consuming that much Performance??
EDIT: the closing brace consumes 1.5% too, so I'm pretty sure it doesn't mean the whole function.
Related
I've recently made a desktop application that communicates with a device at my job.
The general idea of the app is to give the device (oven) a "set temperature" command and after every 10 seconds check the current temperature and display it on a graph using livecharts.
This application is required to run multiple days at a time and I seem to be having a memory leak problem, I think.
What I experience is the application not responding for a while, then it becomes responsive and adds 1 single "log" effect as in LogTemp function every around 1-2 minutes. It should be once every 10 seconds.
Edit: Just read the lines before this edit and I think I was not too clear. It works as it is supposed to work the first few hours. Noticed the performance took a hit only after 24 hours.
After 24 hours of running I came back to find it is using over 800 MB of RAM and it kept growing by the second.
I suspect it MAY have something to do with livecharts but I am not sure by any means. (it ends up with 8640 points of data after 24 hours)
I have no issue disclosing my code and have even minimized the amount of code needed to be shown to around 200 lines in total which are split to a few different functions, but if anyone heard of such an issue with livecharts and/or can suggest a different type of graph library I'd be more than happy to swap it out.
Actually, here's the code, lmao:
https://pastebin.com/YBn5CuD6
Another thing I thought of, could it be that I am adding too many rows to the ListView? We're talking about.. 8640 rows in 24 hours. Might that be something to do with it?
Sorry for the long post, thank you in advance.
For anyone interested, I lowered the sample rate which is responsible for how many points are on the LiveCharts chart and it goes very smooth right now, even after 3 days of running.
(RAM usage was also around 90 MB, which is to be expected from my application)
I believe that was the issue, so I'll mark this as an answer for the next person googling LiveCharts memory-leak.
There is a problem I'm currently experiencing with Parallel.For.
I'm trying to run a batch of ~1000 of pretty long running tasks( each running for like 120 seconds), and my problem seems to be, that on the last 10-20 iterations, the processing becomes single-threaded.
Pseudocode:
Parallel.For(0, 1000,
(i, loopState)=>
{
//Do lots of work
Thread.Sleep(120_000);
});
Do you guys know why this is happening, and how can this issue be fixed?
I think I figured out the problem, or at least found a solution that makes it go away.
Sadly the example code in my question was not 100% representative of my code, in my code I had used the overload with localInit and localFinally. I thought it didn't matter but It seems it does.
My guess is that in doing so, the internal implementation partitioned up my loop iterations and statically assigned each partition to a separate thread. Since each element took wildly different amount of time to process, it resulted one partition taking a lot longer than the others, hence the single-threading.
The fix was to use the overload that is referenced in my question, which doesn't seem to suffer from this problem.
I'm using a parallel for loop in my code to run a long running process on a large number of entities (12,000).
The process parses a string, goes through a number of input files (I've read that given the number of IO based things the benefits of threading could be questionable, but it seems to have sped things up elsewhere) and outputs a matched result.
Initially, the process goes quite quickly - however it ends up slowing to a crawl. It's possible that it's just hit a number of particularly tricky input data, but this seems unlikely looking closer at things.
Within the loop, I added some debug code that prints "Started Processing: " and "Finished Processing: " when it begins/ends an iteration and then wrote a program that pairs a start and a finish, initially in order to find which ID was causing a crash.
However, looking at the number of unmatched ID's, it looks like the program is processing in excess of 400 different entities at once. This seems like, with the large number of IO, it could be the source of the issue.
So my question(s) is(are) this(these):
Am I interpreting the unmatched ID's properly, or is there some clever stuff going behind the scenes I'm missing, or even something obvious?
If you'd agree what I've spotted is correct, how can I limit the number it spins off and does at once?
I realise this is perhaps a somewhat unorthodox question and may be tricky to answer given there is no code, but any help is appreciated and if there's any more info you'd like, let me know in the comments.
Without seeing some code, I can guess at the answers to your questions:
Unmatched IDs indicate to me that the thread that is processing that data is being de-prioritized. This could be due to IO or the thread pool trying to optimize, however it seems like if you are strongly IO bound then that is most likely your issue.
I would take a look at Parallel.For, specifically using ParallelOptions.MaxDegreesOfParallelism to limit the maximum number of tasks to a reasonable number. I would suggest trial and error to determine the optimum number of degrees, starting around the number of processor cores you have.
Good luck!
Let me start by confirming that is indeed a very bad idea to read 2 files at the same time from a hard drive (at least until the majority of HDs out there are SSDs), let alone whichever number your whole thing is using.
The use of parallelism serves to optimize processing using an actually paralellizable resource, which is the CPU power. If you paralellized process reads from a hard drive then you're losing most of the benefit.
And even then, even the CPU power is not prone to infinite paralellization. A normal desktop CPU has the capacity to run up to 10 threads at the same time (depends of the model obviously, but that's the order of magnitude).
So two things
first, I am going to make the assumption that your entities use all your files, but your files are not too big to be loaded into memory. If it's the case, you should read your files into objects (i.e. into memory), then paralellize the processing of your entities using those objects. If not, you're basically relying on your hard drive's cache to not reread your files every time you need them, and your hard drive's cache is far smaller than your memory (1000-fold).
second, you shouldn't be running Parallel.For on 12.000 items. Parallel.For will actually (try to) create 12.000 threads, and that is actually worse than 10 threads, because of the big overhead that paralellizing will create, and the fact your CPU will not benefit from it at all since it cannot run more than 10 threads at a time.
You should probably use a more efficient method, which is the IEnumerable<T>.AsParallel() extension (comes with .net 4.0). This one will, at runtime, determine what is the optimal thread number to run, then divide your enumerable into as many batches. Basically, it does the job for you - but it creates a big overhead too, so it's only useful if the processing of one element is actually costly for the CPU.
From my experience, using anything parallel should always be evaluated against not using it in real-life, i.e. by actually profiling your application. Don't assume it's going to work better.
I inherited the app and what it does is get data from 4 views with an (xml file in it) in chunks of 1000 records then writes them down in an xml file all this split up by a type parameter that has 9 different possibilities. That means in a worst case there will be 36 connections to the database for each 1000 of that type/view combination.
The real data will exist of 90.000 lines and in this case 900 - 936 times fetching up to 1000 lines from database.
Now I am wondering what advantages it would give to read all data into the app and make the app work with this to write the 900+ files.
1000 lines is about 800MB, 90.000 lines is approx 81GB of data being transferred.
The code would have to be rewritten if we read it all at once and although it would make more sense this is a one time job. After the 90.000 lines, we will never use this code again. Is it worth it to spend 2, 3 hours to rewrite code that works to reduce the amount of connections this way?
If it's a one-time thing then why spend any effort at all optimizing it? Answer: no.
Let me add, though, in answer to your general question of what advantage does a big query have over lots of small ones: probably none. If you run a huge query you are leaving a lot of magic up to the middleware, it may or may not work well.
While having 36 simultaneous connections isn't optimal either, its probably better than running a query that could return 80 gigabytes of data. The ideal solution (if you had to use this code more than once) would be to rewrite it to get data in chunks but not leave lots of connections open simultaneously.
Does the code work already? If it does, then I wouldn't spend time rewriting it. You run in to the risk of introducing bugs in the code. Since you will use this once and never use it again, it doesn't seem like it is worth the effort.
If we are talking SQL Server, the biggest disadvantage of a large query (a single batch) over many small ones (note the opposite sense to the question you are asking) is that there can only be one query plan per batch.
If it's a one time job I'd say no. Many times I have done things that i normally wouldn't (cursors) but ONLY because it was a one time job.
Ask yourself it it makes sense to spend 2 to 3 hours on something that already works and you will never use again. There are obviously other factors to take into account though. Like will this lock up your production database for 2-3 hours?
If there are no disastrous side effects I'd say use what you have.
I am trying to make the loading part of a C# program faster. Currently it takes like 15 seconds to load up.
On first glimpse, things that are done during the loading part includes constructing many 3rd Party UI components, loading layout files, xmls, DLLs, resources files, reflections, waiting for WndProc... etc.
I used something really simple to see the time some part takes,
i.e. breakpointing at a double which holds the total milliseconds of a TimeSpan which is the difference of a DateTime.Now at the start and a DateTime.Now at the end.
Trying that a few times will give me sth like,
11s 13s 12s 12s 7s 11s 12s 11s 7s 13s 7s.. (Usually 12s, but 7s sometimes)
If I add SuspendLayout, BeginUpdate like hell; call things in reflections once instead of many times; reduce some redundant redundant computation redundancy. The time are like 3s 4s 3s 4s 3s 10s 4s 4s 3s 4s 10s 3s 10s.... (Usually 4s, but 10s sometimes)
In both cases, the times are not consistent but more like, a bimodal distribution? It really made me unsure whether my correction of the code is really making it faster.
So I would like to know what will cause such result.
Debug mode?
The "C# hve to compile/interpret the code on the 1st time it runs, but the following times will be faster" thing?
The waiting of WndProc message?
The reflections? PropertyInfo? Reflection.Assembly?
Loading files? XML? DLL? resource file?
UI Layouts?
(There are surely no internet/network/database access in that part)
Thanks.
Profiling by stopping in the debugger is not a reliable way to get timings, as you've discovered.
Profiling by writing times to a log works fine, although why do all this by hand when you can just launch the program in dotTrace? (Free trial, fully functional).
Another thing that works when you don't have access to a profiler is what I call the binary approach - look at what happens in the code and try to disable about half of it by using comments. Note the effect on the running time. If it appears significant, repeat this process with half of that half, and so on recursively until you narrow in on the most significant piece of work. The difficulty is in simulating the side effects of the missing code so that that the remaining code can still work, so this is still harder than using a debugger, but can be quicker than adding a lot of manually time logging, because the binary approach lets you zero in on the slowest place in logarithmic time.
Raymond Chen's advise is good here. When people ask him "How can I make my application start up faster?" he says "Do less stuff."
(And ALWAYS profile the release build - profiling the debug build is generally a wasted effort).
Profile it. you can use eqatec its free
Well, the best thing is to run your application through a profiler and see what the bottlenecks are. I've personally used dotTrace, there are plenty of others you can find on the web.
Debug mode turns off a lot of JIT optimizations, so apps will run a lot slower than release builds. Whatever the mode, JITting has to happen, so I'd discount that as a significant factor. Time to read files from disk can vary based on the OS's caching mechanism, and whether you're doing a cold start or a warm start.
If you have to use timers to profile, I'd suggest repeating the experiment a large number of times and taking the average.
Profiling you code is definitely the best way to identify which areas are taking the longest to run.
As for the other part of your question about the inconsistent timings: timings in an multitasking O/S are inherently inconsistent, and working with managed code throws the garbage collector into the equation too. It could be that the GC is kicking in during your timing which will obviously slow things down.
If you want to try and get a "purer" timing try putting a GC collect before you start your timers, this way it is less likely to start in your timing section. Do remember to remove the timers after, as second guessing when the GC should run normally results in poorer performance.
Apart from the obvious (profiling), which will tell you precisely where time is being spent, there are some other points that spring to mind:
To get reasonable timing results with the approach you are using, run a release build of your program, and have it dump the timing results to a file (e.g. with Trace.WriteLine). Timing a debug version will give you spurious results. When running the timing tests, quit all other applications (including your debugger) to minimise the load on your computer and get more consistent results. Run the program many times and look at the average timings. Finally, bear in mind that Windows caches a lot of stuff, so the first run will be slow and subsequent runs will be much faster. This will at least give you a more consistent basis to tell whether your improvements are making a significant difference.
Don't try and optimise code that shouldn't be run in the first place - Can you defer any of the init tasks? You may find that some of the work can simply be removed from the init sequence. e.g. if you are loading a data file, check whether it is needed immediately - if not, then you could load it the first time it is needed instead of during program startup.