I'm planning on creating a live analytics's page for my website - A bit like Google Analytic but will real live data which will change as new users load a page on my site etc.
The main site is/will be written using Asp.Net/C# as the back end with a MS SQL database and the front end will support things like JavaScript (JQuery), CSS3, HTML5 (If required).
I was wondering what methods can I use to have the live analytic in terms of; How to get the data onto the analytic's page, what efficient graphing can I use, and storing the data with fast input/output.
The first thing that came to my mind is to use Node.js - Could I use this to achieve a live analytic's page? Is a good idea? Are there any better alternatives? Any drawbacks with this?
Would I need a C# Application running on a server to use Node.js to send/receive all the data to and from the website?
Would using a MS SQL database be fast enough? Would I need to store all the data live, or could I store it in chunks every x amount of seconds/minutes? (Which would be more efficient?)
This illustrates my initial thoughts on the matter -
Edit:
I'm going to be using this system over multiple sites, I could be getting 10 hits at a time to around 1,000,000 (Highly unlikely, but still possible). I want to be able to scale this system and adapt it to the environment it's in.
It really depends on how "real time" the realtime data needs to be. For example, I made this recently:
http://www.reed.co.uk/labs/realtime/
Which shows job applications coming into the system. Obviously there is way too much going on during busy periods to actually be querying the main database in realtime - so, what we do is query a sliding "window" and cache it on the server - this is a chunk of the last 5minutes worth of events.
We then play this back to the user as is it's happening "now". having a little latency as part of a SLA (wherein the users don't really care) can make the whole system vastly more scalable.
[EDIT- further explanation]
The data is just retrieved from a basic stored procedure call - naturally, a big system like reed has hundreds of transactions/second - so we cant keep hitting the main cluster, for every user.
All we do, is make sure we have a current window, in this case the last 5min of data cached on the server. When a client comes to the site, we get that last 5min of data, and play it back like it's happening right now - the end user is none-the-wiser - but what it means is that all clients are reading off the cache. Once the cache is 5min old, we invalidate it, and start again. This means a max of 1 DB hit, every five min - thus making teh system vastly more scalable (not that it really needs to be - as it's just for fun, really)
Just so you are aware Google analytics's already offers live user tracking. when inside the dashboard of a site on Google analytics's. click the home button on the top bar, and then the real time button on the left bar. Considering the design work and quality of this service, it seems this may be a better option then to attempt to recreate its service. If you do choose to proceed to create your own, then you can at least use their services as a benchmark for the desired features.
Using Api's like the googles charting API https://developers.google.com/chart/ would be a good approach to displaying the output of your stored data, with decreased development time. If you provide more information on the number of hits you exspect, and the scale of the server this software will be hosted, then it will be easier to give you answers to the speed questions.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I am a software engineer intern for a manufacturing company and they want me to develop an application for the company. They are leaning towards a web application however, I wish to know whether a desktop application would better fit the job. Therefore, I have been googling and looking through stackoverflow to find out what the pros and cons between desktop applications and web applications are. The following is essentially what I found:
Quick disclaimer, I have background in C# and WPF so I am a bit biased as it would be easier for me to develop a desktop application. I have no web experience so there is nothing I can really talk about in that area which is why I wish to know more about whether this application is better suited as a web application or desktop application. I am absolutely open to learning Php and web development to expand my abilities. I have started (a bit) looking into developing the web application using Php7 with Laravel framework.
Pros of desktop applications:
Typically faster than Web Applications (Assuming web application will perform complex queries, calculations, etc, and not just display markups)
Development of GUIs is faster
More secure as desktop applications are private by default.
There are more available controls allowing for a more rich and interactive experience for user (Or at least, these controls are easier/faster to implement on desktop based applcations compared to a web-based application)
Can take advantage of user hardware.
Cons of desktop applications:
Use/deployment is limited by system (However, this should not be a problem because all our systems are Windows based.)
Updates and installation must be manually implemented.
If every client desktop gets a database connection, scaling is not
good as database suffers from heavy load. (However, this probably
will not be the case since we won't have more than 500 users).
Pros of web application
Cross platform (No need to deal with different operating systems) so it is easily portable
Development is quick and easy
Deployment is easy as updates are automatic and server side.
Large community support and available frameworks.
Cons of web application
Larger overhead (Applications tend to be slower due to need to transmit data across the internet).
Need to deal with different browsers. Javascript most likely needs to be tweaked to be perfect on one web platform (Chrome, Firefox, etc) and will not be perfect on the others. (However, this is not that big of a deal).
Security is an issue since data will be public.
Please let me know if any of the above is outdated (most of the posts I found were from 2011 or prior) or wrong. Also, if there is any other pro/con to consider.
Moving on to the application description....
Background on the company: We build and process dozens of different parts every day. For each type of part, after X amount of the part is processed, a sample needs to be taken for inspection. So for example, part Y has 3 samples taken every 120 minutes to be inspected (Because the machine typically finishes processing X amounts in 120 minutes). The inspection results (measurement data) are then stored in the database (MySQL database).
General summary of the application's purpose:
View the schematics of all the parts we design (We store all the schematics as pdfs on a network drive, so this is simply just pulling up the specific pdf requested from the drive and displaying it onto the application).
View/update the status of all the machines in the company (What parts are they working on, are they online/offline, etc). A certain user (Inspector) will use this application to update machine status/information. Then another user (Operator) will use the application to view the statuses.
Monitor part inspections. So, for every machine and part being processed, there will be a timer to let a Operator user know when a certain part needs to be submitted for inspection. Upon part submission, an inspector will then receive a notification to inspect the part, and after letting the application know that they completed the inspection, the timer will restart to let the operator know whens the next time they need to submit a part.
The application will calculate statistical data (For example, Cpk values) from the part measurements obtained from inspection results and display the statistical data along with a graph/chart.
I hope I explained all of this clearly enough. Some other things to note, from my understanding, the users will not need remote access. This application will pretty much only be used on company site. Also, the original reason that the company wanted a web application was because operators will be using a tablet for the application and the tablets they acquired were original android based. However, they decided to switch to using Windows Surface tablets so WPF applications are now a possibility.
With all of this being said, I am really looking for input on what route people with more experience would recommend. I am still in college so please forgive my lack of knowledge/experience. What else should I be thinking about when deciding between a web application and desktop application?
Here are some of the pages I have seen while pondering this topic:
Advantages of web applications over desktop applications
https://www.quora.com/How-much-different-is-it-to-build-a-web-application-vs-a-desktop-application
https://www.quora.com/What-are-the-advantages-and-disadvantages-of-web-based-application-development-vs-desktop-application-development
There were more stackoverflow pages but the one listed above pretty much has everything that the other pages stated.
EDIT: Seems like web-application is winning so far (Not that I mind at all, I am actually excited to develop a web-application based off what I am hearing). Is there anyone who would rather do a desktop application? If so, why?
I'm inherently biased against web apps. They're difficult to get right due to browsers, they're typically insecure (by accident though). The platform sucks (JavaScript and the bazillion libraries from random people/orgs), "everything is a string". I could go on.
However it's undeniably the best platform for reaching a wide, public audience and allowing continual updates.
In a corporate environment the advantages do tend to go away, but not entirely. Updates, for example can be achieved generally by storing all your .exe & DLLs in a shared directory. As you say, you can build a much richer UI quicker and cheaper using the Windows platform.
With regards to your architecture, something that has worked for me in a similar situation is to have a Windows front end, but also have the guts of the business logic, data access (connection pooling) and processing off on a stateless web server (or two) accessed from the UI via Web Services (protocol of your choice - I prefer SOAP due to WCF and WSDL but plenty of folks won't).
This allows for centralised data access and a place to put your one-off batch jobs or calculations that can then be shared. It also has the advantage that if you need to do something really intensive, not every client machine has to have that capability.
Your situation seems to fit this model but without a lot of insider knowledge it's primarily opinion, but possibly one to consider.
Sounds like assembling or similar company work proccess monitoring to me.
If i have to build this application then first i will search and do some research if the function you want is possible and easy to develop with the programming language you will use
for example, if i choose to develop using web based then :
Larger overhead (Applications tend to be slower due to need to
transmit data across the internet).
you can use intranet and good spec server computer
Need to deal with different browsers. Javascript most likely needs to
be tweaked to be perfect on one web platform (Chrome, Firefox, etc)
and will not be perfect on the others. (However, this is not that big
of a deal).
then set the standard browser for working in your workplace
View the schematics of all the parts we design (We store all the
schematics as pdfs on a network drive, so this is simply just pulling
up the specific pdf requested from the drive and displaying it onto
the application).
you can upload pdf to server and view it within browser using pdf viewer plugin like pdfjs or similar plugin
View/update the status of all the machines in the company (What parts
are they working on, are they online/offline, etc). A certain user
(Inspector) will use this application to update machine
status/information. Then another user (Operator) will use the
application to view the statuses
is the machine have ip ?
can i use ping function to the machine to determine the machine is online or not to ease the task ?
if not then what is the schedule of the inspector to inspect the machine ?
of course the inspector can login to the system then update the machine status manually using web application
Monitor part inspections. So, for every machine and part being
processed, there will be a timer to let a Operator user know when a
certain part needs to be submitted for inspection. Upon part
submission, an inspector will then receive a notification to inspect
the part, and after letting the application know that they completed
the inspection, the timer will restart to let the operator know whens
the next time they need to submit a part.
this one sounds like scheduling mechanics to ensure the quality, you can make a timer with jquery and using ajax to send notification to the operator with specific data about certain part that need to be inspected
The application will calculate statistical data (For example, Cpk
values) from the part measurements obtained from inspection results
and display the statistical data along with a graph/chart.
this one is depends on your statistic formula, you can use highchart plugin for this one
the second one after you ensure your choosen programming language able to accomplish the task you want is to design the database structure
Quote by Linus Torvalds :
"Bad programmers worry about the code. Good programmers worry about data structures and their relationships."
Have a nice day, good luck with deciding after give it some good thinking to avoid development problem in the future
how much traffic is heavy traffic? what are the best resources for learning about heavy traffic web site development?.. like what are the approaches?
There are a lot of principles that apply to any web site, irrelevant of the underlying stack:
use HTTP caching facilities. For one there is the user agent cache. Second, the entire web backbone is full of proxies that can cache your requests, so use this to full advantage. A request that does even land on your server will add 0 to your load, you can't optimize better than that :)
corollary to the point above, use CDNs (Content Delivery Network, like CloudFront) for your static content. CSS, JPG, JS, static HTML and many more pages can be served from a CDN, thus saving the web server from a HTTP request.
second corollary to the first point: add expiration caching hints to your dynamic content. Even a short cache lifetime like 10 seconds will save a lot of hits that will be instead served from all the proxies sitting between the client and the server.
Minimize the number of HTTP requests. Seems basic, but is probably the best overlooked optimization available. In fact, Yahoo best practices puts this as the topmost optimization, see Best Practices for Speeding Up Your Web Site. Here is their bets practices list:
Minimize HTTP Requests
Use a Content Delivery Network
Add an Expires or a Cache-Control Header
Gzip Components
... (the list is quite long actually, just read the link above)
Now after you eliminated as much as possible from the superfluous hits, you still left with optimizing whatever requests actually hit your server. Once your ASP code starts to run, everything will pale in comparison with the database requests:
reduce number of DB calls per page. The best optimization possible is, obviously, not to make the request to the DB at all to start with. Some say 4 reads and 1 write per page are the most a high load server should handle, other say one DB call per page, still other say 10 calls per page is OK. The point is that fewer is always better than more, and writes are significantly more costly than reads. Review your UI design, perhaps that hit count in the corner of the page that nobody sees doesn't need to be that accurate...
Make sure every single DB request you send to the SQL server is optimized. Look at each and every query plan, make sure you have proper covering indexes in place, make sure you don't do any table scan, review your clustered index design strategy, review all your IO load, storage design, etc etc. Really, there is no short cut you can take her, you have to analyze and optimize the heck out of your database, it will be your chocking point.
eliminate contention. Don't have readers wait for writers. For your stack, SNAPSHOT ISOLATION is a must.
cache results. And usually this is were the cookie crumbles. Designing a good cache is actually quite hard to pull off. I would recommend you watch the Facebook SOCC keynote: Building Facebook: Performance at Massive Scale. Somewhere at slide 47 they show how a typical internal Facebook API looks like:
.
cache_get (
$ids,
'cache_function',
$cache_params,
'db_function',
$db_params);
Everything is requested from a cache, and if not found, requested from their MySQL back end. You probably won't start with 60000 servers thought :)
On the SQL Server stack the best caching strategy is one based on Query Notifications. You can almost mix it with LINQ...
I will define heavy traffic as traffic which triggers resource intensive work. Meaning, if one web request triggers multiple sql calls, or they all calculate pi with a lot of decimals, then it is heavy.
If you are returning static html, then your bandwidth is more of an issue than what a good server today can handle (more or less).
The principles are the same no matter if you use MVC or not when it comes to optimize for speed.
Having a decoupled architecture
makes it easier to scale by adding
more servers etc
Use a repository
pattern for data retrieval (makes
adding a cache easier)
Cache data
which is expensive to query
Data to
be written could be written thru a
cache, so that the client don't have
to wait for the actual database
commit
There's probably more ground rules as well. Maybe you can you say something about the architecture of your application, and how much load you need to plan for?
MSDN has some resources on this. This particular article is out of date, but is a start.
I would suggest also not limiting yourself to reading about the MVC stack: many principles are cross-platform.
Greetings,
I've been working on a C#.NET app that interacts with a data logger. The user can query and obtain logs for a specified time period, and view plots of the data. Typically a new data log is created every minute and stores a measurement for a few parameters. To get meaningful information out of the logger, a reasonable number of logs need to be acquired - data for at least a few days. The hardware interface is a UART to USB module on the device, which restricts transfers to a maximum of about 30 logs/second. This becomes quite slow when reading in the data acquired over a number of days/weeks.
What I would like to do is improve the perceived performance for the user. I realize that with the hardware speed limitation the user will have to wait for the full download cycle at least the first time they acquire a larger set of data. My goal is to cache all data seen by the app, so that it can be obtained faster if ever requested again. The approach I have been considering is to use a light database, like SqlServerCe, that can store the data logs as they are received. I am then hoping to first search the cache prior to querying a device for logs. The cache would be updated with any logs obtained by the request that were not already cached.
Finally my question - would you consider this to be a good approach? Are there any better alternatives you can think of? I've tried to search SO and Google for reinforcement of the idea, but I mostly run into discussions of web request/content caching.
Thanks for any feedback!
Seems like a very reasonable approach. Personally I'd go with SQL CE for storage, make sure you index the column holding the datetime of the record, then use TableDirect on the index for getting and inserting data so it's blazing fast. Since your data is already chronological there's no need to get any slow SQL query processor involved, just seek to the date (or the end) and roll forward with a SqlCeResultSet. You'll end up being speed limited only by I/O. I profiled doing really, really similar stuff on a project and found TableDirect with SQLCE was just as fast as a flat binary file.
I think you're on the right track wanting to store it locally in some queryable form.
I'd strongly recommend SQLite. There's a .NET class here.
I am trying to work out how to calculate the latency of requests through a web-app (Javascript) to a .net webservice.
Currently I am essentially trying to sync both client and server time, which when hitting the webservice I can look at the offset (which would accurately show the 'up' latency.
The problem is - when you sync the time's, you have to factor in latency for that also. So currently I am timeing the sync request (round trip) and dividing by 2, in an attempt to get the 'up' latency...and then modify the sync accordingly.
This works on the assumption that latency is symmetrical, which it isn't. Does anyone know a procedure that would be able to determine specifically the up/down latency of a JS http request to a .net service? If it needs to involve multiple handshakes thats fine, what ever is as accurate as possible.
Thanks!!
I think this is a tough one - or impossible, to be honest.
There are probably a lot of things you can do to come more or less close to what you want. I can see two ways to tackle the problem:
Use something like NTP to synchronize the clocks and use absolute timestamps. This would be fairly easy but is of course only possible if you control both, server and client (which you probably do not).
Try to make an educated guess :) This would be along the lines what you are doing now. Maybe ping could be of some assistance in any way?
The following article might provide some additional idea(s): A Stream-based Time Synchronization Technique For Networked Computer Games.
Mainly it suggests to make multiple measurements and discard "outliers". But in the end it is not that far from your current implementation, if I understand correctly.
Otherwise there is some academic material available for a more theoretical approach (by first reading some stuff, I mean). These are some things I found: Time Synchronization in Ad Hoc Networks and A clock-sampling mutual network time-synchronization algorithm for wireless ad hoc networks. Or you could have a look at the NTP-Protocol.
I have not read those though :)
When writing ASP.NET pages, what signs do you look for that your page is making too many roundtrips to a database or server?
(This is a general question but I say ASP.NET as the majority of my coding is on the web side of things).
How much is too much? The €1M question! Profile. Then profile. If your app is spending most of its time doing data access, you have a problem (and should look at a sql trace). If it is spending most of its time drawing the UI, then (assuming your view isn't doing data access) you should probably look elsewhere first...
Round trips are more relevant to latency than the total quantity of data being moved, so it really does make sense to optimize for them. The usual way is to use stored procedures that do multiple steps, perhaps even returning multiple result sets.
What I do is I look at the ASP performance counters and SQL performance counters. To get an accurate measurement you must ensure that there is no random noise activity on the SQL Server (ie. import batches running unrelated to the web site).
The relevant counters I look at are:
SQL Statistics/Batch requests/sec: This indicates exactly how many Transact-SQL batches the server receives. It can be, in most cases, equated 1:1 with the number of round trips from the web site to SQL.
Databases/Transaction/sec: this counter is instanced per database, so I can quickly see in which database there is 'activity'. This way I can correlate the web site data roundtrips (ie. my app logic requests, goes to app database) and the ASP session state and user stuff (goes to Asp session db or tempdb)
Databases/Write Transaction/sec: This I correlate with the counters above (transaction per second) so I can get a feel of the read-to-write ratio the site is doing.
ASP.NET Applications/Requests/sec: With this counter I can get the number of requests/sec the site is seeing. Correlated with the number of SQL Batch Requests/sec it gives a good indication of the average number of round-trips per request.
The next thing to measure is usually trying to get a feel for where is the time spent in the request. On my own project, I use abundantly performance counters I publish myself so is really easy to measure. But I'm not always so lucky as to clean up only my own mess... Profiling is usually not an option for me because I most times troubleshoot live production systems I cannot instrument.
My approach is to try to sort out the SQL side of things first, since it's easy to find the relevant statistics for execution times in SQL: SQL keeps a nice aggregated statistic ready to look at in sys.dm_exec_query_stats. I can also use Profiler to measure execution duration in real time. With some analysis of these numbers collected, knowing the normal request pattern of the most visited pages, you can give a pretty good estimate of the total time spent in SQL per web request. If this times adds up to nearly all the time it takes a request to serve the page, then you have your answer.
And to answer the original question title: to reduce the number of round-trips, you make fewer requests. Seriously. First, caching what is appropriate to cache I guess is obvious. Second you reduce the complexity: don't display unnecessary data on each page, you cache and display stale data when you can get away with it, you hide details on secondary navigation panels.
If you feel that the problem is the number of round-trips per se as opposed to the number of requests (ie. you would benefit tremendously from batching multiple requests in one round-trip), then you should somehow measure that the round-trip overhead is what's killing you. With connection pooling on a normal network connection this is usually not the most important factor.
And finally you should look if everything that can be done in sets is done in sets. If you have some half-brained ORM that retrieves objects one at a time from an ID keyset, get rid of it.
I know that this may sound reiterative, but client server round trips depends of how many program logic is located at any side of the connection.
First thing to check is validation: You have to validate and sanitize your input at server side always, but it does not means that you cannot do it too at client side too reducing a round trips that are been used only too check input.
At second: What can you do at client side to reduce server side overload? There are calculations that you can check or make at client side. There is also AJAX that can be used to load only a percentage of the page that is changing.
At third: Can you delegate work to another server? If your server is too loaded, why not to use web services or simply delegate some side of the logic to another server?
As Mark wrote: ¿How is too much? It is is up to you and your budget.
When writing ASP.NET pages, what signs
do you look for that your page is
making too many roundtrips to a
database or server?
Of course it all depends and you have to profile. However, here are some indicators, they do not mean there is a problem, but often will indicate
Page is taking a very long time to render locally.
Read this question: Slow response-time cheat sheet , In particular this link
To render the page you need more than 30 round trips. I pulled that number out of my hat, but assuming a round trip is taking about 3.5ms then 30 round trips will kick you over the 100ms guideline (before any other kind of processing).
All the queries involved in rendering the page are heavily optimized and do not take longer than a millisecond or two to execute. There are no operations that require lots of CPU cycles that execute every time you render the page.
Data access is abstracted away and not cached in any kind of way. If, for example, GetCustomer will call the DAL which in turn issues a query and your page is asking for 100 Customer objects which are not retrieved in a batch, you are probably in trouble.