Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
This is a common question, but Googling this gives a lot of crap.
As data volumes are getting higher and higher, all along with processing power & cloud capabilities, we are witnessing a growing need for fast data transfer technologies capable of unleashing the power of all this available data by spreading / moving / sharing it across different servers/clients.
In our case, we are recording real time binary data (50 Gigs a day) and we need to upload it / download it every day to/from subscribers (yes, all of it is needed locally by each subscriber server, for computing and various data analysis stuff)
So to put is shortly, what are choices available today to transfer many Gigs of Data REALLY FAST between remote windows servers (VPS's, Cloud, with a "fairly" consistent bandwitdth -(optic fiber put aside) )
This is an open question. Every Idea is welcome whatever the protocol.
The challenge of sending and receiving the data over the network is multi-fold.
The network bandwidth is the most limiting factor and there is hardly anything you can do for this at application level (except occasional compress the data and even in that case the compression ration determines the gain). So faster network is the first choice.
Certain protocols are more suited for transferring certain type of files/data. For example http is a text based protocol and and not really suited for binary and large contents. But since its the most popular web protocol which needs binary contents to be sent over the wire, techniques like encoding and chunking have evolved. HTTP is really not the choice if your data is in the order of GBs which is your case.
FTP is the most popular protocol used to transfer files over the network and its specifically designed for transferring files. There are several extension of FTP like GridFTP, bbftp which are very specifically designed for large data transfers.
BitTorrents is another option that can be explored. Facebook uses BitTorrents to push the binaries to the servers (tens of thousands in number).
You problem is very open ended and I am limited by my experience :). Here is link I found which deals of large data transfers. Hope this helps you.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
There is a company that has about 5000 terminals around the city that serve public needs. These terminals run in-house software that needs to be updated every week or two. The software consists of executable code, images, configuration files and page templates. The size of the periodical update varies and is about 25-30 MB on average. Terminals are connected to the internet via GPRS.
This company has a server that provides dynamic data to these terminals. When it comes to updates, the server cannot handle all terminals at once. It takes 1-2 weeks to update all of them. There is a need to decrease the update time as much as possible, however the company cannot afford additional server resources.
What external resources can be used to decrease the update time? The software can be encrypted and making it publicly available is not a problem. Is Google Drive an option? If not, what free or low cost resources are there on the internet to upload the software and let the terminals download it simultaneously? (5000 terminals, 25-30 MB at the same time, using standard protocols - ftp, http, etc...)
Google for "free webhosting" in your country. Offering a HTTP server with resumable chunked download service and resumable secured FTP download service is very common and cheap from the software infrastructure point of view.
On the terminal-side you can then use something like cURL.
The problem you describe is also known as Content Delivery Network (CDN) so Google for CDN providers in your country as well.
In order to reduce the download volumes you may distribute update packages in a form of some binary diff delta patch instead of full size update image
..This company has a server..When it comes to updates (every week or two), the server cannot handle all terminals..however the company cannot afford additional server resources..
This scenario can be solved by hiring a Software as a service (SaaS) somewhere "in the cloud" only when it is needed. The company would not have to invest massively in the sever resources, just pay some fee every week or two
https://stackoverflow.com/help/on-topic: "..4. Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow.."
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
What are the upsides to using an FTP server (hosted by a third party) to transfer (and maybe store) files when compared to just sending through email? The language of choice is C#.
Email looks easier to implement and if it was going to Gmail then server hosting and upkeep would not be a worry. However, I am not experienced with FTP servers and don't know how big of deal setup and upkeep is on them. All that is being sent is a bunch of text files, most likely each under 1 MB. Security is not a big deal at this point, but I am curious which is more secure without doing a lot of extra setup work.
Emailing means you have no guarantee that the file is received at the other end, or in a timely manner. Maybe this is not important for you? Emailing certainly would be easier to program up compared to FTP.
On the other hand if you use one of the many FTP libraries available for .NET then have complete control. You could include the library in a C# windows service to do the transferring seamlessly for you including exception (error) processing and notification.
Personally I'd take the opportunity to learn about FTP (its easy). You would of course require a FTP service to be setup on your server. All part of the learning.
I don't know your specific use case, but it sound like FTP is more appropriate than email for transferring and storing files. I mean it is called the "File Transfer Protocol" for a reason ;) The upside of FTP over Email is that it is designed for files while email is designed for email messages - it will be more difficult in automating the management of file attachments in email.
Setting up an FTP server is not difficult. Check out FileZilla:
https://filezilla-project.org/download.php?type=server
Sending files via FTP with C# is not difficult either. Here is question on that:
Upload file on ftp
BTW, again without knowing your requirements, there are also cloud services like Dropbox and Box.com that have APIs that might be even more appropriate for you.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I am currently working on a social networking application that needs to be highly scalable.
I have been reading about the publish/subscribe pattern(message bus) and am struggling with understanding proper use case scenarios - when this would be appropriate and when this would be overkill?
For example:
There are several areas of the site where users enter information on a form that needs to be saved to the database;
When the database saving occurs and an email notification(s) must be made to one or more users.
Also, for the save scenarios, I would like to give the user friendly messages letting them know their data is saved on the form after saving process completes, if I were to go pub/sub
approach.
How would I return success/fail messages back to the UI after a specific task completed?
Which scenarios are ideal candidates for pub/sub pattern? It seems to be overkill for basic form database saving.
From your two scenarios, the latter is a possible candidate of being implemented with a bus. The rule is - the more complex/longer processing takes, the higher probability is it won't scale when processed synchronously. Sometimes it is even the matter of not the number of concurrent requests but also the amount of memory each request consumes.
Suppose your server has 8GB of memory and you have 10 concurrent users each taking 50 megabytes of RAM. Your server handles this easily. However, suddenly, when more users come, the processing time doesn't scale linearly. This is because concurrent requests will involve virtual memory which is a hell lot slower than the physical memory.
And this is where the bus comes into play. Bus let's you throtle concurrent requests by queuing them. Your subscribers take requests and handle them one by one but because the number of subscribers is fixed, you have the control over the resource usage.
Sending emails, what else? Well, for example we queue all requests that involve reporting / document generation. We have observed that some specific documents are generated in short specific time spans (for example: accounting reports at the end of each month) and because a lot of data is processed, we usually had a complete paralysis of our servers.
Instead, having a queue only means that users have to wait for their documents a little longer but the responsiveness of the server farm is under control.
Answering your second question: because of the asynchronous and detached nature of processed implemented with message busses, you usually make the UI actively ask whether or not the processing is done. It is not the server that pushses the processing status to the UI but rather, the UI asks and asks and asks and suddenly it learns that the processing is complete. This scales well while maintaining a two-way connection to push the notification back to the client can be expensive in case of large number of users.
I suppose there is no definite answer to your question. IMHO, nobody can evaluate the performance of a design pattern. Perhaps, someone could consider comparing it with another design pattern but even then the comparison would be at least unsafe. The performance has to do with the actual implementation which could vary between different implementations of the same design pattern. If you want to evaluate the performance of a software module, you have to build it and then profile it. As Steve McConell in his legendary book suggests, make no decisions regarding performance without profiling.
Regarding the specific pattern and your scenarios, I would suggest to avoid using it. Publish-subscribe pattern is typically used when the subscribers do not want to receive all messages published, but rather some of them satisfying some specific criteria (eg belonging to a specific kind of messages). Therefore, I would not suggest using it for your scenarios.
I would also suggest looking at the Observer pattern. I suppose you could find many more references online.
Hope I helped!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm an beginner programer who is trying to have a list of objects stored on a server and clients can connect and view/edit this list.
I will try to explain as best I can what my set up is.
Server:
This will hold a "Main List" that contains all the objects of that list type. When a Client wants an update this list needs to be passed to the client so they can read it.
Client:
The clients will read the updated list from the server and when they make a change to the list this change needs to be sent to the server.
Right now I'm thinking I make it to where only 1 client can edit the list at any time, and since things can be added/remove/changed/location in list changed I think it would be best to just have the client send their list to the server to replace the servers. This way since only 1 will be editing the list the list should stay updated.
My problem is that I can't find a somewhat simple way to send a list of objects through the network. Currently I might be able to pull it off by taking one object at a time converting it to XML then back, but since it's a list that requires much much code. I'm hoping someone knows of an easier way to move a list of objects through the network or converting a list of objects to a string/back again.
An easy example of what I'm doing is imagine pictures on a field that people can click and move, so I need to keep track of the order of images, the x/y and the image name. That is a rough example.
Please let me know what you think, any help would be appreciated. I'm a beginner so I apologize for any incorrect terminology.
Thank you.
Text Protocol
XML
JSON
HTML/SOAP
XML works fine, especially if you're just learning. Its simple, and you've got enough to learn without worrying too much about efficiency at this point. Networking is a big subject!
JSON is a little less verbose than XML and equivalently as readable. I don't think C# has built in JSON serialization, so you'd need a third party library.
A web service would be easy. The information exchanged would be greater, but this would be more scalable as web servers are pretty optimized. However, I do not suggest creating a simple web service as it will teach you less than working directly with sockets.
Binary Protocol
Built in serialization
Third-party (Protobuf or Thrift)
Serialize by hand
If the data needs to be sent/received fast, or you are wanting a server that's more scalable I suggest a binary protocol. Binary protocols are small, and are difficult to read.
If all clients and servers are C# based, just use the built in serialization C# offers. It is more verbose than other serialization solutions, but its the easiest solution. Again, as you are learning, I suggest using built in serialization if you need a binary protocol.
If not all the clients and servers are C# based, use a library like Google Protobufs or Apache Thrift which offers a way to serialize objects in a binary protocol in different languages pretty easily and very efficiently.
Last solution would be to serialize by hand. It will be the very fast, but inflexible, difficult, tedious, and hard to maintain. I do not suggest it.
Don't send whole list back and forth. Send only updates.
Server should hold compiled list, and should also hold a list of all changes from the time whole list is 'snapshot' last time.
Updates are:
deletes
edits
additions
If you don't have much data, you can use simple xml for everything.
When received, every update should get serial number from the server. That way each client knows last update number and can request new updates if available by polling.
Upon connect, client should:
request snapshot from the server (receiving last ID snapshot-ed)
Later on, client should:
poll regularly for new updates and store that updates in local list
Upon any change on the client, client should:
pack the change and send it to the server
Implementing this will be both challenging and fun. And will surely create some more questions here :)
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Overview
I am sending messages back and forth between a client (Android phone) and a Server (Windows Server). Using a persistent connection over TCP, which protocol would be the best solution. I am looking at performance, scalability, size of messages, and battery life. The messages must arrive at the destination in order and can not be duplicates.
MQTT
This seems like the better solution, but there seems to be little examples of large implementation with lots of users. I am not sure if I can integrate this into the windows server, or if it would have to be another application or server running. Finally there seems to be a lack of information on it in general.
XMPP
This seems to have lots of implementation, examples, and even a book : ). However the main purpose seems to be for instant messaging clients and things like Google talk. Will this be an optimal solution to messaging between server and client. I know currently XMPP is mostly used in client to server to client architectures.
Please correct me if I am wrong and thanks in advance for any guidance.
It depends on what you are trying to do and what hardware you are running.
MQTT has very low keep-alive traffic. XMPP is a an IM protocol, and has a much, much higher overhead in handling presence messages between all the clients.
If you have a small memory footprint constraint, then having to handle the XML parser may make the use of XMPP impossible.
Keep in mind that MQTT stands for Message Queue Telemetry Transport, i.e., it is a transport protocol and does not define the message format at all - you will have to supply this; XMPP is an Instant Messaging protocol which carefully defines all the message formats and requires that all messages be in XML.
In addition to all this: MQTT is a publish subscribe protocol, XMPP is an instant messaging protocol that can be extended (using XEP-0060) to support publish subscribe. You need to consider this when you architect your system.
We are finding MQTT to be the quiet achiever. Your milage might be different.
It all depends ...
Track down the recent announcement by LinkedIn where they discuss their use of MQTT in their mobile app.
Cheers
Mark
(BTW Andy was slightly off in his reference to us. We are at Centre for Educational Innovation & Technology (CEIT), The University of Queensland, Brisbane, Australia)
I think that in short the MQTT advantages over XMPP are:
Throughput capacity: less overhead, more lightweight
Binary vs plain text
QoS in place (Fire-and-forget, At-least-once and Exactly-once)
Pub/Sub in place (XMPP requires extension XEP- 0060)
No need for an XML parser
I think you are probably correcting your assessment of XMPP in that it is a primarily chat-oriented protocol - it is also quite heavyweight and uses XML extensively making it verbose. I know that folks at CEIT at the Uni of Brisbane have specifically studied the differences and optimal uses for the two protocols. MQTT is very lightweight and low power - it has been used for telemetry and sensor applications for over 10 years and has been deployed on a very large scale by IBM and partners. Folks are now finding that a simple protocol like this is ideal for mobile development.
What exactly are you looking to achieve? The mqtt.org site aims to provide good links to content. There are also IRC channels and mailing lists about it. How can we help?