We have a distributed system system built on erlang with one server node and hundreds of client nodes(the systems are distributed over the intra-net). We have a requirement that all the client nodes will connect to the server node and try to download some file(mostly the same file will be accessed by all client nodes) simultaneously by using sftp. The steps we follow for downloading the file is:
Establish ssh sftp connection between the server node and client node using function call as below:
ssh_sftp:start_channel/2 .
Then reads the file by doing function call as below:
ssh_sftp:read_file/2
The problem what we are facing is that when the number of clients are more then it is observed that few client nodes are failing to establish connection between server node. i.e. the ssh_sftp:start_channel/2 function call is failing.
Can somebody please explain me;
Is there any limitation for the number of sftp session what we can establish in a single system ?
What are the possible reason because of which the connection request fails ?
Is there anything wrong we are doing in this approach ?
Is there any better solution by which we can guarantee that all client nodes will be able to connect to server and will be able to download the file.
Observation: We tried to connect 25 client nodes to the server; during the first try only 2 nodes failed to connect and on the second try 5 nodes failed to connect. Why this random behavior ?.
I think I can answer some question below (fix me if I wrong):
Erlang is really strong when you use this concurrent, so your issue here is power of your physical hardware (server).
I don't really know what the issue in your project but my project about telecom can easy handle a thousand call which 2 process handle each call. 1 process is the master process handle the session (connection) and each other watching and handle error master process, so that be can NOT failed the connection
Related
I have put together an architecture that at high level is best described below
Five node docker swarm cluster
Have say 5 instances of my dockerized micro service running one copy on each of the swarm nodes
The service offers functionality via REST end points
One such functionality is downloads and they work perfectly, I wrote some code in Scala/Play framerwork, dockerized the service and deployed it.
I also know that since I use swarm , it internally does LB per request for me.
I have some questions on WebSocket and how load balancer does not ruin things during download.
I start a 5GB file download and it works. I am using HTTP stream or chunked I guess it does not matter. Now my question is once my REST end point for download is hit, the TCP connection remains open and since it is open until the server closes the connection, it is due to this that the swarm load balancing does not interfere? In short, each time a client requests a HTTP call, swarm load balances it but once the TCP socket is established as in case of specific download example, the request is served by one node as the connection is not re-stablished during the download process?
If a client opens a web socket, it will hit one of the nodes of swarm where the service is running and the websocket connection since it is open, the same service instance will push the notifications?
If for some reason the websocket dies, a new connection might be established by client but the request might end up on some other service instance and will remain like that until a new connection is again established?
Are above 3 points correct in my understanding? Is there some reading material/blogs I can find more on elaborating this?
Maybe using nginx like proxy LB, ip_hash mode
Specifies that a group should use a load balancing method where requests are distributed between servers based on client IP addresses. The first three octets of the client IPv4 address, or the entire IPv6 address, are used as a hashing key. The method ensures that requests from the same client will always be passed to the same server except when this server is unavailable. In the latter case client requests will be passed to another server. Most probably, it will always be the same server as well.
http://nginx.org/en/docs/http/ngx_http_upstream_module.html#ip_hash
I want to create containers which start small webservices. Developers of our team should then upload small images which contain different services. A main backend system then uses these services.
My problem is: When a developer uploads a new service, how does the backend service know there is a new service it can use? Beforehand, when service X should be used and there was no service for that functionality, it returned just a simple message. When there is a service uploaded to do X, the main backend should use that service. But how does it know it is there and should be used?
You can add some notification to your small web service. But what do you do when the service will be down unexpectedly or network will be down at the short time? You need add some logic to clients for refreshing connection.
And docker recommends to use this way.
The problem of waiting for a database (for example) to be ready is
really just a subset of a much larger problem of distributed systems.
In production, your database could become unavailable or move hosts at
any time. Your application needs to be resilient to these types of
failures.
To handle this, your application should attempt to re-establish a
connection to the database after a failure. If the application retries
the connection, it should eventually be able to connect to the
database.
The best solution is to perform this check in your application code,
both at startup and whenever a connection is lost for any reason.
I am presently working on a client-server solution to transfer files to another machine via a socket network connection. Since I intend to do some evaluation on the receiving end as well I am assuming that I will need to have some kind of client or server programme running there, too.
I am fairly new to the whole client-server thing and therefore have the following elementary question:
My present understanding is that client and server will be two independent programmes running on two different machines. How would one typically ensure that the communication partner (i.e., the server when sending from a client and the client when sending from a server) is actually up and running on the remote machine that I want to transfer a file to?
So far, I have been looking into the following options:
In the sending programme include an ssh access to the remote
machine and start an instance of the receiving programme on the
remote machine.
Have the receiving programme run as a demon process on the remote
machine. This would mean that the receiving programme should always
be running on the remote machine. However, how would I know whether
the process has crashed or has been shut down for some reason and
how would one recover from that without option 1) above?
So, my main question is: Are there any additional options that might be worth considering?
Thanks for your view on this!
Depending on how your client server messages are setup, a ping (I don't mean the ICMP ping, but the basic idea) message, where the server can respond with "I am alive" would help. This way at least you know the server end is running.
It is not uncommon in production environments using these that monitoring systems are put in place. Other options worth considering - xinet.d scripts - stuff that gets started on incoming connections.
There probably new ways to achieve the automatic start/restart or start on connection of this with systemd/systemctl but I am not familiar enough with them to give you the specifics.
A somewhat crude, but effective means may be a cron job that periodically runs a script to enforce keeping the service up.
so I'm making an iOS app, but this is more of a general networking question.
So what I have is one phone that acts as the server and then a bunch of phones connect to the phone as the client. Basically it's a game/music sharer.
It's kind of hard to really get into the semantics of it, but that isn't important.
What is important is that the server and client are repeatedly sending each other commands and positions rapidly over a TCP connection, and sometimes the client wants to send the server a music file (4MB usually) to play as the music.
The problem I initially encountered was that when sending the large file, it would hang the sending of commands from the client to the server.
My naive solution was to create another socket to connect to the server to send the file to the server, the server would check the IP of the new socket, and if it has the IP of an existing connection then it would just tie it to that connection, receive the file, and then disconnect the socket.
But the problem with this is that it takes a 1-2 second delay for the socket to connect, and I'm aware that there are man-in-the-middle attacks that can occur.
Is there a more elegant solution to this problem?
I would not call your solution naive, this is largely how FTP works, separating data and control paths is a good design pattern in my view.
I wouldn't worry about the man in the middle thing. If you wanted, you could add a command to the client that it responds to over the data connection with a secret the server supplies, this would let you associate the connections without using the ip addressing.
If the delay is a problem then why not establish both connections at the start, the overhead of a few tcp connections on an operating system is not usually significant.
You could also use the two connections for both commands and data, alternating between them. Since both the server and client know when a connection is busy they can choose to use the idle one. The advantage of this is that it will keep both connections busy to ensure they are both known to be working.
You probably should also use a different thread for each socket but I suspect you are doing this since it won't work too well without it.
I'm looking to detect local connection loss. Is there a mean to do that, as with the events on the Corelabs components ?
Thanks
EDIT:
Sorry, I'm going to try to be more specific:
I'm currently designing a prototype using datasnap 2009. So I've got a thin client, a stateless server app and a database server.
What I would be able to do is to detect and handle connection loss (internet connectivity) between the client and the server app to handle it appropriately, ie: Display an informative error message to the user or to detect a server shutdown to silently redirect on another app server.
In 2-tier I used to manage that with ODAC components, the TOraSession have some events to handle this issues.
Normally there is no event fired when a connection is broken, unless a statement is fired against the database. This is because there is no way of knowing a connection loss unless there is some sort of is-alive pinging going on.
Many frameworks check if a connection is still valid by doing a very small query against the server. Could be getting the time from a server. Especially in a connection pooling environment.
You can implement a connection checking function in your application in some of the database events (beforeexecute?). Or make a timer that checks every 10 seconds.
Spawn a thread on the client which periodically sends some RPC 'Ping' or 'Heartbeat' commands to the server.
if this fails, the client knows that something happened to the connection
if the server does not hear the client anymore for some time period (for example, two times the heartbeat interval), he can conclude that the client disconnected, however this requires a stateful server (and your design is stateless so it would require event processing in a secondary system, which could be fed through a message queue)