NOTE: I'll use the ssh_sftp channel as an example here, but I've noticed the same behaviour when using different channels.
After starting a channel:
{ok, ChannelPid} = ssh_sftp:start_channel(State#state.cm),
(where cm is my Connection Manager), I'm performing an operation through the channel. Say:
ssh_sftp:write_file(ChannelPid, FilePath, Content),
Then, I'm stopping the channel:
ssh_sftp:stop_channel(ChannelPid),
Since, as far as I know, the channel is implemented as a gen_server, I was expecting the requests to be sequentialized.
Well, after a bit of tracing, I've noticed that the channel is somehow stopped before the file write is completed and the result of the operation is sent through the channel. As a conclusion, the response is not sent through the channel, since the channel doesn't exist anymore.
If I don't stop the channel explicitely, everything works fine and the file write (or any other operation performed through the channel) is completed correctly. But I would prefer to avoid to leave open channels. On the other hand, I would prefer to avoid implementing my own receive handler to wait for the result before the channel can be stopped.
I'm probably missing something trivial here. Do you have any idea why this is happening and/or I could fix it?
I repeat, the ssh_sftp is just an example. I'm using my own channels, implemented using the existing channels in the Erlang SSH application as a template.
As you can see in ssh_sftp.erl it forcefully kills channel after 5 sec timeout with exit(Pid, kill) which interrupts the process regardless of whether it's processing something or not.
Related quote from erlang man:
If Reason is the atom kill, that is if exit(Pid, kill) is called, an untrappable exit signal is sent to Pid which will unconditionally exit with exit reason killed.
I had a similar issue with ssh_connection:exec/4. The problem is that these ssh sibling modules ( ssh_connection, ssh_sftp, etc) all appear to behave asynchronously, therefore a closure of channel of ssh itself will shut down the ongoing action.
The options are:
1) do not close the connection : this may lead to leak of resources. Purpose of my question here
2) After the sftp, introduce a monitoring function that waits by monitoring on the file you are transfering at the remote server ( checksum check ). This can be based on ssh_connection:exec and poll on the file you are transferring. Once the checksum matches what you expect, you can free the main module
Related
I'm writing an Erlang application that requires actively polling some remote resources, and I want the process that does the polling to fit into the OTP supervision trees and support all the standard facilities like proper termination, hot code reloading, etc.
However, the two default behaviours, gen_server and gen_fsm seem to only support operation based on callbacks. I could abuse gen_server to do that through calls to self or abuse gen_fsm by having a single state that always loops to itself with a timeout 0, but I'm not sure that's safe (i.e. doesn't exhaust the stack or accumulate unread messages in the mailbox).
I could make my process into a special process and write all that handling myself, but that effectively makes me reimplement the Erlang equivalent of the wheel.
So is there a behavior for code like this?
loop(State) ->
do_stuff(State), % without waiting to be called
loop(NewState).
And if not, is there a safe way to trick default behaviours into doing this without exhausting the stack or accumulating messages over time or something?
The standard way of doing that in Erlang is by using erlang:send_after/3. See this SO answer and also this example implementation.
Is it possible that you could employ an essentially non OTP compliant process? Although to be a good OTP citizen, you do ideally want to make your long running processes into gen_server's and gen_fsm's, sometimes you have to look beyond the standard issue rule book and consider why the rules exist.
What if, for example, your supervisor starts your gen_server, and your gen_server spawns another process (lets call it the active_poll process), and they link to each other so that they have shared fate (if one dies the other dies). The active_poll process is now indirectly supervised by the supervisor that spawned the gen_server, because if it dies, so will the gen_server, and they will both get restarted. The only problem you really have to solve now is code upgrade, but this is not too difficult - your gen_server gets a code_change callback call when the code is to be upgraded, and it could simply send a message to the active_poll process, which can make an appropriate fully qualified function call, and bingo, it's running the new code.
If this doesn't suit you for some reason and/or you MUST use gen_server/gen_fsm/similar directly...
I'm not sure that writing a 'special process' really gives you very much. If you wrote a special process correctly, such that it is in theory compliant to OTP design principals, it could still be ineffective in practice if it blocks or busy waits in a loop somewhere, and doesn't invoke sys when it should, so you really have at most a small optimisation over using gen_server/gen_fsm with a zero timeout (or by having an async message handler which does the polling and sends a message to self to trigger the next poll).
If what ever you are doing to actively poll can block (such as a blocking socket read for example), this is really big trouble, as gen_server, gen_fsm or a special process will all be stopped from fullfilling their usual obligations (which they would usually be able to either because the callback in the case of gen_server/gen_fsm returns, or because receive is called and the sys module invoked explicitly in the case of a special process).
If what you are doing to actively poll is non blocking though, you can do it, but if you poll without any delay then it effectively becomes a busy wait (it's not quite because the loop will include a receive call somewhere, which means the process will yield, giving the scheduler voluntary opportunity to run other processes, but it's not far off, and it will still be a relative CPU hog). If you can have a 1ms delay between each poll that makes a world of difference vs polling as rapidly as you can. It's not ideal, but if you MUST, it'll work. So use a timeout (as big as you can without it becoming a problem), or have an async message handler which does the polling and sends a message to self to trigger the next poll.
Most server framework/examples using sockets and I/O completion ports makes notifications in a way I couldn't completely figure out the purpose.
Upon read packets are processed, usually they are reordered to circumvent thread scheduling issues processing packets out of order no matter IOCP ensure a FIFO queue.
The problem is when a socket is closed gracefully or by an error. I saw in both situation, and again by the o.s. thread scheduler, the close notification may be sent to the application (i.e. http server using the framework) "before" the notification of data previously readed.
I think that the close notification should be queued in such way so the application receives it after previous reads.
Is there any intended use in most code I saw or my behavior may be correct depending on the situation?
What you suggest makes sense and I would imagine that any code that handles graceful close (a read returning 0 bytes) would do so by processing it after any proceeding successful read. Errors coming out of GetQueuedCompletionStatus(), such as connection reset errors, etc, are harder to integrate into the receive flow as they occur out of band as far as the receive data is concerned. Your question's a bit vague and depends very much on the code you're using and how you (or the people who wrote that code) want to handle these things. There is no single correct way, IMHO.
I am trying to use luasocket to connect to an Irc channel and send and receive messages within my game (Wolfenstein Enemy Territory, If that helps).
Right now I am able to do all of that, with one problem. Once I set it to listen for a message, it basically locks up. I have a fallback command if I type stoplisten in Irc it just stops the script, And I can see it got all the message, but the game itself is locked up while waiting for the messages.
Any Ideas on how I would do this without freezing the game? I have just recently learned a little of coroutines So I do not know if I am using them correctly.
I should also note I have access to a run frame functions which runs every millisecond if that helps (Though normally it is done like: if math.mod(currentTime, 50) ~= 0 then return end)
Here is the part in my code: http://pastebin.com/j1gCqm4R
(I wasnt gonna edit all my code with an indent just to post it here, so i just put it on pastebin)
Your problem is that all sockets are, by default, blocking, which means they will halt ('block') the current thread of execution (in this case, your game) until they either get the desired result or 'timeout'.
The solution is non-blocking sockets. invoke :settimeout(0) on your client socket object, and all future :send(...) :recieve(...) will return immediately, having either succeeded, or timed-out.
The LuaSocket reference contains the full details, but you will have to modify your code either to handle the 'timeout' failure state, or add calls to socket.select() to make sure that you only use sockets that are 'ready' to be used.
I am currently working on a hobby OS, specifically the ATA driver. I am having some issues with PIO data-in commands with interrupts. I am trying to execute the READ MULTIPLE command to read multiple sectors from the drive, block by block, with an interrupt firing for each block.
If I request a read of 4 blocks (1 sector per block). I expect to get 4 interrupts, one for each data block. Upon receiving the 4th interrupt I can identify that I've transferred all data and update my request structure accordingly. However, in VirtualBox I've found that after the last data block has been transferred I received yet another interrupt (STATUS = 0x50, READY, OVERLAPPED MODE SERVER REQ). I can simply read the STATUS register then to clear it, but I don't think I should ever receive the 5th interrupt according to the specs.
So what is the proper way acknowledge an interrupt issued by an ATA device?
In this example I issue a READ MULTIPLE command, and then my ISR does the following:
disables CPU interrupts, sets nIEN
Read a single data block (not sector!) fro the DATA register,
If all data has been read, read the STATUS register to clear the 'extra' interrupt
Exit by clearing nIEN, and sending EOI's to both the master and slave PICs
The ATA specs for the PIO data-in command protocol don't indicate that you need to read the status register. From that I assumed that when I receive an interrupt all I have to do is follow the protocol and finish by sending the EOIs to the PICs. As for the setting/clearing of nIEN, in dealing with VirtualBox I've found that if I don't do this I don't receive any interrupts past the first one. So I set nIEN when entering the ISR, then clear it before I leave. I'd think that wouldn't have any effect, but it must be related to reading/writing that specific register.
This always happens to me, I post a question I've been struggling with, only to find the answer myself shortly after.
The ATA-6 spec I've been referencing has this one line in the PIO data-in section (9.5):
When in this state, the host shall read the device Status register.
With ATA the Status register has a side-effect: it clears a pending interrupt. I knew this, but I didn't correctly read this part before. It doesn't mention why you should read the register, it just states it exactly as above.
The important part is how this works with the interrupt handler. After issuing a PIO data-in command, once the INTRQ is asserted, you simply read the Status register once to clear the interrupt, then continue to processes the interrupt and return as normal (just sending EOIs to the PICs.) What had me confused is that none of the documentation I read mentioned exactly how this should work with interrupts (receive an INTRQ, read Status, processes interrupt.) Most online guides only deal with polled IO.
This is one of the difficulties with low-level programming, key details, such as needing to read the Status register in the ISR, are often glanced over. This one was left as a single line in the protocol description. Call me picky but I just would have expected more emphasis on this point.
My application is basically a content based router which will route MMS events.
The logger I am using is the one that comes with the OTP framework in SASL mode "error_logger"
The issue is ::
I am using a client to generate MMS events with default values. This client (in Java) has the ability to send high load of events in multiple THREADS
I am sending 100 events in 10 threads (each thread sending 10 MMS events) to the my router written in Erlang/OTP.
The problem is, when such high load is received by my router , my Logger hangs i.e it stops updating my Log file. But the router is still able to route the events.
The conclusions that I have come up with is ::
Scheduling problem in Erlang when such high load of events is received (a separate process for each event).
A very unlikely dead-loack state.
Might be due to sending events in multiple threads rather than sending them sequentially. But I guess a router will be connected to multiple service provider boxes, so I thought of sending events in threads.
Can anybody help mw in demystifying the problem?
You already have a good answer, but I'll add to the discussion.
The error_logger is by default using cached write operations to disk. So one possibility is that you don't really notice this while under low load, but under high load your writes get stuck in the cache for a while.
On a side note: there should be no problem having multiple threads doing calls to Erlang.
Another way of testing this is to add your own logger to error_logger, and see what happens. Possibly printing to the shell or something else that is "fast".
Which version of Erlang are you using? Prior to R14A (R13B4 maybe?), there was a performance penalty when you invoked a selective receive when the message queue contained a lot of messages. This behaviour meant that in a process that receives lots of messages (error_logger being the canonical example), if it was barely keeping up with the load then a small spike in load could cause the cost of processing to spike up and stay there as the new processing cost was higher than the process could bear. This problem has been solved in R14A.
Secondly - why are you sending a high volume of events/calls/logs to a text logger? Formatting strings for output to a human readable log file is a lot more expensive than using a binary disk_log for instance. Reducing the cost of logging will help, but reducing the volume of logs will help even more. Maybe investigate exactly why you need to log these things and see if you can't record them another (less expensive) way.
Problems with error_logger are often symptoms of some other overload problem. Try looking at the message queue sizes for all your processes when this problem occurs and see if something else is backed up too. The following erlang shellcode might help:
[ { P, element(2, process_info(P, message_queue_len)) }
|| P <- erlang:processes(), is_process_alive(P) ]