elixir mix app cannot understand clearly - erlang

i try to work with elixir.
it's bit hard to understand about application.ex
defmodule PluralsightTweet.Application do
# See http://elixir-lang.org/docs/stable/elixir/Application.html
# for more information on OTP Applications
#moduledoc false
use Application
def start(_type, _args) do
import Supervisor.Spec, warn: false
# Define workers and child supervisors to be supervised
children = [
# Starts a worker by calling: PluralsightTweet.Worker.start_link(arg1, arg2, arg3)
worker(PluralsightTweet.TweetServer, [])
]
# See http://elixir-lang.org/docs/stable/elixir/Supervisor.html
# for other strategies and supported options
opts = [strategy: :one_for_one, name: PluralsightTweet.Supervisor]
process = Supervisor.start_link(children, opts)
PluralsightTweet.Scheduler.schedule_file("* * * * *", Path.join("#{:code.priv_dir(:pluralsight_tweet)}",
"sample.txt"))
process
end
end
i'm following pluralsight elixir tutorial
this is scheduler to tweet text in every minute from reading text file
task is success but doesn't have crystal clear ideal about the process
can some one please explain whats happening inside application.ex
run as a supervisor app

use Application
This line means the current module is the entrance of an application. Such module can be configured in the mix.exs to be started as a unit.
# Inside mix.exs
def application do
[
extra_applications: [:logger],
mod: {PluralsightTweet.Application, []} # <-- this line
]
end
The start function
This function is the callback when the application starts up. You can consider it as the main function in some other languages.
import Supervisor.Spec, warn: false
It just lets you omit the module name when you call worker, supervisor and supervise. The warn: false part suppresses warning even if you don't call any of those functions.
children = [worker(PluralsightTweet.TweetServer, [])]
This line specifies the child processes that your application supervises. Note that at this point, the child processes are not spawned yet.
The worker(mod, args) just defines a worker spec that will be started later. The args will be passed to the mod's start_link function when starting the worker.
opts = [strategy: :one_for_one, name: PluralsightTweet.Supervisor]
The supervisor options.
See strategies documentation for the meaning of strategy: :one_for_one and other strategies.
Since you have only one worker, all strategies except :simple_one_for_one works pretty the same.
The tricky part is name: PluralsightTweet.Supervisor. You may wonder where did the module PluralsightTweet.Supervisor come from. The fact is that it is NOT a module. It's only an atom :"Elixir.PluralsightTweet.Supervisor", which serves as the name of the supervisor process.
Supervisor.start_link(children, opts)
Now the supervisor process and its child processes are spawned.

Related

How to distribute supervised gen_server workers?

Hi I want to implement distributed caches as an exercise. The cache module is based on gen_server. The caches are started by an CacheSupervisor module. At first I tried running it all on one node, which worked well. Now I am trying to distribute my caches on two nodes, which live in two open console windows on my laptop.
PS:
While writing this question I realised that I forgot to connect my third window to the other nodes. I fixed it, but I am still having the original error.
Consoles:
After connecting my nodes I callCacheSupervisor.start_link() in my third window, this results in the follwing error message.
Error:
** (EXIT from #PID<0.112.0>) shutdown: failed to start child: :de
** (EXIT) an exception was raised:
** (ArgumentError) argument error
erlang.erl:2619: :erlang.spawn(:node1#DELL_XPS, {:ok, #PID<0.128.0>})
(stdlib) supervisor.erl:365: :supervisor.do_start_child/2
(stdlib) supervisor.erl:348: :supervisor.start_children/3
(stdlib) supervisor.erl:314: :supervisor.init_children/2
(stdlib) gen_server.erl:328: :gen_server.init_it/6
(stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
I am guessing that the error indicates that the :gen_server.start_link(..) inside start_link(name) of my Cache Module resolves to {:ok, #PID<0.128.0>} which seems to be incorrect, but I am having no Idea where to put the Node.spawn() else
Module Cache:
defmodule Cache do
use GenServer
def handle_cast({:put, url, page}, {pages, size}) do
new_pages = Dict.put(pages, url, page)
new_size = size + byte_size(page)
{:noreply, {new_pages, new_size}}
end
def handle_call({:get, url}, _from, {pages, size}) do
{:reply, pages[url], {pages, size}}
end
def handle_call({:size}, _from, {pages, size}) do
{:reply, size, {pages, size}}
end
def start_link(name) do
IO.puts(elem(name,0))
Node.spawn(String.to_atom(elem(name, 0)), :gen_server.start_link({:local,elem(name, 1)}, __MODULE__, {HashDict.new, 0}, []))
end
def put(name, url, page) do
:gen_server.cast(name, {:put, url, page})
end
def get(name, url) do
:gen_server.call(name, {:get, url})
end
def size(name) do
:gen_server.call(name, {:size})
end
end
Module CacheSupervisor:
defmodule CacheSupervisor do
use Supervisor
def init(_args) do
workers = Enum.map( [{"node1#DELL_XPS", :de},{"node1#DELL_XPS", :edu}, {"node2#DELL_XPS", :com} ,{"node2#DELL_XPS", :it}, {"node2#DELL_XPS", :rest}],
fn(n)-> worker(Cache, [n], id: elem(n, 1)) end)
supervise(workers, strategy: :one_for_one)
end
def start_link() do
:supervisor.start_link(__MODULE__, [])
end
end
:erlang.spawn(:node1#DELL_XPS, {:ok, #PID<0.128.0>})
:erlang.spawn/2 is the same function as Node.spawn/2. The function expects node name (which you have provided) and a function. Your GenServer.start_link call returned {:ok, Pid} as it should. Since a tuple can't be treated like a function Node.spawn/2 crashes.
I would not recommend spawning processes on separate nodes like this. If remote node goes down, not only will you lose that node in your cluster, but you will also have to deal with the fallout from all your spawned processes. This will result an app that is more brittle than it would otherwise be. If you want to have your cache GenServers running on multiple nodes I'd suggest running the application you are building on multiple nodes, and having an instance of your CacheSupervisor on each node. Then each CacheSupervisor starts up it's own GenServers underneath it. This is more robust because if a node goes down the remaining nodes will be unaffected. Of course you application logic will need to take this into account, losing a node could mean losing cache data or client connections. See this answer for more details: How does an Erlang gen_server start_link a gen_server on another node?
If you really really want to a spawn process on a remote node like this you could do this:
:erlang.spawn_link(:node1#DELL_XPS, fun() ->
{:ok, #PID<0.128.0>} = :gen_server.start_link({:local,elem(name, 1)}, __MODULE__, {HashDict.new, 0}, [])
receive
% Block forever
:exit -> :ok
end
end)
Note that you must use spawn_link, as supervisors expect to be linked to their children. If the supervisor is not linked it will not know when the child crashes and won't be able to restart the process.

Creating a multiprocess cache in Elixir

I'm reading 7 concurrency models in 7 weeks and the author asks to implement
cache that distributes cache entries across multiple actors according
to a hash function. Create a supervisor that starts multiple cache
actors and routes incoming messages to the appropriate cache worker.
What action should this supervisor take if one of the cache workers
fails?
defmodule CacheSupervisor do
def start(number_of_caches) do
spawn(__MODULE__, :init, [number_of_caches])
end
def init(number_of_caches) do
Process.flag(:trap_exit, true)
pids = Enum.into(Enum.map(0..number_of_caches-1, fn(id) ->
{id, Cache.start}
end), %{})
loop(pids, number_of_caches)
end
def loop(pids, number_of_caches) do
receive do
{:put, url, page} -> send(compute_cache_pid(pids, url, number_of_caches), {:put, url, page})
loop(pids, number_of_caches)
{:get, sender, ref, url} -> send(compute_cache_pid(pids, url, number_of_caches), {:get, sender, ref, url})
loop(pids, number_of_caches)
{:EXIT, pid, reason} -> IO.puts("Cache #{pid} failed with reason #{inspect reason} - restarting it")
loop(repair(pids, pid), number_of_caches)
end
end
def restart(pids, pid) do
#not clear how to repair, since we don't know the index of the failed process
end
def compute_cache_pid(pids, url, number_of_caches) do
pids[rem(:erlang.phash2(url), number_of_caches)]
end
end
When a process fails, supervisor only get its pid. Using just pid, I can't understand which bucket needs a new process. I probably could create a second map pid -> bucket_index, but it seems ugly.
Can I somehow attach additional attributes when I spawn a process? Can I read those attributes when process exits?
What would be the best way to solve it (without OTP)?
This implementation also makes supervisor too "smart". As I understand supervisors should be simple and only care about restarting failed processes.
I'm not sure how to separate restarting and routing logic into 2 separate processes: when supervisor restarts some process, router should be aware of it and have up-to-date pids.

Number of threads used by rails puma

I have a rails application running with puma server. Is there any way, we can see how many number of threads used in application currently ?
I was wondering about the same thing a while ago and came upon this issue. The author included the code they ended up using to collect those stats:
module PumaThreadLogger
def initialize *args
ret = super *args
Thread.new do
while true
# Every X seconds, write out what the state of this dyno is in a format that Librato understands.
sleep 5
thread_count = 0
backlog = 0
waiting = 0
# I don't do the logging or string stuff inside of the mutex. I want to get out of there as fast as possible
#mutex.synchronize {
thread_count = #workers.size
backlog = #todo.size
waiting = #waiting
}
# For some reason, even a single Puma server (not clustered) has two booted ThreadPools.
# One of them is empty, and the other is actually doing work
# The check above ignores the empty one
if (thread_count > 0)
# It might be cool if we knew the Puma worker index for this worker, but that didn't look easy to me.
# The good news: By using the PID we can differentiate two different workers on two different dynos with the same name
# (which might happen if one is shutting down and the other is starting)
source_name = "#{Process.pid}"
# If we have a dyno name, prepend it to the source to make it easier to group in the log output
dyno_name = ENV['DYNO']
if (dyno_name)
source_name="#{dyno_name}."+source_name
end
msg = "source=#{source_name} "
msg += "sample#puma.backlog=#{backlog} sample#puma.active_connections=#{thread_count - waiting} sample#puma.total_threads=#{thread_count}"
Rails.logger.info msg
end
end
end
ret
end
end
module Puma
class ThreadPool
prepend PumaThreadLogger
end
end
This code contains logic that is specific to heroku, but the core of collecting the #workers.size and logging it will work in any environment.

Running code asynchronously inside pollers

In my ruby script,I am using celluloid-zmq gem. where I am trying to run evaluate_response asynchronously inside pollers using,
async.evaluate_response(socket.read_multipart)
But if I remove sleep from loop, somehow thats not working out, It is not reaching to "evaluate_response" method. But if I put sleep inside loop it works perfectly.
require 'celluloid/zmq'
Celluloid::ZMQ.init
module Celluloid
module ZMQ
class Socket
def socket
#socket
end
end
end
end
class Indefinite
include Celluloid::ZMQ
## Readers
attr_reader :dealersock,:pullsock,:pollers
def initialize
prepare_dealersock and prepare_pullsock and prepare_pollers
end
## prepare DEALER SOCK
def prepare_dealersock
#dealersock = DealerSocket.new
#dealersock.identity = "IDENTITY"
#dealersock.connect("tcp://localhost:20482")
end
## prepare PULL SOCK
def prepare_pullsock
#pullsock = PullSocket.new
#pullsock.connect("tcp://localhost:20483")
end
## prepare the Pollers
def prepare_pollers
#pollers = ZMQ::Poller.new
#pollers.register_readable(dealersock.socket)
#pollers.register_readable(pullsock.socket)
end
def run!
loop do
pollers.poll ## this is blocking operation never mind though we need it
pollers.readables.each do |socket|
## we know socket.read_multipart is blocking call this would give celluloid the chance to run other process in mean time.
async.evaluate_response(socket.read_multipart)
end
## If you remove the sleep the async evaluate response would never be executed.
## sleep 0.2
end
end
def evaluate_response(message)
## Hmmm, the code just not reaches over here
puts "got message: #{message}"
...
...
...
...
end
end
## Code is invoked like this
Indefinite.new.run!
Any idea why this is happening?
The question was 100% changed, so my previous answer does not help.
Now, the issues are...
ZMQ::Poller is not part of Celluloid::ZMQ
You are directly using the ffi-rzmq bindings, and not using the Celluloid::ZMQ wrapping, which provides evented & threaded handling of the socket(s).
It would be best to make multiple actors -- one per socket -- or to just use Celluloid::ZMQ directly in one actor, rather than undermining it.
Your actor never gets time to work with the response
This part makes it a duplicate of:
Celluloid async inside ruby blocks does not work
The best answer is to use after or every and not loop ... which is dominating your actor.
You need to either:
Move evaluate_response to another actor.
Move each socket to their own actor.
This code needs to be broken up into several actors to work properly, with a main sleep at the end of the program. But before all that, try using after or every instead of loop.

Dynamic Ruby Daemon Management

I have a Ruby process that listens on a given device. I would like to spin up/down instances of it for different devices with a rails app. Everything I can find for Ruby daemons seems to be based around a set number of daemons running or background processing with message queues.
Should I just be doing this with Kernel.spawn and storing the PIDs in the database? It seems a bit hacky but if there isn't an existing framework that allows me to bring up/down daemons it seems I may not have much choice.
Instead of spawning another script and keeping the PIDs in the database, you can do it all within the same script, using fork, and keeping PIDs in memory. Here's a sample script - you add and delete "worker instances" by typing commands "add" and "del" in console, exiting with "quit":
#pids = []
#counter = 0
def add_process
#pids.push(Process.fork {
loop do
puts "Hello from worker ##{#counter}"
sleep 1
end
})
#counter += 1
end
def del_process
return false if #pids.empty?
pid = #pids.pop
Process.kill('SIGTERM', pid)
true
end
def kill_all
while del_process
end
end
while cmd = gets.chomp
case cmd.downcase
when 'quit'
kill_all
exit
when 'add'
add_process
when 'del'
del_process
end
end
Of course, this is just an example, and for sending comands and/or monitoring instances you can replace this simple gets loop with a small Sinatra app, or socket interface, or named pipes etc.

Resources