I have set up a dev network consisting of 4VPs using the pbft consensus.
I am trying to test the behaviour of the VPs when one of them is down.
Step one
While the 4 VPs are running , i have deployed a chain code (chaincode_example02).
Checking localhost:7050/chain -> return 2
Step two
I shutdown one of the VP using (docker stop containerID)
Now when i execute an Invoke transaction and recheck the chain length:
localhost:7050/chain -> it still returns 2.
Step three
I restart the VP (from step 2) , and the i see that the invoke transaction (from step 2) is executed automatically and the chain size is now 3
localhost:7050/chain -> now returns 3.
My understanding is that with 4VP using the pbft consensus, we have tolerance for 1 faulty VP .If that is the case, then the invoke transaction should have been executed in step2.
Can someone please advise if that is the expected result and why?
Thanks in advance
Related
If I have an Orchestrator that calls multiple sub-orchestrators, can I safely use a single Durable Entity to share common data across the primary and sub-orchestrators without violating Durable Function determinism rules? I think this is legit, but want to make sure I'm not missing something. Thoughts? Thanks.
Yes, durable entities are safe to use across multiple orchestrators with the OrchestrationTrigger binding. With this binding, it will only read the entity one time and then put it in a table to be used on subsequent runs so it is deterministic. It will also guarantee that all operations are processed in order across multiple orchestrators because they operate on queues and only process one operation at a time.
But as with any distributed system operating working on the same data, it is prone to race conditions and async operations. This must be considered when developing.
ex. a counter with a initial value 5
Orch1 -> Get -> returns 5, commited value is now 5
Orch2 -> Get -> returns 5, commited value is now 5
Orch1 -> Set 5 + 1 -> commited value is now 6
Orch2 -> Set 5 + 1 -> commited value is now 6
increment instead or get and increment in one operation
Orch1 -> GetAndIncrement 1 -> returns 5, commited value is now 6
Orch2 -> GetAndIncrement 1 -> returns 6, commited value is now 7
Note: If this entity is also accessed by a normal functions with ReadEntityStateAsync then there exists a situation when this data reads the current committed state and not in sequence. This is because it then reads it from the storage table directly instead of calling it with the queue.
ex. value is 5
Orch1 -> GetIncrement 1 -> return 5, commit 6
Func1 -> ReadState -> depending on how close to the last operation it is made there is a possibility of it returning 5 or 6.
I'm trying to set up a dataflow streaming pipeline in python. I have quite some experience with batch pipelines. Our basic architecture looks like this:
The first step is doing some basic processing and takes about 2 seconds per message to get to the windowing. We are using sliding windows of 3 seconds and 3 second interval (might change later so we have overlapping windows). As last step we have the SOG prediction that takes about 15ish seconds to process and which is clearly our bottleneck transform.
So, The issue we seem to face is that the workload is perfectly distributed over our workers before the windowing, but the most important transform is not distributed at all. All the windows are processed one at a time seemingly on 1 worker, while we have 50 available.
The logs show us that the sog prediction step has an output once every 15ish seconds which should not be the case if the windows would be processed over more workers, so this builds up huge latency over time which we don't want. With 1 minute of messages, we have a latency of 5 minutes for the last window. When distribution would work, this should only be around 15sec (the SOG prediction time). So at this point we are clueless..
Does anyone see if there is something wrong with our code or how to prevent/circumvent this?
It seems like this is something happening in the internals of google cloud dataflow. Does this also occur in java streaming pipelines?
In batch mode, Everything works fine. There, one could try to do a reshuffle to make sure no fusion etc occurs. But that is not possible after windowing in streaming.
args = parse_arguments(sys.argv if argv is None else argv)
pipeline_options = get_pipeline_options(project=args.project_id,
job_name='XX',
num_workers=args.workers,
max_num_workers=MAX_NUM_WORKERS,
disk_size_gb=DISK_SIZE_GB,
local=args.local,
streaming=args.streaming)
pipeline = beam.Pipeline(options=pipeline_options)
# Build pipeline
# pylint: disable=C0330
if args.streaming:
frames = (pipeline | 'ReadFromPubsub' >> beam.io.ReadFromPubSub(
subscription=SUBSCRIPTION_PATH,
with_attributes=True,
timestamp_attribute='timestamp'
))
frame_tpl = frames | 'CreateFrameTuples' >> beam.Map(
create_frame_tuples_fn)
crops = frame_tpl | 'MakeCrops' >> beam.Map(make_crops_fn, NR_CROPS)
bboxs = crops | 'bounding boxes tfserv' >> beam.Map(
pred_bbox_tfserv_fn, SERVER_URL)
sliding_windows = bboxs | 'Window' >> beam.WindowInto(
beam.window.SlidingWindows(
FEATURE_WINDOWS['goal']['window_size'],
FEATURE_WINDOWS['goal']['window_interval']),
trigger=AfterCount(30),
accumulation_mode=AccumulationMode.DISCARDING)
# GROUPBYKEY (per match)
group_per_match = sliding_windows | 'Group' >> beam.GroupByKey()
_ = group_per_match | 'LogPerMatch' >> beam.Map(lambda x: logging.info(
"window per match per timewindow: # %s, %s", str(len(x[1])), x[1][0][
'timestamp']))
sog = sliding_windows | 'Predict SOG' >> beam.Map(predict_sog_fn,
SERVER_URL_INCEPTION,
SERVER_URL_SOG )
pipeline.run().wait_until_finish()
In beam the unit of parallelism is the key--all the windows for a given key will be produced on the same machine. However, if you have 50+ keys they should get distributed among all workers.
You mentioned that you were unable to add a Reshuffle in streaming. This should be possible; if you're getting errors please file a bug at https://issues.apache.org/jira/projects/BEAM/issues . Does re-windowing into GlobalWindows make the issue with reshuffling go away?
It looks like you do not necessarily need GroupByKey because you are always grouping on the same key. Instead you could maybe use CombineGlobally to append all the elements inside the window in stead of the GroupByKey (with always the same key).
combined = values | beam.CombineGlobally(append_fn).without_defaults()
combined | beam.ParDo(PostProcessFn())
I am not sure how the load distribution works when using CombineGlobally but since it does not process key,value pairs I would expect another mechanism to do the load distribution.
I wanted to send a message to a process after a delay, and discovered erlang:send_after/4.
When looking at the docs it looked like this is exactly what I wanted:
erlang:send_after(Time, Dest, Msg, Options) -> TimerRef
Starts a timer. When the timer expires, the message Msg is sent to the
process identified by Dest.
However, it doesn't seem to work when the destination is running on another node - it tells me one of the arguments are bad.
1> P = spawn('node#host', module, function, [Arg]).
<10585.83.0>
2> erlang:send_after(1000, P, {123}).
** exception error: bad argument
in function erlang:send_after/3
called as erlang:send_after(1000,<10585.83.0>,{123})
Doing the same thing with timer:send_after/3 appears to work fine:
1> P = spawn('node#host', module, function, [Arg]).
<10101.10.0>
2> timer:send_after(1000, P, {123}).
{ok,{-576458842589535,#Ref<0.1843049418.1937244161.31646>}}
And, the docs for timer:send_after/3 state almost the same thing as the erlang version:
send_after(Time, Pid, Message) -> {ok, TRef} | {error, Reason}
Evaluates Pid ! Message after Time milliseconds.
So the question is, why do these two functions, which on the face of it do the same thing, behave differently? Is erlang:send_after broken, or mis-advertised? Or maybe timer:send_after isn't doing what I think it is?
TL;DR
Your assumption is correct: these are intended to do the same thing, but are implemented differently.
Discussion
Things in the timer module such as timer:send_after/2,3 work through the gen_server that defines that as a service. Like any other service, this one can get overloaded if you assign a really huge number of tasks (timers to track) to it.
erlang:send_after/3,4, on the other hand, is a BIF implemented directly within the runtime and therefore have access to system primitives like the hardware timer. If you have a ton of timers this is definitely the way to go. In most programs you won't notice the difference, though.
There is actually a note about this in the Erlang Efficiency Guide:
3.1 Timer Module
Creating timers using erlang:send_after/3 and erlang:start_timer/3 , is much more efficient than using the timers provided by the timer module in STDLIB. The timer module uses a separate process to manage the timers. That process can easily become overloaded if many processes create and cancel timers frequently (especially when using the SMP emulator).
The functions in the timer module that do not manage timers (such as timer:tc/3 or timer:sleep/1), do not call the timer-server process and are therefore harmless.
A workaround
A workaround to gain the efficiency of the BIF without the same-node restriction is to have a process of your own that does nothing but wait for a message to forward to another node:
-module(foo_forward).
-export([send_after/3, cancel/1]).
% Obviously this is an example only. You would want to write this to
% be compliant with proc_lib, write a proper init/N and integrate with
% OTP. Note that this snippet is missing the OTP service functions.
start() ->
spawn(fun() -> loop(self(), [], none) end).
send_after(Time, Dest, Message) ->
erlang:send_after(Time, self(), {forward, Dest, Message}).
loop(Parent, Debug, State) ->
receive
{forward, Dest, Message} ->
Dest ! Message,
loop(Parent, Debug, State);
{system, From, Request} ->
sys:handle_msg(Request, From, Parent, ?MODULE, Debug, State);
Unexpected ->
ok = log(warning, "Received message: ~tp", [Unexpected]),
loop(Parent, Debug, State)
end.
The above example is a bit shallow, but hopefully it expresses the point. It should be possible to get the efficiency of the BIF erlang:send_after/3,4 but still manage to send messages across nodes as well as give you the freedom to cancel a message using erlang:cancel_timer/1
But why?
The puzzle (and bug) is why erlang:send_after/3,4 does not want to work across nodes. The example you provided above looks a bit odd as the first assignment to P was the Pid <10101.10.0>, but the crashed call was reported as <10585.83.0> -- clearly not the same.
For the moment I do not know why erlang:send_after/3,4 doesn't work, but I can say with confidence that the mechanism of operation between the two is not the same. I'll look into it, but I imagine that the BIF version is actually doing some funny business within the runtime to gain efficiency and as a result signalling the target process by directly updating its mailbox instead of actually sending an Erlang message on the higher Erlang-to-Erlang level.
Maybe it is good that we have both, but this should definitely be clearly marked in the docs, and it evidently is not (I just checked).
There is some difference in timeout order if you have many timers.
The example below shows erlang:send_after does not guarantee order, but
timer:send_after does.
1> A = lists:seq(1,10).
[1,2,3,4,5,6,7,8,9,10]
2> [erlang:send_after(100, self(), X) || X <- A].
...
3> flush().
Shell got 2
Shell got 3
Shell got 4
Shell got 5
Shell got 6
Shell got 7
Shell got 8
Shell got 9
Shell got 10
Shell got 1
ok
4> [timer:send_after(100, self(), X) || X <- A].
...
5> flush().
Shell got 1
Shell got 2
Shell got 3
Shell got 4
Shell got 5
Shell got 6
Shell got 7
Shell got 8
Shell got 9
Shell got 10
ok
I have a question concerning net use. is there a way to delete multiple mapped drives connected to a single host? For example, I want to delete drives X and Y in one go using only one command. Is there a way to delete both X and Y since they share the same "hostname". If I am any bit unclear, please let me know. The following is an instance of what I am describing.
Status Local Remote Network
OK -> X: -> \hostname\dir1 Microsoft Windows Network
OK -> Y: -> \hostname\dir2 Microsoft Windows Network
OK -> Z: -> \hostname2\dir1 Microsoft Windows Network
The command completed successfully.
I know the net use /delete * command will delete all, but I want to save drive Z. Again, using only one command. Any ideas?
Assuming that you want to use a command such as
unmap x y
then unmap.bat could be
#echo off
setlocal
:loop
set target=%1
if defined target net use %1: /delete &shift&goto loop
should do the task. %1 is the first parameter; the shift moves the parameters down so %2 becomes %1, etc.
I am interested in bench-marking different parts of my program for speed. I having tried using info(statistics) and erlang:now()
I need to know down to the microsecond what the average speed is. I don't know why I am having trouble with a script I wrote.
It should be able to start anywhere and end anywhere. I ran into a problem when I tried starting it on a process that may be running up to four times in parallel.
Is there anyone who already has a solution to this issue?
EDIT:
Willing to give a bounty if someone can provide a script to do it. It needs to spawn though multiple process'. I cannot accept a function like timer.. at least in the implementations I have seen. IT only traverses one process and even then some major editing is necessary for a full test of a full program. Hope I made it clear enough.
Here's how to use eprof, likely the easiest solution for you:
First you need to start it, like most applications out there:
23> eprof:start().
{ok,<0.95.0>}
Eprof supports two profiling mode. You can call it and ask to profile a certain function, but we can't use that because other processes will mess everything up. We need to manually start it profiling and tell it when to stop (this is why you won't have an easy script, by the way).
24> eprof:start_profiling([self()]).
profiling
This tells eprof to profile everything that will be run and spawned from the shell. New processes will be included here. I will run some arbitrary multiprocessing function I have, which spawns about 4 processes communicating with each other for a few seconds:
25> trade_calls:main_ab().
Spawned Carl: <0.99.0>
Spawned Jim: <0.101.0>
<0.100.0>
Jim: asking user <0.99.0> for a trade
Carl: <0.101.0> asked for a trade negotiation
Carl: accepting negotiation
Jim: starting negotiation
... <snip> ...
We can now tell eprof to stop profiling once the function is done running.
26> eprof:stop_profiling().
profiling_stopped
And we want the logs. Eprof will print them to screen by default. You can ask it to also log to a file with eprof:log(File). Then you can tell it to analyze the results. We tell it to collapse the run time from all processes into a single table with the option total (see the manual for more options):
27> eprof:analyze(total).
FUNCTION CALLS % TIME [uS / CALLS]
-------- ----- --- ---- [----------]
io:o_request/3 46 0.00 0 [ 0.00]
io:columns/0 2 0.00 0 [ 0.00]
io:columns/1 2 0.00 0 [ 0.00]
io:format/1 4 0.00 0 [ 0.00]
io:format/2 46 0.00 0 [ 0.00]
io:request/2 48 0.00 0 [ 0.00]
...
erlang:atom_to_list/1 5 0.00 0 [ 0.00]
io:format/3 46 16.67 1000 [ 21.74]
erl_eval:bindings/1 4 16.67 1000 [ 250.00]
dict:store_bkt_val/3 400 16.67 1000 [ 2.50]
dict:store/3 114 50.00 3000 [ 26.32]
And you can see that most of the time (50%) is spent in dict:store/3. 16.67% is taken in outputting the result, another 16.67% is taken by erl_eval (this is why you get by running short functions in the shell -- parsing them becomes longer than running them).
You can then start going from there. That's the basics of profiling run times with Erlang. Handle with care, eprof can be quite a load on a production system or for functions that run for too long. Especially on a production system.
You can use eprof or fprof.
The normal way to do this is with timer:tc. Here is a good explanation.
I can recommend you this tool: https://github.com/virtan/eep
You will get something like this https://raw.github.com/virtan/eep/master/doc/sshot1.png as a result.
Step by step instruction for profiling all processes on running system:
On target system:
1> eep:start_file_tracing("file_name"), timer:sleep(20000), eep:stop_tracing().
$ scp -C $PWD/file_name.trace desktop:
On desktop:
1> eep:convert_tracing("file_name").
$ kcachegrind callgrind.out.file_name