How can I track the bandwidth of transfers within a Dask cluster? - dask

I would like to perform some networking benchmarks on my Dask cluster. What is the best way to get detailed information about recent transfers?

Dashboard
Many people use Dask's dashboard for this and watch for the presence of red bars in the task stream plot.
get_task_stream
However, if you're doing benchmarks then maybe the idea of looking at a live plot during computations isn't what you're looking for. You can get that same information by using the dask.distributed.get_task_stream context manager.
>>> from dask.distributed import get_task_stream
>>> with get_task_stream() as ts:
... x.compute()
>>> ts.data
[...]
This will include information about both computations and data transfers, so you'll have to sift through them a bit.
transfer_logs
Also, as of the time of writing this, Dask workers maintain logs of every transfer in the Worker.incoming_transfer_log and Worker.outgoing_transfer_log attributes. You could use the Client.run method to get these.
>>> client.run(lambda dask_worker: dask_worker.incoming_transfer_log)
{'tcp://192.168.1.191:50637': deque([]),
'tcp://192.168.1.191:50638': deque([]),
'tcp://192.168.1.191:50640': deque([{'start': 1558119113.3489196,
'stop': 1558119113.4012725,
'middle': 1558119113.375096,
'duration': 0.0523529052734375,
'keys': {"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 27, 0)": 463941,
"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 23, 0)": 464477,
"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 7, 0)": 463708,
"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 15, 0)": 464091,
"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 19, 0)": 464826,
"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 3, 0)": 463847,
"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 11, 0)": 464200},
'total': 3249090,
'bandwidth': 62061312.22384144,
'who': 'tcp://192.168.1.191:50642'},
{'start': 1558119113.3484848,
'stop': 1558119113.4085395,
'middle': 1558119113.3785121,
'duration': 0.060054779052734375,
'keys': {"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 12, 0)": 463485,
"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 24, 0)": 464183,
"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 16, 0)": 464061,
"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 4, 0)": 464161,
"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 8, 0)": 463925,
"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 0, 0)": 464214,
"('dataframe-sum-chunk-5b219ece79c8315870694c0e17df68ee', 0, 20, 0)": 464070},
'total': 3248099,
'bandwidth': 54085604.03074382,
'who': 'tcp://192.168.1.191:50637'},
This is keyed by worker, and gives the start/stop/total bytes/sender/recipient of every transfer. This solution uses internal API though, and so could change at any time.

Related

Getting fractions of seconds difference instead of equal times when comparing dates with TimeCop

I've TimeCop installed and using the travel: option with my tests but my tests seem to be failing when I know they shouldn't be. I thought it was my code but it seems that I'm getting fractions of a second added somewhere which is causing dates that should be equal to not be.
Given the following rspec test:
it 'Testing TimeCop', travel: Time.new(2021, 10, 5, 9, 0, 0, '-07:00') do
puts "Time.now: #{Time.now}"
puts "Time.new(2021, 10, 5, 9, 0, 0, '-07:00')}: #{Time.new(2021, 10, 5, 9, 0, 0, '-07:00')}"
puts "Time.now == Time.new(2021, 10, 5, 9, 0, 0, '-07:00')}: #{Time.now == Time.new(2021, 10, 5, 9, 0, 0, '-07:00')}"
puts "Time.now - Time.new(2021, 10, 5, 9, 0, 0, '-07:00')}: #{Time.now - Time.new(2021, 10, 5, 9, 0, 0, '-07:00')}"
end
I'm getting the following output:
Time.now: 2021-10-05 09:00:00 -0700
Time.new(2021, 10, 5, 9, 0, 0, '-07:00')}: 2021-10-05 09:00:00 -0700
Time.now == Time.new(2021, 10, 5, 9, 0, 0, '-07:00')}: false
Time.now - Time.new(2021, 10, 5, 9, 0, 0, '-07:00')}: 0.0004161418930646181
As you can see these times are not equal and the reason being is there seems to be a 4/10000th of a second discrepancy between the two. I don't know whats going on here. Is there something I'm doing wrong with TimeCop or is this a bug?
This is how Timecop#travel works and it is the expected behavior.
Setting a time to travel to sets the time to that timestamp but allows the time to move on. Because you have multiple Time.now calls in your test and the time moved on in between those calls those instances of time cannot be the same.
That means that you in your test must allow small differences in time between two instances, for example like this:
expect(Time.now).to be_within(1.second).of Time.now
Or you can freeze the time like Christian already mentioned in the comments. Freezing the time means that you set the current time to a specific time and it will not move on. For example like this:
before do
Timecop.freeze(Time.new(2021, 10, 5, 9, 0, 0, '-07:00'))
end
after do
Timecop.return
end
it "Testing TimeCop" do
time = Time.new
pause 10
expect(time).to eq Time.now
end

glTexImage2D fail on opengles3.0 context

I am implementing a native webgl context compatible with h5.
Currently I support webgl1.0 APIs.
On iOS I create the EAGLContext with kEAGLRenderingAPIOpenGLES3. Other GL calls works fine, but
glTexImage2D(3553, 0, 6408, 144, 108, 0, 6408, 5126, null), glError()=1282
This call fails.
If I change EAGLContext to opengles2.0, everything works fine.
My question is all parameter values to glTexSubImage2D are the same. Why this call fails if I create the context as es3.0 but succeeds if the context is es2.0.
These are the gl calls dumped. The only difference is that when I create the EAGLContext using GLES3 api level, there is a glError 1282. If the context is created using GLES2 api level, everything works fine.
The first two glTexImage2D use GL_UNSIGNED_BYTE, the failed one uses GL_FLOAT. But es3.0 context should support GL_FLOAT.
17:26:24.683200 Will setup FBOs.
17:26:24.684360 Setup FBOs done.
17:26:24.694778 glCreateTexture()=1
17:26:24.694981 glBindTexture(3553, 1)
17:26:24.695079 glTexParameteri(3553, 10242, 10497)
17:26:24.695142 glTexParameteri(3553, 10243, 10497)
17:26:24.695266 glTexParameteri(3553, 10241, 9985)
17:26:24.695313 glTexParameteri(3553, 10240, 9729)
17:26:24.695414 glTexParameterf(3553, 34046, 1.000000)
17:26:24.695414 glTexImage2D(3553, 0, 6408, 2, 2, 0, 6408, 5121, null)
17:26:24.695414 glTexImage2D(3553, 1, 6408, 1, 1, 0, 6408, 5121, null)
17:26:24.696141 [Buf:GL_UNSIGNED_BYTE:u8] 16, 16, 1
17:26:24.696961 glTexImage2D(3553, 0, 6408, 2, 2, 0, 6408, 5121, [16])
17:26:24.697674 glGenBuffers()=1
17:26:24.697862 glGenBuffers()=2
17:26:24.702478 glGenBuffers()=3
17:26:24.702547 glGenBuffers()=4
17:26:24.702675 glGenBuffers()=5
17:26:24.702734 glGenBuffers()=6
17:26:24.722429 glGenBuffers()=7
17:26:24.722589 glBindBuffer(34962, 7)
17:26:24.722697 glBufferData(34962, [65536], null, 35048)
17:26:24.722758 glGenBuffers()=8
17:26:24.722806 glBindBuffer(34962, 8)
17:26:24.722862 glBufferData(34962, [65536], null, 35048)
17:26:24.723104 createVertexArrayOES(1)
17:26:24.723690 glGenBuffers()=9
17:26:24.723743 glBindBuffer(34962, 9)
17:26:24.723799 glBufferData(34962, [2304000], null, 35048)
17:26:24.723985 glGenBuffers()=10
17:26:24.724068 glBindBuffer(34963, 10)
17:26:24.724120 glBufferData(34963, [64000], null, 35048)
17:26:24.724120 glCreateTexture()=2
17:26:24.747552 glBindTexture(3553, 2)
17:26:24.747625 glTexParameteri(3553, 10242, 33071)
17:26:24.747680 glTexParameteri(3553, 10243, 33071)
17:26:24.747733 glTexParameteri(3553, 10241, 9729)
17:26:24.747778 glTexParameteri(3553, 10240, 9729)
17:26:24.747842 glTexParameterf(3553, 34046, 1.000000)
17:26:24.747842 glTexImage2D(3553, 0, 6408, 144, 108, 0, 6408, 5126, null), glError()=1282
17:26:24.748000 glTexParameteri(3553, 10241, 9728)
17:26:24.748048 glTexParameteri(3553, 10240, 9728)
17:26:24.748120 glTexParameteri(3553, 10242, 33071)
17:26:24.748189 glTexParameteri(3553, 10243, 33071)
17:26:24.748266 glTexParameterf(3553, 34046, 1.000000)+0800
The error is because JS passed invalid combination of internal format/format/type.
glTexImage2D(3553, 0, 6408, 144, 108, 0, 6408, 5126, null), glError()=1282
is actually glTexImage2D(3553, 0, GL_RGBA, 144, 108, 0, GL_RGBA, GL_FLOAT, null)
This combination is not valid according to https://www.khronos.org/registry/OpenGL-Refpages/es3.0/html/glTexImage2D.xhtml.
The interesting thing is that in iOS es2.0 context, this combination is valid.

Data shuffling for Image Classification

I want to develop a CNN model to identify 24 hand signs in American Sign Language. I created a custom dataset that contains 3000 images for each hand sign i.e. 72000 images in the entire dataset.
For training the model, I would be using 80-20 dataset split (2400 images/hand sign in the training set and 600 images/hand sign in the validation set).
My question is:
Should I randomly shuffle the images when creating the dataset? And Why?
Based on my previous experience, it led to validation loss being lower than training loss and validation accuracy more than training accuracy. Check this link.
Random shuffling of data is a standard procedure in all machine learning pipelines, and image classification is not an exception; its purpose is to break possible biases during data preparation - e.g. putting all the cat images first and then the dog ones in a cat/dog classification dataset.
Take for example the famous iris dataset:
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
y
# result:
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
As you can clearly see, the dataset has been prepared in such a way that the first 50 samples are all of label 0, the next 50 of label 1, and the last 50 of label 2. Try to perform a 5-fold cross validation in such a dataset without shuffling and you'll find most of your folds containing only a single label; try a 3-fold CV, and all your folds will include only one label. Bad... BTW, it's not just a theoretical possibility, it has actually happened.
Even if no such bias exists, shuffling never hurts, so we do it always just to be on the safe side (you never know...).
Based on my previous experience, it led to validation loss being lower than training loss and validation accuracy more than training accuracy. Check this link.
As noted in the answer there, it is highly unlikely that this was due to shuffling. Data shuffling is not anything sophisticated - essentially, it is just the equivalent of shuffling a deck of cards; it may have happened once that you insisted on "better" shuffling and subsequently you ended up with a straight flush hand, but obviously this was not due to the "better" shuffling of the cards.
Here is my two cents on the topic.
First of all make sure to extract a test set that has equal number of samples for each hand sign. (hand sign #1 - 500 samples, hand sign #2 - 500 samples and so on)
I think this is referred to as stratified sampling.
When it comes to the training set, there is no huge mistake in shuffling the entire set. However, when splitting the training set into training and validation set make sure that the validation set is good enough to be a representation for the test set.
One of my personal experiences with shuffling:
After splitting the training set into training and validation sets, the validation set turned out to be very easy to predict. Therefore, I saw good learning metric values. However, the performance of the model on the test set was horrible.

Client request for tensorflow serving gives error "Attempting to use uninitialized value fully_connected/biases"

I created a LSTM RNN model for text classification on tensorflow and exported the savedModel successfully. I tested the model using savedModel CLI and everything seems to be working fine. However I am trying to create a client that can make a request and get a result. I have been following this tensorflow serving inception example (more specifically inception_client.py) for reference. This works well with the inception model but I am not sure how to change the request for my own model. How exactly should I change the request?
My signature and saving the model:
# Build the signature_def_map.
classification_signature = signature_def_utils.build_signature_def(
inputs={signature_constants.CLASSIFY_INPUTS: classification_inputs},
outputs={
signature_constants.CLASSIFY_OUTPUT_CLASSES:
classification_outputs_classes,
},
method_name=signature_constants.CLASSIFY_METHOD_NAME)
legacy_init_op = tf.group(
tf.tables_initializer(), name='legacy_init_op')
#add the sigs to the servable
builder.add_meta_graph_and_variables(
sess, [tag_constants.SERVING],
signature_def_map={
signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
classification_signature
},
assets_collection=tf.get_collection(tf.GraphKeys.ASSET_FILEPATHS),
legacy_init_op=tf.group(assign_filename_op))
print ("added meta graph and variables")
builder.save()
print("model saved")
The model takes in inputs_ as the input which is a list of list of numbers ( [[1,3,4,5,2]] ).
inputs_ = tf.placeholder(tf.int32, [None, None], name="input_ints")
How I am using the savedModel CLI (returns right results):
$ saved_model_cli run --dir ./python2_SavedModelFinalInputInts --tag_set serve --signature_def 'serving_default' --input_exprs inputs='[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2634, 758, 938, 579, 1868, 1894, 24, 651, 572, 32, 1847, 232]]'
More information about the savedModel:
$ saved_model_cli show --dir ./python2_prediction_SavedModelFinalInputInts --all
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_INT32
shape: (-1, -1)
name: inputs/input_ints:0
The given SavedModel SignatureDef contains the following output(s):
outputs['outputs'] tensor_info:
dtype: DT_FLOAT
shape: (1, 1)
name: predictions/fully_connected/Sigmoid:0
Method name is: tensorflow/serving/predict
How I am trying to create a request in the client code:
request1 = predict_pb2.PredictRequest()
request1.model_spec.name = 'mnist'
request1.model_spec.signature_name = signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY
request1.inputs[signature_constants.PREDICT_INPUTS].CopyFrom(tf.contrib.util.make_tensor_proto(input_nums, shape=[1,100],dtype=tf.int32))
response = stub.Predict(request1,1.0)
result_dict = { 'Analyst Rating': str(response.message) }
return jsonify(result_dict)
I am getting the following error:
[2017-11-29 19:03:29,318] ERROR in app: Exception on /analyst_rating [POST]
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1612, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1598, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/usr/local/lib/python2.7/dist-packages/flask_restful/__init__.py", line 480, in wrapper
resp = resource(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/flask/views.py", line 84, in view
return self.dispatch_request(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/flask_restful/__init__.py", line 595, in dispatch_request
resp = meth(*args, **kwargs)
File "restApi.py", line 91, in post
response = stub.Predict(request,1)
File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 309, in __call__
self._request_serializer, self._response_deserializer)
File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 195, in _blocking_unary_unary
raise _abortion_error(rpc_error_call)
AbortionError: AbortionError(code=StatusCode.FAILED_PRECONDITION, details="Attempting to use uninitialized value fully_connected/biases
[[Node: fully_connected/biases/read = Identity[T=DT_FLOAT, _class=["loc:#fully_connected/biases"], _output_shapes=[[1]], _device="/job:localhost/replica:0/task:0/cpu:0"](fully_connected/biases)]]")
127.0.0.1 - - [29/Nov/2017 19:03:29] "POST /analyst_rating HTTP/1.1" 500 -
{"message": "Internal Server Error"}
Update:
Changing the signature of the model from a classification signature to a prediction signature seemed to work. I also changed the legacy_init_op to legacy_init_op as defined from assign_filename_op which I was using for Assets organization initially.
Changing the model signature from classification to a prediction signature seemed to return results.
prediction_signature = (tf.saved_model.signature_def_utils.build_signature_def(
inputs={signature_constants.PREDICT_INPUTS: prediction_inputs},
outputs={signature_constants.PREDICT_OUTPUTS: prediction_outputs},
method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))
legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
#add the sigs to the servable
builder.add_meta_graph_and_variables(
sess, [tag_constants.SERVING],
signature_def_map={
signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
prediction_signature
},
# assets_collection=tf.get_collection(tf.GraphKeys.ASSET_FILEPATHS),
legacy_init_op=legacy_init_op)
I am not entirely sure how the client request should be for a model with classification signature or why it was not working.
(If anyone has an explanation, I will select that as the correct answer.)

AS2 Final Fantasy type Inventory

Edit: since I wasn't clear originally, I'm having trouble working out an AS2 inventory with stacking consumables and equippable weapons/armor, my code below is the arrays holding the values I've been working on for years but haven't figured any viable way to make work. I'm looking for help/ideas of how I can make an array hold these values and make them stack
Since about 2008, I've been working on a flash RPG. While I'm making headway, my biggest roadblock is the lack of item system which is becoming increasingly hard to create in AS2 as all of the great coders are slowly refusing to touch flash anymore. It's far too late to start over in a different language, my main file is a well over 500mb .fla
Honestly, I'd pay whoever can figure this out cause this game has a few other issues I'd like help with after this is solved. I'll throw down $50 usd to whoever can make a good item system with these reqs
Has to be able to draw from a database of different item types (consumable, weapon, armor, accessory)
Consumables must be able to "stack" (weapons, armor, accessories dont need to stack)
Consumables must be able to directly affect hp/mp values (heal and the like)
Weapons Armour acc. must have values attached to them so they can affect the character's stats when equipped (ex.strength +10 or something)
Items must be able to be bought and sold
Items must be able to be displayed in text (movieclips of each item is not really necessary)
AS2
Pretty much this below : (Final Fantasy item/inventory example)
File size is not an issue whatsoever, these items must be able to be rendered in game
Literally if possible, use Final Fantasy 7 (or any FF game) as a basis for how the item system should work. If you can get it close to that, I'll pay you immediately. This is the code I was working on...
//ITEM SYSTEM
//this array will hold all items, inside of it, hopefully to be called inside of itself
//0.......1............2........3......4.... array element values
//name, description, effect, price, item type
_global.itemlist = [
//CONSUMABLES
["Sweetroot", "Restores 50 HP", 50, 100, "consumable"],
["Sugarroot", "Restores 100 HP", 100, 500, "consumable"],
["Candyroot", "Fully restores HP", 9999, 5000, "consumable"],
["Bitterroot", "Restores 50 MP", 50, 200, "consumable"],
["Sourroot", "Restores 100 MP", 100, 600, "consumable"],
["Tartroot", "Fully restores MP", 9999, 10000, "consumable"]
];
//0.......1............2....3....4...5....6.....7. array element values
//name, description, user, str, sta, def, mst, item type
_global.weaponlist = [
//WEAPONS
["Secred", "A mysterious zanbato with seals on the blade", "azul", 50, 0, 25, 10, "weapon"],
["Illuminite", "A thin weapon with a diamond rod for focusing light", "aseru", 50, 10, 10, 50, "weapon"],
["Fists", "A bare handed attack", "malforn", 50, 0, 55, 0, "weapon"],
["Kazeshini", "An oddly shaped blade for quick attacks ", "vayle", 20, 10, 25, 10, "weapon"],
["Yamato", "A katana with a blue hilt and a lightweight blade", "aoizi", 50, 20, 35, 50, "weapon"],
["Saleria", "A heavy scythe made from chenre absorbing crystal", "laava", 50, 20, 35, 50, "weapon"]
];
//0.......1............2....3....4...5....6...7 array element values
//name, description, str, sta, def, mst, price, item type
_global.armorlist = [
["Clothes", "light clothing", 0, 10, 10, 0, 500, "armor"],
["Armor", "plated armor", 0, 20, 20, 0, 5000,"armor"],
["Chain", "a thick chain for attack or defense", 20, 10, 10, 0, 500,"armor"],
["Shawl", "thin fabric worn by women", 0, 0, 5, 10, 7000,"armor"],
["Bodysuit", "a thick full body suit", 10, 20, 20, 0, 7500,"armor"],
["Spiked Araments", "armor covered in spikes, increases attack strength", 30, 0, 20, 0, 8000,"armor"]
];
//0.......1............2....3....4...5....6...7 array element values
//name, description, str, sta, def, mst, price, item type
_global.acclist = [
["Blade Ring", "raises attack", 30, 0, 0, 0, 500,"accesory"],
["Vigor Ring", "raises stamina", 0, 30, 0, 0, 500,"accesory"],
["Shield Ring", "raises defense", 0, 0, 30, 0, 500,"accesory"],
["Chenre Ring", "raises mental", 0, 0, 0, 30, 500,"accesory"],
["Knife Ring", "greatly raises attack", 120, 0, 0, 0, 50000,"accesory"],
["Vehemence Ring", "greatly raises stamina", 0, 120, 0, 0, 50000,"accesory"],
["Aegis Ring", "greatly raises defense", 0, 0, 120, 0, 50000,"accesory"],
["Aura Ring", "greatly raises mental", 0, 0, 0, 120, 50000,"accesory"]
];
get item Name with item[0][0];
get item Description with item[0][1];
get item Effect with item[0][2];
Example : If item[0] is being addressed then... (item[0] [0] =
Secred (as Name). Then item[0] [1] = Fists (as Description), and so on. trace(item[0][4]); should trace the name Sweetroot.

Resources