How do I make Dask Dashboard working properly? - dask

I used AWS EC2 to run everything, in one Jupiter notebook process, I do:
from dask.distributed import Client, LocalCluster
cluster = LocalCluster(scheduler_port=5272, dashboard_address=5273,memory_limit='4GB')
# client = Client(cluster)
to start a client specifying a port. Then in another process, I join it with,
client = Client('tcp://127.0.0.1:5272')
To check the dash board, I open with safari
http://ec2-1-123-12-123.compute-1.amazonaws.com:5273/status
I could see the dash board, but when i run some big processing tasks, there is no action in the dashboard. still showing little cpu usage. But I saw, huge cpu usage in htop , with my ddf.compute() running... Did I miss anything?

Related

Does it make sense to cluster NodeJs (in order to take advantage of multiple CPUs) if will be deployed with orchestration tool like Kubernetes?

Right now I am struggling with debugging of NodeJs application which is clustered and is running on Docker. Found on this link and this information in it:
Remember, Node.js is still single-threaded in most cases, so even on a
single server you’ll likely want to spin up multiple container
replicas to take advantage of multiple CPU’s
So what does it mean, clustering of NodeJs app is pointless when it is meant to be deployed on Kubernetes ?
EDIT: I should also say that, by clustering I mean forking workers with cluster.fork() and goal of the application is to build simple REST API with high load traffic.
Short answer is yes..
Containers are just mini VM's and kubernetes is the orchestration tool that manages all the running 'containers', checking for health, resource allocation, load etc.
So, if you are running your node application in a container with an orchestration tool like kubernetes, then clustering is moot as each 'container' will be using 1 CPU or partial CPU depending on how you have it configured. Multiple containers essentially just place a new VM in rotation and kubernetes will direct traffic to each.
Now, when we talk about clustering node, that really comes into play when using tools like PM2, lets say you have a beefy server with 8 CPU's, node can only use 1 per instance so tools like PM2 setup a cluster and will route traffic along each of the running instances.
One thing to keep in mind though is that your application needs to be cluster OR container ready. Meaning nothing should be stored on the ephemeral disk as with each container restart that data is lost OR in a cluster situation there is no guarantee the folders will be available to each running instance and if you cluster with multiple servers etc you are asking for trouble :D ( this is where an object store would come into play like S3)

Confusion regarding cluster scheduler and single machine distributed scheduler

In below code, why dd.read_csv is running on cluster?
client.read_csv should run on cluster.
import dask.dataframe as dd
from dask.distributed import Client
client=Client('10.31.32.34:8786')
dd.read_csv('file.csv',blocksize=10e7)
dd.compute()
Is it the case that once I make a client object, all api calls will run on cluster?
The commnad dd.read_csv('file.csv', blocksize==1e8) will generate many pd.read_csv(...) commands, each of which will run on your dask workers. Each task will look for the file.csv file, seek to some location within that file defined by your blocksize, and read those bytes to create a pandas dataframe. The file.csv file should be universally present for each worker.
It is common for people to use files that are on some universally available storage, like a network file system, database, or cloud object store.
In addition to the first answer:
yes, creating a client to a distributed client will make that be the default scheduler for all following dask work. You can, however, specify where you would like work to run as follows
for a specific compute,
dd.compute(scheduler='threads')
for a black of code,
with dask.config.set(scheduler='threads'):
dd.compute()
until further notice,
dask.config.set(scheduler='threads')
dd.compute()
See http://dask.pydata.org/en/latest/scheduling.html

Can you shut down your computer when your model is being trained on Google Cloud?

When you train a model via jupyter notebook in a Google Cloud instance, should you keep your computer on until for the model to finish training? All the computation is done in the cloud but still the code and the notebook is on your browser. So I was a bit curious.
Thanks a lot
This is actually more related on Jupyter behaviour than being run on a Google Cloud Instance.
If your process is not terminated in your VM instance, then the kernel is still active and although you close your browser, whatever you are running should still be running. You should be able to therefore access again the notebook and access all variables which had already being defined, however you cannot see any output that was printed to the notebook (if any). In case you need to close your window and want to log events you can see some of the suggestions in the following post:
Keep Jupyter notebook running after closing browser tab
Also this github issue thread could be useful

How to monitor windows service and process with zabbix

I am new in Zabbix and I am using Zabbix 3.4 version. I have installed server on Linux and want to monitor and check status of Windows service using its Windows agent.
I got the status of services using the key below
service.info[<serviceName>,state]
It returns me proper status of service. Now I want to check how much CPU is utilized by process and how much memory is utilized by process.
I tried some of keys but it's not returning proper value.
perf_counter[\Process(<processName>)\% User Time] // to get CPU utilization by process
proc_info[<processName>,wkset] // to get memory utilize by process
system.cpu.util[,system,avg5] // to get total CPU utilization
vm.memory.size[available] // to get total RAM utilization
But none of above working properly. I tried other keys also but agent logs say it's unsupported. I checked forum and searched on Google but nothing found.
Usually there isn't a direct match Windows Service -> Specific process.
Any service spawns N processes for its internals and also can spawn additional processes to manage incoming connection, log requests and so on.
Think about a classic httpd server: you should find at least one master process, various pre-forked server processes and php/php-fpm processes for current requests.
Regarding the keys you provided, what do you mean by "not working properly" ?
You can refer to Zabbix documentation for Windows-specific items for the exact syntax of the items and the meaning of the return values.
You can use Zabbix item for CPU Utilization of average 5 min:
system.cpu.util[,,avg5]
This will give you average usage of CPU per 5 mins on Windows server. You can then create an appropriate trigger for the same.

Neo4j enterprise cluster master not fully utilizing CPU

We have a Neo4j v3.0.4 Enterprise cluster running on a machine in AWS with 16 cores and when we issue a lot of requests to it, it seems to only utilize at most ~40% of the CPU (when looking on the box using htop it only seems to utilize 6 cores?). Disk + network IO on said box all look negligible during the test.
Screen cap of CPU profile - the flat part is when we hit it with load:
Requests are routed to the DB via a cluster of Spring Boot apps using Spring Data Neo4j 4 and from our investigations it does not look like those servers are forming any bottlenecks, from a memory, CPU, and network IO POV.
We currently are NOT using Bolt, nor are we using causal clustering; however we are planning on moving towards both. In the interim though, is there anything that may cause this type of behavior? Could our DB be misconfigured? Could this be a JVM level problem?
Any advice is much appreciated - thanks!

Resources