System died when I ran edgetpu_demo - google-coral

I went to step 7
https://coral.ai/docs/dev-board/get-started/#run-demo
After I hit the command edgetpu_demo --stream, the console has the message below and then the system rebooted.
INFO:edgetpuvision.streaming.server:Listening on ports tcp: 4665, web:
4664, annexb: 4666 INFO:edgetpuvision.streaming.server:New web
connection from 192.168.1.6:62609
INFO:edgetpuvision.streaming.server:Number of active clients: 1
INFO:edgetpuvision.streaming.server:[192.168.1.6:62609] Rx thread
finished INFO:edgetpuvision.streaming.server:[192.168.1.6:62609] Tx
thread finished INFO:edgetpuvision.streaming.server:New web connection
from 192.168.1.6:62610 INFO:edgetpuvision.streaming.server:Number of
active clients: 2
INFO:edgetpuvision.streaming.server:[192.168.1.6:62609] Stopping...
INFO:edgetpuvision.streaming.server:[192.168.1.6:62609] Stopped.
INFO:edgetpuvision.streaming.server:Number of active clients: 1
INFO:edgetpuvision.streaming.server:New web connection from
192.168.1.6:62611 INFO:edgetpuvision.streaming.server:Number of active clients: 2 INFO:edgetpuvision.streaming.server:[192.168.1.6:62610] Rx
thread finished INFO:edgetpuvision.streaming.server:New web connection
from 192.168.1.6:62612 INFO:edgetpuvision.streaming.server:Number of
active clients: 3
INFO:edgetpuvision.streaming.server:[192.168.1.6:62611] Rx thread
finished
Need I update some modules such as GStreamer, etc.?

Board reboot can occur if board is not getting enough power. Please make sure to boot the board with at least 2.1 - 3 amp of power adaptor.

Related

uwsgi log format: seeing uwsgi req: N/M where N > M

I have a uwsgi process running a flask application. There is haproxy (running in mode http) sitting between the client and the application.
I am seeing occational haproxy termination state as "SD--" and the Tc = 0 and Tr = -1, and the returned http code is -1. This means that the haproxy encountered a explicit tcp disconnection from the uwsgi server.
Looking at the uwsgi logs, I found that the server was normally processing requests at the same time. But the affected request never reached the server.
Only thing strange about the uwsgi logs at that point of time is that
the Number of requests managed by the current uwsgi worker is greater than the sum total of requests managed by the whole uwsgi app.
like this:
[pid: 22759|app: 0|req: **47188**/**47178**] * POST * => generated 84 bytes in 970 msecs (HTTP/1.1 200) 2 headers in 71 bytes (3 switches on core 98)
I am wondering if this is abnormal, or what what scenarios can these counters be so?

Q: Apache Geode - Unable to reconnect a node after SO patching "15 seconds have elapsed while waiting for replies"

I have a cluster situation consisting of 4 total nodes, 3 servers and 1 management node, working properly.
At the beginning of the month we planned to patch the OS and we started from the first server node with this procedure:
Stop service
S.O. patching
Server restart
Start service
The service of the first patched node named "serverA" fails to restart with this error:
Log entries cluster join:
serverA:
| INFO | region-dm-12 | ache.geode.internal.tcp.Connection | --> Connection: shared=true ordered=false failed to connect to peer 10.237.110.195( Server serverB:9993):1024 because: java.net.ConnectException: Connection timed out (Connection timed out)
| WARN | region-dm-12 | ache.geode.internal.tcp.Connection | --> Connection: Attempting reconnect to peer 10.237.110.195( Server serverB:9993):1024
ServerMgmt:
| WARN | pool-3-thread-1 | tributed.internal.ReplyProcessor21 | --> 15 seconds have elapsed while waiting for replies: <CreateRegionProcessor$CreateRegionReplyProcessor 44180 waiting for 1 replies from [10.237.110.194( Server serverA:632):1024]> on 10.237.110.225( Management:6033):1024 whose current membership list is: [[10.237.110.196( Server serverC:16805):1024, 10.237.110.225( Management:6033):1024, 10.237.110.195( Server serverB:9993):1024, 10.237.110.194( Server serverA:632):1024]]
The connection between the systems was verified with tcpdumps, udp 1024 is running fine.
We have tried redeploying the service and making numerous attempts but we always get the same error during startup.
Any suggestions? Thank you.
Marco.
I think to see this error message, serverA was probably able to send UDP messages to serverB but it is failing to create a TCP connection. It's hard to say why though - a firewall issue, some TCP configuration issue, ... ?
Check to see if serverB has anything interesting in its logs. Since you are using TCP dump, you should be watching for that TCP connection for serverB:9993, since it looks like that is wwhat failed.
There is no firewall between the systems, we've analyzed again the network connection, during startup from node a, and we can see that the communication can be established between all systems. But what we detected is, that on port 2323 which is configured as locater, the node sends packages to the b and c node, but only receives back packages from the c node, and not from the b node. This is for us again a sign that the b node has an issue. Does it give a way to check our assumption from the b node?
A node ip .194
B node ip .195
C node ip .196
Management ip .225

Why are Google Pipeline VM instances hanging indefinitely?

I am using Dockerflow to run parallel tasks through the Google Pipelines API on Google Cloud Platform. I started a single-step task running 1389 VMs in parallel and found that 233 of the VMs were apparently doing nothing and hanging indefinitely.
I did a spot check of the serial console output and repeatedly saw the VMs running into "Getting controller config failed" errors.
When I tried logging into the VMs I received the error: "Connection Failed. We are unable to connect to the VM on port 22".
I am wondering why my VM instances are hanging, and if there is something I can do to avoid running into these issues.
I've included a snippet of the serial console output below
startupscript: +++ readlink -f /usr/share/google-genomics/startup.sh
startupscript: ++ dirname /usr/share/google-genomics/startup.sh
startupscript: + cd /usr/share/google-genomics
startupscript: + ./controller --operation_id <id> --validation_token <token> --base_path https://genomics.googleapis.com
create controller[2905]: Getting controller config
create controller[2905]: Getting controller config failed, will retry: Get <link>: Get <service_account_token_link>: net/http: timeout awaiting response headers
create controller[2905]: Getting controller config failed, will retry: Get <link>: dial tcp 74.125.26.95:443: i/o timeout
collectd[2342]: write_gcm: Asking metadata server for auth token
collectd[2342]: write_gcm: curl_easy_perform() failed: Couldn't connect to server
collectd[2342]: write_gcm: Error -1 from wg_curl_get_or_post
collectd[2342]: write_gcm: wg_transmit_unique_segment failed.
collectd[2342]: write_gcm: wg_transmit_unique_segments failed. Flushing.
there was a temporary networking issue in us-east1-b. All 3 above VMs were in us-east1-b. These minor incidents do not appear in https://status.cloud.google.com/
Serial console output for a successful run looks like:
A Feb 21 19:05:06 ggp-5629907348021283130 startupscript: + ./controller --operation_id --validation_token --base_path https://autopush-genomics.sandbox.googleapis.com
A Feb 21 19:05:06 ggp-5629907348021283130 create controller[2689]: Getting controller config
A Feb 21 19:05:36 ggp-5629907348021283130 create controller[2689]: Getting controller config failed, will retry: Get https://genomics.googleapis.com/v1alpha2/pipelines:getControllerConfig?alt=json&operationId=&validationToken=: dial tcp 173.194.212.81:443: i/o timeout
A Feb 21 19:05:43 ggp-5629907348021283130 controller[2689]: Switching to status: pulling-image
A Feb 21 19:05:43 ggp-5629907348021283130 controller[2689]: Calling SetOperationStatus(pulling-image)
A Feb 21 19:05:44 ggp-5629907348021283130 controller[2689]: SetOperationStatus(pulling-image) succeeded
The "Getting controller config failed, will retry" is fine. It succeeded upon retry. The "SetOperationStatus(pulling-image) succeeded" indicates networking is working.
In theory, you can submit any number of jobs to Pipelines API and the API will take care of queueing.
If these temporary networking hiccups become common, we may consider changing Pipelines API to somehow detect and retry.
there may have been a temporary networking issue. Can you give me some failed operation ids (or failed VM names)?
Have you tried again since then; can you reproduce the problem?

How YAWS handle concurrent users

I wish to know which code is being executed in YAWS every time a new client uses its web server...
First I tried to understand how YAWS handles concurrent users... and trie the following .yaws page:
io:format("~nProcess Identifier: ~p Port: ~p Client: ~p YAWS pid: ~p ~n",[self(), A#arg.clisock, A#arg.client_ip_port, A#arg.pid]).
which should return the Pid , port and ip of each client... I opened this page on the same browser (Firefox) and opened two distinct tabs... this was printed:
Process Identifier: <0.65.0> Port: #Port<0.1211> Client: {{127,0,0,1},60451} YAWS pid: <0.65.0>
Process Identifier: <0.65.0> Port: #Port<0.1211> Client: {{127,0,0,1},60451} YAWS pid: <0.65.0>
for some reason, the same port and pid are being returned (hence, YAWS isn't creating a new port or new pid for each client).
When I try this out on Chrome this was printed:
Process Identifier: <0.71.0> Port: #Port<0.2998> Client: {{127,0,0,1},60543} YAWS pid: <0.71.0>
Process Identifier: <0.71.0> Port: #Port<0.2998> Client: {{127,0,0,1},60543} YAWS pid: <0.71.0>
Hence, why is YAWS not opening a new port or pid for each tab on the same browser?
Also, back to the original question, where and which code does YAWS spawns a new PID or opens a new port?
Thanks
Unless you're sure that your browsers open new HTTP connections for each tab, you're not really testing what you think you're testing. Instead, try this from a command line:
curl http://yaws_host:yaws_port/path/to/your/yaws/page.yaws
curl http://yaws_host:yaws_port/path/to/your/yaws/page.yaws
Yes, run it twice, as that is guaranteed to use two separate connections. You will then see that Yaws uses two distinct Erlang processes and TCP connections to handle the two requests:
Process Identifier: <0.59.0> Port: #Port<0.1181> Client: {{127,0,0,1},64977} YAWS pid: <0.59.0>
Process Identifier: <0.64.0> Port: #Port<0.3268> Client: {{127,0,0,1},64978} YAWS pid: <0.64.0>
As for where the Yaws code for dealing with connections resides, you can look in yaws_server.erl, in particular at the acceptor/1 function which launches processes to accept connections and the do_listen/2 function which opens sockets for listening.

Timeout Exception trying to connect to Memcached server on AppHarbor

I've been using the Enyim Memcached Client for .Net, trying to connect to a server running on AppHarbor. The relevant parts of my configuration file look like this:
<enyim.com>
<log factory="Enyim.Caching.DiagnosticsLogFactory, Enyim.Caching" />
<memcached protocol="Binary">
<servers>
<add address="8d593f28-37d7-4c4f-a702-aa7687a85ea1.memcacher.com" port="11211" />
</servers>
<authentication
type="Enyim.Caching.Memcached.PlainTextAuthenticator, Enyim.Caching"
userName="changed to post on stack overflow"
password="changed to post on stack overflow"
zone="AUTHZ"
/>
</memcached>
</enyim.com>
My connection keeps timing out. Any ideas whats going on here? Here are the logs from Enyim client:
2012-01-21 18:56:08 [ERROR] 7 Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Could not init pool. - System.TimeoutException: Could not connect to 50.19.210.46:11211
at Enyim.Caching.Memcached.PooledSocket.ConnectWithTimeout(Socket socket, IPEndPoint endpoint, Int32 timeout)
at Enyim.Caching.Memcached.PooledSocket..ctor(IPEndPoint endpoint, TimeSpan connectionTimeout, TimeSpan receiveTimeout)
at Enyim.Caching.Memcached.MemcachedNode.CreateSocket()
at Enyim.Caching.Memcached.Protocol.Binary.BinaryNode.CreateSocket()
at Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl.CreateSocket()
at Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl.InitPool()
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Mark as dead was requested for 50.19.210.46:11211
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - FailurePolicy.ShouldFail(): True
2012-01-21 18:56:08 [WARN] 7 Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Marking node 50.19.210.46:11211 as dead
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.DefaultServerPool - Node 50.19.210.46:11211 is dead.
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.DefaultServerPool - Starting the recovery timer.
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.DefaultServerPool - Timer started.
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Acquiring stream from pool. 50.19.210.46:11211
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Pool is dead or disposed, returning null. 50.19.210.46:11211
UPDATE:
Turns out the reason I can't connect to the memcached server is because it's only accessible from appharbor's environment. So for anyone else that runs across this, you need to use a local memcached service when developing locally, then simply change the credentials when deploying (which apphaorbor actually does automatically for you). Problem resolved.
AppHarbor Memcacher buckets are only accessible from AppHarbor application servers. The documentation has been amended to clearly reflect this.
You should use a locally installed memcached server for testing.

Resources