Keras running in Docker very slow and crashes - ValueError: Feature my_feature is not in features dictionary - docker

I can run Keras neural net locally on my W10 laptop fine
But same code running in Docker is extremely slow and always crashes with error:
ValueError: Feature my_feature is not in features dictionary.
The feature not found is always the target feature
There are version differences between laptop and container but I'm not convinced this has bearing
Laptop
Windows 10 Enterprise 64bit
Intel Core i7-7820HQ # 2.90GHz
16GB RAM
Python 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 17:00:18) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
λ pip list | grep tensorflow
tensorflow 2.0.0
tensorflow-estimator 2.0.1
λ pip list | grep pandas
pandas 0.23.3
pandas-ml 0.6.1
λ pip list | grep numpy
numpy 1.17.4
Docker
# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
VERSION_CODENAME=stretch
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
Python 3.6.10 (default, Apr 23 2020, 15:40:23)
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
root#modelbuilder:~# pip list | grep tensorflow
tensorflow 2.3.0
tensorflow-estimator 2.3.0
root#modelbuilder:~# pip list | grep pandas
pandas 0.24.0
pandas-ml 0.6.1
root#modelbuilder:~# pip list | grep numpy
numpy 1.19.2
Verified what was mentioned here: ValueError: Feature not in features dictionary
Target is not being fed into feature columns, features correspond etc, and this would also fail locally.
Any help will be much appreciated

Figured this out.
Crash issue:
In error created feature column for target, so removed the target from features for columns
Slow Docker:
Was running model.fit() over and over (many times)

Related

How to set up Raspberry Pi Buster and Intel NCS2 and OpenVINO with OpenCV trackers

Is there a definitive set of instructions to implement OpenCV trackers with OpenVINO and the now-obsolete NCS2 on a RPi 4b - Buster?
My understanding that the last OpenVINO to support the NCS2 was v2020.3.
I attempted to cross-compile using:
https://github.com/opencv/opencv/wiki/Intel-OpenVINO-backend#raspbian-buster
After installing opencv/opencv-contrib 4.5.5 from source:
$ python3
Python 3.7.3 (default, Oct 31 2022, 14:04:00)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.__version__
'4.5.5'
>>> tracker = cv2.TrackerCSRT_create()
>>>
However, in a test.py script I have:
...
import cv2
net = cv2.dnn.readNetFromCaffe(_weights, _model)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_MYRIAD)
...
detections = net.forward()
I get the error relating to DNN_TARGET_MYRIAD:
cv2.error: OpenCV(4.5.5) /home/pi/opencv/modules/dnn/src/dnn.cpp:1414: error: (-215:Assertion failed) preferableBackend != DNN_BACKEND_OPENCV || preferableTarget == DNN_TARGET_CPU || preferableTarget == DNN_TARGET_OPENCL || preferableTarget == DNN_TARGET_OPENCL_FP16 in function 'setUpNet'
I then used this to install OpenVINO:
https://docs.openvino.ai/latest/openvino_docs_install_guides_installing_openvino_raspbian.html
but using this version of OpenVINO (as the last to support the NCS2):
https://storage.openvinotoolkit.org/repositories/openvino/packages/2020.3/l_openvino_toolkit_runtime...
I exported the paths to the new post cross-compiled opencv_install directory:
$ export PYTHONPATH=/home/pi/Desktop/opencv_install/lib/python2.7/dist-packages/:$PYTHONPATH
$ export PYTHONPATH=/home/pi/Desktop/opencv_install/lib/python3.7/site-packages/:$PYTHONPATH
$ export LD_LIBRARY_PATH=/home/pi/Desktop/opencv_install/lib/:$LD_LIBRARY_PATH
I set up the NCS2 with no errors :
$ sudo usermod -a -G users "$(whoami)"
$ sh /opt/intel/openvino_2020.3/install_dependencies/install_NCS_udev_rules.sh
then:
$ source /opt/intel/openvino_2020.3/bin/setupvars.sh
and then checked:
$ python3
Python 3.7.3 (default, Oct 31 2022, 14:04:00)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.__version__
'4.3.0-openvino-2020.3.0'
>>> tracker = cv2.TrackerCSRT_create()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'cv2' has no attribute 'TrackerCSRT_create'
>>>
If I open a new terminal and $ source /opt/intel/openvino_2020.3/bin/setupvars.sh
then run a test.py script:
...
import cv2
net = cv2.dnn.readNetFromCaffe(_weights, _model)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_MYRIAD)
...
detections = net.forward()
...
I get a segmentation fault error.
So far I have not edited any of the setup scripts.
Thanks for any help! I'd like to put this NCS2 to work.
Generally, if you are able to run some OpenVINO demo with NCS2 after following this installation guide, then you should be able to use that OpenCV functionality (ensured that you had installed the correct OpenCV).
It's recommended to use the recent OpenVINO and OpenCV version.
As indicated in this OpenVINO System Requirements, the current recommended OpenCV version is 4.5.

glob module is refereed from system package instead of python venv

While trying to import glob in a python venv environment, it is referring to the system package and not the virtual environment even though pandas module is referring to the virtual environment.
I am using python 3.8 and I created a virtual environment using python venv :
cd trial_3
python3 -m venv trial_3_env
On trying to use glob module (which i haven't yet installed in the environment), I can see that it is not throwing any error, but using the glob module from the system packages.
Please find the screenshot showing the same below:
(trial_3_env) anitta#vinjohn:~/Desktop/Study_Data_Engineering/virtualenv_trial/trial_3$ pip freeze
numpy==1.23.4
pyspark==3.3.0
python-dateutil==2.8.2
pytz==2022.6
six==1.16.0
(trial_3_env) anitta#vinjohn:~/Desktop/Study_Data_Engineering/virtualenv_trial/trial_3$ python3
Python 3.8.10 (default, Jun 22 2022, 20:18:18)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import glob
>>> glob.__file__
'/usr/lib/python3.8/glob.py'
>>>
I tried checking this behavior with pandas module, but they are working as expected and throw error while importing when I have not preinstalled them in my system.
(trial_3_env) anitta#vinjohn:~/Desktop/Study_Data_Engineering/virtualenv_trial/trial_3$ python3
Python 3.8.10 (default, Jun 22 2022, 20:18:18)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import pandas
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'pandas'
>>>
Could someone let me know the cause of globs behavior ? and if such scenario can occur for other modules as well.
Thanks in advance!
#ChrisD and #sinoroc answers helped me. standard libraries of venv python interpreter are referenced from the system python interpreter path itself and venv folder doesn't have any python standard libraries stored inside.

Issue installing OpenCV 4.1.2 on Jetson Nano. import cv2, No module named 'cv2'

I installed OpenCV 4.1.2 from source with CUDA support. Had no issues. and created a symbolic link from OpenCV’s installation directory to my virtualenv
ln -s /usr/local/lib/python3.6/site-packages/cv2/python3.6/cv2.cpython-36m-aarch64-linux-gnu.so cv2.so
I am having an issue with import cv2
$ python
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'cv2'
>>>
I checked site-packages directory and I can see cv2.so. I am obviously missing something.
The main issue here in my view I am not able to link to my virtualenv, in fact I am able to check my installation and its working
/usr/local/lib/python3.6/site-packages/cv2/python-3.6$ python
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>>
Issue solved a very very little mistake
I changed the name from
cv2.cpython-36m-aarch64-linux-gnu.so to cv2.so
I realized it was an issue with one of the folders, this will do the magic:
ln -s /usr/local/lib/python3.6/site-packages/cv2/python-3.6/cv2.so cv2.so
notice its python-3.6 not python3.6 after cv2

Installing TensorFlow-GPU

I try to install tensorflow-gpu. The problem is that I have nvidia-375.82 driver, while tensorflow requires 375.66.
When I got this error
ImportError: libnvidia-fatbinaryloader.so.375.66: cannot open shared object file: No such file or directory
I tried to make link
sudo ln -s /usr/lib/nvidia-375/libnvidia-fatbinaryloader.so.375.82 /usr/lib/nvidia-375/libnvidia-fatbinaryloader.so.375.66
It helps to avoid ImportError, but nothing more. If I try to run smth
import tensorflow as tf
# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))
I get result by cpu and prints
2017-10-07 15:56:03.329769: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-07 15:56:03.329832: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-07 15:56:03.329850: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-10-07 15:56:03.329864: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-07 15:56:03.329878: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-10-07 15:56:03.429055: E tensorflow/stream_executor/cuda/cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_NO_DEVICE
2017-10-07 15:56:03.429198: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: sklert-new-comp
2017-10-07 15:56:03.429226: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: sklert-new-comp
2017-10-07 15:56:03.429317: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 375.66.0
2017-10-07 15:56:03.429384: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:369] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 375.82 Wed Jul 19 21:16:49 PDT 2017
GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
"""
2017-10-07 15:56:03.429446: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 375.82.0
2017-10-07 15:56:03.429473: E tensorflow/stream_executor/cuda/cuda_diagnostics.cc:303] kernel version 375.82.0 does not match DSO version 375.66.0 -- cannot find working devices in this configuration
Device mapping: no known devices.
2017-10-07 15:56:03.430336: I tensorflow/core/common_runtime/direct_session.cc:300] Device mapping:
MatMul: (MatMul): /job:localhost/replica:0/task:0/cpu:0
2017-10-07 15:56:03.467133: I tensorflow/core/common_runtime/simple_placer.cc:872] MatMul: (MatMul)/job:localhost/replica:0/task:0/cpu:0
b: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-10-07 15:56:03.467201: I tensorflow/core/common_runtime/simple_placer.cc:872] b: (Const)/job:localhost/replica:0/task:0/cpu:0
a: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-10-07 15:56:03.467226: I tensorflow/core/common_runtime/simple_placer.cc:872] a: (Const)/job:localhost/replica:0/task:0/cpu:0
[[ 22. 28.]
[ 49. 64.]]
Is there any way to use tensorflow with gpu without downgrading?
...
Seems that problem is not in tensorflow, but in nvidia-drivers
sudo dmesg | grep NVRM
[ 1.267417] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 375.82 Wed Jul 19 21:16:49 PDT 2017 (using threaded interrupts)
[ 108.803115] NVRM: API mismatch: the client has the version 375.66, but
NVRM: this kernel module has the version 375.82. Please
NVRM: make sure that this kernel module and all NVIDIA driver
NVRM: components have the same version.
[ 1419.021917] NVRM: API mismatch: the client has the version 375.66, but
NVRM: this kernel module has the version 375.82. Please
NVRM: make sure that this kernel module and all NVIDIA driver
NVRM: components have the same version.
Some drivers have different version:
locate 375.66
/usr/lib/i386-linux-gnu/libcuda.so.375.66
/usr/lib/i386-linux-gnu/libnvidia-opencl.so.375.66
/usr/lib/nvidia-375/libnvidia-fatbinaryloader.so.375.66
/usr/lib/x86_64-linux-gnu/libcuda.so.375.66
/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.375.66
/usr/lib32/nvidia-375/libnvidia-fatbinaryloader.so.375.66

Python3 utf8 codecs not decoding as expected in Docker ubuntu:trusty

The following thing really bugs me, the version of python on my laptop and the version of python inside Docker's ubuntu:trusty image are printing different results with their codecs, what is the reason for that?
For example, python3 on my laptop(ubuntu, trusty):
Python 3.4.3 (default, Apr 14 2015, 14:16:55)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import codecs
>>> codecs.decode(b'\xe2\x80\x99','utf8')
'’'
>>>
python3 on Docker ubuntu:latest:
Python 3.4.0 (default, Apr 11 2014, 13:05:11)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import codecs
>>> codecs.decode(b'\xe2\x80\x99','utf8')
'\u2019'
>>>
Can i make the python3 codecs on Docker's ubuntu:trusty decode b'\xe2\x80\x99' as '’'?
The following illustrates what was happening and how to fix it:
root#df329ec1fe88:/# python3
Python 3.4.0 (default, Apr 11 2014, 13:05:11)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import codecs
>>> codecs.decode(b'\xe2\x80\x99','utf8')
'\u2019'
>>> exit()
root#df329ec1fe88:/# locale -a
C
C.UTF-8
POSIX
root#df329ec1fe88:/# locale
LANG=
LANGUAGE=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
root#df329ec1fe88:/# sudo locale-gen "en_US.UTF-8"
Generating locales...
en_US.UTF-8... done
Generation complete.
root#df329ec1fe88:/# sudo dpkg-reconfigure locales
Generating locales...
en_US.UTF-8... up-to-date
Generation complete.
root#df329ec1fe88:/# echo "export LC_ALL=en_US.utf8" >> ~/.bashrc
root#df329ec1fe88:/# echo "export LANG=en_US.utf8" >> ~/.bashrc
root#df329ec1fe88:/# echo "export LANGUAGE=en_US.utf8" >> ~/.bashrc
root#df329ec1fe88:/# source ~/.bashrc
root#df329ec1fe88:/# locale
LANG=en_US.utf8
LANGUAGE=en_US.utf8
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE="en_US.utf8"
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=en_US.utf8
root#df329ec1fe88:/# python3
Python 3.4.0 (default, Apr 11 2014, 13:05:11)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import codecs
>>> codecs.decode(b'\xe2\x80\x99','utf8')
'’'
>>> exit()
root#df329ec1fe88:/#
You could then commit this container as a new image for future use or you could automate this process in your Dockerfile. Basically add the following lines:
RUN locale-gen "en_US.UTF-8"
RUN dpkg-reconfigure locales
RUN echo "export LC_ALL=en_US.utf8" >> ~/.bashrc
RUN echo "export LANG=en_US.utf8" >> ~/.bashrc
RUN echo "export LANGUAGE=en_US.utf8" >> ~/.bashrc
This sounds like a locale configuration issue. Python could be behaving differently in the two locations because the terminal sessions it's running in are configured differently.
Check your locale settings on your Ubuntu Docker machine to see that you're in a UTF-8 locale in your terminal session. In particular, see if you've been switched over to C for your CTYPE. (I've seen that on servers before, though don't know why it happens.) That could make a difference as to whether the Python console considers it a printable character and thus whether to display it as itself or an escape sequence. This would affect other terminal programs, too.
I was able to reproduce this behavior in Python 3.4.0 on OS X by fiddling with the locale settings.
[# in ~]
$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=
[# in ~]
$ python3.4
Python 3.4.0 (v3.4.0:04f714765c13, Mar 15 2014, 23:02:41)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import codecs
>>> codecs.decode(b'\xe2\x80\x99','utf8')
'’'
>>> quit()
[# in ~]
$ LC_CTYPE=C python3.4
Python 3.4.0 (v3.4.0:04f714765c13, Mar 15 2014, 23:02:41)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import codecs
>>> codecs.decode(b'\xe2\x80\x99','utf8')
'\u2019'
>>> quit()
If it's your locale settings doing it, you need to either set up your rc files on the Docker Ubuntu instance to configure your locale to be the appropriate UTF-8 locale for you, or get your locale settings to propagate through SSH or whatever connection method you're using, in order to configure your remote terminal session there. Propagating your locale through connections may make more sense because it could fix it for other servers or accounts you connect to as well.

Resources