Docker Spark 3.0.0 pyspark py4j.protocol.Py4JError - docker

I created a docker image with spark 3.0.0 that is to be used for executing pyspark from a jupyter notebook. The issue I'm having though, when running the docker image locally and testing the following script:
import os
from pyspark import SparkContext, SparkConf
from pyspark.sql import SparkSession
print("*** START ***")
sparkConf = SparkConf()
sc = SparkContext(conf=sparkConf)
rdd = sc.parallelize(range(100000000))
print(rdd.sum())
print("*** DONE ***")
I get the following error:
Traceback (most recent call last):
File "test.py", line 9, in <module>
sc = SparkContext(conf=sparkConf)
File "/usr/local/lib/python3.7/dist-packages/pyspark/context.py", line 136, in __init__
conf, jsc, profiler_cls)
File "/usr/local/lib/python3.7/dist-packages/pyspark/context.py", line 213, in _do_init
self._encryption_enabled = self._jvm.PythonUtils.getEncryptionEnabled(self._jsc)
File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1487, in __getattr__
"{0}.{1} does not exist in the JVM".format(self._fqn, name))
py4j.protocol.Py4JError: org.apache.spark.api.python.PythonUtils.getEncryptionEnabled does not exist in the JVM
I've tried using findspark and pip installing py4j fresh on the image, but nothing is working and I can't seem to find any answers other than using findspark. Has anyone else been able to solve this issue using spark 3.0.0?

Probably your are mixing different version of Pyspark and Spark
Check my see my complete answer here:
https://stackoverflow.com/a/66927923/14954327

Related

Pyspark running in docker container cannot write file

I have a docker container running PySpark, hadoop and all the required dependecies. I am using spark-submit to query the minio and I want to write the output dataframe to the file. Reading the file works but writing does not. If I execute python in that container and try to create file at the same path, it works.
Am I missing some spark configuration?
This is the error I get:
File "/usr/local/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 1109, in save
File "/usr/local/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1304, in __call__
File "/usr/local/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 111, in deco
File "/usr/local/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o38.save
: java.net.ConnectException: Call From 10d3463d04ce/10.0.1.132 to localhost:9000 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Relevant code:
spark = SparkSession.builder.getOrCreate()
spark_context = spark.sparkContext
spark_context._jsc.hadoopConfiguration().set('fs.s3a.access.key', 'minio')
spark_context._jsc.hadoopConfiguration().set(
'fs.s3a.secret.key', AWS_SECRET_ACCESS_KEY
)
spark_context._jsc.hadoopConfiguration().set('fs.s3a.path.style.access', 'true')
spark_context._jsc.hadoopConfiguration().set(
'fs.s3a.impl', 'org.apache.hadoop.fs.s3a.S3AFileSystem'
)
spark_context._jsc.hadoopConfiguration().set('fs.s3a.endpoint', AWS_S3_ENDPOINT)
spark_context._jsc.hadoopConfiguration().set(
'fs.s3a.connection.ssl.enabled', 'false'
)
df = spark.sql(query)
df.show() # this works perfectly fine
df.coalesce(1).write.format('json').save(output_path) # here I get the error
Solution was to prepend file:// to output_path.

Airflow DockerOperator unable to mount tmp directory correctly

I am trying to run a simple python script within a docker run command scheduled with Airflow.
I have followed the instructions here Airflow init.
My .env file:
AIRFLOW_UID=1000
AIRFLOW_GID=0
And the docker-compose.yaml is based on the default one docker-compose.yaml. I had to add - /var/run/docker.sock:/var/run/docker.sock as an additional volume to run docker inside of docker.
My dag is configured as followed:
""" this is an example dag """
from datetime import timedelta
from airflow import DAG
from airflow.operators.docker_operator import DockerOperator
from airflow.utils.dates import days_ago
from docker.types import Mount
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'email': ['info#foo.com'],
'email_on_failure': True,
'email_on_retry': False,
'retries': 10,
'retry_delay': timedelta(minutes=5),
}
with DAG(
'msg_europe_etl',
default_args=default_args,
description='Process MSG_EUROPE ETL',
schedule_interval=timedelta(minutes=15),
start_date=days_ago(0),
tags=['satellite_data'],
) as dag:
download_and_store = DockerOperator(
task_id='download_and_store',
image='satellite_image:latest',
auto_remove=True,
api_version='1.41',
mounts=[Mount(source='/home/archive_1/archive/satellite_data',
target='/app/data'),
Mount(source='/home/dlassahn/projects/forecast-system/meteoIntelligence-satellite',
target='/app')],
command="python3 src/scripts.py download_satellite_images "
"{{ (execution_date - macros.timedelta(hours=4)).strftime('%Y-%m-%d %H:%M') }} "
"'msg_europe' ",
)
download_and_store
The Airflow log:
[2021-08-03 17:23:58,691] {docker.py:231} INFO - Starting docker container from image satellite_image:latest
[2021-08-03 17:23:58,702] {taskinstance.py:1501} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.6/site-packages/docker/api/client.py", line 268, in _raise_for_status
response.raise_for_status()
File "/home/airflow/.local/lib/python3.6/site-packages/requests/models.py", line 943, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http+docker://localhost/v1.41/containers/create
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 1157, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 1331, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 1361, in _execute_task
result = task_copy.execute(context=context)
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/docker/operators/docker.py", line 319, in execute
return self._run_image()
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/docker/operators/docker.py", line 258, in _run_image
tty=self.tty,
File "/home/airflow/.local/lib/python3.6/site-packages/docker/api/container.py", line 430, in create_container
return self.create_container_from_config(config, name)
File "/home/airflow/.local/lib/python3.6/site-packages/docker/api/container.py", line 441, in create_container_from_config
return self._result(res, True)
File "/home/airflow/.local/lib/python3.6/site-packages/docker/api/client.py", line 274, in _result
self._raise_for_status(response)
File "/home/airflow/.local/lib/python3.6/site-packages/docker/api/client.py", line 270, in _raise_for_status
raise create_api_error_from_http_exception(e)
File "/home/airflow/.local/lib/python3.6/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 400 Client Error for http+docker://localhost/v1.41/containers/create: Bad Request ("invalid mount config for type "bind": bind source path does not exist: /tmp/airflowtmp037k87u6")
Trying to set mount_tmp_dir=False yield to an Dag ImportError because of unknown Keyword Argument mount_tmp_dir. (this might be an issue for the Documentation)
Nevertheless I do not know how to configure the tmp directory correctly.
My Airflow Version: 2.1.2
There was a bug in Docker Provider 2.0.0 which prevented Docker Operator to run with Docker-In-Docker solution.
You need to upgrade to the latest Docker Provider 2.1.0
https://airflow.apache.org/docs/apache-airflow-providers-docker/stable/index.html#id1
You can do it by extending the image as described in https://airflow.apache.org/docs/docker-stack/build.html#extending-the-image with - for example - this docker file:
FROM apache/airflow
RUN pip install --no-cache-dir apache-airflow-providers-docker==2.1.0
The operator will work out-of-the-box in this case with "fallback" mode (and Warning message), but you can also disable the mount that causes the problem. More explanation from the https://airflow.apache.org/docs/apache-airflow-providers-docker/stable/_api/airflow/providers/docker/operators/docker/index.html
By default, a temporary directory is created on the host and mounted
into a container to allow storing files that together exceed the
default disk size of 10GB in a container. In this case The path to the
mounted directory can be accessed via the environment variable
AIRFLOW_TMP_DIR.
If the volume cannot be mounted, warning is printed and an attempt is
made to execute the docker command without the temporary folder
mounted. This is to make it works by default with remote docker engine
or when you run docker-in-docker solution and temporary directory is
not shared with the docker engine. Warning is printed in logs in this
case.
If you know you run DockerOperator with remote engine or via
docker-in-docker you should set mount_tmp_dir parameter to False. In
this case, you can still use mounts parameter to mount already
existing named volumes in your Docker Engine to achieve similar
capability where you can store files exceeding default disk size of
the container,
I had the same issue and all "recommended" ways of solving the issue here and setting up mount_dir params as descripted here just lead to other errors. The one solution that helped me was wrapping the invocated by docker code with the VPN (actually this hack was taken from another docker-powered DAG that used VPN and worked well).
So the final solution looks like:
#!/bin/bash
connect_to_vpn.sh &
sleep 10
python3 my_func.py
sleep 10
stop_vpn.sh
wait -n
exit $?
To connect to VPN I used openconnect. The took can be installed with apt install and supports anyconnect protocol (it was my crucial requirement).

Cannot use sqlite with the LocalExecutor [AIrflow]

I am trying to restart the airflow scheduler using the following command
airflow scheduler
I am using docker. I went inside my docker image for airflow and opened the CLI for my airflow image. That is where I used this command.
It throws an exception
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 25, in <module>
from airflow.configuration import conf
File "/usr/local/lib/python3.6/site-packages/airflow/__init__.py", line 31, in <module>
from airflow.utils.log.logging_mixin import LoggingMixin
File "/usr/local/lib/python3.6/site-packages/airflow/utils/__init__.py", line 24, in <module>
from .decorators import apply_defaults as _apply_defaults
File "/usr/local/lib/python3.6/site-packages/airflow/utils/decorators.py", line 36, in <module>
from airflow import settings
File "/usr/local/lib/python3.6/site-packages/airflow/settings.py", line 37, in <module>
from airflow.configuration import conf, AIRFLOW_HOME, WEBSERVER_CONFIG # NOQA F401
File "/usr/local/lib/python3.6/site-packages/airflow/configuration.py", line 731, in <module>
conf.read(AIRFLOW_CONFIG)
File "/usr/local/lib/python3.6/site-packages/airflow/configuration.py", line 421, in read
self._validate()
File "/usr/local/lib/python3.6/site-packages/airflow/configuration.py", line 213, in _validate
self._validate_config_dependencies()
File "/usr/local/lib/python3.6/site-packages/airflow/configuration.py", line 247, in _validate_config_dependencies
self.get('core', 'executor')))
airflow.exceptions.AirflowConfigException: error: cannot use sqlite with the LocalExecutor
I am looking for any way to restart the airflow scheduler.
This is expected.
Since sqlite doesn’t support multiple connections it can only be used with SequentialExecutor. This is also explained in the docs.
If you want to use LocalExecutor please set MySQL or PostgreSQL as backend.

Error while bitbake to create a ROS package for beaglebone

I am trying to compile the hello world program of ROS tutorials to beaglebone black using bitbake. I am using an Ubuntu PC and have setup the workspace as mentioned in the user manual provided in vmayoral github link
I have modified the local.conf file in the /build/conf folder and contents look like this
DL_DIR = "${OEBASE}/sources"
BBFILES = "${OEBASE}/openembedded/recipes/*/*.bb"
ASSUME_PROVIDED += "help2man-native"
PREFERRED_PROVIDERS = "virtual/qte:qte virtual/libqpe:libqpe-opie"
PREFERRED_PROVIDERS += " virtual/libsdl:libsdl-x11"
PREFERRED_PROVIDERS += " virtual/${TARGET_PREFIX}gcc-initial:gcc-cross-initial"
PREFERRED_PROVIDERS += " virtual/${TARGET_PREFIX}gcc-intermediate:gcc-cross-intermediate"
PREFERRED_PROVIDERS += " virtual/${TARGET_PREFIX}gcc:gcc-cross"
PREFERRED_PROVIDERS += " virtual/${TARGET_PREFIX}g++:gcc-cross"
MACHINE = "beaglebone"
DISTRO = "angstrom-2008.1"
IMAGE_FSTYPES = "jffs2 tar"
BBINCLUDELOGS = "yes"
The bitbake recipe as below
DESCRIPTION = "Beginner_tutorials, talker/listener ROS package"
SECTION = "devel"
LICENSE = "MIT"
LIC_FILES_CHKSUM = "file://package.xml;;beginline=16;endline=16;md5=05c8b019cf5b0834bc5e547a1 4f26ca3"
DEPENDS = "roscpp catkin rospy std-msgs"
RDEPENDS = "roscpp rospy std-msgs"
SRC_URI = "git://github.com/vmayoral/beginner_tutorials.git"
SRCREV = "${AUTOREV}"
PV = "1.0.0+gitr${SRCPV}"
S = "${WORKDIR}/git"
inherit catkin
When I run bitbake test.bb from oe/build folder I get this following error
ERROR: Traceback (most recent call last):File /home/srijit/oe/bitbake/lib/bb/cookerdata.py", line 175 in wrapped return func(fn, *args)File "/home/srijit/oe/bitbake/lib/bb/cookerdata.py", line 185, in parse_config_filereturn bb.parse.handle(fn, data, include) File "/home/srijit/oe/bitbake/lib/bb/parse/__init__.py", line 107, in handle return h['handle'](fn, data, include)File "/home/srijit/oe/bitbake/lib/bb/parse/parse_py/ConfHandler.py", line 145, in handle feeder(lineno, s, abs_fn, statements) File "/home/srijit/oe/bitbake/lib/bb/parse/parse_py/ConfHandler.py", line 182, in feederraise ParseError("unparsed line: '%s'" % s, fn, lineno);ParseError: ParseError at home/srijit/oe/openembedded/conf/bitbake.conf:377: unparsed line: 'IMAGE_EXTRA_SPACE = 10240' ERROR: Unable to parse conf/bitbake.conf: ParseError at /home/srijit/oe/openembedded/conf/bitbake.conf:377: unparsed line: 'IMAGE_EXTRA_SPACE = 10240'
I dont know what to do
Thanks for the help in advance
As I did more search on google.. I found here that we can't use the latest bitbake with openembedded-classic. So tried with bitbake 1.10 and this error went away.. But I have a new error now. It is
Unknown Event: <bb.event.NoProvider instance at 0x7f05e40ee248>
ERROR: Nothing PROVIDES 'mobile-unit.bb'
Command execution failed: Traceback (most recent call last):
File "/home/srijit/oe/bitbake/lib/bb/command.py", line 88, in runAsyncCommand commandmethod(self.cmds_async, self, options)
File "/home/srijit/oe/bitbake/lib/bb/command.py", line 174, in buildTargets command.cooker.buildTargets(pkgs_to_build, task)
File "/home/srijit/oe/bitbake/lib/bb/cooker.py", line 782, in buildTargets
taskdata.add_provider(localdata, self.status, k)
File "/home/srijit/oe/bitbake/lib/bb/taskdata.py", line 354, in add_provider
self.add_provider_internal(cfgData, dataCache, item)
File "/home/srijit/oe/bitbake/lib/bb/taskdata.py", line 383, in add_provider_internal
raise bb.providers.NoProvider(item)
NoProvider: mobile-unit.bb
Finally I solved the issue.. thought it will be helpful for somebody else. I think the main issue was my immaturity in understanding the ROS meta-ros layer and how it work and that the overall (mis)direction in installing ROS in BBB. I was trying to compile the beagle-ros for the Angstrom distribution that came with BBB. That was the problem.
Actually I downloaded the lates Angstrom distribution source in my Ubuntu PC and compiled for BBB as described here. Few tweaks here and there
Then we have to flash that Angstrom distribution to an SD Card and boot BBB from that.
Then you follow the instructions here to compile the beagle-ros layer and ros packages using the same bitbake setup as you compiled for Angstrom as discussed here and here
Now copy the compiled ipk files to BBB and install it using opkg and now you can run them on BBB

Python 3 runing script from one user works but from the other doesn't?

When i run this script from user jenkins (Linux Mint) i get this error, and when running it from my user it works. Jenkins user is created with jenkins service. I have installed virtualenv.
import unittest
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
DRIVER = None
def getOrCreateWebdriver():
global DRIVER
DRIVER = DRIVER or webdriver.Firefox()
return DRIVER
class LoginTest(unittest.TestCase):
def setUp(self):
self.browser = getOrCreateWebdriver()
def test_Loggin(self):pass
browser = self.browser
def tearDown(self):
self.browser.close()
if __name__ == '__main__':
unittest.main(verbosity=2)
From user jenkins when i run this script i get error :
test_Loggin (__main__.LoginTest) ... ERROR
/usr/lib/python3.4/unittest/case.py:602: ResourceWarning: unclosed file <_io.BufferedWriter name='/dev/null'>
outcome.errors.clear()
======================================================================
ERROR: test_Loggin (__main__.LoginTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "Test.py", line 16, in setUp
self.browser = getOrCreateWebdriver()
File "Test.py", line 10, in getOrCreateWebdriver
DRIVER = DRIVER or webdriver.Firefox()
File "/usr/local/lib/python3.4/dist-packages/selenium/webdriver/firefox/webdriver.py", line 64, in __init__
self.binary, timeout),
File "/usr/local/lib/python3.4/dist-packages/selenium/webdriver/firefox/extension_connection.py", line 51, in __init__
self.binary.launch_browser(self.profile)
File "/usr/local/lib/python3.4/dist-packages/selenium/webdriver/firefox/firefox_binary.py", line 70, in launch_browser
self._wait_until_connectable()
File "/usr/local/lib/python3.4/dist-packages/selenium/webdriver/firefox/firefox_binary.py", line 100, in _wait_until_connectable
raise WebDriverException("The browser appears to have exited "
selenium.common.exceptions.WebDriverException: Message: The browser appears to have exited before we could connect. If you specified a log_file in the FirefoxBinary constructor, check it for details.
When you're logged in as yourself, you need to do echo $DISPLAY and note the display info it prints. Subsequently when you login as jenkins service you need to do xhost +, DISPLAY=[display-info]; export DISPLAY. (display-info is what you got from echo $DISPLAY, ignore the square brackets, they shouldn't be specified in the command)
Hopefully this should work. I don't have similar env to test, just mentioning what i recollect having done it quite sometime back.

Resources