install %%R cell magic in docker from jupyter docker stack

install %%R cell magic in docker from jupyter docker stack - docker

I tried installing the datascience jupyter docker image (tag 45b8529a6bfc, last update Feb 14, 2019) from docker stacks. My entire dockerfile:
FROM jupyter/datascience-notebook:45b8529a6bfc
USER $NB_UID
When I open a new Jupyter notebook with an R kernel, the notebook works fine. When I try a %%R cell magic in an ipython notebook, it doesn't work:
%%R
3+4
UsageError: Cell magic `%%R` not found.
I wandered around various stackoverflow answers and internet searches, tried installing rpy2 (it was already installed). Didn't work.
Suggestions?

Load the jupyter extension before you try to use it:
%load_ext rpy2.ipython

I tried %load_ext rpy2.ipython as suggested by #lgautier, and got the error message No module named 'simplegeneric'. Once I pip installed simplegeneric, everything works and I don't need the load_ext statement.
Not sure why the dockerfile doesn't install simplegeneric, but there you have it.

Related

How to start graphframes on spark on pyspark on juypter on docker?

Been playing with pyspark on juypter all day with no issues. Just by simply using the docker image juypter/pyspark-notebook, 90% of everything I need is packaged (YAY!)
I would like to start exploring using GraphFrames, which sits on top of GraphX which sits on top of Spark. Has anyone gotten this combination to work?
Essentially, according to the documentation, I just need to pass "--packages graphframes:xxyyzz" when running pyspark to download and run graphframes. Problem is that juypter is already running as soon as the container comes up.
I've tried passing the "--packages" line as an environment variable (-e) for both JUYPTER_SPARK_OPTS and SPARK_OPTS when running docker run and that didn't work. I found that I can do pip install graphframes from a terminal, which gets me part of the way -- the python libraries are installed, but the java ones are not "java.lang.ClassNotFoundException: org.graphframes.GraphFramePythonAPI".
The image specifics documentation does not appear to offer any insights on how to deploy a Spark Package to the image.
Is there a certain place to throw the graphframes .jar? Is there a command to install a spark package post-docker? Is there a magic argument to docker run that would install this?
I bet there's a really simple answer to this --Or am I in high cotton here?
References:
No module named graphframes Jupyter Notebook
How do I run pyspark with jupyter notebook?

So the answer was quite simple:
From the gist here, we need to simply tell juypter to add the --packages line to the SPARK_SUBMIT with something like this to the top of my notebook. Spark goes out and installs the package when grabbing the context:
import os
os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages graphframes:graphframes:0.8.1-spark3.0-s_2.12 pyspark-shell'
Keep a watch on the versions available at the graphframes package, which for now, means graphframes 0.8.1 on spark 3.0 on scala 2.12.

Plotly shows blank graphs in AWS Sagemaker JupyterLab

Background: I am new to the Python world and am using Plotly for creating basic graphs in Python. I am using AWS Sagemaker's JupyterLab for creating the python scripts.
Issue: I have been trying to run the basic codes mentioned on Plotly's website however even those are returning blank graphs.
Issue Resolution Tried by myself:
pip installed plotly version 4.6.0
Steps mentioned on https://plotly.com/python/getting-started/ for JupyterLab support have already been executed
Code Example:
import plotly.graph_objects as go
fig = go.Figure(data=go.Bar(y=[2, 3, 1]))
fig.show()

I recently had the same issue. Simple change suggested here helped me. I know this is a temporary workaround until a proper fix is found.
// fig = go.Figure()
fig = go.FigureWidget() // replace with this
// fig.show()
fig // remove .show()

Sagemaker notebook instances are using (As of Jan 2022), for some reason, jupyterlab==1.2.21. You can verify that by running pip freeze | grep lab from the terminal or !pip freeze | grep lab from a notebook.
According to the documentation, you'll need to install the following jupyterlab extensions (which are not needed if sagemaker was running jupyterlab 3):
jupyterlab-plotly
jupyter-widgets/jupyterlab-manager
You can install those on a up-and-running instance by running
jupyter labextension install jupyterlab-plotly#5.5.0 #jupyter-widgets/jupyterlab-manager in the terminal or notebook (using ! if you are running on the notebook ofcourse). Notice that the jupyterlab-plotly extension version (here 5.5.0) should match the plotly version you are installing. Mismatches my cause issues. In this case by plotly version is 5.5.0 and thus that's also the jupyterlab-plotly version I've installed.
If you need, like I did, to have it ready upon spinning up a notebook instance, you'll need to:
Create a lifecycle script
To it, add:
PATH=$PATH:/home/ec2-user/anaconda3/envs/JupyterSystemEnv/bin - To ensure nodejs path which is needed for the extension installation
pip install plotly==5.5.0 - To ensure a specific version
jupyter labextension install jupyterlab-plotly#5.5.0 #jupyter-widgets/jupyterlab-manager - To ensure same version
of coures, you can change the version according to the most up to date.

I think that documentation is not on par. You now need to install jupyterlab-plotly extension.
jupyter labextension install jupyterlab-plotly
UPDATE
I followed a mix of instructions here and here.
First Enable Extention manager from jupyter-lab
then from terminal
conda install -c conda-forge "nbformat" "ipywidgets>=7.5" -y
jupyter labextension install jupyterlab-plotly
jupyter labextension install #jupyter-widgets/jupyterlab-manager plotlywidget
And within your environment
conda install nbformat

Pytorch errors: "received an invalid combination of arguments" in Jupyter Notebook

I'm trying to learn Pytorch, but whenever I seem to try any online tutorial (https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#sphx-glr-beginner-blitz-tensor-tutorial-py), I get errors when trying to run certain functions, but only in Jupyter Notebook.
When running
x = torch.empty(5, 3)
I get an error:
module 'torch' has no attribute 'empty'
Furthermore, when running
x = torch.zeros(5, 3, dtype=torch.long)
I get the error:
module 'torch' has no attribute 'long'
Some other functions work fine like:
x = torch.rand(5, 3)
But generally, most code I try to run seems to run into an error really quickly. I couldn't find any resolution online.
When I go into my docker container and simply run python in the shell, I can run these lines just fine with no errors.
I'm running pytorch in a Docker image that I extended from a fastai image, as it already included things like jupyter notebook and pytorch. I used anaconda to update everything, and committed it to a new image for myself.
I have absolutely no idea what the issue could be. I've tried updating packages through anaconda, pip, aptitude in my docker container, and making sure to commit my changes, but nothing seems to work. I also tried creating a new kernel with python 3.7 as I noticed that my Jupyter Notebook only runs in 3.6.4, and when I run python in the shell it is at 3.7.
I've also tried getting different docker images and extending them with what I need, but all images that I've tried have had errors with anaconda where it gets stuck on "Solving environment" step.

Ok, so the fix for me was to either update pytorch through conda using the following command
conda update pytorch
If it's not installed yet, I've gotten it to work in other environments by simply installing it through conda
conda install pytorch
Kind of stupid that I didn't try this earlier, but I was confused on the difference between conda and pip.

How to install wkhtmltopdf for docker

I'm using wkhtmltopdf for nodejs, followed instructions for windows installation (and added it to PATH after installation). When i start my app through bash, it works just fine as it should. I manage to convert html to pdf.
But it doesnt work when im using docker, like it doesnt even exists. Im assuming there is some other way to install it for docker, or some way to add PATH to docker?? Any other ideas? hints?
And before u say it, been googling it and looking for images and installations for docker, none helped. Got one that u know it works?

Anyways for all the others that found themselves in the same pickle... I was trying to use wkhtmltopdf within docker container while wkhtmltopdf was only installed and executable within system (windows/linux) environment and not in the actual docker environment... after updating dockerfile to automatically install wkhtml with the build, I also had to SET THE PATH.. for linux docker smth like this
cp wkhtmltox/bin/* /usr/local/bin/ &&
that made everything works just as it should.

dockerfile: vim (compiled python), vim-ipython, and ipython notebook

I would like to build a Dockerfile in linux which
1. compiles vim with python
2. installs python stack (such as numpy, scipy, ipython, etc)
3. creates ssl certificate for ipython-notebook, to view the notebooks on host machine
It seemed straightforward enough. But I have run into problems despite a variety of approaches, such as linking separate containers, using anaconda, as well as with a single unified image vs separate layers, or creating a user or running all as a root.
In order to run vim, simply installing to root, does not activate pathogen bundle/vim-ipython. Creating a user allows pathogen bundles (ie nerdtree works) to install, but :IPython throws error.
:IPython failed
^-- failed '' not found .
Ive tried the above with no layers/1 large Dockerfile, and with different layers for the python stack, vim, and the ipython notebook.
Dockerfile
What am I not seeing here ?
what does the ^-- failed '' not found referring to?
Ive tried running the ipython notebook using --no-browser & and then running vim, or using running two shells on the same container... but cant get past this error.

Here is a working Dockerfile for anyone trying to get vim-ipython working in Docker.
issues:
user/shared home needed to for vim, despite runtimepath in .vimrc to pathogen/bundle
%connect_info >> required with containers
I am running in root, not sure why vim required a USER to install packages, but changing to USER would throw errors with CMD
--best

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart