What setting in spyder should I change such that when I submit this code, I get identical looking output for both print call ?
import pandas as pd
df = pd.Dataframe({'a':[1,2,3], 'b':[2,3,4]})
df.shape
print df.shape
I suspect that df.shape isn't exactly the same as print df.shape but on any other installation of Spyder that I have used, both call behave the same. I am using python 2.7.12, conda 4.1.11 and pandas 0.18.1
Currently print df.shape correctly prints the shape of df in the console. df.shape is displaying the shape of df in a bigger and dark font in the console.
A picture is worth a 1000 words, what I am getting currently looks like this:
Related
I am trying to use PIL to show an image. I know that I can use other modules to do that. I am working on google colab. But I can't figure out why PIL is not showing output image.
% matplotlib inline
import numpy as np
import PIL
im=Image.open('/content/drive/My Drive/images-process.jpeg')
print(im.width, im.height, im.mode, im.format, type(im))
im.show()
output: 739 415 RGB JPEG < class 'PIL.JpegImagePlugin.JpegImageFile'>
Instead of
im.show()
Try just
im
Colab should try to display it on its own. See example notebook
Use
display(im)
instead of im.show() or im.
When using these options after multiple lines or in a loop, im won't work.
After you open an image(which you have done using Image.open()), try converting using im.convert() to which ever mode image is in then do display(im)
It will work
I have successfully used map_blocks a few times on dask arrays. I'm now trying to deploy a numba function to act on each block, and to act and change one of the inputs.
The numba function takes in 2 numpy arrays, and updates the second one. this is then returned in the return statement to make it available to map_blocks as a result.
The function works fine on a numpy array, but python just crashes when calling it from map_blocks. numba functions that do not act on an input array behave normally (although it is difficult to get them to do anything useful in this case).
Is this a known limitation? A bug? Am I using it wrong?!
Update
I've finally boiled it down to a reproducible example with a trivial numba function, and I get a clearer idea of the problem. However I'm still unclear on how to resolve the issue. Here's the code:
import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
b=np.zeros((size,),dtype='float64')
dista=da.from_array(a,chunks=size//4)
distb=da.from_array(b,chunks=size//4)
#jit(float64[:](float64[:],float64[:]))
def crasher(x,y):
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,distb,dtype='float64')
c=distc.compute() #it all crashes at this point
And I now get a more comprehensible error rather than just a straight up crash:
TypeError: No matching definition for argument type(s) readonly array(float64, 1d, C), readonly array(float64, 1d, C)
So if numba is receiving numpy arrays with write=False set, how do you get numba to do any useful work? You can't put an array creation line in the numba function, and you can't feed it writeable arrays.
Any views on how to achieve this?
Here is a version of your code with array creation, which runs fine with numba nopython mode
import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
dista=da.from_array(a,chunks=size//4)
#jit(nopython=True)
def crasher(x):
y = np.empty_like(x)
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,dtype='float64')
c=distc.compute()
Note the y= line. Note the list of numpy functions supported, according to the documentation.
I am using dask to read a csv file. However, i couldn't apply or compute any operation on it because of this error:
Do you have ideas what is this error all about and how to fix it?
On reading csv file in dask, errors comes in upon not recognizing the correct dtype of columns.
For example, we read a csv file using dask as follows:
import dask.dataframe as dd
df = dd.read_csv('\data\file.txt', sep='\t', header='infer')
This prompts the error mentioned above.
To solve this problem, as suggested by #mrocklin on this comment, https://github.com/dask/dask/issues/1166, we need to determine the dtype of the columns. We can do this by reading the csv file in pandas and identify the data type and pass that as argument in reading csv using dask.
df_pd = pd.read_csv('\data\file.txt', sep='\t', header='infer')
dt = df_pd.dtypes.to_dict()
df = dd.read_csv('\data\file.txt', sep='\t', header='infer', dtype=dt)
I've got scanned image files that I perform some preprocessing on and get them looking something like this:
My phone's ZBar app can read this QR code fine, but zbarimg seems to be unable to figure it out. I've tried all sorts of things in ImageMagick to make it smoother (-smooth, -morphology) but even with slightly better-looking results, zbarimg still comes up blank.
Why would my phone's ZBar be so much better than my computer's (zbar-0.10)? Is there anything I can do to get zbarimg to read this successfully?
You can try morphological closing.
Python code:
# -*- coding: utf-8 -*-
import qrtools
import cv2
import numpy as np
imgPath = "Fdnm1.png"
img = cv2.imread(imgPath, 0)
kernel = np.ones((5, 5), np.uint8)
processed=cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)
cv2.imwrite('test.png', processed)
d = qrtools.QR(filename='test.png')
d.decode()
print d.data
Result:
1MB24
When I load ipython with any one of:
ipython qtconsole
ipython qtconsole --pylab
ipython qtconsole --pylab inline
The output buffer only holds the last 500 lines. To see this run:
for x in range(0, 501):
...: print x
Is there a configuration option for this?
I've tried adjusting --cache-size but this does not seem to make a difference.
Quickly:
ipython qtconsole --IPythonWidget.buffer_size=1000
Or you can set it permanently by adding:
c.IPythonWidget.buffer_size=1000
in your ipython config file.
For discovering this sort of thing, a helpful trick is:
ipython qtconsole --help-all | grep PATTERN
For instance, you already had 'buffer', so:
$> ipython qtconsole --help-all | grep -C 3 buffer
...
--IPythonWidget.buffer_size=<Integer>
Default: 500
The maximum number of lines of text before truncation. Specifying a non-
positive number disables text truncation (not recommended).
If IPython used a different name than you expect and that first search turned up nothing, then you could use 500, since you knew what the value was that you wanted to change, which would also find the relevant config.
The accepted answer is no longer correct if you are using Jupyter. Instead, the command line option should be:
jupyter qtconsole --ConsoleWidget.buffer_size=5000
You can choose whatever value you want, just make it larger than the default of 500.
If you want to make this permanent, go to your home directory - C:\Users\username, /Users/username, or /home/username - then go into the .jupyter folder (create it if it doesn't exist), then create the file jupyter_qtconsole_config.py and open it up in your favorite editor. Add the following line:
c.ConsoleWidget.buffer_size=5000
Again, the number can be anything, just as long as it is an integer larger than 500. Don't worry that c isn't defined in this particular file, it is already defined elsewhere in the startup machinery.
Thanks to #firescape for the pointer in the right direction.