When I load ipython with any one of:
ipython qtconsole
ipython qtconsole --pylab
ipython qtconsole --pylab inline
The output buffer only holds the last 500 lines. To see this run:
for x in range(0, 501):
...: print x
Is there a configuration option for this?
I've tried adjusting --cache-size but this does not seem to make a difference.
Quickly:
ipython qtconsole --IPythonWidget.buffer_size=1000
Or you can set it permanently by adding:
c.IPythonWidget.buffer_size=1000
in your ipython config file.
For discovering this sort of thing, a helpful trick is:
ipython qtconsole --help-all | grep PATTERN
For instance, you already had 'buffer', so:
$> ipython qtconsole --help-all | grep -C 3 buffer
...
--IPythonWidget.buffer_size=<Integer>
Default: 500
The maximum number of lines of text before truncation. Specifying a non-
positive number disables text truncation (not recommended).
If IPython used a different name than you expect and that first search turned up nothing, then you could use 500, since you knew what the value was that you wanted to change, which would also find the relevant config.
The accepted answer is no longer correct if you are using Jupyter. Instead, the command line option should be:
jupyter qtconsole --ConsoleWidget.buffer_size=5000
You can choose whatever value you want, just make it larger than the default of 500.
If you want to make this permanent, go to your home directory - C:\Users\username, /Users/username, or /home/username - then go into the .jupyter folder (create it if it doesn't exist), then create the file jupyter_qtconsole_config.py and open it up in your favorite editor. Add the following line:
c.ConsoleWidget.buffer_size=5000
Again, the number can be anything, just as long as it is an integer larger than 500. Don't worry that c isn't defined in this particular file, it is already defined elsewhere in the startup machinery.
Thanks to #firescape for the pointer in the right direction.
Related
The Spyder IDE logs the commands from the console in ~/.config/spyder-py3/history.py. However, it only stores about 1000 lines of history. How do I increase this limit?
I followed the advice in this post and increased the buffer size to 5000. However, it does not increase the length of the history file, (and I think it changes only the console buffer size).
In short, I am looking for the equivalent of HISTFILESIZE in .bashrc. I do not mind even if the buffer size remains short (say 500, the default size), i.e. I don't care about HISTSIZE equivalent of .bashrc.
P.S. there is another empty file in the config path: ~/.config/spyder-py3/history_internal.py . Don't know if that matters.
I am relatively new to machine learning/python/ubuntu.
I have a set of images in .jpg format where half contain a feature I want caffe to learn and half don't. I'm having trouble in finding a way to convert them to the required lmdb format.
I have the necessary text input files.
My question is can anyone provide a step by step guide on how to use convert_imageset.cpp in the ubuntu terminal?
Thanks
A quick guide to Caffe's convert_imageset
Build
First thing you must do is build caffe and caffe's tools (convert_imageset is one of these tools).
After installing caffe and makeing it make sure you ran make tools as well.
Verify that a binary file convert_imageset is created in $CAFFE_ROOT/build/tools.
Prepare your data
Images: put all images in a folder (I'll call it here /path/to/jpegs/).
Labels: create a text file (e.g., /path/to/labels/train.txt) with a line per input image . For example:
img_0000.jpeg 1
img_0001.jpeg 0
img_0002.jpeg 0
In this example the first image is labeled 1 while the other two are labeled 0.
Convert the dataset
Run the binary in shell
~$ GLOG_logtostderr=1 $CAFFE_ROOT/build/tools/convert_imageset \
--resize_height=200 --resize_width=200 --shuffle \
/path/to/jpegs/ \
/path/to/labels/train.txt \
/path/to/lmdb/train_lmdb
Command line explained:
GLOG_logtostderr flag is set to 1 before calling convert_imageset indicates the logging mechanism to redirect log messages to stderr.
--resize_height and --resize_width resize all input images to same size 200x200.
--shuffle randomly change the order of images and does not preserve the order in the /path/to/labels/train.txt file.
Following are the path to the images folder, the labels text file and the output name. Note that the output name should not exist prior to calling convert_imageset otherwise you'll get a scary error message.
Other flags that might be useful:
--backend - allows you to choose between an lmdb dataset or levelDB.
--gray - convert all images to gray scale.
--encoded and --encoded_type - keep image data in encoded (jpg/png) compressed form in the database.
--help - shows some help, see all relevant flags under Flags from tools/convert_imageset.cpp
You can check out $CAFFE_ROOT/examples/imagenet/convert_imagenet.sh
for an example how to use convert_imageset.
I'm using ImageMagick (with Wand in Python) to convert images and to get thumbnails from them. However, I noticed that I need to verify whether a file is an image or not ahead of time. Should I do this with Identify?
So I would assume checking the integrity of a file needs the whole file to be read into memory. Is it better to try and convert the file and if there was an error, then we know the file wasn't good.
seems like you answered your own question
$ ls -l *.png
-rw-r--r-- 1 jsp jsp 526254 Jul 20 12:10 image.png
-rw-r--r-- 1 jsp jsp 10000 Jul 20 12:12 image_with_error.png
$ identify image.png &> /dev/null; echo $?
0
$ identify image_with_error.png &> /dev/null; echo $?
0
$ convert image.png /dev/null &> /dev/null ; echo $?
0
$ convert image_with_error.png /dev/null &> /dev/null ; echo $?
1
if you specify the regard-warnings flag with the imagemagick identify tool
magick identify -regard-warnings myimage.jpg
it will throw an error if there are any warnings about the file. This is good for checking images, and seems to be a lot faster than using verbose.
I the case you use Python you can consider also the Pillow module.
In my experiments, I have used both the Pyhton Pillow module (PIL) and the Imagemagick wrapper Wand (for psd, xcf formats) in order to detect broken images, the original answer with code snippets is here.
Update:
I also implemented this solution in my Python script here on GitHub.
I also verified that damaged files (jpg) frequently are not 'broken' images i.e, a damaged picture file sometimes remains a legit picture file, the original image is lost or altered but you are still able to load it.
End Update
I quote the full answer for completeness:
You can use Python Pillow(PIL) module, with most image formats, to check if a file is a valid and intact image file.
In the case you aim at detecting also broken images, #Nadia Alramli correctly suggests the im.verify() method, but this does not detect all the possible image defects, e.g., im.verify does not detect truncated images (that most viewers often load with a greyed area).
Pillow is able to detect these type of defects too, but you have to apply image manipulation or image decode/recode in or to trigger the check. Finally I suggest to use this code:
try:
im = Image.load(filename)
im.verify() #I perform also verify, don't know if he sees other types o defects
im.close() #reload is necessary in my case
im = Image.load(filename)
im.transpose(PIL.Image.FLIP_LEFT_RIGHT)
im.close()
except:
#manage excetions here
In case of image defects this code will raise an exception.
Please consider that im.verify is about 100 times faster than performing the image manipulation (and I think that flip is one of the cheaper transformations).
With this code you are going to verify a set of images at about 10 MBytes/sec (using single thread of a modern 2.5Ghz x86_64 CPU).
For the other formats psd,xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:
im = wand.image.Image(filename=filename)
temp = im.flip;
im.close()
But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.
I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.
I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:
statfile = os.stat(filename)
filesize = statfile.st_size
if filesize == 0:
#manage here the 'faulty image' case
Here's another solution using identify, but without convert:
identify -verbose *.png 2>&1 | grep "corrupt image"
identify: corrupt image 'image_with_error.png' # error/png.c/ReadPNGImage/4051.
i use identify:
$ identify image.tif
00000005.tif TIFF 4741x6981 4741x6981+0+0 8-bit DirectClass 4.471MB 0.000u 0:00.010
$ echo $?
When I run Apple's Automator to simply cut a bunch of images in their size Automator will also reduce the quality of the files (jpg) and they get blurry.
How can I prevent this? Are there settings that I can take control of?
Edit:
Or are there any other tools that do the same job but without affecting the image quality?
If you want to have finer control over the amount of JPEG compression, as kopischke said you'll have to use the sips utility, which can be used in a shell script. Here's how you would do that in Automator:
First get the files and the compression setting:
The Ask for Text action should not accept any input (right-click on it, select "Ignore Input").
Make sure that the first Get Value of Variable action is not accepting any input (right-click on them, select "Ignore Input"), and that the second Get Value of Variable takes the input from the first. This creates an array that is then passed on to the shell script. The first item in the array is the compression level that was given to the Automator Script. The second is the list of files that the script will do the sips command on.
In the options on the top of the Run Shell Script action, select "/bin/bash" as the Shell and select "as arguments" for Pass Input. Then paste this code:
itemNumber=0
compressionLevel=0
for file in "$#"
do
if [ "$itemNumber" = "0" ]; then
compressionLevel=$file
else
echo "Processing $file"
filename="$file"
sips -s format jpeg -s formatOptions $compressionLevel "$file" --out "${filename%.*}.jpg"
fi
((itemNumber=itemNumber+1))
done
((itemNumber=itemNumber-1))
osascript -e "tell app \"Automator\" to display dialog \"${itemNumber} Files Converted\" buttons {\"OK\"}"
If you click on Results at the bottom, it'll tell you what file it's currently working on. Have fun compressing!
Automator’s “Crop Images” and “Scale Images” actions have no quality settings – as is often the case with Automator, simplicity trumps configurability. However, there is another way to access CoreImage’s image manipulation facilities whithout resorting to Cocoa programming: the Scriptable Image Processing System, which makes image processing functions available to
the shell via the sips utility. You can fiddle with the most minute settings using this, but as it is a bit arcane in handling, you might be better served with the second way,
AppleScript via Image Events, a scriptable faceless background application provided by OS X. There are crop and scale commands, and the option of specifying a compression level when saving as a JPEG with
save <image> as JPEG with compression level (low|medium|high)
Use a “Run AppleScript” action instead of your “Crop” / “Scale” one and wrap the Image Events commands in a tell application "Image Events" block, and you should be set. For instance, to scale the image to half its size and save as a JPEG in best quality, overwriting the original:
on run {input, parameters}
set output to {}
repeat with aPath in input
tell application "Image Events"
set aPicture to open aPath
try
scale aPicture by factor 0.5
set end of output to save aPicture as JPEG with compression level low
on error errorMessage
log errorMessage
end try
close aPicture
end tell
end repeat
return output -- next action processes edited files.
end run
– for other scales, adjust the factor accordingly (1 = 100 %, .5 = 50 %, .25 = 25 % etc.); for a crop, replace the scale aPicture by factor X by crop aPicture to {width, height}. Mac OS X Automation has good tutorials on the usage of both scale and crop.
Eric's code is just brilliant. Can get most of the jobs done.
but if the image's filename contains space, this workflow will not work.(due to space will break the shell script when processing sips.)
There is a simple solution for this: add "Rename Finder Item" in this workflow.
replace spaces with "_" or anything you like.
then, it's good to go.
Comment from '20
I changed the script into a quick action, without any prompts (for compression as well as confirmation). It duplicates the file and renames the original version to _original. I also included nyam's solution for the 'space' problem.
You can download the workflow file here: http://mobilejournalism.blog/files/Compress%2080%20percent.workflow.zip (file is zipped, because otherwise it will be recognized as a folder instead of workflow file)
Hopefully this is useful for anyone searching for a solution like this (like I did an hour ago).
Comment from '17
To avoid "space" problem, it's smarter to change IFS than renaming.
Back up current IFS and change it to \n only. And restore original IFS after the processing loop.
ORG_IFS=$IFS
IFS=$'\n'
for file in $#
do
...
done
IFS=$ORG_IFS
I am using libsvm for binary classification.. I wanted to try grid.py , as it is said to improve results.. I ran this script for five files in separate terminals , and the script has been running for more than 12 hours..
this is the state of my 5 terminals now :
[root#localhost tools]# python grid.py sarts_nonarts_feat.txt>grid_arts.txt
Warning: empty z range [61.3997:61.3997], adjusting to [60.7857:62.0137]
line 2: warning: Cannot contour non grid data. Please use "set dgrid3d".
Warning: empty z range [61.3997:61.3997], adjusting to [60.7857:62.0137]
line 4: warning: Cannot contour non grid data. Please use "set dgrid3d".
[root#localhost tools]# python grid.py sgames_nongames_feat.txt>grid_games.txt
Warning: empty z range [64.5867:64.5867], adjusting to [63.9408:65.2326]
line 2: warning: Cannot contour non grid data. Please use "set dgrid3d".
Warning: empty z range [64.5867:64.5867], adjusting to [63.9408:65.2326]
line 4: warning: Cannot contour non grid data. Please use "set dgrid3d".
[root#localhost tools]# python grid.py sref_nonref_feat.txt>grid_ref.txt
Warning: empty z range [62.4602:62.4602], adjusting to [61.8356:63.0848]
line 2: warning: Cannot contour non grid data. Please use "set dgrid3d".
Warning: empty z range [62.4602:62.4602], adjusting to [61.8356:63.0848]
line 4: warning: Cannot contour non grid data. Please use "set dgrid3d".
[root#localhost tools]# python grid.py sbiz_nonbiz_feat.txt>grid_biz.txt
Warning: empty z range [67.9762:67.9762], adjusting to [67.2964:68.656]
line 2: warning: Cannot contour non grid data. Please use "set dgrid3d".
Warning: empty z range [67.9762:67.9762], adjusting to [67.2964:68.656]
line 4: warning: Cannot contour non grid data. Please use "set dgrid3d".
[root#localhost tools]# python grid.py snews_nonnews_feat.txt>grid_news.txt
Wrong input format at line 494
Traceback (most recent call last):
File "grid.py", line 223, in run
if rate is None: raise "get no rate"
TypeError: exceptions must be classes or instances, not str
I had redirected the outputs to files , but those files for now contain nothing..
And , the following files were created :
sbiz_nonbiz_feat.txt.out
sbiz_nonbiz_feat.txt.png
sarts_nonarts_feat.txt.out
sarts_nonarts_feat.txt.png
sgames_nongames_feat.txt.out
sgames_nongames_feat.txt.png
sref_nonref_feat.txt.out
sref_nonref_feat.txt.png
snews_nonnews_feat.txt.out (--> is empty )
There's just one line of information in .out files..
the ".png" files are some GNU PLOTS .
But i dont understand what the above GNUplots / warnings convey .. Should i re-run them ?
Can anyone please tell me on how much time this script might take if each input file contains about 144000 lines..
Thanks and regards
Your data is huge, 144 000 lines. So this will take sometime. I used large data such as yours and it took up to a week to finish. If you using images, which I suppose you are, hence the large data, try resizing your image before creating the data. You should get approximately the same results with your images resized.
The libSVM faq speaks to your question:
Q: Why grid.py/easy.py sometimes generates the following warning message?
Warning: empty z range [62.5:62.5], adjusting to [61.875:63.125]
Notice: cannot contour non grid data!
Nothing is wrong and please disregard the message. It is from gnuplot when drawing the contour.
As a side note, you can parallelize your grid.py operations. The libSVM tools directory README file has this to say on the matter:
Parallel grid search
You can conduct a parallel grid search by dispatching jobs to a
cluster of computers which share the same file system. First, you add
machine names in grid.py:
ssh_workers = ["linux1", "linux5", "linux5"]
and then setup your ssh so that the authentication works without
asking a password.
The same machine (e.g., linux5 here) can be listed more than once if
it has multiple CPUs or has more RAM. If the local machine is the
best, you can also enlarge the nr_local_worker. For example:
nr_local_worker = 2
In my Ubuntu 10.04 installation grid.py is actually /usr/bin/svm-grid.py
I guess grid.py is trying to find the optimal value for C (or Nu)?
I don't have an answer for the amount of time it will take, but you might want to try this SVM library, even though it's an R package: svmpath.
As described on that page there, it will compute the entire "regularization path" for a two class SVM classifier in about as much time as it takes to train an SVM using one value of your penalty param C (or Nu).
So, instead of training and doing cross validation for an SVM with a value x for your C parameter, then doing all of that again for value x+1 for C, x+2, etc. You can just train the SVM once, then query its predictive performance for different values of C post-facto, so to speak.
Change:
if rate is None: raise "get no rate"
in line 223 in grid.py to:
if rate is None: raise ValueError("get no rate")
Also, try adding:
gnuplot.write("set dgrid3d\n")
after this line in grid.py:
gnuplot.write("set contour\n")
This should fix your warnings and errors, but I am not sure if it will work, since grid.py seems to think your data has no rate.