Caret doesn't run in parallel - r-caret

Actual parallelizing caret depends on R , caret and doMC packages . As described at Parallelizing Caret code
Does anyone working with similar enviroment as I do ? What the max R version where R caret paralellization working correctly ?
> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.2 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] caret_6.0-52 ggplot2_1.0.1 lattice_0.20-31 doMC_1.3.3 iterators_1.0.7 foreach_1.4.2 RStudioAMI_0.2
loaded via a namespace (and not attached):
[1] Rcpp_0.12.1 magrittr_1.5 splines_3.2.1 MASS_7.3-41 munsell_0.4.2 colorspace_1.2-6
[7] minqa_1.2.4 car_2.1-0 stringr_1.0.0 plyr_1.8.3 tools_3.2.1 pbkrtest_0.4-2
[13] nnet_7.3-9 grid_3.2.1 gtable_0.1.2 nlme_3.1-120 mgcv_1.8-6 quantreg_5.19
[19] MatrixModels_0.4-1 gtools_3.5.0 lme4_1.1-9 digest_0.6.8 Matrix_1.2-0 nloptr_1.0.4
[25] reshape2_1.4.1 codetools_0.2-11 stringi_0.5-5 BradleyTerry2_1.0-6 scales_0.3.0 stats4_3.2.1
[31] SparseM_1.7 brglm_0.5-9 proto_0.3-10
Update 1 :
My code follows :
library(doMC) ; registerDoMC(cores=4)
library(caret)
classification_formula <- as.formula(paste("target" ,"~",
paste(names(m_input_data)[!names(m_input_data)=='target'],collapse="+")))
CVfolds <- 2
CVreps <- 5
ma_control <- trainControl(method = "repeatedcv",
number = CVfolds,
repeats = CVreps ,
returnResamp = "final" ,
classProbs = T,
summaryFunction = twoClassSummary,
allowParallel = TRUE,verboseIter = TRUE)
rf_tuneGrid = expand.grid(mtry = seq(2,32, length.out = 6))
rf <- train(classification_formula , data = m_input_data , method = "rf", metric="ROC" ,trControl = ma_control, tuneGrid = rf_tuneGrid , ntree = 101)
Update 2 :
When I run from command line the only one core is working
When I run these script from Rstudio the paralell is working since I see 4
processes via top . But a second after this the error happens :
Error in names(resamples) <- gsub("^\\.", "", names(resamples)) :
attempt to set an attribute on NULL
Update 4 :
Hi , it seems the problem was in R session that was terminated . Each time I am start AWS instance I was run the R code with now refresh the R engine . Now each time I refresh Rstudio browser I do Session -> Restart R . Seems it runs .
I am checking now if the same for run the script from Ubuntu command line.
Generally it is running without to finish . Caret parallel on the data level . It means it is able to process each resample on different process . But if sample still big ( 100,000 / 2 ( number of folds = 2) X 2,000 features ) this can be hard to finish for each processor unit . Am I right ?
I think the parallelism must on algorithm level . It means each algorithm run likely to run on several cores . If such algorithm imlpementation avialable in caret ???

I have the latest release for Linux platforms, R version 3.2.2 (2015-08-14, Fire Safety), and paralellization works fine. Can you provide your code that does not work in parallel.
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.3 LTS
locale:
[1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8
[5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8 LC_PAPER=en_CA.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] kernlab_0.9-22 doMC_1.3.3 iterators_1.0.7 foreach_1.4.2 caret_6.0-52 ggplot2_1.0.1 lattice_0.20-33
loaded via a namespace (and not attached):
[1] Rcpp_0.12.0 compiler_3.2.2 nloptr_1.0.4 plyr_1.8.3 tools_3.2.2 digest_0.6.8
[7] lme4_1.1-9 nlme_3.1-122 gtable_0.1.2 mgcv_1.8-7 Matrix_1.2-2 brglm_0.5-9
[13] SparseM_1.7 proto_0.3-10 BradleyTerry2_1.0-6 stringr_1.0.0 gtools_3.5.0 MatrixModels_0.4-1
[19] stats4_3.2.2 grid_3.2.2 nnet_7.3-10 minqa_1.2.4 reshape2_1.4.1 car_2.0-26
[25] magrittr_1.5 scales_0.3.0 codetools_0.2-11 MASS_7.3-43 splines_3.2.2 pbkrtest_0.4-2
[31] colorspace_1.2-6 quantreg_5.18 stringi_0.5-5 munsell_0.4.2
I've used your code for the BreastCancer dataset on my local machine and it worked in parallel without any problem. I am using RStudio Version 0.98.1103.
library(caret)
library(mlbench)
data(BreastCancer)
library(doMC)
registerDoMC(cores=2)
classification_formula <- as.formula(paste("Class" ,"~",
paste(names(BreastCancer)[!names(BreastCancer)=='Class'],collapse="+")))
CVfolds <- 2
CVreps <- 5
ma_control <- trainControl(method = "repeatedcv",
number = CVfolds,
repeats = CVreps ,
returnResamp = "final" ,
classProbs = T,
summaryFunction = twoClassSummary,
allowParallel = TRUE,verboseIter = TRUE)
rf_tuneGrid = expand.grid(mtry = seq(2,32, length.out = 6))
#Notice, it might be easier just to use Class~.
#instead of classification_formula
rf <- train(classification_formula ,
data = BreastCancer ,
method = "rf",
metric="ROC" ,
trControl = ma_control,
tuneGrid = rf_tuneGrid ,
ntree = 101)
> rf
Random Forest
699 samples
10 predictors
2 classes: 'benign', 'malignant'
No pre-processing
Resampling: Cross-Validated (2 fold, repeated 5 times)
Summary of sample sizes: 341, 342, 342, 341, 342, 341, ...
Resampling results across tuning parameters:
mtry ROC Sens Spec ROC SD Sens SD Spec SD
2 0.9867820 1.0000000 0.0000000 0.005007691 0.000000000 0.000000000
8 0.9899107 0.9549550 0.9640196 0.002243649 0.006714919 0.017247716
14 0.9907072 0.9558559 0.9631933 0.003028258 0.012345228 0.008019979
20 0.9909514 0.9635135 0.9556513 0.003268291 0.006864342 0.010471005
26 0.9911480 0.9630631 0.9539706 0.003384987 0.005113930 0.010628533
32 0.9911485 0.9657658 0.9522969 0.002973508 0.004842197 0.004090206
ROC was used to select the optimal model using the largest value.
The final value used for the model was mtry = 32.
>

Related

onnxruntime inference is way slower than pytorch on GPU

I was comparing the inference times for an input using pytorch and onnxruntime and I find that onnxruntime is actually slower on GPU while being significantly faster on CPU
I was tryng this on Windows 10.
ONNX Runtime installed from source - ONNX Runtime version: 1.11.0 (onnx version 1.10.1)
Python version - 3.8.12
CUDA/cuDNN version - cuda version 11.5, cudnn version 8.2
GPU model and memory - Quadro M2000M, 4 GB
Relevant code -
import torch
from torchvision import models
import onnxruntime # to inference ONNX models, we use the ONNX Runtime
import onnx
import os
import time
batch_size = 1
total_samples = 1000
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
def convert_to_onnx(resnet):
resnet.eval()
dummy_input = (torch.randn(batch_size, 3, 224, 224, device=device)).to(device=device)
input_names = [ 'input' ]
output_names = [ 'output' ]
torch.onnx.export(resnet,
dummy_input,
"resnet18.onnx",
verbose=True,
opset_version=13,
input_names=input_names,
output_names=output_names,
export_params=True,
do_constant_folding=True,
dynamic_axes={
'input': {0: 'batch_size'}, # variable length axes
'output': {0: 'batch_size'}}
)
def infer_pytorch(resnet):
print('Pytorch Inference')
print('==========================')
print()
x = torch.randn((batch_size, 3, 224, 224))
x = x.to(device=device)
latency = []
for i in range(total_samples):
t0 = time.time()
resnet.eval()
with torch.no_grad():
out = resnet(x)
latency.append(time.time() - t0)
print('Number of runs:', len(latency))
print("Average PyTorch {} Inference time = {} ms".format(device.type, format(sum(latency) * 1000 / len(latency), '.2f')))
def to_numpy(tensor):
return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()
def infer_onnxruntime():
print('Onnxruntime Inference')
print('==========================')
print()
onnx_model = onnx.load("resnet18.onnx")
onnx.checker.check_model(onnx_model)
# Input
x = torch.randn((batch_size, 3, 224, 224))
x = x.to(device=device)
x = to_numpy(x)
so = onnxruntime.SessionOptions()
so.execution_mode = onnxruntime.ExecutionMode.ORT_SEQUENTIAL
so.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_ENABLE_ALL
exproviders = ['CUDAExecutionProvider', 'CPUExecutionProvider']
model_onnx_path = os.path.join(".", "resnet18.onnx")
ort_session = onnxruntime.InferenceSession(model_onnx_path, so, providers=exproviders)
options = ort_session.get_provider_options()
cuda_options = options['CUDAExecutionProvider']
cuda_options['cudnn_conv_use_max_workspace'] = '1'
ort_session.set_providers(['CUDAExecutionProvider'], [cuda_options])
#IOBinding
input_names = ort_session.get_inputs()[0].name
output_names = ort_session.get_outputs()[0].name
io_binding = ort_session.io_binding()
io_binding.bind_cpu_input(input_names, x)
io_binding.bind_output(output_names, device)
#warm up run
ort_session.run_with_iobinding(io_binding)
ort_outs = io_binding.copy_outputs_to_cpu()
latency = []
for i in range(total_samples):
t0 = time.time()
ort_session.run_with_iobinding(io_binding)
latency.append(time.time() - t0)
ort_outs = io_binding.copy_outputs_to_cpu()
print('Number of runs:', len(latency))
print("Average onnxruntime {} Inference time = {} ms".format(device.type, format(sum(latency) * 1000 / len(latency), '.2f')))
if __name__ == '__main__':
torch.cuda.empty_cache()
resnet = (models.resnet18(pretrained=True)).to(device=device)
convert_to_onnx(resnet)
infer_onnxruntime()
infer_pytorch(resnet)
Output
If run on CPU,
Average onnxruntime cpu Inference time = 18.48 ms
Average PyTorch cpu Inference time = 51.74 ms
but, if run on GPU, I see
Average onnxruntime cuda Inference time = 47.89 ms
Average PyTorch cuda Inference time = 8.94 ms
If I change graph optimizations to onnxruntime.GraphOptimizationLevel.ORT_DISABLE_ALL, I see some improvements in inference time on GPU, but its still slower than Pytorch.
I use io binding for the input tensor numpy array and the nodes of the model are on GPU.
Further, during the processing for onnxruntime, I print device usage stats and I see this -
Using device: cuda:0
GPU Device name: Quadro M2000M
Memory Usage:
Allocated: 0.1 GB
Cached: 0.1 GB
So, GPU device is being used.
Further, I have used the resnet18.onnx model from the ModelZoo to see if it is a converted mode issue, but i get the same results.
What am I doing wrong or missing here?
When calculating inference time exclude all code that should be run once like resnet.eval() from the loop.
Please include imports in example
import torch
from torchvision import models
import onnxruntime # to inference ONNX models, we use the ONNX Runtime
import onnx
import os
import time
After running your example GPU only I found that time differs only ~x2, so the speed difference may be caused by framework characteristics. For more details explore onnx conversion optimization
Onnxruntime Inference
==========================
Number of runs: 1000
Average onnxruntime cuda Inference time = 4.76 ms
Pytorch Inference
==========================
Number of runs: 1000
Average PyTorch cuda Inference time = 2.27 ms
for CPU you don't need to use io-binding, it is required only for GPU.
and don't change session options as onnxruntime by default selects the best options.
the following things may help to speed up the gpu
make sure to install onnxruntime-gpu which comes with prebuilt CUDA EP and TensortRT EP.
you are currently binding the inputs and outputs to the CPU. when using onnxruntime with CUDA EP you should bind them to GPU (to avoid copying inputs/output btw CPU and GPU) refer here
I suggest you use io_binding.bind_input() method instead of io_binding.bind_cpu_input()
for i in range(total_samples):
t0 = time.time()
ort_session.run_with_iobinding(io_binding)
latency.append(time.time() - t0)
-> ort_outs = io_binding.copy_outputs_to_cpu()
copying output every time from GPU to CPU for 1000 times drops the performance.

Calculate average gap size in time series by extracting data from imputeTS functions

I need to calculate the average gap size of a univariate time-series data set. imputeTS package generates plots using this data. Is it possible to extract the 'gap size' and the 'number of occurrence' from either statsNA or ggplot_na_gapsize?
Or is there any other way to find the average size of gaps in a time-series data set?
(You could use tsNH4 data set from the imputeTS package)
(This is my first time asking questions here and I'm fairly new to 'r')
At the moment you can get the average gap size only indirectly with some extra work with the CRAN version of imputeTS.
But I made a quick update to the development version on GitHub.
Now you can also get the average gap size with the statsNA function.
Therefore you have to install the new version from GitHub first (since it is not on CRAN yet):
library("devtools")
install_github("SteffenMoritz/imputeTS")
If you do not have "devtools" installed, then also install this library at the very beginning
install.packages("devtools")
Afterwards just use the imputeTS package as usual.
library("imputeTS")
#Example with the tsNH4 dataset
statsNA(tsNH4)
This will now print you the following:
> statsNA(tsNH4)
[1] "Length of time series:"
[1] 4552
[1] "-------------------------"
[1] "Number of Missing Values:"
[1] 883
[1] "-------------------------"
[1] "Percentage of Missing Values:"
[1] "19.4%"
[1] "-------------------------"
[1] "Number of Gaps:"
[1] 155
[1] "-------------------------"
[1] "Average Gap Size:"
[1] 5.696774
[1] "-------------------------"
[1] "Stats for Bins"
[1] " Bin 1 (1138 values from 1 to 1138) : 233 NAs (20.5%)"
[1] " Bin 2 (1138 values from 1139 to 2276) : 433 NAs (38%)"
[1] " Bin 3 (1138 values from 2277 to 3414) : 135 NAs (11.9%)"
[1] " Bin 4 (1138 values from 3415 to 4552) : 82 NAs (7.21%)"
[1] "-------------------------"
[1] "Longest NA gap (series of consecutive NAs)"
[1] "157 in a row"
[1] "-------------------------"
[1] "Most frequent gap size (series of consecutive NA series)"
[1] "1 NA in a row (occuring 68 times)"
[1] "-------------------------"
[1] "Gap size accounting for most NAs"
[1] "157 NA in a row (occuring 1 times, making up for overall 157 NAs)"
As you can see, 'Number of gaps' and 'Average gap size' is now newly added to the output.
You can also access the output as a variable:
library("imputeTS")
#To actually get a output object, set print_only to false
out <- statsNA(tsNH4, print_only = F)
# Average gap size
out$average_size_na_gaps
# Number of Gaps
out$number_na_gaps
#Number of NAs
out$number_NAs
The updates will also be in the next CRAN update. (thanks for the suggestion)
Just be a little bit careful, since it is a development version - thus not so thoroughly tested as the CRAN version.

TypeError("Tensor is unhashable if Tensor equality is enabled. " K.learning_phase(): 0

I am porting a Keras, Tensorflow, and OpenCV script to TF2 and Keras 2 and have run into a problem. I am getting an error on K.learning_phase(): 0.
The error happens in this code section.
ef detect_image(self, image):
if self.model_image_size != (None, None):
assert self.model_image_size[0]%32 == 0, 'Multiples of 32 required'
assert self.model_image_size[1]%32 == 0, 'Multiples of 32 required'
boxed_image = image_preporcess(np.copy(image), tuple(reversed(self.model_image_size)))
image_data = boxed_image
out_boxes, out_scores, out_classes = self.sess.run(
[self.boxes, self.scores, self.classes],
feed_dict={
self.yolo_model.input: image_data,
self.input_image_shape: [image.shape[0], image.shape[1]],
tf.keras.learning_phase(): 0 })
here is a gist to the full code
https://gist.github.com/robisen1/31976de17af9e752c6ba8d1dd0e08906
Traceback (most recent call last):
File "webcam_detect.py", line 188, in <module>
r_image, ObjectsList = yolo.detect_image(frame)
File "webcam_detect.py", line 110, in detect_image
K.learning_phase(): 0
File "C:\Anaconda3\envs\simplecv\lib\site-packages\tensorflow_core\python\framework\ops.py", line 705, in __hash__
raise TypeError("Tensor is unhashable if Tensor equality is enabled. "
TypeError: Tensor is unhashable if Tensor equality is enabled. Instead, use tensor.experimental_ref() as the key.
(simplecv) PS C:\dev\lacv\yolov3\yolov3ct>
I am not sure what is going on. I would appreciate any insights.
You are trying to use Tensorflow 1.x, which works in graph mode whereas TensorFlow 2.x works in eager mode. TensorFlow 1.X requires users to manually stitch together an abstract syntax tree (the graph) by making tf.* API calls. It then requires users to manually compile the abstract syntax tree by passing a set of output tensors and input tensors to a session.run() call. TensorFlow 2.0 executes eagerly (like Python normally does) and in 2.0, graphs and sessions should feel like implementation details.
The error is due to version. If you are using session in TF2 then you need to use the compatible version and same goes with other operations. Also in TF2 it is tf.keras.backend.learning_phase.
Would recommend to go through the guide - Migrate your TensorFlow 1 code to TensorFlow 2.
For Example Below Code throws the error similar to the error you are facing -
import tensorflow as tf
print(tf.__version__)
x = tf.constant(5)
y = tf.constant(10)
z = tf.constant(20)
# This will show same error.
tensor_set = {x, y, z}
tensor_dict = {x: 'five', y: 'ten', z: 'twenty'}
Output -
2.2.0
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-509b2d8d7ab1> in <module>()
6
7 # This will show same error.
----> 8 tensor_set = {x, y, z}
9 tensor_dict = {x: 'five', y: 'ten', z: 'twenty'}
10
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py in __hash__(self)
724 if (Tensor._USE_EQUALITY and executing_eagerly_outside_functions() and
725 (g is None or g.building_function)):
--> 726 raise TypeError("Tensor is unhashable. "
727 "Instead, use tensor.ref() as the key.")
728 else:
TypeError: Tensor is unhashable. Instead, use tensor.ref() as the key.
But below code will fix the issue -
import tensorflow as tf
print(tf.__version__)
x = tf.constant(5)
y = tf.constant(10)
z = tf.constant(20)
#This solves the issue
tensor_set = {x.experimental_ref(), y.experimental_ref(), z.experimental_ref()}
tensor_dict = {x.experimental_ref(): 'five', y.experimental_ref(): 'ten', z.experimental_ref(): 'twenty'}
Output -
2.2.0
WARNING:tensorflow:From <ipython-input-4-05e379e669d9>:12: Tensor.experimental_ref (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use ref() instead.
If you are still facing the error, then kindly share the reproducible code for the error like above. Will be happy to help you.
Hope this answers your question. Happy Learning.
try to disable tf.compat.v1.disable_eager_execution()
from tensorflow.compat.v1 import disable_eager_execution
disable_eager_execution()

Planet NDVI calculation: ModuleNotFoundError: No module named 'rasterio'

I'm performing NDVI calculation on a Planet Scope 4 band image as per Planet's documentation
The following block of code is what I wrote:
Extract band data from original image in working directory
import rasterio import numpy
image_file = "20170430_194027_0c82_3B_AnalyticMS"
with rasterio.open(image_file) as src: band_red = src.read(3)
with rasterio.open(image_file) as src: band_nir = src.read(4)
from xml.dom import minidom
xmldoc = minidom.parse("20170430_194027_0c82_3B_AnalyticMS_metadata") nodes = xmldoc.getElementsByTagName("ps:bandSpecificMetadata")
Extract TOA correction coefficients from metadata file in directory
TOA_coeffs = {} for node in nodes: bn = node.getElementsByTagName("ps:bandNumber")[0].firstChild.data if bn in ['1', '2', '3', '4']:
i = int(bn)
value = node.getElementsByTagName("ps:ReflectanceCoefficient")[0].firstChild.data
TOA_coeffs[1] = float(value)
Calculate NDVI and save file
band_red = band_red * TOA_coeffs[3] band_nir = band_nir * TOA_coeffs[4]
numpy.seterr(divide = 'ignore', invalid = 'ignore')
NDVI = (band_nir.astype(float) - band_red.astype(float))/(band_nir + band_red) numpy.nanmin(NDVI), numpy.nanmax(NDVI)
kwargs = src.meta kwargs.update(dtype=rasterio.float32,
count = 1)
with rasterio.open('ndvi.tif', 'W', **kwargs) as dst: dst.write_band(1, NDVI.astype(rasterio.float32))
Add symbology and plot color bar
import matplotlib.pyplot as plt import matplotlib.colors as colors
class MidpointNormalize(colors.Normalize): def __init__(self, vmin=None, vmax=None, midpoint=None, clip=False):
self.midpoint = midpoint
colors.Normalize.__init__(self, vmin, vmax, clip)
def __call__(self, value, clip=None):
x, y = [self.vmin, self.midpoint, self.vmax], [0, 0.5, 1]
return numpy.ma.masked_array(numpy.interp(value, x, y), >numpy.isnan(value))
min = numpy.nanmin(NDVI) min = numpy.nanmax(NDVI) mid = 0.1
fig = plt.figure(figsize= (20,10)) ax = fig.add_subplot(111)
cmap = plt.cm.RdYlGn
cax = ax.imshow(NDVI, cmap=cmap, clim=(min,max),
>norm=MidpointNormalize(midpoint=mid, vmin=min, vmax=max))
ax.axis('off') ax.set_title('NDVI_test', fontsize= 18, fontweight='bold')
cbar = fig.colorbar(cax, orientation= 'horizontal', shrink=0.65)
fig.savefig("output/NDVI_test.png", dpi=200, bbox_inches='tight',
>pad_inches=0.7)
plt.show()
Plot histogram for NDVI pixel value distribution
fig2 = plt.figure(figsize=(10,10)) ax = fig2.add_subplot(111)
plt.title("NDVI Histogram", fontsize=18, fontweight='bold') plt.xlabel("NDVI values", fontsize=14) plt.ylabel("# pixels", fontsize=14)
x = NDVI[~numpy.isnan(NDVI)] numBins = 20 ax.hist(x,numBins,color='green',alpha=0.8)
fig2.savefig("output/ndvi-histogram.png", dpi=200, bbox_inches='tight', >pad_inches=0.7)
plt.show()
Alas, the execution of the script is cut short at the beginning of the code:
File "C:/Users/David/Desktop/ArcGIS files/Planet Labs/2017.6_Luis_Bedin_Bolivia/planet_order_58311/20170430_194027_0c82/TOA_correction_NDVI.py", line 8, in <module>
import rasterio
ModuleNotFoundError: No module named 'rasterio'
So I decide to install rasterio, that should solve the problem:
C:\Users\David\Desktop\ArcGIS files\Planet Labs\2017.6_Luis_Bedin_Bolivia\planet_order_58311\20170430_194027_0c82>pip install rasterio
Collecting rasterio
Using cached rasterio-0.36.0.tar.gz
Requirement already satisfied: affine in c:\users\david\anaconda3\lib\site-packages (from rasterio)
Requirement already satisfied: cligj in c:\users\david\anaconda3\lib\site-packages (from rasterio)
Requirement already satisfied: numpy in c:\users\david\anaconda3\lib\site-packages (from rasterio)
Requirement already satisfied: snuggs in c:\users\david\anaconda3\lib\site-packages (from rasterio)
Requirement already satisfied: click-plugins in c:\users\david\anaconda3\lib\site-packages (from rasterio)
What I interpret from this is that rasterio is already installed. How can this be if the Python console tells me there's no module named rasterio. The output from the console also says Microsoft Visual C++ is required. Upon further research I find this user's solution. I tried it but the console also tells me that rasterio is already installed:
(envpythonfs) C:\Users\David\Desktop\ArcGIS files\Planet Labs\2017.6_Luis_Bedin_Bolivia\planet_order_58311\20170430_194027_0c82>conda install rasterio gdal
Fetching package metadata .............
Solving package specifications: .
# All requested packages already installed.
# packages in environment at C:\Users\David\Anaconda3\envs\envpythonfs:
#
I'm creating the script using Spyder 3.1.2 with Python 3.6 on a Windows 10 64-bit machine.
I think pip is not the best way to go for making sure dependencies are handled appropriately. Since you're already using anaconda, I would suggest:
conda install rasterio -c conda-forge/label/dev
Note that installing from the dev labeled version is not the long term solution (see https://github.com/conda-forge/rasterio-feedstock/pull/36).

Convergence error for development version of lme4 - after R3.1.0

I am having the same problem as previous post from dmartin but the solution presented has not being working to my dataset.
trying to fit:
model<-glmer(nb~habitat*stigmatype+(1|sitecode/stigmaspecies),
family=Gamma(link=log))
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00436052 (tol = 0.001)
I uploaded my R version to R-3.1.0 for Windows (32/64 bit) in order to run glmmADB package as a way to apply a post hoc test on the interaction factors.
Before that, I was using glmer, in previous R version which was working fine for at least the glmer, which gave me the following output:
> summary(nbnew)
Generalized linear mixed model fit by maximum likelihood ['glmerMod']
Family: Gamma ( log )
Formula: n ~ habitat * stigmatype + (1 | sitecode/stigmaspecies)
AIC BIC logLik deviance
3030.101 3066.737 -1506.050 3012.101
Random effects:
Groups Name Variance Std.Dev.
stigmaspecies:sitecode (Intercept) 5.209e+00 2.2822436
sitecode (Intercept) 2.498e-07 0.0004998
Residual 2.070e+00 1.4388273
Number of obs: 433, groups: stigmaspecies:sitecode, 109; sitecode, 20
Fixed effects:
Estimate Std. Error t value Pr(>|z|)
(Intercept) 2.3824 0.4080 5.839 5.26e-09 ***
habitatnon-invaded -1.8270 0.6425 -2.843 0.00446 **
stigmatypesemidry -1.7531 0.7573 -2.315 0.02061 *
stigmatypewet -1.7210 0.8944 -1.924 0.05434 .
habitatnon-invaded:stigmatypesemidry 2.0774 1.1440 1.816 0.06938 .
habitatnon-invaded:stigmatypewet 1.3120 1.4741 0.890 0.37346
---
`Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 `
Correlation of Fixed Effects:
(Intr) hbttn- stgmtyps stgmtypw hbttnn-nvdd:stgmtyps`
hbttnn-nvdd -0.635
stgmtypsmdr -0.539 0.342
stigmatypwt -0.456 0.290 0.246
hbttnn-nvdd:stgmtyps 0.357 -0.562 -0.662 -0.163
hbttnn-nvdd:stgmtypw 0.277 -0.436 -0.149 -0.607 0.245
Since I am interested in difference between each level of habitat and stigma type as well as on the interactions, I applied ghlt from multicomp:
model<-glmer(log(nb+1)~habitat*stigmatype+
(1|sitecode/stigmaspecies),
family=Gamma(link=log))
av<-anova(model)`
nb.habstigma<-interaction(nb$habitat, nb$stigmatype,drop=T)
m1<-glmer(nbnb.habstigma+(1|sitecode/stigmaspecies),family=Gamma(link=log))
stigmatest<-glht(m1, linfct = mcp(nb.habstigma = "Tukey"))
and:
Error: pwrssUpdate did not converge in (30) iterations
from here, I switched to R latest version to install glmmADMB, and got the message:
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00436052 (tol = 0.001)
I followed the instructions from Ben Bolker (response to dmartin) trying to refit with
control=glmerControl(optimizer="bobyqa")
but
Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 52.2329 (tol = 0.001)
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues
Any ideas- please??
Thank you!

Resources