I am losing my mind with py2neo: node in the graph but not in the graph - py2neo

So this is a graph that contains network devices
Preexisting nodes were added and now I am trying to add more nodes and add relationships to the graph. What am I doing wrong. At the bottom of the second block of code the error message is saying the node is not in this graph but as you can see the node is listed as present
matcher=NodeMatcher(db)
nodes=matcher.match()
for node in nodes:
print (node)
node1=matcher.match(name="mxxx103")
print (node1)
node2=matcher.match(name='mxxxcvss01')
print(node2)
for rel in db.relationships.match((node1,node2)):
print (rel)
And the output when running the above code
(_9787:Device {model: 'ASR1000', name: 'mxxx103', scanned: 'Yes'})
(_9788:Device {model: 'ASR1000', name: 'lxxx100', scanned: 'Yes'})
(_9789:Device {model: 'ASR1000', name: 'mxxx100', scanned: 'Yes'})
(_9790:Device {model: 'ASR1000', name: 'txxx100', scanned: 'Yes'})
(_9791:Device {model: 'ASR1000', name: 'mxxx101', scanned: 'Yes'})
(_9792:Device {model: 'ASR1000', name: 'mxxx102', scanned: 'Yes'})
(_9793:Device {model: 'ASR1000', name: 'txxx101', scanned: 'Yes'})
(_9794:Device {model: 'ASR1000', name: 'lxxx101', scanned: 'Yes'})
(_9795:Device {model: 'ASR1000', name: 'cxxx100', scanned: 'Yes'})
(_9796:Device {model: 'ASR1000', name: 'cxxx101', scanned: 'Yes'})
(_9797:Device {capabilities: 'R S I', model: 'WS-C4500X', name: 'mxxxcvss01'})
<py2neo.matching.NodeMatch object at 0x02CCB870>
<py2neo.matching.NodeMatch object at 0x02CCBCD0>
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-135-653e282922e7> in <module>
7 node2=matcher.match(name='mxxxcvss01')
8 print(node2)
----> 9 for rel in db.relationships.match((node1,node2)):
10 print (rel)
C:\Utils\WPy3.6 -32-Qt5\python-3.6.7\lib\site-packages\py2neo\matching.py in __iter__(self)
266 """ Iterate through all matching relationships.
267 """
--> 268 query, parameters = self._query_and_parameters()
269 for record in self.graph.run(query, parameters):
270 yield record[0]
C:\Utils\WPy3.6 -32-Qt5\python-3.6.7\lib\site-packages\py2neo\matching.py in _query_and_parameters(self, count)
311 if len(self._nodes) >= 1 and self._nodes[0] is not None:
312 start_node = Node.cast(self._nodes[0])
--> 313 verify_node(start_node)
314 clauses.append("MATCH (a) WHERE id(a) = {x}")
315 parameters["x"] = start_node.identity
C:\Utils\WPy3.6 -32-Qt5\python-3.6.7\lib\site-packages\py2neo\matching.py in verify_node(n)
288 def verify_node(n):
289 if n.graph != self.graph:
--> 290 raise ValueError("Node %r does not belong to this graph" % n)
291 if n.identity is None:
292 raise ValueError("Node %r is not bound to a graph" % n)
ValueError: Node ({model: 'ASR1000', name: 'mxxx103', scanned: 'Yes'}) does not belong to this graph

OK I managed to find the mistake, it seems that I need to look again and again over the return of each method and the data types py2neo uses
The below code worked. My mistake was to believe that the node.match returns a node. That is not the case. The below code worked
matcher=NodeMatcher(db)
nodes=matcher.match()
for node in nodes:
print (node)
node1=matcher.match(name="mdc103")
list (node1)
node2=matcher.match(name='mdccvss01')
list(node2)
type(node1)
node1 = db.evaluate('MATCH (x) WHERE x.name="mxxx103" RETURN(x)')
print(node1)
node2 = db.evaluate('MATCH (x) WHERE x.name="mxxxcvss01" RETURN(x)')
print(node2)
for rel in db.relationships.match((node1,node2)):
print (rel)

Related

Error creating new user or granting for user permissions on Rancher

I'm having problems with creating an account on Rancher. When creating a new account I get the following error:
Internal error occurred: failed calling webhook "rancherauth.cattle.io": Post "https://rancher-webhook.cattle-system.svc:443/v1/webhook/validation?timeout=10s"
Detail:
Internal error occurred: failed calling webhook "rancherauth.cattle.io":
Post "https://rancher-webhook.cattle-system.svc:443/v1/webhook/validation?timeout=10s":
dial tcp 10.43.163.117:443: connect: connection refused
I'm use Rancher version v2.5.13.
Thank you,
Peter
This solved the problem for me.
Looks like deployment rancher-webhook in namespace cattle-system was removed for some reason.
You need to go to cluster local ==> project system ==> namespace cattle-system and check that again.
If deployment rancher-webhook does not exist, you can recreate it by importing yaml file contents from another rancher (go to item Import YAML from the rancher menu - top-right corner of the image) or you have to reinstall rancher to get deployment rancher-webhook.
This is the yaml file which I use:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
meta.helm.sh/release-name: rancher-webhook
meta.helm.sh/release-namespace: cattle-system
generation: 2
labels:
app.kubernetes.io/managed-by: Helm
managedFields:
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:meta.helm.sh/release-name: {}
f:meta.helm.sh/release-namespace: {}
f:labels:
.: {}
f:app.kubernetes.io/managed-by: {}
f:spec:
f:progressDeadlineSeconds: {}
f:replicas: {}
f:revisionHistoryLimit: {}
f:selector:
f:matchLabels:
.: {}
f:app: {}
f:strategy:
f:rollingUpdate:
.: {}
f:maxSurge: {}
f:maxUnavailable: {}
f:type: {}
f:template:
f:metadata:
f:labels:
.: {}
f:app: {}
f:spec:
f:containers:
k:{"name":"rancher-webhook"}:
.: {}
f:env:
.: {}
k:{"name":"NAMESPACE"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:fieldRef:
.: {}
f:apiVersion: {}
f:fieldPath: {}
f:image: {}
f:imagePullPolicy: {}
f:name: {}
f:ports:
.: {}
k:{"containerPort":9443,"protocol":"TCP"}:
.: {}
f:containerPort: {}
f:name: {}
f:protocol: {}
f:resources: {}
f:terminationMessagePath: {}
f:terminationMessagePolicy: {}
f:dnsPolicy: {}
f:restartPolicy: {}
f:schedulerName: {}
f:securityContext: {}
f:serviceAccount: {}
f:serviceAccountName: {}
f:terminationGracePeriodSeconds: {}
manager: Go-http-client
operation: Update
time: "2021-07-22T19:25:06Z"
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:deployment.kubernetes.io/revision: {}
f:status:
f:availableReplicas: {}
f:conditions:
.: {}
k:{"type":"Available"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
k:{"type":"Progressing"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
f:observedGeneration: {}
f:readyReplicas: {}
f:replicas: {}
f:updatedReplicas: {}
manager: k3s
operation: Update
time: "2022-06-23T03:38:49Z"
name: rancher-webhook
namespace: cattle-system
resourceVersion: "291873445"
selfLink: /apis/apps/v1/namespaces/cattle-system/deployments/rancher-webhook
uid: 9c9d68eb-1b0d-4371-9d02-a733c22d036c
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: rancher-webhook
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: rancher-webhook
spec:
containers:
- env:
- name: NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
image: rancher/rancher-webhook:v0.1.4
imagePullPolicy: IfNotPresent
name: rancher-webhook
ports:
- containerPort: 9443
name: https
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: rancher-webhook
serviceAccountName: rancher-webhook
terminationGracePeriodSeconds: 30
Note: If you copied the yaml of deployment rancher-webhook file from another rancher, remove the status section of the yaml file.
Thanks!

kubeflow pipeline Failed to execute component: unable to get pipeline with PipelineName

Install follow https://github.com/kubeflow/manifests in v1.4.1
KFP version: 1.7.0
KFP SDK version: build version dev_local
k3s Kubernetes 1.19
use demo example to add pipline
kfp 1.8.10
kfp-pipeline-spec 0.1.13
kfp-server-api 1.7.1
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import kfp
import kfp.dsl as dsl
from kfp.v2.dsl import component
from kfp import compiler
#component(
base_image="library/python:3.7"
)
def add(a: float, b: float) -> float:
'''Calculates sum of two arguments'''
return a + b
#dsl.pipeline(
name='v2add',
description='An example pipeline that performs addition calculations.',
# pipeline_root='gs://my-pipeline-root/example-pipeline'
)
def add_pipeline(a: float = 1, b: float = 7):
add_task = add(a, b)
compiler.Compiler(
mode=kfp.dsl.PipelineExecutionMode.V2_COMPATIBLE,
launcher_image='library/gcr.io/ml-pipeline/kfp-launcher:1.8.7'
).compile(pipeline_func=add_pipeline, package_path='pipeline.yaml')
I upload the pipeline.yaml and start a run get error
logs
I1231 10:12:23.830486 1 launcher.go:144] PipelineRoot defaults to "minio://mlpipeline/v2/artifacts".
I1231 10:12:23.830866 1 cache.go:143] Cannot detect ml-pipeline in the same namespace, default to ml-pipeline.kubeflow:8887 as KFP endpoint.
I1231 10:12:23.830880 1 cache.go:120] Connecting to cache endpoint ml-pipeline.kubeflow:8887
F1231 10:12:23.832000 1 main.go:50] Failed to execute component: unable to get pipeline with PipelineName "pipeline/v2add" PipelineRunID "7e2bdeeb-aa6f-4109-a508-63a1be22267c": Failed GetContextByTypeAndName(type="system.Pipeline", name="pipeline/v2add")
pod
kind: Pod
apiVersion: v1
metadata:
name: v2add-rzrht-37236994
namespace: kubeflow-user-example-com
selfLink: /api/v1/namespaces/kubeflow-user-example-com/pods/v2add-rzrht-37236994
uid: 3ceb73e5-80b5-4844-8cc8-8f2bf61319d2
resourceVersion: '28824661'
creationTimestamp: '2021-12-31T10:12:21Z'
labels:
pipeline/runid: 7e2bdeeb-aa6f-4109-a508-63a1be22267c
pipelines.kubeflow.org/cache_enabled: 'true'
pipelines.kubeflow.org/enable_caching: 'true'
pipelines.kubeflow.org/kfp_sdk_version: 1.8.10
pipelines.kubeflow.org/pipeline-sdk-type: kfp
pipelines.kubeflow.org/v2_component: 'true'
workflows.argoproj.io/completed: 'true'
workflows.argoproj.io/workflow: v2add-rzrht
annotations:
pipelines.kubeflow.org/arguments.parameters: '{"a": "1", "b": "7"}'
pipelines.kubeflow.org/component_ref: '{}'
pipelines.kubeflow.org/v2_component: 'true'
sidecar.istio.io/inject: 'false'
workflows.argoproj.io/node-name: v2add-rzrht.add
workflows.argoproj.io/outputs: >-
{"artifacts":[{"name":"add-Output","path":"/tmp/outputs/Output/data"},{"name":"main-logs","s3":{"key":"artifacts/v2add-rzrht/2021/12/31/v2add-rzrht-37236994/main.log"}}]}
workflows.argoproj.io/template: >-
{"name":"add","inputs":{"parameters":[{"name":"a","value":"1"},{"name":"b","value":"7"},{"name":"pipeline-name","value":"pipeline/v2add"},{"name":"pipeline-root","value":""}]},"outputs":{"artifacts":[{"name":"add-Output","path":"/tmp/outputs/Output/data"}]},"metadata":{"annotations":{"pipelines.kubeflow.org/arguments.parameters":"{\"a\":
\"1\", \"b\":
\"7\"}","pipelines.kubeflow.org/component_ref":"{}","pipelines.kubeflow.org/v2_component":"true","sidecar.istio.io/inject":"false"},"labels":{"pipelines.kubeflow.org/cache_enabled":"true","pipelines.kubeflow.org/enable_caching":"true","pipelines.kubeflow.org/kfp_sdk_version":"1.8.10","pipelines.kubeflow.org/pipeline-sdk-type":"kfp","pipelines.kubeflow.org/v2_component":"true"}},"container":{"name":"","image":"library/python:3.7","command":["/kfp-launcher/launch","--mlmd_server_address","$(METADATA_GRPC_SERVICE_HOST)","--mlmd_server_port","$(METADATA_GRPC_SERVICE_PORT)","--runtime_info_json","$(KFP_V2_RUNTIME_INFO)","--container_image","$(KFP_V2_IMAGE)","--task_name","add","--pipeline_name","pipeline/v2add","--run_id","$(KFP_RUN_ID)","--run_resource","workflows.argoproj.io/$(WORKFLOW_ID)","--namespace","$(KFP_NAMESPACE)","--pod_name","$(KFP_POD_NAME)","--pod_uid","$(KFP_POD_UID)","--pipeline_root","","--enable_caching","$(ENABLE_CACHING)","--","a=1","b=7","--"],"args":["sh","-c","\nif
! [ -x \"$(command -v pip)\" ]; then\n python3 -m ensurepip || python3
-m ensurepip --user || apt-get install
python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install
--quiet --no-warn-script-location 'kfp==1.8.10' \u0026\u0026 \"$0\"
\"$#\"\n","sh","-ec","program_path=$(mktemp -d)\nprintf \"%s\" \"$0\"
\u003e \"$program_path/ephemeral_component.py\"\npython3 -m
kfp.v2.components.executor_main
--component_module_path
\"$program_path/ephemeral_component.py\"
\"$#\"\n","\nimport kfp\nfrom kfp.v2 import dsl\nfrom kfp.v2.dsl import
*\nfrom typing import *\n\ndef add(a: float, b: float) -\u003e float:\n
'''Calculates sum of two arguments'''\n return a +
b\n\n","--executor_input","{{$}}","--function_to_execute","add"],"envFrom":[{"configMapRef":{"name":"metadata-grpc-configmap","optional":true}}],"env":[{"name":"KFP_POD_NAME","valueFrom":{"fieldRef":{"fieldPath":"metadata.name"}}},{"name":"KFP_POD_UID","valueFrom":{"fieldRef":{"fieldPath":"metadata.uid"}}},{"name":"KFP_NAMESPACE","valueFrom":{"fieldRef":{"fieldPath":"metadata.namespace"}}},{"name":"WORKFLOW_ID","valueFrom":{"fieldRef":{"fieldPath":"metadata.labels['workflows.argoproj.io/workflow']"}}},{"name":"KFP_RUN_ID","valueFrom":{"fieldRef":{"fieldPath":"metadata.labels['pipeline/runid']"}}},{"name":"ENABLE_CACHING","valueFrom":{"fieldRef":{"fieldPath":"metadata.labels['pipelines.kubeflow.org/enable_caching']"}}},{"name":"KFP_V2_IMAGE","value":"library/python:3.7"},{"name":"KFP_V2_RUNTIME_INFO","value":"{\"inputParameters\":
{\"a\": {\"type\": \"DOUBLE\"}, \"b\": {\"type\": \"DOUBLE\"}},
\"inputArtifacts\": {}, \"outputParameters\": {\"Output\": {\"type\":
\"DOUBLE\", \"path\": \"/tmp/outputs/Output/data\"}}, \"outputArtifacts\":
{}}"}],"resources":{},"volumeMounts":[{"name":"kfp-launcher","mountPath":"/kfp-launcher"}]},"volumes":[{"name":"kfp-launcher"}],"initContainers":[{"name":"kfp-launcher","image":"library/gcr.io/ml-pipeline/kfp-launcher:1.8.7","command":["launcher","--copy","/kfp-launcher/launch"],"resources":{},"mirrorVolumeMounts":true}],"archiveLocation":{"archiveLogs":true,"s3":{"endpoint":"minio-service.kubeflow:9000","bucket":"mlpipeline","insecure":true,"accessKeySecret":{"name":"mlpipeline-minio-artifact","key":"accesskey"},"secretKeySecret":{"name":"mlpipeline-minio-artifact","key":"secretkey"},"key":"artifacts/v2add-rzrht/2021/12/31/v2add-rzrht-37236994"}}}
ownerReferences:
- apiVersion: argoproj.io/v1alpha1
kind: Workflow
name: v2add-rzrht
uid: 9a806b04-d5fa-49eb-9e46-7502bc3e7ac5
controller: true
blockOwnerDeletion: true
managedFields:
- manager: workflow-controller
operation: Update
apiVersion: v1
time: '2021-12-31T10:12:21Z'
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:annotations':
.: {}
'f:pipelines.kubeflow.org/arguments.parameters': {}
'f:pipelines.kubeflow.org/component_ref': {}
'f:pipelines.kubeflow.org/v2_component': {}
'f:sidecar.istio.io/inject': {}
'f:workflows.argoproj.io/node-name': {}
'f:workflows.argoproj.io/template': {}
'f:labels':
.: {}
'f:pipeline/runid': {}
'f:pipelines.kubeflow.org/cache_enabled': {}
'f:pipelines.kubeflow.org/enable_caching': {}
'f:pipelines.kubeflow.org/kfp_sdk_version': {}
'f:pipelines.kubeflow.org/pipeline-sdk-type': {}
'f:pipelines.kubeflow.org/v2_component': {}
'f:workflows.argoproj.io/completed': {}
'f:workflows.argoproj.io/workflow': {}
'f:ownerReferences':
.: {}
'k:{"uid":"9a806b04-d5fa-49eb-9e46-7502bc3e7ac5"}':
.: {}
'f:apiVersion': {}
'f:blockOwnerDeletion': {}
'f:controller': {}
'f:kind': {}
'f:name': {}
'f:uid': {}
'f:spec':
'f:containers':
'k:{"name":"main"}':
.: {}
'f:args': {}
'f:command': {}
'f:env':
.: {}
'k:{"name":"ARGO_CONTAINER_NAME"}':
.: {}
'f:name': {}
'f:value': {}
'k:{"name":"ARGO_INCLUDE_SCRIPT_OUTPUT"}':
.: {}
'f:name': {}
'f:value': {}
'k:{"name":"ENABLE_CACHING"}':
.: {}
'f:name': {}
'f:valueFrom':
.: {}
'f:fieldRef':
.: {}
'f:apiVersion': {}
'f:fieldPath': {}
'k:{"name":"KFP_NAMESPACE"}':
.: {}
'f:name': {}
'f:valueFrom':
.: {}
'f:fieldRef':
.: {}
'f:apiVersion': {}
'f:fieldPath': {}
'k:{"name":"KFP_POD_NAME"}':
.: {}
'f:name': {}
'f:valueFrom':
.: {}
'f:fieldRef':
.: {}
'f:apiVersion': {}
'f:fieldPath': {}
'k:{"name":"KFP_POD_UID"}':
.: {}
'f:name': {}
'f:valueFrom':
.: {}
'f:fieldRef':
.: {}
'f:apiVersion': {}
'f:fieldPath': {}
'k:{"name":"KFP_RUN_ID"}':
.: {}
'f:name': {}
'f:valueFrom':
.: {}
'f:fieldRef':
.: {}
'f:apiVersion': {}
'f:fieldPath': {}
'k:{"name":"KFP_V2_IMAGE"}':
.: {}
'f:name': {}
'f:value': {}
'k:{"name":"KFP_V2_RUNTIME_INFO"}':
.: {}
'f:name': {}
'f:value': {}
'k:{"name":"WORKFLOW_ID"}':
.: {}
'f:name': {}
'f:valueFrom':
.: {}
'f:fieldRef':
.: {}
'f:apiVersion': {}
'f:fieldPath': {}
'f:envFrom': {}
'f:image': {}
'f:imagePullPolicy': {}
'f:name': {}
'f:resources': {}
'f:terminationMessagePath': {}
'f:terminationMessagePolicy': {}
'f:volumeMounts':
.: {}
'k:{"mountPath":"/kfp-launcher"}':
.: {}
'f:mountPath': {}
'f:name': {}
'k:{"name":"wait"}':
.: {}
'f:command': {}
'f:env':
.: {}
'k:{"name":"ARGO_CONTAINER_NAME"}':
.: {}
'f:name': {}
'f:value': {}
'k:{"name":"ARGO_CONTAINER_RUNTIME_EXECUTOR"}':
.: {}
'f:name': {}
'f:value': {}
'k:{"name":"ARGO_INCLUDE_SCRIPT_OUTPUT"}':
.: {}
'f:name': {}
'f:value': {}
'k:{"name":"ARGO_POD_NAME"}':
.: {}
'f:name': {}
'f:valueFrom':
.: {}
'f:fieldRef':
.: {}
'f:apiVersion': {}
'f:fieldPath': {}
'k:{"name":"GODEBUG"}':
.: {}
'f:name': {}
'f:value': {}
'f:image': {}
'f:imagePullPolicy': {}
'f:name': {}
'f:resources':
.: {}
'f:limits':
.: {}
'f:cpu': {}
'f:memory': {}
'f:requests':
.: {}
'f:cpu': {}
'f:memory': {}
'f:terminationMessagePath': {}
'f:terminationMessagePolicy': {}
'f:volumeMounts':
.: {}
'k:{"mountPath":"/argo/podmetadata"}':
.: {}
'f:mountPath': {}
'f:name': {}
'k:{"mountPath":"/argo/secret/mlpipeline-minio-artifact"}':
.: {}
'f:mountPath': {}
'f:name': {}
'f:readOnly': {}
'k:{"mountPath":"/mainctrfs/kfp-launcher"}':
.: {}
'f:mountPath': {}
'f:name': {}
'k:{"mountPath":"/var/run/docker.sock"}':
.: {}
'f:mountPath': {}
'f:name': {}
'f:readOnly': {}
'f:dnsPolicy': {}
'f:enableServiceLinks': {}
'f:initContainers':
.: {}
'k:{"name":"kfp-launcher"}':
.: {}
'f:command': {}
'f:env':
.: {}
'k:{"name":"ARGO_CONTAINER_NAME"}':
.: {}
'f:name': {}
'f:value': {}
'k:{"name":"ARGO_INCLUDE_SCRIPT_OUTPUT"}':
.: {}
'f:name': {}
'f:value': {}
'f:image': {}
'f:imagePullPolicy': {}
'f:name': {}
'f:resources': {}
'f:terminationMessagePath': {}
'f:terminationMessagePolicy': {}
'f:volumeMounts':
.: {}
'k:{"mountPath":"/kfp-launcher"}':
.: {}
'f:mountPath': {}
'f:name': {}
'f:restartPolicy': {}
'f:schedulerName': {}
'f:securityContext': {}
'f:serviceAccount': {}
'f:serviceAccountName': {}
'f:terminationGracePeriodSeconds': {}
'f:volumes':
.: {}
'k:{"name":"docker-sock"}':
.: {}
'f:hostPath':
.: {}
'f:path': {}
'f:type': {}
'f:name': {}
'k:{"name":"kfp-launcher"}':
.: {}
'f:emptyDir': {}
'f:name': {}
'k:{"name":"mlpipeline-minio-artifact"}':
.: {}
'f:name': {}
'f:secret':
.: {}
'f:defaultMode': {}
'f:items': {}
'f:secretName': {}
'k:{"name":"podmetadata"}':
.: {}
'f:downwardAPI':
.: {}
'f:defaultMode': {}
'f:items': {}
'f:name': {}
- manager: k3s
operation: Update
apiVersion: v1
time: '2021-12-31T10:12:24Z'
fieldsType: FieldsV1
fieldsV1:
'f:status':
'f:conditions':
'k:{"type":"ContainersReady"}':
.: {}
'f:lastProbeTime': {}
'f:lastTransitionTime': {}
'f:message': {}
'f:reason': {}
'f:status': {}
'f:type': {}
'k:{"type":"Initialized"}':
.: {}
'f:lastProbeTime': {}
'f:lastTransitionTime': {}
'f:status': {}
'f:type': {}
'k:{"type":"Ready"}':
.: {}
'f:lastProbeTime': {}
'f:lastTransitionTime': {}
'f:message': {}
'f:reason': {}
'f:status': {}
'f:type': {}
'f:containerStatuses': {}
'f:hostIP': {}
'f:initContainerStatuses': {}
'f:phase': {}
'f:podIP': {}
'f:podIPs':
.: {}
'k:{"ip":"10.42.0.101"}':
.: {}
'f:ip': {}
'f:startTime': {}
- manager: argoexec
operation: Update
apiVersion: v1
time: '2021-12-31T10:12:25Z'
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:annotations':
'f:workflows.argoproj.io/outputs': {}
status:
phase: Failed
conditions:
- type: Initialized
status: 'True'
lastProbeTime: null
lastTransitionTime: '2021-12-31T10:12:23Z'
- type: Ready
status: 'False'
lastProbeTime: null
lastTransitionTime: '2021-12-31T10:12:21Z'
reason: ContainersNotReady
message: 'containers with unready status: [wait main]'
- type: ContainersReady
status: 'False'
lastProbeTime: null
lastTransitionTime: '2021-12-31T10:12:21Z'
reason: ContainersNotReady
message: 'containers with unready status: [wait main]'
- type: PodScheduled
status: 'True'
lastProbeTime: null
lastTransitionTime: '2021-12-31T10:12:21Z'
hostIP: 10.19.64.214
podIP: 10.42.0.101
podIPs:
- ip: 10.42.0.101
startTime: '2021-12-31T10:12:21Z'
initContainerStatuses:
- name: kfp-launcher
state:
terminated:
exitCode: 0
reason: Completed
startedAt: '2021-12-31T10:12:22Z'
finishedAt: '2021-12-31T10:12:22Z'
containerID: >-
docker://fbf8b39a3bab8065b54e9a3b25a678e07e0880ef61f9e78abe92f9fa205a73c4
lastState: {}
ready: true
restartCount: 0
image: 'library/gcr.io/ml-pipeline/kfp-launcher:1.8.7'
imageID: >-
docker-pullable://library/gcr.io/ml-pipeline/kfp-launcher#sha256:8b3f14d468a41c319e95ef4047b7823c64480fd1980c3d5b369c8412afbc684f
containerID: >-
docker://fbf8b39a3bab8065b54e9a3b25a678e07e0880ef61f9e78abe92f9fa205a73c4
containerStatuses:
- name: main
state:
terminated:
exitCode: 1
reason: Error
startedAt: '2021-12-31T10:12:23Z'
finishedAt: '2021-12-31T10:12:23Z'
containerID: >-
docker://26faae59907e5a4207960ee9d15d9d350587c5be7db31c3e8f0ec97e72c6d2cf
lastState: {}
ready: false
restartCount: 0
image: 'python:3.7'
imageID: >-
docker-pullable://python#sha256:3908249ce6b2d28284e3610b07bf406c3035bc2e3ce328711a2b42e1c5a75fc1
containerID: >-
docker://26faae59907e5a4207960ee9d15d9d350587c5be7db31c3e8f0ec97e72c6d2cf
started: false
- name: wait
state:
terminated:
exitCode: 1
reason: Error
message: >-
path /tmp/outputs/Output/data does not exist in archive
/tmp/argo/outputs/artifacts/add-Output.tgz
startedAt: '2021-12-31T10:12:23Z'
finishedAt: '2021-12-31T10:12:25Z'
containerID: >-
docker://66b6306eb81ac2abb1fbf2609d7375a00f92891f1c827680a45962cbb1ec3c0a
lastState: {}
ready: false
restartCount: 0
image: 'library/gcr.io/ml-pipeline/argoexec:v3.1.6-patch-license-compliance'
imageID: >-
docker-pullable://library/gcr.io/ml-pipeline/argoexec#sha256:44cf8455a51aa5b961d1a86f65e39adf5ffca9bdcd33a745c3b79f430b7439e0
containerID: >-
docker://66b6306eb81ac2abb1fbf2609d7375a00f92891f1c827680a45962cbb1ec3c0a
started: false
qosClass: Burstable
spec:
volumes:
- name: podmetadata
downwardAPI:
items:
- path: annotations
fieldRef:
apiVersion: v1
fieldPath: metadata.annotations
defaultMode: 420
- name: docker-sock
hostPath:
path: /var/run/docker.sock
type: Socket
- name: kfp-launcher
emptyDir: {}
- name: mlpipeline-minio-artifact
secret:
secretName: mlpipeline-minio-artifact
items:
- key: accesskey
path: accesskey
- key: secretkey
path: secretkey
defaultMode: 420
- name: default-editor-token-8lmfr
secret:
secretName: default-editor-token-8lmfr
defaultMode: 420
initContainers:
- name: kfp-launcher
image: 'library/gcr.io/ml-pipeline/kfp-launcher:1.8.7'
command:
- launcher
- '--copy'
- /kfp-launcher/launch
env:
- name: ARGO_CONTAINER_NAME
value: kfp-launcher
- name: ARGO_INCLUDE_SCRIPT_OUTPUT
value: 'false'
resources: {}
volumeMounts:
- name: kfp-launcher
mountPath: /kfp-launcher
- name: default-editor-token-8lmfr
readOnly: true
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
containers:
- name: wait
image: 'library/gcr.io/ml-pipeline/argoexec:v3.1.6-patch-license-compliance'
command:
- argoexec
- wait
- '--loglevel'
- info
env:
- name: ARGO_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: ARGO_CONTAINER_RUNTIME_EXECUTOR
value: docker
- name: GODEBUG
value: x509ignoreCN=0
- name: ARGO_CONTAINER_NAME
value: wait
- name: ARGO_INCLUDE_SCRIPT_OUTPUT
value: 'false'
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 10m
memory: 32Mi
volumeMounts:
- name: podmetadata
mountPath: /argo/podmetadata
- name: docker-sock
readOnly: true
mountPath: /var/run/docker.sock
- name: mlpipeline-minio-artifact
readOnly: true
mountPath: /argo/secret/mlpipeline-minio-artifact
- name: kfp-launcher
mountPath: /mainctrfs/kfp-launcher
- name: default-editor-token-8lmfr
readOnly: true
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
- name: main
image: 'library/python:3.7'
command:
- /kfp-launcher/launch
- '--mlmd_server_address'
- $(METADATA_GRPC_SERVICE_HOST)
- '--mlmd_server_port'
- $(METADATA_GRPC_SERVICE_PORT)
- '--runtime_info_json'
- $(KFP_V2_RUNTIME_INFO)
- '--container_image'
- $(KFP_V2_IMAGE)
- '--task_name'
- add
- '--pipeline_name'
- pipeline/v2add
- '--run_id'
- $(KFP_RUN_ID)
- '--run_resource'
- workflows.argoproj.io/$(WORKFLOW_ID)
- '--namespace'
- $(KFP_NAMESPACE)
- '--pod_name'
- $(KFP_POD_NAME)
- '--pod_uid'
- $(KFP_POD_UID)
- '--pipeline_root'
- ''
- '--enable_caching'
- $(ENABLE_CACHING)
- '--'
- a=1
- b=7
- '--'
args:
- sh
- '-c'
- >
if ! [ -x "$(command -v pip)" ]; then
python3 -m ensurepip || python3 -m ensurepip --user || apt-get install python3-pip
fi
PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet
--no-warn-script-location 'kfp==1.8.10' && "$0" "$#"
- sh
- '-ec'
- >
program_path=$(mktemp -d)
printf "%s" "$0" > "$program_path/ephemeral_component.py"
python3 -m kfp.v2.components.executor_main
--component_module_path
"$program_path/ephemeral_component.py" "$#"
- |+
import kfp
from kfp.v2 import dsl
from kfp.v2.dsl import *
from typing import *
def add(a: float, b: float) -> float:
'''Calculates sum of two arguments'''
return a + b
- '--executor_input'
- '{{$}}'
- '--function_to_execute'
- add
envFrom:
- configMapRef:
name: metadata-grpc-configmap
optional: true
env:
- name: KFP_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: KFP_POD_UID
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.uid
- name: KFP_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: WORKFLOW_ID
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: 'metadata.labels[''workflows.argoproj.io/workflow'']'
- name: KFP_RUN_ID
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: 'metadata.labels[''pipeline/runid'']'
- name: ENABLE_CACHING
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: 'metadata.labels[''pipelines.kubeflow.org/enable_caching'']'
- name: KFP_V2_IMAGE
value: 'library/python:3.7'
- name: KFP_V2_RUNTIME_INFO
value: >-
{"inputParameters": {"a": {"type": "DOUBLE"}, "b": {"type":
"DOUBLE"}}, "inputArtifacts": {}, "outputParameters": {"Output":
{"type": "DOUBLE", "path": "/tmp/outputs/Output/data"}},
"outputArtifacts": {}}
- name: ARGO_CONTAINER_NAME
value: main
- name: ARGO_INCLUDE_SCRIPT_OUTPUT
value: 'false'
resources: {}
volumeMounts:
- name: kfp-launcher
mountPath: /kfp-launcher
- name: default-editor-token-8lmfr
readOnly: true
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
restartPolicy: Never
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
serviceAccountName: default-editor
serviceAccount: default-editor
nodeName: iz1bb01rvtheuakv3h25ntz
securityContext: {}
schedulerName: default-scheduler
tolerations:
- key: node.kubernetes.io/not-ready
operator: Exists
effect: NoExecute
tolerationSeconds: 300
- key: node.kubernetes.io/unreachable
operator: Exists
effect: NoExecute
tolerationSeconds: 300
priority: 0
enableServiceLinks: true
preemptionPolicy: PreemptLowerPriority
I don't know why it can't find the PipelineName?

left outer join query in Informix

I have two queries from which I expected to get same result.
First one:
select d.serialno, d.location, d.ticket, d.vehicle, o.name, d.parrivaltime, e.etype
from deliveries d left outer join events e
on d.serialno=e.eserialno and d.client=e.eclient and d.carrier=e.ecarrier and d.location=e.elocation
left outer join operators o
on d.client=o.client AND d.driver=o.code AND d.carrier=o.carrier AND d.location=o.location
where d.statusmessage='CURRENT' and d.scheduleddate BETWEEN '08/02/2017' AND '08/03/2017' and d.dropno!=0 and d.location in ('MIAMID') AND (e.etype=107 or e.etype=108)
order by d.location, d.dropno, e.etype;
Second one:
select d.serialno, d.location, d.ticket, d.vehicle, o.name, d.parrivaltime, e.etype
from deliveries d, outer events e, outer operators o
where d.serialno=e.eserialno and d.client=e.eclient and d.carrier=e.ecarrier and d.location=e.elocation
AND d.client=o.client and d.driver=o.code and d.carrier=o.carrier and d.location=o.location
AND d.statusmessage='CURRENT' and d.scheduleddate BETWEEN '08/02/2017' AND '08/03/2017' and d.dropno!=0 and d.location in ('MIAMID') AND (e.etype=107 or e.etype=108)
order by d.location, d.dropno, e.etype;
However, from the first query, I got 1044 records. But from the second query, I got 876 records.
I also checked the explain.out file which is as below. But I still cannot figure out why the output record is different.
QUERY: (OPTIMIZATION TIMESTAMP: 09-06-2017 11:27:22)
------
select d.serialno, d.location, d.ticket, d.vehicle, o.name, d.parrivaltime, e.etype from deliveries d, outer events e, outer operators o where d.serialno=e.eserialno and d.client=e.eclient and d.carrier=e.ecarrier and d.location=e.elocation AND d.client=o.client AND d.driver=o.code AND d.carrier=o.carrier AND d.location=o.location AND d.statusmessage='CURRENT' and d.scheduleddate BETWEEN '08/02/2017' AND '08/03/2017' and d.dropno!=0 and d.location in ('MIAMID') AND (e.etype=107 or e.etype=108) order by d.location, d.dropno, e.etype
Estimated Cost: 304
Estimated # of Rows Returned: 71
Temporary Files Required For: Order By
1) jlong.d: INDEX PATH
Filters: (jlong.d.statusmessage = 'CURRENT' AND jlong.d.dropno != 0 )
(1) Index Name: informix.delivcapac1idx
Index Keys: scheduleddate location client customername (Serial, fragments: ALL)
Index Self Join Keys (scheduleddate )
Lower bound: jlong.d.scheduleddate >= 08/02/2017
Upper bound: jlong.d.scheduleddate <= 08/03/2017
Lower Index Filter: jlong.d.scheduleddate = jlong.d.scheduleddate AND jlong.d.location = 'MIAMID'
2) jlong.o: INDEX PATH
(1) Index Name: informix. 126_300
Index Keys: client carrier location code (Serial, fragments: ALL)
Lower Index Filter: (((jlong.d.driver = jlong.o.code AND jlong.d.location = jlong.o.location ) AND jlong.d.carrier = jlong.o.carrier ) AND jlong.d.client = jlong.o.client )
NESTED LOOP JOIN
3) jlong.e: INDEX PATH
Filters: ((((jlong.d.carrier = jlong.e.ecarrier AND (jlong.e.etype = 107 OR jlong.e.etype = 108 ) ) AND jlong.d.location = jlong.e.elocation ) AND jlong.d.client = jlong.e.eclient ) AND jlong.e.elocation = 'MIAMID' )
(1) Index Name: informix.ix154_17
Index Keys: eserialno (Serial, fragments: ALL)
Lower Index Filter: jlong.d.serialno = jlong.e.eserialno
NESTED LOOP JOIN
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 d
t2 o
t3 e
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 603 71 742 00:00.00 122
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t2 603 741 603 00:00.00 1
type rows_prod est_rows time est_cost
-------------------------------------------------
nljoin 603 71 00:00.00 158
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t3 876 1597 1729 00:00.00 2
type rows_prod est_rows time est_cost
-------------------------------------------------
nljoin 1044 71 00:00.01 291
type rows_sort est_rows rows_cons time est_cost
------------------------------------------------------------
sort 1044 71 1044 00:00.01 14
QUERY: (OPTIMIZATION TIMESTAMP: 09-06-2017 11:27:29)
------
select d.serialno, d.location, d.ticket, d.vehicle, o.name, d.parrivaltime, e.etype from deliveries d left outer join events e on d.serialno=e.eserialno and d.client=e.eclient and d.carrier=e.ecarrier and d.location=e.elocation left outer join operators o on d.client=o.client AND d.driver=o.code AND d.carrier=o.carrier AND d.location=o.location where d.statusmessage='CURRENT' and d.scheduleddate BETWEEN '08/02/2017' AND '08/03/2017' and d.dropno!=0 and d.location in ('MIAMID') AND (e.etype=107 or e.etype=108) order by d.location, d.dropno, e.etype
Estimated Cost: 254
Estimated # of Rows Returned: 1
Temporary Files Required For: Order By
1) jlong.d: INDEX PATH
Filters: (jlong.d.statusmessage = 'CURRENT' AND jlong.d.dropno != 0 )
(1) Index Name: informix.delivcapac1idx
Index Keys: scheduleddate location client customername (Serial, fragments: ALL)
Index Self Join Keys (scheduleddate )
Lower bound: jlong.d.scheduleddate >= 08/02/2017
Upper bound: jlong.d.scheduleddate <= 08/03/2017
Lower Index Filter: jlong.d.scheduleddate = jlong.d.scheduleddate AND jlong.d.location = 'MIAMID'
2) jlong.e: INDEX PATH
Filters: ((jlong.e.etype = 107 OR jlong.e.etype = 108 ) AND jlong.e.elocation = 'MIAMID' )
(1) Index Name: informix.ix154_17
Index Keys: eserialno (Serial, fragments: ALL)
Lower Index Filter: jlong.d.serialno = jlong.e.eserialno
ON-Filters:(((jlong.d.serialno = jlong.e.eserialno AND jlong.d.client = jlong.e.eclient ) AND jlong.d.carrier = jlong.e.ecarrier ) AND jlong.d.location = jlong.e.elocation )
NESTED LOOP JOIN
3) jlong.o: INDEX PATH
(1) Index Name: informix. 126_300
Index Keys: client carrier location code (Serial, fragments: ALL)
Lower Index Filter: (((jlong.d.client = jlong.o.client AND jlong.d.driver = jlong.o.code ) AND jlong.d.carrier = jlong.o.carrier ) AND jlong.d.location = jlong.o.location )
ON-Filters:(((jlong.d.client = jlong.o.client AND jlong.d.driver = jlong.o.code ) AND jlong.d.carrier = jlong.o.carrier ) AND jlong.d.location = jlong.o.location )
NESTED LOOP JOIN(LEFT OUTER JOIN)
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 d
t2 e
t3 o
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 603 71 742 00:00.00 122
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t2 1752 1597 1729 00:00.00 2
type rows_prod est_rows time est_cost
-------------------------------------------------
nljoin 876 1 00:00.00 254
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t3 1752 9862 876 00:00.00 1
type rows_prod est_rows time est_cost
-------------------------------------------------
nljoin 876 1 00:00.01 254
type rows_sort est_rows rows_cons time est_cost
------------------------------------------------------------
sort 876 1 876 00:00.01 0
Can anybody help to analyze the explain file and give the reason of different output?
Thanks

interrogate a Ruby array of hashes

I need to group people by age in Ruby. I have their date of birth, and a method which returns their age in years. So a solution like this works.
case
when (0..15).cover?(age_years)
'child'
when (16..24).cover?(age_years)
'16 to 24'
when (25..34).cover?(age_years)
'25 to 34'
when (35..44).cover?(age_years)
'35 to 44'
when (45..54).cover?(age_years)
'45 to 54'
when (55..64).cover?(age_years)
'55 to 64'
when age_years > 64
'really old'
else
'unknown'
end
However, I am trying to learn Ruby and am looking for a more elegant solution. I thought about putting the age_ranges into an array of hashes like this...
age_ranges = [{ name: 'child', min_age: 0, max_age: 15 },
{ name: '16 to 24', min_age: 16, max_age: 24 }]
but am at a loss as to how to interrogate this data to return the correct name where the age_years is within the appropriate ranges, or even a range like this
age_ranges = [{ name: 'child', age_range: '0..15' },
{ name: '16 to 24', age_range: '16..24' }]
which looks neater but I have no idea if I have written gibberish as I don't know how to extract the name when the age years matches.
Can someone point me in the right direction?
Now that you have an map of age names and ranges (note I used range, not string as a value of age_range), you want to search the age_ranges array of hashes for such, which value of age_range includes the age:
def age_ranges
[
{ name: 'child', age_range: 0..15 },
{ name: '16 to 24', age_range: 16..24 }
]
end
def find_age(age)
age_ranges.find { |hash| hash[:age_range].include?(age) }[:name]
end
find_age(12)
#=> "child"
find_age(17)
#=> "16 to 24"
Note, that [:name] will fail if find returns nil (meaning, no matches found).
To overcome it either add an infinite range as a last one in the array (I'd prefer this one, because it is simpler):
def age_ranges
[
{ name: 'child', age_range: 0..15 },
{ name: '16 to 24', age_range: 16..24 },
{ name: 'unknown', age_range: 25..Float::INFINITY }
]
end
Or handle it while fetching the age in the find_age method:
def find_age(age)
age_ranges.each_with_object('unknown') { |hash, _| break hash[:name] if hash[:age_range].include?(age) }
end
Also, make sure to handle the negative numbers passed to the method (since age_ranges do not cover negatives):
def find_age(age)
return 'Age can not be less than 0' if age.negative?
age_ranges.find { |hash| hash[:age_range].include?(age) }[:name]
end
P.S. After all these "note/make sure" I want to say that #mudasobwa's answer is the simplest way to go about it :)
Use Range#=== triple equal directly, as it is supposed to be used:
case age_years
when 0..15 then 'child'
when 16..24 then '16 to 24'
when 25..34 then '25 to 34'
when 35..44 then '35 to 44'
when 45..54 then '45 to 54'
when 55..64 then '55 to 64'
when 64..Float::INFINITY then 'really old' # or when 64.method(:<).to_proc
else 'unknown'
end
To make case to accept floats, one should use triple-dot ranges:
case age_years
when 0...16 then 'child'
when 16...25 then '16 to 24'
when 25...35 then '25 to 34'
when 35...45 then '35 to 44'
when 45...55 then '45 to 54'
when 55...64 then '55 to 64'
when 64..Float::INFINITY then 'really old' # or when 64.method(:<).to_proc
else 'unknown'
end
Here's how I'd do it, to avoid code repetition between 16 and 64 :
def age_range(age, offset=4, span=10, lowest_age=16)
i = ((age-offset-1)/span).to_i
min = [i*span+offset+1, lowest_age].max
max = (i+1)*span + offset
"#{min} to #{max}"
end
def age_description(age)
case age
when 0...16 then 'child'
when 16..64 then age_range(age)
when 64..999 then 'really old'
else 'unknown'
end
end
(0..99).each do |age|
puts "%s (%s)" % [age_description(age), age]
end
It outputs :
child (0)
child (1)
child (2)
child (3)
child (4)
child (5)
child (6)
child (7)
child (8)
child (9)
child (10)
child (11)
child (12)
child (13)
child (14)
child (15)
16 to 24 (16)
16 to 24 (17)
16 to 24 (18)
16 to 24 (19)
16 to 24 (20)
16 to 24 (21)
16 to 24 (22)
16 to 24 (23)
16 to 24 (24)
25 to 34 (25)
25 to 34 (26)
25 to 34 (27)
25 to 34 (28)
25 to 34 (29)
25 to 34 (30)
25 to 34 (31)
25 to 34 (32)
25 to 34 (33)
25 to 34 (34)
35 to 44 (35)
...
As a bonus, it also works with Floats (e.g. 15.9 and 16.0).

Dask: ValueError: Integer column has NA values

I tried to use dask and found something that appears to be a bug in dask.dataframe.read_csv.
import dask.dataframe as dd
types = {'id': 'int16', 'Semana': 'uint8', 'Agencia_ID': 'uint16', 'Canal_ID': 'uint8',
'Ruta_SAK': 'uint16' ,'Cliente_ID': 'float32', 'Producto_ID': 'float32'}
name_map = {'Semana': 'week', 'Agencia_ID': 'agency', 'Canal_ID': 'channel',
'Ruta_SAK': 'route', 'Cliente_ID': 'client', 'Producto_ID': 'prod'}
test = dd.read_csv(os.path.join(datadir, 'test.csv'), usecols=types.keys(), dtype=types)
test = test.rename(columns=name_map)
gives :
ValueError: Integer column has NA values in column 1
However, the same pandas read_csv operation completes fine and does not yield any NA:
types = {'id': 'int16', 'Semana': 'uint8', 'Agencia_ID': 'uint16', 'Canal_ID': 'uint8',
'Ruta_SAK': 'uint16' ,'Cliente_ID': 'float32', 'Producto_ID': 'float32'}
name_map = {'Semana': 'week', 'Agencia_ID': 'agency', 'Canal_ID': 'channel',
'Ruta_SAK': 'route', 'Cliente_ID': 'client', 'Producto_ID': 'prod'}
test = pd.read_csv(os.path.join(datadir, 'test.csv'), usecols=types.keys(), dtype=types)
test = test.rename(columns=name_map)
test.isnull().any()
id False
week False
agency False
channel False
route False
client False
prod False
dtype: bool
Should I consider this to be an established bug and raise a JIRA for it?
Full traceback:
ValueError Traceback (most recent call last)
in ()
4 'Ruta_SAK': 'route', 'Cliente_ID': 'client', 'Producto_ID': 'prod'}
5
----> 6 test = dd.read_csv(os.path.join(datadir, 'test.csv'), usecols=types.keys(), dtype=types)
7 test = test.rename(columns=name_map)
D:\PROGLANG\Anaconda2\lib\site-packages\dask\dataframe\csv.pyc in read_csv(filename, blocksize, chunkbytes, collection, lineterminator, compression, sample, enforce, storage_options, **kwargs)
195 else:
196 header = sample.split(b_lineterminator)[0] + b_lineterminator
--> 197 head = pd.read_csv(BytesIO(sample), **kwargs)
198
199 df = read_csv_from_bytes(values, header, head, kwargs,
D:\PROGLANG\Anaconda2\lib\site-packages\pandas\io\parsers.pyc in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision)
560 skip_blank_lines=skip_blank_lines)
561
--> 562 return _read(filepath_or_buffer, kwds)
563
564 parser_f.name = name
D:\PROGLANG\Anaconda2\lib\site-packages\pandas\io\parsers.pyc in _read(filepath_or_buffer, kwds)
323 return parser
324
--> 325 return parser.read()
326
327 _parser_defaults = {
D:\PROGLANG\Anaconda2\lib\site-packages\pandas\io\parsers.pyc in read(self, nrows)
813 raise ValueError('skip_footer not supported for iteration')
814
--> 815 ret = self._engine.read(nrows)
816
817 if self.options.get('as_recarray'):
D:\PROGLANG\Anaconda2\lib\site-packages\pandas\io\parsers.pyc in read(self, nrows)
1312 def read(self, nrows=None):
1313 try:
-> 1314 data = self._reader.read(nrows)
1315 except StopIteration:
1316 if self._first_chunk:
pandas\parser.pyx in pandas.parser.TextReader.read (pandas\parser.c:8748)()
pandas\parser.pyx in pandas.parser.TextReader._read_low_memory (pandas\parser.c:9003)()
pandas\parser.pyx in pandas.parser.TextReader._read_rows (pandas\parser.c:10022)()
pandas\parser.pyx in pandas.parser.TextReader._convert_column_data (pandas\parser.c:11397)()
pandas\parser.pyx in pandas.parser.TextReader._convert_tokens (pandas\parser.c:12093)()
pandas\parser.pyx in pandas.parser.TextReader._convert_with_dtype (pandas\parser.c:13057)()
ValueError: Integer column has NA values in column 1

Resources