Cant find Avro schema in divolte and kafka docker - avro

I have 3 dockers kafka, divolt and streamsets (https://github.com/divolte/docker-divolte) started by compose-up. I want to convert the topic messages to avro files. I created the pipeline in streamset and paste the avro schema, but got an error:
com.streamsets.pipeline.api.base.OnRecordErrorException: KAFKA_37 - Cannot parse record from message 'divolte::3::0': java.io.IOException: Invalid int encoding
at com.streamsets.pipeline.stage.origin.multikafka.MultiKafkaSource$MultiTopicCallable.createRecord(MultiKafkaSource.java:192)
at com.streamsets.pipeline.stage.origin.multikafka.MultiKafkaSource$MultiTopicCallable.sendBatch(MultiKafkaSource.java:158)
at com.streamsets.pipeline.stage.origin.multikafka.MultiKafkaSource$MultiTopicCallable.call(MultiKafkaSource.java:135)
at com.streamsets.pipeline.stage.origin.multikafka.MultiKafkaSource$MultiTopicCallable.call(MultiKafkaSource.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Invalid int encoding
I red that problem is in incorrect avro schema. Could you tell where I can find the correct avro schema for this? I cant find it in dockers and github.

Looks like it might be in the Divolte GitHub repo, at https://github.com/divolte/divolte-schema/blob/master/src/main/resources/DefaultEventRecord.avsc

Related

Unable to access file .vmtx (Jenkins plugin vSphere Cloud)

I am using Jenkins version 2.236 (but also tried with 2.263.1), vSphere Plugin version 2.24, vSphere Client version 6.7.0
When i start my job and go to jenkins logs i see that below:
enter image description here
This vmtx file is avaliable on vSphere client (i can download it, delete it, move by hands no problem), i also can Clone VM from template manualy, but plugin can't get access to it and clone it, it has an error
enter image description here
Please help, how to avoid this problem?
Unexpected exception encountered while provisioning agent dynamic3fofguaybl6yiwkjfe7jk5zm9
com.vmware.vim25.CannotAccessVmConfig
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)
at com.vmware.vim25.ws.XmlGenDom.fromXml(XmlGenDom.java:253)
at com.vmware.vim25.ws.XmlGenDom.fromXml(XmlGenDom.java:363)
at com.vmware.vim25.ws.XmlGenDom.fromXml(XmlGenDom.java:363)
at com.vmware.vim25.ws.XmlGenDom.fromXml(XmlGenDom.java:363)
at com.vmware.vim25.ws.XmlGenDom.fromXml(XmlGenDom.java:356)
at com.vmware.vim25.ws.XmlGenDom.fromXML(XmlGenDom.java:233)
at com.vmware.vim25.ws.XmlGenDom.fromXML(XmlGenDom.java:124)
at com.vmware.vim25.ws.SoapClient.unMarshall(SoapClient.java:253)
at com.vmware.vim25.ws.WSClient.invoke(WSClient.java:96)
at com.vmware.vim25.ws.VimStub.retrieveProperties(VimStub.java:106)
at com.vmware.vim25.mo.PropertyCollector.retrieveProperties(PropertyCollector.java:98)
at com.vmware.vim25.mo.ManagedObject.retrieveObjectProperties(ManagedObject.java:146)
at com.vmware.vim25.mo.ManagedObject.getCurrentProperty(ManagedObject.java:167)
at com.vmware.vim25.mo.Task.getTaskInfo(Task.java:51)
Caused: org.jenkinsci.plugins.vsphere.tools.VSphereException: vSphere Error: Couldn't clone "ubuntu-18.04-dynamic-node-for-test". Clone task ended with status error.
Unable to access the virtual machine configuration: Unable to access file [esxi-datastore_nvme] ubuntu-18.04-dynamic-node-for-test/ubuntu-18.04-dynamic-node-for-test.vmtx
at org.jenkinsci.plugins.vsphere.tools.VSphere.newVSphereException(VSphere.java:1151)
at org.jenkinsci.plugins.vsphere.tools.VSphere.cloneOrDeployVm(VSphere.java:287)
at org.jenkinsci.plugins.vSphereCloudSlaveTemplate.provision(vSphereCloudSlaveTemplate.java:428)
at org.jenkinsci.plugins.vSphereCloudSlaveTemplate.provision(vSphereCloudSlaveTemplate.java:403)
at org.jenkinsci.plugins.vSphereCloud$VSpherePlannedNode.provisionNewNode(vSphereCloud.java:534)
at org.jenkinsci.plugins.vSphereCloud$VSpherePlannedNode.access$100(vSphereCloud.java:496)
at org.jenkinsci.plugins.vSphereCloud$VSpherePlannedNode$1.call(vSphereCloud.java:510)
at org.jenkinsci.plugins.vSphereCloud$VSpherePlannedNode$1.call(vSphereCloud.java:506)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Cassandra Digest Mismatch Error

I see the following message in Cassandra's debug.log frequently and sometimes before losing nodes in the cluster. Any ideas on what the message means, and how to fix the underlying issue?
DEBUG [ReadRepairStage:9346] 2017-11-06 22:29:46,135 ReadCallback.java:242 - Digest mismatch:
org.apache.cassandra.service.DigestMismatchException: Mismatch for key DecoratedKey(-8713145541289520569, 00114c65616465722f6d61737465722f352e3100000364633100) (408c7e13eea38efc9429366038cbe4a3 vs 8ce8acece0966903ac590d3229099398)
at org.apache.cassandra.service.DigestResolver.compareResponses(DigestResolver.java:92) ~[cassandra-all-3.11.0.1900.jar:3.11.0.1900]
at org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:233) ~[cassandra-all-3.11.0.1900.jar:3.11.0.1900]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_151]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_151]
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) [cassandra-all-3.11.0.1900.jar:3.11.0.1900]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_151]
Here are the details of the Cassandra cluster:
4 node cluster
Each is an AWS instance of type m4.2xlarge
Each has an io1 volume with 20000 IOPS
All on same VPC, with 10.0.0.x private IP addresses
DataStax Enterprise Server 5.1.5
I think these are harmless messages from read repair noticing different data on different nodes, and probably not the cause of your node going down. See a more detailed answer this question last year: Datastax Mismatch for Key Issue

How to handle "an unhandled error caused by the Dataflow SDK" (corrupted gz as input)

Is there a way to deal with "an unhandled error caused by the Dataflow SDK"?
Specifically, we have a Dataflow job that takes a list of gz files (in GCS) as input, and produces some output.
Once in a while one of the gz files may be corrupted, and the job fails because of it.
We are wondering if there is a way to handle this -- specifically, we want to make it so that the job would ignore such corrupted file(s) and proceed.
It is not clear if we can catch an exception thrown due to the corrupted gz file or not (because it appears that it is handled in Dataflow SDK itself, causing it to fail).
(For Google Dataflow team: Here is a specific dataflow job id: 2017-04-02_05_08_20-5491890758767473661.)
Update: Here's the stack trace we got from the logging UI.
(778029c78ed61ff2): java.io.IOException: Failed to advance reader of source: StaticValueProvider{value=gs://aaa.gz} range [0, 9223372036854775807)
at com.google.cloud.dataflow.sdk.runners.worker.WorkerCustomSources$BoundedReaderIterator.advance(WorkerCustomSources.java:544)
at com.google.cloud.dataflow.sdk.util.common.worker.ReadOperation$SynchronizedReaderIterator.advance(ReadOperation.java:425)
at com.google.cloud.dataflow.sdk.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:217)
at com.google.cloud.dataflow.sdk.util.common.worker.ReadOperation.start(ReadOperation.java:182)
at com.google.cloud.dataflow.sdk.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:69)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.executeWork(DataflowWorker.java:284)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.doWork(DataflowWorker.java:220)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:170)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.doWork(DataflowWorkerHarness.java:192)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:172)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:159)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.EOFException
at org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.read(GzipCompressorInputStream.java:278)
at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
at com.google.cloud.dataflow.sdk.io.TextIO$TextSource$TextBasedReader.tryToEnsureNumberOfBytesInBuffer(TextIO.java:1077)
at com.google.cloud.dataflow.sdk.io.TextIO$TextSource$TextBasedReader.findSeparatorBounds(TextIO.java:1011)
at com.google.cloud.dataflow.sdk.io.TextIO$TextSource$TextBasedReader.readNextRecord(TextIO.java:1043)
at com.google.cloud.dataflow.sdk.io.CompressedSource$CompressedReader.readNextRecord(CompressedSource.java:482)
at com.google.cloud.dataflow.sdk.io.FileBasedSource$FileBasedReader.advanceImpl(FileBasedSource.java:536)
at com.google.cloud.dataflow.sdk.io.OffsetBasedSource$OffsetBasedReader.advance(OffsetBasedSource.java:287)
at com.google.cloud.dataflow.sdk.runners.worker.WorkerCustomSources$BoundedReaderIterator.advance(WorkerCustomSources.java:541)
... 14 more

Jenkins download plugin analysis-core

I am trying to download the Static Analysis Utilities plugin from Jenkins v1.618. I keep getting Failure. I tried this several times since the yesterday but same result. The exception is as follows:
hudson.util.IOException2: Failed to download from http://updates.jenkins-ci.org/download/plugins/analysis-core/1.72/analysis-core.hpi (redirected to: http://ftp-nyc.osuosl.org/pub/jenkins/plugins/analysis-core/1.72/analysis-core.hpi)
at hudson.model.UpdateCenter$UpdateCenterConfiguration.download(UpdateCenter.java:797)
at hudson.model.UpdateCenter$DownloadJob._run(UpdateCenter.java:1148)
at hudson.model.UpdateCenter$InstallationJob._run(UpdateCenter.java:1309)
at hudson.model.UpdateCenter$DownloadJob.run(UpdateCenter.java:1126)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Inconsistent file length: expected 3186580 but only got 6836
at hudson.model.UpdateCenter$UpdateCenterConfiguration.download(UpdateCenter.java:784)
... 7 more
Any suggestions?

Flume twitter config error

I am trying to extract twitter data using flume. but i am getting the following error
15/04/08 23:16:36 ERROR node.PollingPropertiesFileConfigurationProvider: Unhandled error
java.lang.NoSuchMethodError: twitter4j.conf.Configuration.isStallWarningsEnabled()Z
at twitter4j.TwitterStreamImpl.<init>(TwitterStreamImpl.java:60)
at twitter4j.TwitterStreamFactory.<clinit>(TwitterStreamFactory.java:40)
at com.cloudera.flume.source.TwitterSource.<init>(TwitterSource.java:64)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at java.lang.Class.newInstance(Class.java:433)
at org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:42)
at org.apache.flume.node.AbstractConfigurationProvider.loadSources(AbstractConfigurationProvider.java:327)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I have used the flume-sources-1.0-SNAPSHOT.jar from cloudera.The TwitterAgent runs with the above mentioned error.
Is there any work around for it?
Thanks in advance.
This is obviously a dependency error. the flume-sources library expects a version of Twitter4j that isn't present, hence the NoSuchMethod error. I would suggest that you pull the right versions, whihc would be
1.6.0-SNAPSHOT for twitter source and 3.0.3 for twitter4j. You should consult flume's pom.xml, which has all the version info you need.
It should be noted that you should use the most current version as possible, as old implementations will not work. Twitter broke their old APIs in the meantime.
Hope this helps.
This is an issue with fully qualified name of the Class in your Agent.conf file.
In the older versions, class name is: com.cloudera.flume.source.TwitterSource
In the latest version of the flume TwitterSource is already shipped and no need to download separately.
The class name is changed to org.apache.flume.source.twitter.TwitterSource
Please carefully change the class name, defitely It will work for you

Resources