Google cloud streaming dataflow : Error while fetching side input - google-cloud-dataflow

Sometimes I am getting below exception while running streaming dataflow :
exception: "java.lang.RuntimeException: Exception while fetching side input:
at com.google.cloud.dataflow.sdk.runners.worker.StateFetcher.fetchSideInput(StateFetcher.java:184)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingModeExecutionContext.fetchSideInput(StreamingModeExecutionContext.java:175)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingModeExecutionContext.access$400(StreamingModeExecutionContext.java:56)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingModeExecutionContext$StepContext.issueSideInputFetch(StreamingModeExecutionContext.java:401)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingSideInputFetcher.getReadyWindows(StreamingSideInputFetcher.java:135)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:49)
at com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:175)
at com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.startBundle(SimpleParDoFn.java:117)
at com.google.cloud.dataflow.sdk.runners.worker.ForwardingParDoFn.startBundle(ForwardingParDoFn.java:36)
at com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.start(ParDoOperation.java:45)
at com.google.cloud.dataflow.sdk.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:69)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:719)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingDataflowWorker.access$600(StreamingDataflowWorker.java:95)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:538)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.google.cloud.dataflow.worker.repackaged.com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Duplicate values for 2059
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2207)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache.get(LocalCache.java:3953)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4790)
at com.google.cloud.dataflow.sdk.runners.worker.StateFetcher.fetchSideInput(StateFetcher.java:175)
... 16 more
Caused by: java.lang.IllegalArgumentException: Duplicate values for 2059
at com.google.cloud.dataflow.sdk.util.PCollectionViews$MapPCollectionView.fromElements(PCollectionViews.java:291)
at com.google.cloud.dataflow.sdk.util.PCollectionViews$MapPCollectionView.fromElements(PCollectionViews.java:273)
at com.google.cloud.dataflow.sdk.util.PCollectionViews$PCollectionViewBase.fromIterableInternal(PCollectionViews.java:368)
at com.google.cloud.dataflow.sdk.runners.worker.StateFetcher$2.call(StateFetcher.java:152)
at com.google.cloud.dataflow.sdk.runners.worker.StateFetcher$2.call(StateFetcher.java:104)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4793)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3542)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2323)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2286)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2201)
... 19 more
Dataflow worker machine type : n1-standard-4
Worker cache memory Mb : 2048
Dataflow main Input : PubSub subscription.
I am creating sideInput from BT , and passing this sideInputs to multiple transformations.The size of my sideinput is less that 100Mb.
Thanks.

That error indicates that multiple values with the same key (2059) have been encountered, which violates the expectations for a Map-valued side input. This can happen in streaming especially if you trigger the same value multiple times. If you instead use a Multimap, it should allow you to retrieve all of the values associated with a given key.

Related

neo4j admin import Error in import Requested index -1, but length is 1000000

I have a set of CSV's that I have been able to use with LOAD CSV to create a database. This set is the small version (1 gb) of a much larger data set (120 gb) I intend to load to neo4j using admin import. I am trying to run the admin import on the smaller dataset first since I have already successfully created a graph with that data. I assume that if I can get the admin import to run for the small version it will hopefully run without problems for the large dataset. I've read through the admin import instructions and I've set up header files. The import loads the nodes just fine but ends up failing with he relationship files. Can anyone help me understand what is happening here so that I can figure out how to fix it? I've tried just removing the file and its associated nodes but this only results in the same error being thrown from the next file in the relationships list.
IMPORT FAILED in 9s 121ms.
Data statistics is not available.
Peak memory usage: 1.015GiB
Error in input data
Caused by:ERROR in input
data source: BufferedCharSeeker[source:/var/lib/neo4j/import/rel_cchg_dimcchg.csv, position:3861455, line:77614]
in field: :START_ID(cchg-ID):1
for header: [:START_ID(cchg-ID), :END_ID(dim_cchg-ID), :TYPE]
raw field value: 106715432018-09-010.01.00.0
original error: Requested index -1, but length is 1000000
org.neo4j.internal.batchimport.input.InputException: ERROR in input
data source: BufferedCharSeeker[source:/var/lib/neo4j/import/rel_cchg_dimcchg.csv, position:3861455, line:77614]
in field: :START_ID(cchg-ID):1
for header: [:START_ID(cchg-ID), :END_ID(dim_cchg-ID), :TYPE]
raw field value: 106715432018-09-010.01.00.0
original error: Requested index -1, but length is 1000000
at org.neo4j.internal.batchimport.input.csv.CsvInputParser.next(CsvInputParser.java:234)
at org.neo4j.internal.batchimport.input.csv.LazyCsvInputChunk.next(LazyCsvInputChunk.java:98)
at org.neo4j.internal.batchimport.input.csv.CsvInputChunkProxy.next(CsvInputChunkProxy.java:75)
at org.neo4j.internal.batchimport.ExhaustingEntityImporterRunnable.run(ExhaustingEntityImporterRunnable.java:57)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
at org.neo4j.internal.helpers.NamedThreadFactory$2.run(NamedThreadFactory.java:110)
Caused by: java.lang.ArrayIndexOutOfBoundsException: Requested index -1, but length is 1000000
at org.neo4j.internal.batchimport.cache.OffHeapRegularNumberArray.addressOf(OffHeapRegularNumberArray.java:42)
at org.neo4j.internal.batchimport.cache.OffHeapLongArray.get(OffHeapLongArray.java:43)
at org.neo4j.internal.batchimport.cache.DynamicLongArray.get(DynamicLongArray.java:46)
at org.neo4j.internal.batchimport.cache.idmapping.string.EncodingIdMapper.dataValue(EncodingIdMapper.java:767)
at org.neo4j.internal.batchimport.cache.idmapping.string.EncodingIdMapper.findFromEIdRange(EncodingIdMapper.java:802)
at org.neo4j.internal.batchimport.cache.idmapping.string.EncodingIdMapper.binarySearch(EncodingIdMapper.java:750)
at org.neo4j.internal.batchimport.cache.idmapping.string.EncodingIdMapper.binarySearch(EncodingIdMapper.java:305)
at org.neo4j.internal.batchimport.cache.idmapping.string.EncodingIdMapper.get(EncodingIdMapper.java:205)
at org.neo4j.internal.batchimport.RelationshipImporter.nodeId(RelationshipImporter.java:134)
at org.neo4j.internal.batchimport.RelationshipImporter.startId(RelationshipImporter.java:109)
at org.neo4j.internal.batchimport.input.InputEntityVisitor$Delegate.startId(InputEntityVisitor.java:228)
at org.neo4j.internal.batchimport.input.csv.CsvInputParser.next(CsvInputParser.java:117)
... 9 more
The error is actually quite explicit: have a look at line 77614 in rel_cchg_dimcchg.csv. It's usually caused by an incorrect endpoint id. For example, if the END_ID is supposed to be a number but it's something like 4171;4172;4173;4174;4175;4176 this will raise the InputException error.
One would assume that --skip-bad-relationships would ignore these issues but it doesn't. So, the only remedy is to ensure that all START_ID/END_ID's are correct (ie. the right data type and format).

Avro "not open" exception when writing generic records using Apache Beam

I am using AvroIO.<MyCustomType>writeCustomTypeToGenericRecords() for writing generic records to GCS inside a streaming data flow job. For the first few minutes all seems to be working fine, however, after around 10 minutes, the job starts throwing the following error:
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: org.apache.avro.AvroRuntimeException: not open
com.google.cloud.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:183)
com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:102)
org.apache.beam.runners.core.ReduceFnRunner.lambda$onTrigger$1(ReduceFnRunner.java:1057)
org.apache.beam.runners.core.ReduceFnContextFactory$OnTriggerContextImpl.output(ReduceFnContextFactory.java:438)
org.apache.beam.runners.core.SystemReduceFn.onTrigger(SystemReduceFn.java:125)
org.apache.beam.runners.core.ReduceFnRunner.onTrigger(ReduceFnRunner.java:1060)
org.apache.beam.runners.core.ReduceFnRunner.onTimers(ReduceFnRunner.java:768)
com.google.cloud.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:95)
com.google.cloud.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:42)
com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner.invokeProcessElement(GroupAlsoByWindowFnRunner.java:115)
com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner.processElement(GroupAlsoByWindowFnRunner.java:73)
org.apache.beam.runners.core.LateDataDroppingDoFnRunner.processElement(LateDataDroppingDoFnRunner.java:80)
com.google.cloud.dataflow.worker.GroupAlsoByWindowsParDoFn.processElement(GroupAlsoByWindowsParDoFn.java:133)
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:200)
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:158)
com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:75)
com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1227)
com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:136)
com.google.cloud.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:966)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.beam.sdk.util.UserCodeException: org.apache.avro.AvroRuntimeException: not open
org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:34)
org.apache.beam.sdk.io.WriteFiles$WriteShardsIntoTempFilesFn$DoFnInvoker.invokeProcessElement(Unknown Source)
org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:275)
org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:237)
com.google.cloud.dataflow.worker.StreamingSideInputDoFnRunner.processElement(StreamingSideInputDoFnRunner.java:72)
com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:324)
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
com.google.cloud.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:181)
com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:102)
org.apache.beam.runners.core.ReduceFnRunner.lambda$onTrigger$1(ReduceFnRunner.java:1057)
org.apache.beam.runners.core.ReduceFnContextFactory$OnTriggerContextImpl.output(ReduceFnContextFactory.java:438)
org.apache.beam.runners.core.SystemReduceFn.onTrigger(SystemReduceFn.java:125)
org.apache.beam.runners.core.ReduceFnRunner.onTrigger(ReduceFnRunner.java:1060)
org.apache.beam.runners.core.ReduceFnRunner.onTimers(ReduceFnRunner.java:768)
com.google.cloud.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:95)
com.google.cloud.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:42)
com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner.invokeProcessElement(GroupAlsoByWindowFnRunner.java:115)
com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner.processElement(GroupAlsoByWindowFnRunner.java:73)
org.apache.beam.runners.core.LateDataDroppingDoFnRunner.processElement(LateDataDroppingDoFnRunner.java:80)
com.google.cloud.dataflow.worker.GroupAlsoByWindowsParDoFn.processElement(GroupAlsoByWindowsParDoFn.java:133)
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:200)
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:158)
com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:75)
com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1227)
com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:136)
com.google.cloud.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:966)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.avro.AvroRuntimeException: not open
org.apache.avro.file.DataFileWriter.assertOpen(DataFileWriter.java:82)
org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:299)
org.apache.beam.sdk.io.AvroSink$AvroWriter.write(AvroSink.java:123)
org.apache.beam.sdk.io.WriteFiles.writeOrClose(WriteFiles.java:550)
org.apache.beam.sdk.io.WriteFiles.access$1000(WriteFiles.java:112)
org.apache.beam.sdk.io.WriteFiles$WriteShardsIntoTempFilesFn.processElement(WriteFiles.java:718)
The data flow job continues to run fine though. Just to give some background about the streaming job: it pulls messages from Pub/Sub, creates a fixed window of 5 minutes with a trigger of 10,000 messages ( whichever comes first), processes the messages and finally writes to a GCP bucket whereby each specific type of message goes to a specific folder based on the type of the message using .to(new AvroEventDynamicDestinations(avroBaseDir, schemaView)).
UPDATE 1: Looking at the timestamp of this error, it seems to be coming up exactly with an interval of 10 seconds, so 6 per minute.
I had exactly same exception. My problem came from wrong schema, null schema to be exact (not found by schema registry)

Sonarqube 6.7 - Fail to read ISSUES.LOCATIONS, com.google.protobuf.InvalidProtocolBufferException

Having the below issue after upgrading to SonarQube 6.7. I'm using docker image for sonarqube, I just ran the migrations scripts just like suggested by Sonarquibe UI.
Can someone suggest ?
Thank you
2018.05.15 06:58:42 INFO ce[AWNimASHSOFQrk0D-LC8][o.s.c.t.CeWorkerImpl] Execute task | project=myproject:develop | type=REPORT | id=AWNimASHSOFQrk0D-LC8 | submitter=jenkins
2018.05.15 06:59:52 ERROR ce[AWNimASHSOFQrk0D-LC8][o.s.c.t.CeWorkerImpl] Failed to execute task AWNimASHSOFQrk0D-LC8
org.sonar.server.computation.task.projectanalysis.component.VisitException: Visit of Component {key=myproject:develop,type=PROJECT} failed
at org.sonar.server.computation.task.projectanalysis.component.VisitException.rethrowOrWrap(VisitException.java:44)
at org.sonar.server.computation.task.projectanalysis.component.VisitorsCrawler.visit(VisitorsCrawler.java:74)
at org.sonar.server.computation.task.projectanalysis.step.ExecuteVisitorsStep.execute(ExecuteVisitorsStep.java:51)
at org.sonar.server.computation.task.step.ComputationStepExecutor.executeSteps(ComputationStepExecutor.java:64)
at org.sonar.server.computation.task.step.ComputationStepExecutor.execute(ComputationStepExecutor.java:52)
at org.sonar.server.computation.task.projectanalysis.taskprocessor.ReportTaskProcessor.process(ReportTaskProcessor.java:73)
at org.sonar.ce.taskprocessor.CeWorkerImpl.executeTask(CeWorkerImpl.java:134)
at org.sonar.ce.taskprocessor.CeWorkerImpl.findAndProcessTask(CeWorkerImpl.java:97)
at org.sonar.ce.taskprocessor.CeWorkerImpl.withCustomizedThreadName(CeWorkerImpl.java:81)
at org.sonar.ce.taskprocessor.CeWorkerImpl.call(CeWorkerImpl.java:73)
at org.sonar.ce.taskprocessor.CeWorkerImpl.call(CeWorkerImpl.java:43)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.ibatis.exceptions.PersistenceException:
Error querying database. Cause: java.lang.IllegalStateException: Fail to read ISSUES.LOCATIONS [KEE=AV9aIkjMZ-jmCeoEg-IR]
The error may exist in org.sonar.db.issue.IssueMapper
The error may involve org.sonar.db.issue.IssueMapper.scrollNonClosedByComponentUuid
The error occurred while handling results
SQL: select i.id, i.kee as kee, i.rule_id as ruleId, i.severity as severity, i.manual_severity as manualSeverity, i.message as message, i.line as line, i.locations as locations, i.gap as gap, i.effort as effort, i.status as status, i.resolution as resolution, i.checksum as checksum, i.assignee as assignee, i.author_login as authorLogin, i.tags as tagsString, i.issue_attributes as issueAttributes, i.issue_creation_date as issueCreationTime, i.issue_update_date as issueUpdateTime, i.issue_close_date as issueCloseTime, i.created_at as createdAt, i.updated_at as updatedAt, r.plugin_rule_key as ruleKey, r.plugin_name as ruleRepo, r.language as language, p.kee as componentKey, i.component_uuid as componentUuid, p.module_uuid as moduleUuid, p.module_uuid_path as moduleUuidPath, p.path as filePath, root.kee as projectKey, i.project_uuid as projectUuid, i.issue_type as type from issues i inner join rules r on r.id=i.rule_id inner join projects p on p.uuid=i.component_uuid inner join projects root on root.uuid=i.project_uuid where i.component_uuid = ? and i.status <> 'CLOSED'
Cause: java.lang.IllegalStateException: Fail to read ISSUES.LOCATIONS [KEE=AV9aIkjMZ-jmCeoEg-IR]
at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
at org.apache.ibatis.session.defaults.DefaultSqlSession.select(DefaultSqlSession.java:172)
at org.apache.ibatis.session.defaults.DefaultSqlSession.select(DefaultSqlSession.java:158)
at org.apache.ibatis.binding.MapperMethod.executeWithResultHandler(MapperMethod.java:126)
at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:72)
at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:59)
at com.sun.proxy.$Proxy42.scrollNonClosedByComponentUuid(Unknown Source)
at org.sonar.server.computation.task.projectanalysis.issue.ComponentIssuesLoader.loadForComponentUuid(ComponentIssuesLoader.java:73)
at org.sonar.server.computation.task.projectanalysis.issue.ComponentIssuesLoader.loadForComponentUuid(ComponentIssuesLoader.java:51)
at org.sonar.server.computation.task.projectanalysis.issue.CloseIssuesOnRemovedComponentsVisitor.closeIssuesForDeletedComponentUuids(CloseIssuesOnRemovedComponentsVisitor.java:60)
at org.sonar.server.computation.task.projectanalysis.issue.CloseIssuesOnRemovedComponentsVisitor.visitProject(CloseIssuesOnRemovedComponentsVisitor.java:53)
at org.sonar.server.computation.task.projectanalysis.component.TypeAwareVisitorWrapper.visitProject(TypeAwareVisitorWrapper.java:47)
at org.sonar.server.computation.task.projectanalysis.component.VisitorsCrawler.visitNode(VisitorsCrawler.java:120)
at org.sonar.server.computation.task.projectanalysis.component.VisitorsCrawler.visitImpl(VisitorsCrawler.java:100)
at org.sonar.server.computation.task.projectanalysis.component.VisitorsCrawler.visit(VisitorsCrawler.java:72)
... 17 common frames omitted
Caused by: java.lang.IllegalStateException: Fail to read ISSUES.LOCATIONS [KEE=AV9aIkjMZ-jmCeoEg-IR]
at org.sonar.db.issue.IssueDto.parseLocations(IssueDto.java:652)
at org.sonar.db.issue.IssueDto.toDefaultIssue(IssueDto.java:721)
at org.sonar.server.computation.task.projectanalysis.issue.ComponentIssuesLoader.lambda$loadForComponentUuid$1(ComponentIssuesLoader.java:74)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.callResultHandler(DefaultResultSetHandler.java:363)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.storeObject(DefaultResultSetHandler.java:356)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.handleRowValuesForSimpleResultMap(DefaultResultSetHandler.java:348)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.handleRowValues(DefaultResultSetHandler.java:322)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.handleResultSet(DefaultResultSetHandler.java:298)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.handleResultSets(DefaultResultSetHandler.java:192)
at org.apache.ibatis.executor.statement.PreparedStatementHandler.query(PreparedStatementHandler.java:64)
at org.apache.ibatis.executor.statement.RoutingStatementHandler.query(RoutingStatementHandler.java:79)
at org.apache.ibatis.executor.ReuseExecutor.doQuery(ReuseExecutor.java:60)
at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:324)
at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:156)
at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:109)
at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:83)
at org.apache.ibatis.session.defaults.DefaultSqlSession.select(DefaultSqlSession.java:170)
... 30 common frames omitted
Caused by: com.google.protobuf.InvalidProtocolBufferException: While parsing a protocol message, the input ended unexpectedly in the middle of a field. This could mean either that the input has been truncated or that an embedded message misreported its own length.
at com.google.protobuf.InvalidProtocolBufferException.truncatedMessage(InvalidProtocolBufferException.java:70)
at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:1068)
at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:1135)
at com.google.protobuf.CodedInputStream.readRawVarint64SlowPath(CodedInputStream.java:778)
at com.google.protobuf.CodedInputStream.readRawVarint32(CodedInputStream.java:637)
at com.google.protobuf.CodedInputStream.readInt32(CodedInputStream.java:348)
at org.sonar.db.protobuf.DbCommons$TextRange.<init>(DbCommons.java:149)
at org.sonar.db.protobuf.DbCommons$TextRange.<init>(DbCommons.java:90)
at org.sonar.db.protobuf.DbCommons$TextRange$1.parsePartialFrom(DbCommons.java:750)
at org.sonar.db.protobuf.DbCommons$TextRange$1.parsePartialFrom(DbCommons.java:744)
at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:495)
at org.sonar.db.protobuf.DbIssues$Locations.<init>(DbIssues.java:99)
at org.sonar.db.protobuf.DbIssues$Locations.<init>(DbIssues.java:55)
at org.sonar.db.protobuf.DbIssues$Locations$1.parsePartialFrom(DbIssues.java:852)
at org.sonar.db.protobuf.DbIssues$Locations$1.parsePartialFrom(DbIssues.java:846)
at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:137)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:169)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:180)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:185)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
at org.sonar.db.protobuf.DbIssues$Locations.parseFrom(DbIssues.java:253)
at org.sonar.db.issue.IssueDto.parseLocations(IssueDto.java:650)
... 46 common frames omitted
2018.05.15 06:59:52 ERROR ce[AWNimASHSOFQrk0D-LC8][o.s.c.t.CeWorkerImpl] Executed task | project=myproject:develop | type=REPORT | id=AWNimASHSOFQrk0D-LC8 | submitter=jenkins | time=70255ms
Actually I solved my problem executing this query on sonar schema. I guess that was definitely related to version upgrade... not a clean solution but still ok to me as I didn't need to have historical data
delete from issues where STATUS != "CLOSED"

Getting timeout exception while connecting Jira to fetch issues

I am getting below error in logs while fetching Jira issues on particular JQL. I have set a schedular to run that JQL every 30 mins to sync the retrieved issues to another tool.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.TimeoutException
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:294)
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:267)
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:96)
at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:69)
at com.asurion.Autopilot.main(Autopilot.java:104)
Caused by: java.lang.RuntimeException: java.util.concurrent.TimeoutException
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at com.atlassian.httpclient.apache.httpcomponents.DefaultHttpClient$3.apply(DefaultHttpClient.java:256)
at com.atlassian.httpclient.apache.httpcomponents.DefaultHttpClient$3.apply(DefaultHttpClient.java:249)
at com.atlassian.util.concurrent.Promises$Of$2.apply(Promises.java:276)
at com.atlassian.util.concurrent.Promises$Of$2.apply(Promises.java:272)
at com.atlassian.util.concurrent.Promises$2.onFailure(Promises.java:167)
at com.google.common.util.concurrent.Futures$6.run(Futures.java:801)
at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:262)
at com.google.common.util.concurrent.ExecutionList$RunnableExecutorPair.execute(ExecutionList.java:149)
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:134)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:193)
at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:68)
at com.atlassian.httpclient.apache.httpcomponents.SettableFuturePromiseHttpPromiseAsyncClient$1$3.run(SettableFuturePromiseHttpPromiseAsyncClient.java:73)
at com.atlassian.httpclient.apache.httpcomponents.SettableFuturePromiseHttpPromiseAsyncClient$ThreadLocalDelegateRunnable$1.run(SettableFuturePromiseHttpPromiseAsyncClient.java:197)
at com.atlassian.httpclient.apache.httpcomponents.SettableFuturePromiseHttpPromiseAsyncClient.runInContext(SettableFuturePromiseHttpPromiseAsyncClient.java:90)
at com.atlassian.httpclient.apache.httpcomponents.SettableFuturePromiseHttpPromiseAsyncClient$ThreadLocalDelegateRunnable.run(SettableFuturePromiseHttpPromiseAsyncClient.java:192)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.util.concurrent.TimeoutException
at com.atlassian.httpclient.apache.httpcomponents.SettableFuturePromiseHttpPromiseAsyncClient$1.doCancelled(SettableFuturePromiseHttpPromiseAsyncClient.java:67)
at com.atlassian.httpclient.apache.httpcomponents.SettableFuturePromiseHttpPromiseAsyncClient$ThreadLocalContextAwareFutureCallback$3.run(SettableFuturePromiseHttpPromiseAsyncClient.java:152)
at com.atlassian.httpclient.apache.httpcomponents.SettableFuturePromiseHttpPromiseAsyncClient.runInContext(SettableFuturePromiseHttpPromiseAsyncClient.java:90)
at com.atlassian.httpclient.apache.httpcomponents.SettableFuturePromiseHttpPromiseAsyncClient$ThreadLocalContextAwareFutureCallback.cancelled(SettableFuturePromiseHttpPromiseAsyncClient.java:147)
at org.apache.http.impl.client.cache.CachingHttpAsyncClient$2.cancelled(CachingHttpAsyncClient.java:636)
at org.apache.http.concurrent.BasicFuture.cancel(BasicFuture.java:150)
at org.apache.http.impl.nio.client.DefaultResultCallback.cancelled(DefaultResultCallback.java:57)
at org.apache.http.impl.nio.client.DefaultAsyncRequestDirector.cancel(DefaultAsyncRequestDirector.java:533)
at org.apache.http.impl.nio.client.DefaultAsyncRequestDirector$1.abortConnection(DefaultAsyncRequestDirector.java:222)
at org.apache.http.client.methods.HttpRequestBase.cleanup(HttpRequestBase.java:137)
at org.apache.http.client.methods.HttpRequestBase.abort(HttpRequestBase.java:151)
at com.atlassian.httpclient.base.RequestKiller$RequestEntry.abort(RequestKiller.java:98)
at com.atlassian.httpclient.base.RequestKiller.run(RequestKiller.java:56)
... 1 more
This error is occurring frequently but not continuous. Sometime it works, many times it won't.
Jira Version: 7.1.4#71008
Jira REST Java Client Version: 2.0.0-m2
JIRA REST java-client-core Version: 2.0.0-m25
Please let me know if you need more details.
Thanks in advance.
Regards,
Tushar
Looks like a timeout on your request, which comes and goes depending on server load. Can you streamline or break up the query, set a longer timeout, or put other activity on the server on hold during the sync?
There was some issue with Jira server. We don't have access to it, but after few days the issue seems to be gone. Not sure what has happened at Jira side.

PredictionIO train error tokens must not be empty

I am tinkering with predictioIO to build a custom classification engine. I have done this before without issues. But for current dataset pio train is giving me an error tokens must not be empty.I have edited Datasource.scala to mention fields in dataset to engine. A line from my dataset is as below
{"event": "ticket", "eventTime": "2015-02-16T05:22:13.477+0000", "entityType": "content","entityId": 365,"properties":{"text": "Request to reset svn credentials","label": "Linux/Admin Task" }}
I can import data and build engine without any issues. I am getting a set of observations too. The error is pasted below
[INFO] [Remoting] Starting remoting
[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://sparkDriver#192.168.61.44:50713]
[INFO] [Engine$] EngineWorkflow.train
[INFO] [Engine$] DataSource: org.template.textclassification.DataSource#4fb64e14
[INFO] [Engine$] Preparator: org.template.textclassification.Preparator#5c4cc644
[INFO] [Engine$] AlgorithmList: List(org.template.textclassification.NBAlgorithm#62b6c045)
[INFO] [Engine$] Data sanity check is off.
[ERROR] [Executor] Exception in task 0.0 in stage 2.0 (TID 2)
[WARN] [TaskSetManager] Lost task 0.0 in stage 2.0 (TID 2, localhost): java.lang.IllegalArgumentException: tokens must not be empty
at opennlp.tools.util.StringList.<init>(StringList.java:61)
at org.template.textclassification.PreparedData.org$template$textclassification$PreparedData$$hash(Preparator.scala:71)
at org.template.textclassification.PreparedData$$anonfun$2.apply(Preparator.scala:113)
at org.template.textclassification.PreparedData$$anonfun$2.apply(Preparator.scala:113)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:202)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[ERROR] [TaskSetManager] Task 0 in stage 2.0 failed 1 times; aborting job
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 2, localhost): java.lang.IllegalArgumentException: tokens must not be empty
at opennlp.tools.util.StringList.<init>(StringList.java:61)
at org.template.textclassification.PreparedData.org$template$textclassification$PreparedData$$hash(Preparator.scala:71)
at org.template.textclassification.PreparedData$$anonfun$2.apply(Preparator.scala:113)
at org.template.textclassification.PreparedData$$anonfun$2.apply(Preparator.scala:113)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:202)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1204)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1193)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1192)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
The problem is with dataset. I did splitting up dataset into parts and trained. Training was completed for that dataset and no errors were reported. How can I know which line in the dataset produce error? It should be very very helpful if this feature is in PredictionIO .
So this is something that happens when you feed in an empty Array[String] to OpenNLP's StringList constructor. Try modifying the function hash in Prepared Data as follows:
private def hash (tokenList : Array[String]): HashMap[String, Double] = {
// Initialize an NGramModel from OpenNLP tools library,
// and add the list of allowable tokens to the n-gram model.
try {
val model : NGramModel = new NGramModel()
model.add(new StringList(tokenList: _*), nMin, nMax)
val map : HashMap[String, Double] = HashMap(
model.iterator.map(
x => (x.toString, model.getCount(x).toDouble)
).toSeq : _*
)
val mapSum = map.values.sum
// Divide by the total number of n-grams in the document
// to obtain n-gram frequency.
map.map(e => (e._1, e._2 / mapSum))
} catch {
case (e : IllegalArgumentException) => HashMap("" -> 0.0)
}
I've only encountered this issue in the prediction stage, and so you can see this is actually implemented in the models' predict methods. I'll update this right now, and put it in a new version release. Thank you for the catch and feedback!

Resources