PredictionIO train error tokens must not be empty - token

I am tinkering with predictioIO to build a custom classification engine. I have done this before without issues. But for current dataset pio train is giving me an error tokens must not be empty.I have edited Datasource.scala to mention fields in dataset to engine. A line from my dataset is as below
{"event": "ticket", "eventTime": "2015-02-16T05:22:13.477+0000", "entityType": "content","entityId": 365,"properties":{"text": "Request to reset svn credentials","label": "Linux/Admin Task" }}
I can import data and build engine without any issues. I am getting a set of observations too. The error is pasted below
[INFO] [Remoting] Starting remoting
[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://sparkDriver#192.168.61.44:50713]
[INFO] [Engine$] EngineWorkflow.train
[INFO] [Engine$] DataSource: org.template.textclassification.DataSource#4fb64e14
[INFO] [Engine$] Preparator: org.template.textclassification.Preparator#5c4cc644
[INFO] [Engine$] AlgorithmList: List(org.template.textclassification.NBAlgorithm#62b6c045)
[INFO] [Engine$] Data sanity check is off.
[ERROR] [Executor] Exception in task 0.0 in stage 2.0 (TID 2)
[WARN] [TaskSetManager] Lost task 0.0 in stage 2.0 (TID 2, localhost): java.lang.IllegalArgumentException: tokens must not be empty
at opennlp.tools.util.StringList.<init>(StringList.java:61)
at org.template.textclassification.PreparedData.org$template$textclassification$PreparedData$$hash(Preparator.scala:71)
at org.template.textclassification.PreparedData$$anonfun$2.apply(Preparator.scala:113)
at org.template.textclassification.PreparedData$$anonfun$2.apply(Preparator.scala:113)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:202)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[ERROR] [TaskSetManager] Task 0 in stage 2.0 failed 1 times; aborting job
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 2, localhost): java.lang.IllegalArgumentException: tokens must not be empty
at opennlp.tools.util.StringList.<init>(StringList.java:61)
at org.template.textclassification.PreparedData.org$template$textclassification$PreparedData$$hash(Preparator.scala:71)
at org.template.textclassification.PreparedData$$anonfun$2.apply(Preparator.scala:113)
at org.template.textclassification.PreparedData$$anonfun$2.apply(Preparator.scala:113)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:202)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1204)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1193)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1192)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
The problem is with dataset. I did splitting up dataset into parts and trained. Training was completed for that dataset and no errors were reported. How can I know which line in the dataset produce error? It should be very very helpful if this feature is in PredictionIO .

So this is something that happens when you feed in an empty Array[String] to OpenNLP's StringList constructor. Try modifying the function hash in Prepared Data as follows:
private def hash (tokenList : Array[String]): HashMap[String, Double] = {
// Initialize an NGramModel from OpenNLP tools library,
// and add the list of allowable tokens to the n-gram model.
try {
val model : NGramModel = new NGramModel()
model.add(new StringList(tokenList: _*), nMin, nMax)
val map : HashMap[String, Double] = HashMap(
model.iterator.map(
x => (x.toString, model.getCount(x).toDouble)
).toSeq : _*
)
val mapSum = map.values.sum
// Divide by the total number of n-grams in the document
// to obtain n-gram frequency.
map.map(e => (e._1, e._2 / mapSum))
} catch {
case (e : IllegalArgumentException) => HashMap("" -> 0.0)
}
I've only encountered this issue in the prediction stage, and so you can see this is actually implemented in the models' predict methods. I'll update this right now, and put it in a new version release. Thank you for the catch and feedback!

Related

Integration of EfficientNet TensorFlow lite model with iOS swift

I am performing object detection using EfficientNet .tflite model in iOS which we trained using the model maker library of TensorFlow. I am integrating this model inside the Official ObjectDetection Sample Application, but while integration i get this error
2022-05-27 16:47:09.401469+0400 ObjectDetection[13225:488954] Initialized TensorFlow Lite runtime.
Failed to invoke the interpreter with error: Provided data count 270000 must match the required count 307200.
Failed to invoke the interpreter with error: Provided data count 270000 must match the required count 307200.
Failed to invoke the interpreter with error: Provided data count 270000 must match the required count 307200.
Failed to invoke the interpreter with error: Provided data count 270000 must match the required count 307200.
So changed the code in ModelDataHandler.swift into the following:
let batchSize = 1
let inputChannels = 3
let inputWidth = 320
let inputHeight = 320
However, when I tried to run the app, it crashed and gave the following error:
2022-05-28 00:05:40.099286+0400 ObjectDetection[13413:503427] Initialized TensorFlow Lite runtime.
Swift/ContiguousArrayBuffer.swift:575: Fatal error: Index out of range
2022-05-28 00:05:43.025412+0400 ObjectDetection[13413:503623] Swift/ContiguousArrayBuffer.swift:575: Fatal error: Index out of range
The error is throwing from the code in ModelDataHandler.swift line 203:
I really appreciate any help.

BI Publisher burst results returns "No data available"

When I run a Version 11 BI Publisher report that has a report and burst script that uses 2 database links the output after submitting the report returns "No data available". The output format of the report requested is Excel (xslx).
Below are the results found in the bipublisher.log file on our Windows Server. Need help to determine why we are not receiving output.
[2019-08-02T11:23:52.932-05:00] [bi_server1] [WARNING] [] [oracle.xdo] [tid: 23] [userId: ] [ecid: 617a7b4a2b313a71:-7f2f4396:16bede5b1b0:-8000-0000000000314602,0] [APP: bipublisher#11.1.1] Context: 0, code: U9KP7Q94, message: Path not found (/users/dclay/Case Management/Reports/Case Status Report 7 TEST.xdo/_report.xdo)
[2019-08-02T11:23:52.932-05:00] [bi_server1] [WARNING] [] [oracle.xdo] [tid: 23] [userId: ] [ecid: 617a7b4a2b313a71:-7f2f4396:16bede5b1b0:-8000-0000000000314602,0] [APP: bipublisher#11.1.1] Context: 1, code: U9KP7Q94, message: Path not found (/users/dclay/Case Management/Reports/Case Status Report 7 TEST.xdo)
[2019-08-02T11:23:52.932-05:00] [bi_server1] [WARNING] [] [oracle.xdo] [tid: 23] [userId: ] [ecid: 617a7b4a2b313a71:-7f2f4396:16bede5b1b0:-8000-0000000000314602,0] [APP: bipublisher#11.1.1] User (dclay) with session id: bulm672onqi9vtjcld9804jbq8hk8gio2rji12q is looking for object in biee path: /users/dclay/Case Management/Reports/Case Status Report 7 TEST.xdo/_report.xdo[[
Object Error [Context: 1, code: U9KP7Q94, message: Path not found (/users/dclay/Case Management/Reports/Case Status Report 7 TEST.xdo)]
Object found [path: /users/dclay/Case Management/Reports, type: 0]
]]
The report runs without issue if we do not burst. Bursting does not cause an external error message, but it does not burst the data.
We expect to see our reports burst by center and id.
::NO_DATA_TO_PROCESS::
i had the same problem:
first i checked the report. Scheduled - unclicked bursting. It worked i got an email.
That means the queries are ok. The bursting definition must be wrong.
I found it - it was the delivery etc dataset in the bursting page.
It had G_1. I had renamed G_1 to a meanful name. In Bursting was still G_1 which took
me a long time to see.

How to fine-tune the level for SonarQube Gradle plugin

I'm using SonarQube plugin (version 2.6.1) for Gradle (version 4.7) and have the problem that a lot of unimportant log output is being written while running the sonar analysis on my CI server.
Is there a way to fine-tune the log level for this plugin?
I checked the documentation but the only setting related to the log output I found was the JVM argument "verbose" which I'm not using either way (I guess the default is false so this shouldn't be turned on for me).
EDIT: Here are some examples of the output I would like to get rid of:
Some huge exception stacktraces during findbugs analysis (this one is shortened, didn't want to post the whole stacktrace, it's really huge).
16:23:34.993 ERROR - Unable to create symbol table for : /opt/workspace/pipeline-1/src/main/java/com/SomeClass.java
java.lang.NullPointerException: null
at org.sonar.java.resolve.TypeAndReferenceSolver.getSymbolOfMemberSelectExpression(TypeAndReferenceSolver.java:232) ~[java-squid-2.5.1.jar:na]
at org.sonar.java.resolve.TypeAndReferenceSolver.resolveAs(TypeAndReferenceSolver.java:200) ~[java-squid-2.5.1.jar:na]
at org.sonar.java.resolve.TypeAndReferenceSolver.resolveAs(TypeAndReferenceSolver.java:182) ~[java-squid-2.5.1.jar:na]
at...
Stacktraces from PMD:
16:23:37.206 ERROR - Fail to execute PMD. Following file is ignored: /opt/workspace/pipeline-1/src/main/java/com/SomeClass.java
java.lang.RuntimeException: null
at org.objectweb.asm.MethodVisitor.visitParameter(Unknown Source) ~[asm-5.0.3.jar:5.0.3]
at org.objectweb.asm.ClassReader.b(Unknown Source) ~[asm-5.0.3.jar:5.0.3]
at org.objectweb.asm.ClassReader.accept(Unknown Source) ~[asm-5.0.3.jar:5.0.3]
at org.objectweb.asm.ClassReader.accept(Unknown Source) ~[asm-5.0.3.jar:5.0.3]
at net.sourceforge.pmd.lang.java.typeresolution.PMDASMClassLoader.getImportedClasses(PMDASMClassLoader.java:77) ~[pmd-java-5.2.1.jar:na]...
Lots of irrelevant warnings like these:
16:23:38.638 WARN - /opt/workspace/pipeline-1/src/main/java/com/SomeClass.java: Got an exception - expecting EOF, found '}'
/opt/workspace/pipeline-1/src/main/java/com/SomeClass.java:28:5: expecting RCURLY, found 'default'
16:23:38.655 WARN - /opt/workspace/pipeline-1/src/main/java/com/SomeClass.java: Got an exception - expecting EOF, found 'someVariable'
I don't know what exactly is causing these problems, but since both my app and the results of the sonar analysis are looking OK, I would like to get rid of those log outputs since they only pollute my logs on Jenkins and make them unreadable.
There's property sonar.log.level and sonar.verbose; for example:
allprojects {
sonarqube {
properties {
// property "sonar.log.level", "INFO"
property "sonar.log.level", "TRACE"
}
}
}
see the analysis parameters.

Sonarqube 6.7 - Fail to read ISSUES.LOCATIONS, com.google.protobuf.InvalidProtocolBufferException

Having the below issue after upgrading to SonarQube 6.7. I'm using docker image for sonarqube, I just ran the migrations scripts just like suggested by Sonarquibe UI.
Can someone suggest ?
Thank you
2018.05.15 06:58:42 INFO ce[AWNimASHSOFQrk0D-LC8][o.s.c.t.CeWorkerImpl] Execute task | project=myproject:develop | type=REPORT | id=AWNimASHSOFQrk0D-LC8 | submitter=jenkins
2018.05.15 06:59:52 ERROR ce[AWNimASHSOFQrk0D-LC8][o.s.c.t.CeWorkerImpl] Failed to execute task AWNimASHSOFQrk0D-LC8
org.sonar.server.computation.task.projectanalysis.component.VisitException: Visit of Component {key=myproject:develop,type=PROJECT} failed
at org.sonar.server.computation.task.projectanalysis.component.VisitException.rethrowOrWrap(VisitException.java:44)
at org.sonar.server.computation.task.projectanalysis.component.VisitorsCrawler.visit(VisitorsCrawler.java:74)
at org.sonar.server.computation.task.projectanalysis.step.ExecuteVisitorsStep.execute(ExecuteVisitorsStep.java:51)
at org.sonar.server.computation.task.step.ComputationStepExecutor.executeSteps(ComputationStepExecutor.java:64)
at org.sonar.server.computation.task.step.ComputationStepExecutor.execute(ComputationStepExecutor.java:52)
at org.sonar.server.computation.task.projectanalysis.taskprocessor.ReportTaskProcessor.process(ReportTaskProcessor.java:73)
at org.sonar.ce.taskprocessor.CeWorkerImpl.executeTask(CeWorkerImpl.java:134)
at org.sonar.ce.taskprocessor.CeWorkerImpl.findAndProcessTask(CeWorkerImpl.java:97)
at org.sonar.ce.taskprocessor.CeWorkerImpl.withCustomizedThreadName(CeWorkerImpl.java:81)
at org.sonar.ce.taskprocessor.CeWorkerImpl.call(CeWorkerImpl.java:73)
at org.sonar.ce.taskprocessor.CeWorkerImpl.call(CeWorkerImpl.java:43)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.ibatis.exceptions.PersistenceException:
Error querying database. Cause: java.lang.IllegalStateException: Fail to read ISSUES.LOCATIONS [KEE=AV9aIkjMZ-jmCeoEg-IR]
The error may exist in org.sonar.db.issue.IssueMapper
The error may involve org.sonar.db.issue.IssueMapper.scrollNonClosedByComponentUuid
The error occurred while handling results
SQL: select i.id, i.kee as kee, i.rule_id as ruleId, i.severity as severity, i.manual_severity as manualSeverity, i.message as message, i.line as line, i.locations as locations, i.gap as gap, i.effort as effort, i.status as status, i.resolution as resolution, i.checksum as checksum, i.assignee as assignee, i.author_login as authorLogin, i.tags as tagsString, i.issue_attributes as issueAttributes, i.issue_creation_date as issueCreationTime, i.issue_update_date as issueUpdateTime, i.issue_close_date as issueCloseTime, i.created_at as createdAt, i.updated_at as updatedAt, r.plugin_rule_key as ruleKey, r.plugin_name as ruleRepo, r.language as language, p.kee as componentKey, i.component_uuid as componentUuid, p.module_uuid as moduleUuid, p.module_uuid_path as moduleUuidPath, p.path as filePath, root.kee as projectKey, i.project_uuid as projectUuid, i.issue_type as type from issues i inner join rules r on r.id=i.rule_id inner join projects p on p.uuid=i.component_uuid inner join projects root on root.uuid=i.project_uuid where i.component_uuid = ? and i.status <> 'CLOSED'
Cause: java.lang.IllegalStateException: Fail to read ISSUES.LOCATIONS [KEE=AV9aIkjMZ-jmCeoEg-IR]
at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
at org.apache.ibatis.session.defaults.DefaultSqlSession.select(DefaultSqlSession.java:172)
at org.apache.ibatis.session.defaults.DefaultSqlSession.select(DefaultSqlSession.java:158)
at org.apache.ibatis.binding.MapperMethod.executeWithResultHandler(MapperMethod.java:126)
at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:72)
at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:59)
at com.sun.proxy.$Proxy42.scrollNonClosedByComponentUuid(Unknown Source)
at org.sonar.server.computation.task.projectanalysis.issue.ComponentIssuesLoader.loadForComponentUuid(ComponentIssuesLoader.java:73)
at org.sonar.server.computation.task.projectanalysis.issue.ComponentIssuesLoader.loadForComponentUuid(ComponentIssuesLoader.java:51)
at org.sonar.server.computation.task.projectanalysis.issue.CloseIssuesOnRemovedComponentsVisitor.closeIssuesForDeletedComponentUuids(CloseIssuesOnRemovedComponentsVisitor.java:60)
at org.sonar.server.computation.task.projectanalysis.issue.CloseIssuesOnRemovedComponentsVisitor.visitProject(CloseIssuesOnRemovedComponentsVisitor.java:53)
at org.sonar.server.computation.task.projectanalysis.component.TypeAwareVisitorWrapper.visitProject(TypeAwareVisitorWrapper.java:47)
at org.sonar.server.computation.task.projectanalysis.component.VisitorsCrawler.visitNode(VisitorsCrawler.java:120)
at org.sonar.server.computation.task.projectanalysis.component.VisitorsCrawler.visitImpl(VisitorsCrawler.java:100)
at org.sonar.server.computation.task.projectanalysis.component.VisitorsCrawler.visit(VisitorsCrawler.java:72)
... 17 common frames omitted
Caused by: java.lang.IllegalStateException: Fail to read ISSUES.LOCATIONS [KEE=AV9aIkjMZ-jmCeoEg-IR]
at org.sonar.db.issue.IssueDto.parseLocations(IssueDto.java:652)
at org.sonar.db.issue.IssueDto.toDefaultIssue(IssueDto.java:721)
at org.sonar.server.computation.task.projectanalysis.issue.ComponentIssuesLoader.lambda$loadForComponentUuid$1(ComponentIssuesLoader.java:74)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.callResultHandler(DefaultResultSetHandler.java:363)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.storeObject(DefaultResultSetHandler.java:356)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.handleRowValuesForSimpleResultMap(DefaultResultSetHandler.java:348)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.handleRowValues(DefaultResultSetHandler.java:322)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.handleResultSet(DefaultResultSetHandler.java:298)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.handleResultSets(DefaultResultSetHandler.java:192)
at org.apache.ibatis.executor.statement.PreparedStatementHandler.query(PreparedStatementHandler.java:64)
at org.apache.ibatis.executor.statement.RoutingStatementHandler.query(RoutingStatementHandler.java:79)
at org.apache.ibatis.executor.ReuseExecutor.doQuery(ReuseExecutor.java:60)
at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:324)
at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:156)
at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:109)
at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:83)
at org.apache.ibatis.session.defaults.DefaultSqlSession.select(DefaultSqlSession.java:170)
... 30 common frames omitted
Caused by: com.google.protobuf.InvalidProtocolBufferException: While parsing a protocol message, the input ended unexpectedly in the middle of a field. This could mean either that the input has been truncated or that an embedded message misreported its own length.
at com.google.protobuf.InvalidProtocolBufferException.truncatedMessage(InvalidProtocolBufferException.java:70)
at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:1068)
at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:1135)
at com.google.protobuf.CodedInputStream.readRawVarint64SlowPath(CodedInputStream.java:778)
at com.google.protobuf.CodedInputStream.readRawVarint32(CodedInputStream.java:637)
at com.google.protobuf.CodedInputStream.readInt32(CodedInputStream.java:348)
at org.sonar.db.protobuf.DbCommons$TextRange.<init>(DbCommons.java:149)
at org.sonar.db.protobuf.DbCommons$TextRange.<init>(DbCommons.java:90)
at org.sonar.db.protobuf.DbCommons$TextRange$1.parsePartialFrom(DbCommons.java:750)
at org.sonar.db.protobuf.DbCommons$TextRange$1.parsePartialFrom(DbCommons.java:744)
at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:495)
at org.sonar.db.protobuf.DbIssues$Locations.<init>(DbIssues.java:99)
at org.sonar.db.protobuf.DbIssues$Locations.<init>(DbIssues.java:55)
at org.sonar.db.protobuf.DbIssues$Locations$1.parsePartialFrom(DbIssues.java:852)
at org.sonar.db.protobuf.DbIssues$Locations$1.parsePartialFrom(DbIssues.java:846)
at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:137)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:169)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:180)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:185)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
at org.sonar.db.protobuf.DbIssues$Locations.parseFrom(DbIssues.java:253)
at org.sonar.db.issue.IssueDto.parseLocations(IssueDto.java:650)
... 46 common frames omitted
2018.05.15 06:59:52 ERROR ce[AWNimASHSOFQrk0D-LC8][o.s.c.t.CeWorkerImpl] Executed task | project=myproject:develop | type=REPORT | id=AWNimASHSOFQrk0D-LC8 | submitter=jenkins | time=70255ms
Actually I solved my problem executing this query on sonar schema. I guess that was definitely related to version upgrade... not a clean solution but still ok to me as I didn't need to have historical data
delete from issues where STATUS != "CLOSED"

Google cloud streaming dataflow : Error while fetching side input

Sometimes I am getting below exception while running streaming dataflow :
exception: "java.lang.RuntimeException: Exception while fetching side input:
at com.google.cloud.dataflow.sdk.runners.worker.StateFetcher.fetchSideInput(StateFetcher.java:184)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingModeExecutionContext.fetchSideInput(StreamingModeExecutionContext.java:175)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingModeExecutionContext.access$400(StreamingModeExecutionContext.java:56)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingModeExecutionContext$StepContext.issueSideInputFetch(StreamingModeExecutionContext.java:401)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingSideInputFetcher.getReadyWindows(StreamingSideInputFetcher.java:135)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:49)
at com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:175)
at com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.startBundle(SimpleParDoFn.java:117)
at com.google.cloud.dataflow.sdk.runners.worker.ForwardingParDoFn.startBundle(ForwardingParDoFn.java:36)
at com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.start(ParDoOperation.java:45)
at com.google.cloud.dataflow.sdk.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:69)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:719)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingDataflowWorker.access$600(StreamingDataflowWorker.java:95)
at com.google.cloud.dataflow.sdk.runners.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:538)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.google.cloud.dataflow.worker.repackaged.com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Duplicate values for 2059
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2207)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache.get(LocalCache.java:3953)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4790)
at com.google.cloud.dataflow.sdk.runners.worker.StateFetcher.fetchSideInput(StateFetcher.java:175)
... 16 more
Caused by: java.lang.IllegalArgumentException: Duplicate values for 2059
at com.google.cloud.dataflow.sdk.util.PCollectionViews$MapPCollectionView.fromElements(PCollectionViews.java:291)
at com.google.cloud.dataflow.sdk.util.PCollectionViews$MapPCollectionView.fromElements(PCollectionViews.java:273)
at com.google.cloud.dataflow.sdk.util.PCollectionViews$PCollectionViewBase.fromIterableInternal(PCollectionViews.java:368)
at com.google.cloud.dataflow.sdk.runners.worker.StateFetcher$2.call(StateFetcher.java:152)
at com.google.cloud.dataflow.sdk.runners.worker.StateFetcher$2.call(StateFetcher.java:104)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4793)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3542)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2323)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2286)
at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2201)
... 19 more
Dataflow worker machine type : n1-standard-4
Worker cache memory Mb : 2048
Dataflow main Input : PubSub subscription.
I am creating sideInput from BT , and passing this sideInputs to multiple transformations.The size of my sideinput is less that 100Mb.
Thanks.
That error indicates that multiple values with the same key (2059) have been encountered, which violates the expectations for a Map-valued side input. This can happen in streaming especially if you trigger the same value multiple times. If you instead use a Multimap, it should allow you to retrieve all of the values associated with a given key.

Resources