Split a Flux into a Mono head and Flux tail - project-reactor

I want to split a Flux into two parts: A Mono for the 1st element (the head), and a Flux for everything else (the tail).
The base Flux should not be re-subscribed to in the course of this process.
Example of what DOESN'T work:
final Flux<Integer> baseFlux = Flux.range(0, 3).log();
final Mono<Integer> head = baseFlux.next();
final Flux<Integer> tail = baseFlux.skip(1L);
assertThat(head.block()).isEqualTo(0);
assertThat(tail.collectList().block()).isEqualTo(Arrays.asList(1, 2));
The log for this looks something like the following, and as you can see, the base Flux will be re-subscribed to twice:
[main] DEBUG reactor.util.Loggers$LoggerFactory - Using Slf4j logging framework
[main] INFO reactor.Flux.Range.1 - | onSubscribe([Synchronous Fuseable] FluxRange.RangeSubscription)
[main] INFO reactor.Flux.Range.1 - | request(unbounded)
[main] INFO reactor.Flux.Range.1 - | onNext(0)
[main] INFO reactor.Flux.Range.1 - | cancel()
[main] INFO reactor.Flux.Range.1 - | onSubscribe([Synchronous Fuseable] FluxRange.RangeSubscription)
[main] INFO reactor.Flux.Range.1 - | request(unbounded)
[main] INFO reactor.Flux.Range.1 - | onNext(0)
[main] INFO reactor.Flux.Range.1 - | onNext(1)
[main] INFO reactor.Flux.Range.1 - | onNext(2)
[main] INFO reactor.Flux.Range.1 - | onComplete()
[main] INFO reactor.Flux.Range.1 - | request(1)
My actual case is that my base Flux contains the lines of a CSV file, with the first line being the header of the file, which is needed to parse all subsequent lines. The base Flux can only be subscribed to once, as it is based on an InputStream
The only somewhat related resource I found for this is this question, but I found this to be somewhat unfitting for my needs.

Thanks to a suggestion offered in the comments, I was able to devise the following solution:
final Flux<Integer> baseFlux = Flux.range(0, 3).log();
final Flux<? extends Tuple2<? extends Integer, Integer>> zipped = baseFlux
.switchOnFirst((signal, flux) -> (signal.hasValue()
? Flux.zip(Flux.just(signal.get()).repeat(), flux.skip(1L))
: Flux.empty()));
final List<? extends Tuple2<? extends Integer, Integer>> list = zipped.collectList().block();
assertThat(list.stream().map(Tuple2::getT1)).isEqualTo(Arrays.asList(0, 0));
assertThat(list.stream().map(Tuple2::getT2)).isEqualTo(Arrays.asList(1, 2));
It transforms the base Flux after the 1st element, by zipping this element repeated with the tail of the original flux. And it only subscribes to the baseFlux once.
I'm not sure this is the best solution, as it will create a lot of Tuple2 objects which will be GC'd eventually, compared to a solution which would have a stateful ("hot") flux based on the baseFlux, which keeps the original subscription alive.

Related

xwork2.ActionSupport looping after application startup

I have a struts 2.5.22 app that when deployed to Kubernetes/Docker tomcat it runs ok, but is constantly looping through in the logs...(shown below) Any ideas why its doing this, and how to stop it??
> 10:32:32.884 [http-nio-8080-exec-6] DEBUG com.opensymphony.xwork2.ognl.SecurityMemberAccess - Checking access for [target: com.opensymphony.xwork2.ActionSupport#7f0a56a, member: public java.util.Locale com.opensymphony.xwork2.ActionSupport.getLocale(), property: locale]
1/30/20
10:32:32.884 AM
10:32:32.884 [http-nio-8080-exec-6] DEBUG com.opensymphony.xwork2.ognl.SecurityMemberAccess - Checking access for [target: com.opensymphony.xwork2.ActionSupport#7f0a56a, member: public java.util.Locale com.opensymphony.xwork2.ActionSupport.getLocale(), property: locale]
1/30/20
10:32:32.883 AM
10:32:32.883 [http-nio-8080-exec-6] DEBUG com.opensymphony.xwork2.conversion.impl.InstantiatingNullHandler - Entering nullPropertyValue [target=[com.opensymphony.xwork2.ActionSupport#7f0a56a, com.opensymphony.xwork2.DefaultTextProvider#4c9fe77e], property=org]
1/30/20
10:32:32.882 AM
10:32:32.882 [http-nio-8080-exec-6] DEBUG com.opensymphony.xwork2.conversion.impl.InstantiatingNullHandler - Entering nullPropertyValue [target=[com.opensymphony.xwork2.ActionSupport#7f0a56a, com.opensymphony.xwork2.DefaultTextProvider#4c9fe77e], property=helpKey]

Flux.zip method not emitting all elements

I am working with Reactive Stream and Publishers (Mono and Flux), and combining the two publishers using the zip and zipWith method of Flux as follows:
Flux<String> flux1 = Flux.just(" {1} ","{2} ","{3} ","{4} " );
Flux<String> flux2 = Flux.just(" |A|"," |B| "," |C| ");
Flux.zip(flux1, flux2,
(itemflux1, itemflux2) -> "[ "+itemflux1 + ":"+ itemflux2 + " ] " )
.subscribe(System.out::print);
and here is the output:
[ {1} : |A| ] [ {2} : |B| ] [ {3} : |C| ]
AS flux1 has four elements and flux2 has three elements, the forth element in flux1 gets lost. And when i tried to print the logs of the flux, there is no information about what happened to the forth element.
Here is the statement for printing logs:
Flux.zip(flux1, flux2,
(itemflux1, itemflux2) -> "[ "+itemflux1 + ":"+ itemflux2 + " ] " ).log()
.subscribe(System.out::print);
and here is the console logs with using log method:
[info] onSubscribe(FluxZip.ZipCoordinator)
[info] request(unbounded)
[info] onNext([ {1} : |A| ] )
[ {1} : |A| ] [info] onNext([ {2} : |B| ] )
[ {2} : |B| ] [info] onNext([ {3} : |C| ] )
[ {3} : |C| ] [info] onComplete()
From the documentation of zip method, i got
The operator will continue doing so until any of the sources completes. Errors will immediately be forwarded. This "Step-Merge" processing is especially useful in Scatter-Gather scenarios.
But in my case, it did not logged any error and did not logged any message about the lost element.
How can i get the information about the lost element?
Please suggest.
zip/zipWith will output as many pairs as there are elements in the shortest Flux. It cancels the longer Flux upon termination of the smallest one, which should be visible if you put the log() on the source Flux instead of the zipped one.
This is demonstrated by this snippet (which is tuned to show 1-by-1 requests and run as a unit test, hence the hide()/zipWith(..., 1) and blockLast()):
#Test
public void test() {
Flux<Integer> flux1 = Flux.range(1, 4).hide().log("\tFLUX 1");
Flux<Integer> flux2 = Flux.range(10, 2).hide().log("\tFlux 2");
flux1.zipWith(flux2, 1)
.log("zipped")
.blockLast();
}
Which outputs:
11:57:21.072 [main] INFO zipped - onSubscribe(FluxZip.ZipCoordinator)
11:57:21.077 [main] INFO zipped - request(unbounded)
11:57:21.079 [main] INFO FLUX 1 - onSubscribe(FluxHide.HideSubscriber)
11:57:21.079 [main] INFO FLUX 1 - request(1)
11:57:21.079 [main] INFO FLUX 1 - onNext(1)
11:57:21.079 [main] INFO Flux 2 - onSubscribe(FluxHide.HideSubscriber)
11:57:21.080 [main] INFO Flux 2 - request(1)
11:57:21.080 [main] INFO Flux 2 - onNext(10)
11:57:21.080 [main] INFO zipped - onNext([1,10])
11:57:21.080 [main] INFO FLUX 1 - request(1)
11:57:21.080 [main] INFO FLUX 1 - onNext(2)
11:57:21.080 [main] INFO Flux 2 - request(1)
11:57:21.080 [main] INFO Flux 2 - onNext(11)
11:57:21.080 [main] INFO zipped - onNext([2,11])
11:57:21.080 [main] INFO FLUX 1 - request(1)
11:57:21.080 [main] INFO FLUX 1 - onNext(3)
11:57:21.080 [main] INFO Flux 2 - request(1)
11:57:21.080 [main] INFO Flux 2 - onComplete()
11:57:21.081 [main] INFO FLUX 1 - cancel() <----- HERE
11:57:21.081 [main] INFO Flux 2 - cancel()
11:57:21.081 [main] INFO zipped - onComplete()
This is the "until any of the sources completes" part.
This is the expected behavior of this operator.
With Flux.zip, one of the provided Flux might be an infinite one; a common example of that is zipping a Flux of data with a Flux.interval(Duration duration) instance (which is infinite).
If you're stuck in this situation, it probably means that you need to use a different operator.
To explain what the documentation says
Errors will immediately be forwarded. --> Means if there is an error in the combinator function then it is immediately forwarded. You can check this by making one of the entries in either of the Flux as null.
There is no way to get the lost element. Because the stream is not read further when one of the Stream has ended. Hope it is clear. If you really want to get the last element of the stream, try other operators.

Unable to retrieve XForms engine state

I working with Orbeon XForms 4.7 on eXist-db 2.2. We've always had session problems where a session would expire in eXist-db after a given point in time, but where Orbeon retains a session token and 'thinks' it is still logged on. This may or may not be at the heart of what I see in the logging:
2016-11-07 15:06:27,643 INFO ProcessorService - /xforms-server - Received request
2016-11-07 15:06:27,650 INFO ProcessorService - /xforms-server - Timing: 7
2016-11-07 15:06:29,368 INFO ProcessorService - /xforms-server - Received request
2016-11-07 15:06:29,372 ERROR XFormsServer - Unable to retrieve XForms engine state. Please reload the current page. Note that you will lose any unsaved changes. {}
2016-11-07 15:06:29,372 INFO ProcessorService - /xforms-server - Timing: 4
2016-11-07 15:06:29,373 ERROR ProcessorService -
+----------------------------------------------------------------------------------------------------------------------+
|An Error has Occurred |
|----------------------------------------------------------------------------------------------------------------------|
|Unable to retrieve XForms engine state. Please reload the current page. Note that you will lose any unsaved changes. |
|----------------------------------------------------------------------------------------------------------------------|
|Application Call Stack |
|----------------------------------------------------------------------------------------------------------------------|
|oxf:/config/prologue-servlet.xpl |executing processor | 36|
|······················································································································|
|element=<p:processor name="oxf:pipeline">[...]</p:processor> |
|name ={http://www.orbeon.com/oxf/processors}pipeline |
|----------------------------------------------------------------------------------------------------------------------|
|oxf:/ops/xforms/xforms-server.xpl |reading processor output | 43|
|······················································································································|
|element=<p:output name="response" id="xforms-response"/> |
|name =response |
|id =xforms-response |
|----------------------------------------------------------------------------------------------------------------------|
|----------------------------------------------------------------------------------------------------------------------|
|Exception: org.orbeon.oxf.common.OXFException |
|----------------------------------------------------------------------------------------------------------------------|
|org.orbeon.oxf.xforms.state.XFormsStateManager |createDocumentFromStore |XFormsStateManager.java | 489|
|org.orbeon.oxf.xforms.state.XFormsStateManager |findOrRestoreDocument |XFormsStateManager.java | 448|
|org.orbeon.oxf.xforms.state.XFormsStateManager |beforeUpdate |XFormsStateManager.java | 328|
|org.orbeon.oxf.xforms.processor.XFormsServer |doIt |XFormsServer.java | 169|
|org.orbeon.oxf.xforms.processor.XFormsServer |access$000 |XFormsServer.java | 59|
|org.orbeon.oxf.xforms.processor.XFormsServer$1 |readImpl |XFormsServer.java | 85|
|essor.impl.ProcessorOutputImpl$TopLevelOutputFilter|read |ProcessorOutputImpl.java | 257|
|org.orbeon.oxf.processor.impl.ProcessorOutputImpl |read |ProcessorOutputImpl.java | 394|
|----------------------------------------------------------------------------------------------------------------------|
|Exception: org.orbeon.oxf.common.ValidationException |
|----------------------------------------------------------------------------------------------------------------------|
|org.orbeon.oxf.common.OrbeonLocationException$ |wrapException |OrbeonLocationException.scala | 60|
|org.orbeon.oxf.common.OrbeonLocationException |wrapException |OrbeonLocationException.scala | |
This error comes up every such and so seconds. There is no mention what XForms triggers it, so a targeted search is out of the question.
Does anyone know how to interpret this error and possibly how to prevent it? Note that there's no error at the user side at all, and everything appears to work fine. This is purely server side.

PredictionIO train error tokens must not be empty

I am tinkering with predictioIO to build a custom classification engine. I have done this before without issues. But for current dataset pio train is giving me an error tokens must not be empty.I have edited Datasource.scala to mention fields in dataset to engine. A line from my dataset is as below
{"event": "ticket", "eventTime": "2015-02-16T05:22:13.477+0000", "entityType": "content","entityId": 365,"properties":{"text": "Request to reset svn credentials","label": "Linux/Admin Task" }}
I can import data and build engine without any issues. I am getting a set of observations too. The error is pasted below
[INFO] [Remoting] Starting remoting
[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://sparkDriver#192.168.61.44:50713]
[INFO] [Engine$] EngineWorkflow.train
[INFO] [Engine$] DataSource: org.template.textclassification.DataSource#4fb64e14
[INFO] [Engine$] Preparator: org.template.textclassification.Preparator#5c4cc644
[INFO] [Engine$] AlgorithmList: List(org.template.textclassification.NBAlgorithm#62b6c045)
[INFO] [Engine$] Data sanity check is off.
[ERROR] [Executor] Exception in task 0.0 in stage 2.0 (TID 2)
[WARN] [TaskSetManager] Lost task 0.0 in stage 2.0 (TID 2, localhost): java.lang.IllegalArgumentException: tokens must not be empty
at opennlp.tools.util.StringList.<init>(StringList.java:61)
at org.template.textclassification.PreparedData.org$template$textclassification$PreparedData$$hash(Preparator.scala:71)
at org.template.textclassification.PreparedData$$anonfun$2.apply(Preparator.scala:113)
at org.template.textclassification.PreparedData$$anonfun$2.apply(Preparator.scala:113)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:202)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[ERROR] [TaskSetManager] Task 0 in stage 2.0 failed 1 times; aborting job
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 2, localhost): java.lang.IllegalArgumentException: tokens must not be empty
at opennlp.tools.util.StringList.<init>(StringList.java:61)
at org.template.textclassification.PreparedData.org$template$textclassification$PreparedData$$hash(Preparator.scala:71)
at org.template.textclassification.PreparedData$$anonfun$2.apply(Preparator.scala:113)
at org.template.textclassification.PreparedData$$anonfun$2.apply(Preparator.scala:113)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:202)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1204)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1193)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1192)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
The problem is with dataset. I did splitting up dataset into parts and trained. Training was completed for that dataset and no errors were reported. How can I know which line in the dataset produce error? It should be very very helpful if this feature is in PredictionIO .
So this is something that happens when you feed in an empty Array[String] to OpenNLP's StringList constructor. Try modifying the function hash in Prepared Data as follows:
private def hash (tokenList : Array[String]): HashMap[String, Double] = {
// Initialize an NGramModel from OpenNLP tools library,
// and add the list of allowable tokens to the n-gram model.
try {
val model : NGramModel = new NGramModel()
model.add(new StringList(tokenList: _*), nMin, nMax)
val map : HashMap[String, Double] = HashMap(
model.iterator.map(
x => (x.toString, model.getCount(x).toDouble)
).toSeq : _*
)
val mapSum = map.values.sum
// Divide by the total number of n-grams in the document
// to obtain n-gram frequency.
map.map(e => (e._1, e._2 / mapSum))
} catch {
case (e : IllegalArgumentException) => HashMap("" -> 0.0)
}
I've only encountered this issue in the prediction stage, and so you can see this is actually implemented in the models' predict methods. I'll update this right now, and put it in a new version release. Thank you for the catch and feedback!

Grails 2.3.6 Scaffolded index page throws ArrayIndexOutOfBoundsException

I have a grails application which is failing at runtime in a cryptic way
(cyptic to me anyway)
with a ArrayIndexOutOfBoundsException when I visit the scaffolded /imca2/imcaReferral/index.
* now edited to put solution at end *
There are about a dozen domain classes.
I haven't got round to worrying about the UI yet so the controllers are all dynamically scaffolded.
All the other controllers work OK.
This Controller:
package com.ubergen
class ImcaReferralController {
def scaffold = ImcaReferral
}
For this Domain:
package com.ubergen
class ImcaReferral {
private def todayDate = new Date()
String advocacyReferenceNum = ""
[snip a lot of code]
String toString() {
"${this.advocacyReferenceNum}: ${this.client?this.client:'-'}${this.referralIssue?', '+this.referralIssue:''}"
}
}
(I don't want to post the domain class here as its huge).
Produces this stacktrace:
|Server running. Browse to http://localhost:8080/imca2
| Error 2014-03-12 18:48:24,935 [http-bio-8080-exec-3] ERROR errors.GrailsExceptionResolver - ArrayIndexOutOfBoundsException occurred when processing request: [GET] /imca2/imcaReferral/index
0. Stacktrace follows:
Message: 0
Line | Method
->> 55 | <init> in grails.orm.PagedResultList
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| 15 | $tt__index in com.ubergen.ImcaReferralController
| 191 | doFilter . in grails.plugin.cache.web.filter.PageFragmentCachingFilter
| 63 | doFilter in grails.plugin.cache.web.filter.AbstractFilter
| 1146 | runWorker in java.util.concurrent.ThreadPoolExecutor
| 615 | run in java.util.concurrent.ThreadPoolExecutor$Worker
^ 701 | run . . . in java.lang.Thread
Cleaning and (re)compiling make no difference.
The domain class is being used during bootstrapping to push data successfully into the database, so it works to that extent.
I can run the application from the command line instead of from inside eclipse/STS. The same error is thrown.
run-app --noreloading makes no difference either (clutching at straws now). And run-war also produces the same error.
run-app --verbose shows:
| Error 2014-03-12 19:58:37,745 [http-bio-8080-exec-1] ERROR errors.GrailsExceptionResolver - ArrayIndexOutOfBoundsException occurred when processing request: [GET] /imca2/imcaReferral/index
0. Stacktrace follows:
java.lang.ArrayIndexOutOfBoundsException: 0
at org.hibernate.criterion.Order.toSqlString(Order.java:73)
at org.hibernate.loader.criteria.CriteriaQueryTranslator.getOrderBy(CriteriaQueryTranslator.java:394)
[snip]
at grails.orm.PagedResultList.<init>(PagedResultList.java:55)
[snip]
at com.ubergen.ImcaReferral.list(ImcaReferral.groovy)
[snip]
at com.ubergen.ImcaReferralController.$tt__index(script1394654146228610896735.groovy:15)
[snip]
So the index page calls the domain's list() and this is a problem in some way but not enough of a way that its getting mentioned in the stacktrace.
Where should I look first for the problem?
Versions:
ubuntu 10.04
eclipse / SpringToolSuite 3.4.0
grails 2.3.6
groovy 2.1.9 (for both project and workspace)
Update 13/03/2014
I followed Joe's suggestions (below) and found that the problem is indeed in the ImcaReferral.list() method.
In the grails console simply running:
package com.ubergen
ImcaReferral.withTransaction { status ->
ImcaReferral.list()
}
Returns
java.lang.ArrayIndexOutOfBoundsException: 0
at org.hibernate.criterion.Order.toSqlString(Order.java:73)
at org.hibernate.loader.criteria.CriteriaQueryTranslator.getOrderBy(CriteriaQueryTranslator.ja a:394)
[snip]
at com.ubergen.ImcaReferral.list(ImcaReferral.groovy)
Looking at the domain's sort order information BINGO! its incorrectly defined, there are two competing definitions of how to sort the domain.
I Comment out the erroneous sort order information:
package com.ubergen
class ImcaReferral {
...
static hasMany = [challenges:Challenge]
static mapping = {
...
sort dateReceived:'asc'
// sort challenges:'challengeRoute' // *** ERROR ***
}
}
and (after restarting the console) the call to list works fine and returns an empty array.
Correcting the sort order of the child records:
package com.ubergen
class ImcaReferral {
...
static hasMany = [challenges:Challenge]
static mapping = {
...
sort dateReceived:'asc'
challenges sort: 'challengeRoute', order: 'asc' // *** CORRECT ***
}
}
Fixes the problem. The scaffolding now works.
Conclusions
Trust the full stacktrace even if its rather verbose. It shows the classes and methods to look at.
Learn to use the console.
grails -reloading console
Read your more code carefully!
You could try to generate the Static Scaffolding and see if you get a different result. You could also try running the list in a integration test to see what happens.

Resources