I'm trying to aggregate (per key) a streaming data source in Apache Beam (via Scio) using a stateful DoFn (using #ProcessElement with #StateId ValueState elements). I thought this would be most appropriate for the problem I'm trying to solve. The requirements are:
for a given key, records are aggregated (essentially summed) across all time - I don't care about previously computed aggregates, just the most recent
keys may be evicted from the state (state.clear()) based on certain conditions that I control
Every 5 minutes, regardless if any new keys were seen, all keys that haven't been evicted from the state should be outputted
Given that this is a streaming pipeline and will be running indefinitely, using a combinePerKey over a global window with accumulating fired panes seems like it will continue to increase its memory footprint and the amount of data it needs to run over time, so I'd like to avoid it. Additionally, when testing this out, (maybe as expected) it simply appends the newly computed aggregates to the output along with the historical input, rather than using the latest value for each key.
My thought was that using a StatefulDoFn would simply allow me to output all of the global state up until now(), but it seems this isn't a trivial solution. I've seen hintings at using timers to artificially execute callbacks for this, as well as potentially using a slowly growing side input map (How to solve Duplicate values exception when I create PCollectionView<Map<String,String>>) and somehow flushing this, but this would essentially require iterating over all values in the map rather than joining on it.
I feel like I might be overlooking something simple to get this working. I'm relatively new to many concepts of windowing and timers in Beam, looking for any advice on how to solve this. Thanks!
You are right that Stateful DoFn should help you here. This is a basic sketch of what you can do. Note that this only outputs the sum without the key. It may not be exactly what you want, but it should help you move forward.
class CombiningEmittingFn extends DoFn<KV<Integer, Integer>, Integer> {
#TimerId("emitter")
private final TimerSpec emitterSpec = TimerSpecs.timer(TimeDomain.PROCESSING_TIME);
#StateId("done")
private final StateSpec<ValueState<Boolean>> doneState = StateSpecs.value();
#StateId("agg")
private final StateSpec<CombiningState<Integer, int[], Integer>>
aggSpec = StateSpecs.combining(
Sum.ofIntegers().getAccumulatorCoder(null, VarIntCoder.of()), Sum.ofIntegers());
#ProcessElement
public void processElement(ProcessContext c,
#StateId("agg") CombiningState<Integer, int[], Integer> aggState,
#StateId("done") ValueState<Boolean> doneState,
#TimerId("emitter") Timer emitterTimer) throws Exception {
if (SOME CONDITION) {
countValueState.clear();
doneState.write(true);
} else {
countValueState.addAccum(c.element().getValue());
emitterTimer.align(Duration.standardMinutes(5)).setRelative();
}
}
}
#OnTimer("emitter")
public void onEmit(
OnTimerContext context,
#StateId("agg") CombiningState<Integer, int[], Integer> aggState,
#StateId("done") ValueState<Boolean> doneState,
#TimerId("emitter") Timer emitterTimer) {
Boolean isDone = doneState.read();
if (isDone != null && isDone) {
return;
} else {
context.output(aggState.getAccum());
// Set the timer to emit again
emitterTimer.align(Duration.standardMinutes(5)).setRelative();
}
}
}
}
Happy to iterate with you on something that'll work.
#Pablo was indeed correct that a StatefulDoFn and timers are useful in this scenario. Here is the with code I was able to get working.
Stateful Do Fn
// DomainState is a custom case class I'm using
type DoFnT = DoFn[KV[String, DomainState], KV[String, DomainState]]
class StatefulDoFn extends DoFnT {
#StateId("key")
private val keySpec = StateSpecs.value[String]()
#StateId("domainState")
private val domainStateSpec = StateSpecs.value[DomainState]()
#TimerId("loopingTimer")
private val loopingTimer: TimerSpec = TimerSpecs.timer(TimeDomain.EVENT_TIME)
#ProcessElement
def process(
context: DoFnT#ProcessContext,
#StateId("key") stateKey: ValueState[String],
#StateId("domainState") stateValue: ValueState[DomainState],
#TimerId("loopingTimer") loopingTimer: Timer): Unit = {
... logic to create key/value from potentially null values
if (keepState(value)) {
loopingTimer.align(Duration.standardMinutes(5)).setRelative()
stateKey.write(key)
stateValue.write(value)
if (flushState(value)) {
context.output(KV.of(key, value))
}
} else {
stateValue.clear()
}
}
#OnTimer("loopingTimer")
def onLoopingTimer(
context: DoFnT#OnTimerContext,
#StateId("key") stateKey: ValueState[String],
#StateId("domainState") stateValue: ValueState[DomainState],
#TimerId("loopingTimer") loopingTimer: Timer): Unit = {
... logic to create key/value checking for nulls
if (keepState(value)) {
loopingTimer.align(Duration.standardMinutes(5)).setRelative()
if (flushState(value)) {
context.output(KV.of(key, value))
}
}
}
}
With pipeline
sc
.pubsubSubscription(...)
.keyBy(...)
.withGlobalWindow()
.applyPerKeyDoFn(new StatefulDoFn())
.withFixedWindows(
duration = Duration.standardMinutes(5),
options = WindowOptions(
accumulationMode = DISCARDING_FIRED_PANES,
trigger = AfterWatermark.pastEndOfWindow(),
allowedLateness = Duration.ZERO,
// Only take the latest per key during a window
timestampCombiner = TimestampCombiner.END_OF_WINDOW
))
.reduceByKey(mostRecentEvent())
.saveAsCustomOutput(TextIO.write()...)
Related
This question is a follow on after such a great answer Is there a way to upload jars for a dataflow job so we don't have to serialize everything?
This made me realize 'ok, what I want is injection with no serialization so that I can mock and test'.
Our current method requires our apis/mocks to be serialiable BUT THEN, I have to put static fields in the mock because it gets serialized and deserialized creating a new instance that dataflow uses.
My colleague pointed out that perhaps this needs to be a sink and that is treated differently? <- We may try that later and update but we are not sure right now.
My desire is from the top to replace the apis with mocks during testing. Does someone have an example for this?
Here is our bootstrap code that does not know if it is in production or inside a feature test. We test end to end results with no apache beam imports in our tests meaning we swap to any tech if we want to pivot and keep all our tests. Not only that, we catch way more integration bugs and can refactor without rewriting tests since the contracts we test are customer ones we can't easily change.
public class App {
private Pipeline pipeline;
private RosterFileTransform transform;
#Inject
public App(Pipeline pipeline, RosterFileTransform transform) {
this.pipeline = pipeline;
this.transform = transform;
}
public void start() {
pipeline.apply(transform);
pipeline.run();
}
}
Notice that everything we do is Guice Injection based so the Pipeline may be direct runner or not. I may need to modify this class to pass things through :( but anything that works for now would be great.
The function I am trying to get our api(and mock and impl to) with no serialization is thus
private class ValidRecordPublisher extends DoFn<Validated<PractitionerDataRecord>, String> {
#ProcessElement
public void processElement(#Element Validated<PractitionerDataRecord>element) {
microServiceApi.writeRecord(element.getValue);
}
}
I am not sure how to pass in microServiceApi in a way that avoid serialization. I would be ok with delayed creation as well after deserialization using guice Provider provider; with provider.get() if there is a solution there too.
Solved in such a way that mocks no longer need static or serialization anymore by one since glass bridging the world of dataflow(in prod and in test) like so
NOTE: There is additional magic-ness we have in our company that passes through headers from service to service and through dataflow and that is some of it in there which you can ignore(ie. the RouterRequest request = Current.request();). so for anyone else, they will have to pass in projectId into getInstance each time.
public abstract class DataflowClientFactory implements Serializable {
private static final Logger log = LoggerFactory.getLogger(DataflowClientFactory.class);
public static final String PROJECT_KEY = "projectKey";
private transient static Injector injector;
private transient static Module overrides;
private static int counter = 0;
public DataflowClientFactory() {
counter++;
log.info("creating again(usually due to deserialization). counter="+counter);
}
public static void injectOverrides(Module dfOverrides) {
overrides = dfOverrides;
}
private synchronized void initialize(String project) {
if(injector != null)
return;
/********************************************
* The hardest part is this piece since this is specific to each Dataflow
* so each project subclasses DataflowClientFactory
* This solution is the best ONLY in the fact of time crunch and it works
* decently for end to end testing without developers needing fancy
* wrappers around mocks anymore.
***/
Module module = loadProjectModule();
Module modules = Modules.combine(module, new OrderlyDataflowModule(project));
if(overrides != null) {
modules = Modules.override(modules).with(overrides);
}
injector = Guice.createInjector(modules);
}
protected abstract Module loadProjectModule();
public <T> T getInstance(Class<T> clazz) {
if(!Current.isContextSet()) {
throw new IllegalStateException("Someone on the stack is extending DoFn instead of OrderlyDoFn so you need to fix that first");
}
RouterRequest request = Current.request();
String project = (String)request.requestState.get(PROJECT_KEY);
initialize(project);
return injector.getInstance(clazz);
}
}
I suppose this may not be what you're looking for, but your use case makes me think of using factory objects. They may depend on the pipeline options that you pass (i.e. your PipelineOptions object), or on some other configuration object.
Perhaps something like this:
class MicroserviceApiClientFactory implements Serializable {
MicroserviceApiClientFactory(PipelineOptions options) {
this.options = options;
}
public static MicroserviceApiClient getClient() {
MyPipelineOptions specialOpts = options.as(MySpecialOptions.class);
if (specialOpts.getMockMicroserviceApi()) {
return new MockedMicroserviceApiClient(...); // Or whatever
} else {
return new MicroserviceApiClient(specialOpts.getMicroserviceEndpoint()); // Or whatever parameters it needs
}
}
}
And for your DoFns and any other execution-time objects that need it, you would pass the factory:
private class ValidRecordPublisher extends DoFn<Validated<PractitionerDataRecord>, String> {
ValidRecordPublisher(MicroserviceApiClientFactory msFactory) {
this.msFactory = msFactory;
}
#ProcessElement
public void processElement(#Element Validated<PractitionerDataRecord>element) {
if (microServiceapi == null) microServiceApi = msFactory.getClient();
microServiceApi.writeRecord(element.getValue);
}
}
This should allow you to encapsulate the mocking functionality into a single class that lazily creates your mock or your client at pipeline execution time.
Let me know if this matches what you want somewhat, or if we should try to iterate further.
I have no experience with Guice, so I don't know if Guice configurations can easily pass the boundary between pipeline construction and pipeline execution (serialization / submittin JARs / etc).
Should this be a sink? Maybe, if you have an external service, and you're writing to it, you can write a PTransform that takes care of it - but the question of how you inject various dependencies will remain.
I am working on a project where we migrate massive number (more than 12000) views to Hadoop/Impala from Oracle. I have written a small Java utility to extract view DDL from Oracle and would like to use ANTLR4 to traverse the AST and generate an Impala-compatible view DDL statement.
The most of the work is relatively simple, only involves re-writing some Oracle specific syntax quirks to Impala style. However, I am facing an issue, where I am not sure I have the best answer yet: we have a number of special cases, where values from a date field are extracted in multiple nested function calls. For example, the following extracts the day from a Date field:
TO_NUMBER(TO_CHAR(d.R_DATE , 'DD' ))
I have an ANTLR4 grammar declared for Oracle SQL and hence get the visitor callback when it reaches TO_NUMBER and TO_CHAR as well, but I would like to have special handling for this special case.
Is not there any other way than implementing the handler method for the outer function and then resorting to manual traversal of the nested structure to see
I have something like in the generated Visitor class:
#Override
public String visitNumber_function(PlSqlParser.Number_functionContext ctx) {
// FIXME: seems to be dodgy code, can it be improved?
String functionName = ctx.name.getText();
if (functionName.equalsIgnoreCase("TO_NUMBER")) {
final int childCount = ctx.getChildCount();
if (childCount == 4) {
final int functionNameIndex = 0;
final int openRoundBracketIndex = 1;
final int encapsulatedValueIndex = 2;
final int closeRoundBracketIndex = 3;
ParseTree encapsulated = ctx.getChild(encapsulatedValueIndex);
if (encapsulated instanceof TerminalNode) {
throw new IllegalStateException("TerminalNode is found at: " + encapsulatedValueIndex);
}
String customDateConversionOrNullOnOtherType =
customDateConversionFromToNumberAndNestedToChar(encapsulated);
if (customDateConversionOrNullOnOtherType != null) {
// the child node contained our expected child element, so return the converted value
return customDateConversionOrNullOnOtherType;
}
// otherwise the child was something unexpected, signalled by null
// so simply fall-back to the default handler
}
}
// some other numeric function, default handling
return super.visitNumber_function(ctx);
}
private String customDateConversionFromToNumberAndNestedToChar(ParseTree parseTree) {
// ...
}
For anyone hitting the same issue, the way to go seems to be:
changing the grammar definition and introducing custom sub-types for
the encapsulated expression of the nested function.
Then, I it is possible to hook into the processing at precisely the desired location of the Parse tree.
Using a second custom ParseTreeVisitor that captures the values of function call and delegates back the processing of the rest of the sub-tree to the main, "outer" ParseTreeVisitor.
Once the second custom ParseTreeVisitor has finished visiting all the sub-ParseTrees I had the context information I required and all the sub-tree visited properly.
I have some initial state in my application and a few of policies that decorates this state with reactively fetched data (each of policy's Mono returns new instance of state with additional data). Eventually I get fully decorated state.
It basically looks like this:
public interface Policy {
Mono<State> apply(State currentState);
}
Usage for fixed number of policies would look like that:
Flux.just(baseState)
.flatMap(firstPolicy::apply)
.flatMap(secondPolicy::apply)
...
.subscribe();
It basically means that entry state for a Mono is result of accumulation of initial state and each of that Mono predecessors.
For my case policies number is not fixed and it comes from another layer of the application as a collection of objects that implements Policy interface.
Is there any way to achieve similar result as in the given code (with 2 flatMap), but for unknown number of policies? I have tried with Flux's reduce method, but it works only if policy returns value, not a Mono.
This seems difficult because you're streaming your baseState, then trying to do an arbitrary number of flatMap() calls on that. There's nothing inherently wrong with using a loop to achieve this, but I like to avoid that unless absolutely necessary, as it breaks the natural reactive flow of the code.
If you instead iterate and reduce the policies into a single policy, then the flatMap() call becomes trivial:
Flux.fromIterable(policies)
.reduce((p1,p2) -> s -> p1.apply(s).flatMap(p2::apply))
.flatMap(p -> p.apply(baseState))
.subscribe();
If you're able to edit your Policy interface, I'd strongly suggest adding a static combine() method to reference in your reduce() call to make that more readable:
interface Policy {
Mono<State> apply(State currentState);
public static Policy combine(Policy p1, Policy p2) {
return s -> p1.apply(s).flatMap(p2::apply);
}
}
The Flux then becomes much more descriptive and less verbose:
Flux.fromIterable(policies)
.reduce(Policy::combine)
.flatMap(p -> p.apply(baseState))
.subscribe();
As a complete demonstration, swapping out your State for a String to keep it shorter:
interface Policy {
Mono<String> apply(String currentState);
public static Policy combine(Policy p1, Policy p2) {
return s -> p1.apply(s).flatMap(p2::apply);
}
}
public static void main(String[] args) {
List<Policy> policies = new ArrayList<>();
policies.add(x -> Mono.just("blah " + x));
policies.add(x -> Mono.just("foo " + x));
String baseState = "bar";
Flux.fromIterable(policies)
.reduce(Policy::combine)
.flatMap(p -> p.apply(baseState))
.subscribe(System.out::println); //Prints "foo blah bar"
}
If I understand the problem correctly, then the most simple solution is to use a regular for loop:
Flux<State> flux = Flux.just(baseState);
for (Policy policy : policies)
{
flux = flux.flatMap(policy::apply);
}
flux.subscribe();
Also, note that if you have just a single baseSate you can use Mono instead of Flux.
UPDATE:
If you are concerned about breaking the flow, you can extract the for loop into a method and apply it via transform operator:
Flux.just(baseState)
.transform(this::applyPolicies)
.subscribe();
private Publisher<State> applyPolicies(Flux<State> originalFlux)
{
Flux<State> newFlux = originalFlux;
for (Policy policy : policies)
{
newFlux = newFlux.flatMap(policy::apply);
}
return newFlux;
}
I want to get a current user location to update nearby stores based on latitude/longitude inside the url.
but I can't figure out how to interact data between two different class.
I want to make it work something like 'AppConfig.latitude = _position.latitude;'. I tried with several methods including inherited widget that I found on stackoverflow and youtube, but still don't work. It's definitely that I'm missing something.
when I use a bloc, I have no clue how to update data inside 'class AppConfig' with bloc. Can it be done simply using SetState? I spent the whole day yesterday Googling for this problem. please guide me to right approach
class _CurrentLocationState extends State<CurrentLocation> {
Position _position;
Future<void> _initPlatformState() async {
Position position;
try {
final Geolocator geolocator = Geolocator()
...
setState(() {
_position = position;
// print(${_position.latitude})
// 35.9341...
// print(${_position.longitude})
// -117.0912...
<*I want to make it work something like this*>
AppConfig.latitude = _position.latitude;
AppConfig.longitude = _position.longitude;
<*this is how I tried with bloc*>
latLongBloc.getUserLocation(LatLng(position.latitude, position.longitude));
});
<* latitude/longitude need to be updated with a current user location *>
abstract class AppConfig {
static const double latitude = 0;
static const double longitude = 0;
static const List<String> storeName = ['starbucks'];
}
<* I need to use AppConfig.latitude for url in Repository class*>
class Repository {
...
Future<List<Business>> getBusinesses() async {
String webAddress =
"https://api.yelp.com/v3/businesses/search?latitude=${AppConfig.latitude}&longitude=${AppConfig.longitude}&term=${AppConfig.storeName}";
...
}
this is my bloc.dart file
class LatLongBloc {
StreamController _getUserLocationStreamController = StreamController<LatLng>();
Stream get getUserLocationStream => _getUserLocationStreamController.stream;
dispose(){
_getUserLocationStreamController.close();
}
getUserLocation(LatLng userLatLong) {
_getUserLocationStreamController.sink.add(userLatLong);
}
}
final latLongBloc = LatLongBloc();
You want to share state between classes/widgets, right? There are also other state management patterns like ScopedModel or Redux. Each pattern has its pros and cons, but you don't have to use BLoC if you don't understand it.
I would recommend to use ScopedModel because it's quite easy to understand in my opinion. Your data/state is in a central place and can be accessed by using ScopedModel. If you don't like to use this approach then try Redux or other patterns :)
Hope it helped you :D
Yours Glup3
I have the following EPL module which successfully deploys:
module context;
import events.*;
import configDemo.*;
import annotations.*;
import main.*;
import subscribers.*;
import listeners.*;
#Name('schemaCreator')
create schema InitEvent(firstStock String, secondStock String, bias double);
#Name('createSchemaEvent')
create schema TickEvent as TickEvent;
#Name('contextCreator')
create context TwoStocksContext
initiated by InitEvent as initEvent;
#Name('compareStocks')
#Description('Compare the difference between two different stocks and make a decision')
#Subscriber('subscribers.MySubscriber')
context TwoStocksContext
select * from TickEvent
match_recognize (
measures A.currentPrice as a_currentPrice, B.currentPrice as b_currentPrice,
A.stockCode as a_stockCode, B.stockCode as b_stockCode
pattern (A C* B)
define
A as A.stockCode = context.initEvent.firstStock,
B as A.currentPrice - B.currentPrice >= context.initEvent.bias and
B.stockCode = context.initEvent.secondStock
);
I have a problem with the listeners/subscribers. According to my checks and debugging, the classes don't have any problems, the annotations work, they are attached to the statement upon deployment, and yet neither of them receive any updates from the events.
This is my subscriber, I simply want to print that it has been received:
package subscribers;
import java.util.Map;
public class MySubscriber {
public void update(Map row) {
System.out.println("got it");
}
}
I previously had the same module without any context partitions and then the subscribers worked without a problem. After I added the context, it stopped.
So far I have tried:
Checking if the statement has any subscriber/listener attached (it does)
Checking their names
Remove the annotations and set them manually within Java code after deployment (same thing - they attach, I can retrieve their name but still don't receive updates)
Debugging the subscriber class. The program either doesn't go there at all to stop at a break point or I get an error (missing line number attribute error - ("can't place a break point there" which I tried to fix to no avail)
Any idea what could cause this or what is the best way to set a subscriber to a statement which has context partitions?
This is a continuation of a previous problem which was solved here - Creating instances of Esper's epl
EDIT: Events being sent in the format I use them and in the EPL online tool format:
I first get the pair to be followed from the user:
System.out.println("First stock:");
String first = scanner.nextLine();
System.out.println("Second stock:");
String second = scanner.nextLine();
System.out.println("Difference:");
double diff= scanner.nextDouble();
InitEvent init = new InitEvent(first, second, diff);
After that I have an engine thread the continuously sends events, but before it starts InitEvents is sent as such:
#Override
public void run() {
runtime.sendEvent(initEvent);
while (contSimulation) {
TickEvent tick1 = new TickEvent(Math.random() * 100, "YAH");
runtime.sendEvent(tick1);
TickEvent tick2 = new TickEvent(Math.random() * 100, "GOO");
runtime.sendEvent(tick2);
TickEvent tick3 = new TickEvent(Math.random() * 100, "IBM");
runtime.sendEvent(tick3);
TickEvent tick4 = new TickEvent(Math.random() * 100, "MIC");
runtime.sendEvent(tick4);
try {
TimeUnit.SECONDS.sleep(1);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
latch.countDown();
}
}
I haven't used the online tool before but I think I got it working. This is the module text:
module context;
create schema InitEvent(firstStock String, secondStock String, bias double);
create schema TickEvent(currentPrice double, stockCode String);
create context TwoStocksContext
initiated by InitEvent as initEvent;
context TwoStocksContext
select * from TickEvent
match_recognize (
measures A.currentPrice as a_currentPrice, B.currentPrice as b_currentPrice,
A.stockCode as a_stockCode, B.stockCode as b_stockCode
pattern (A C* B)
define
A as A.stockCode = context.initEvent.firstStock,
B as A.currentPrice - B.currentPrice >= context.initEvent.bias and
B.stockCode = context.initEvent.secondStock
);
And the sequence of events:
InitEvent={firstStock='YAH', secondStock = 'GOO', bias=5}
TickEvent={currentPrice=55.6, stockCode='YAH'}
TickEvent={currentPrice=50.4, stockCode='GOO'}
TickEvent={currentPrice=30.8, stockCode='MIC'}
TickEvent={currentPrice=24.9, stockCode='APP'}
TickEvent={currentPrice=51.6, stockCode='YAH'}
TickEvent={currentPrice=45.8, stockCode='GOO'}
TickEvent={currentPrice=32.8, stockCode='MIC'}
TickEvent={currentPrice=28.9, stockCode='APP'}
The result I get using them:
At: 2001-01-01 08:00:00.000
Statement: Stmt-4
Insert
Stmt-4-output={a_currentPrice=55.6, b_currentPrice=50.4, a_stockCode='YAH',
b_stockCode='GOO'}
At: 2001-01-01 08:00:00.000
Statement: Stmt-4
Insert
Stmt-4-output={a_currentPrice=51.6, b_currentPrice=45.8, a_stockCode='YAH',
b_stockCode='GOO'}
If I make the second set of events having a difference less than 5 between YAH/GOO, I only get output from the first pair which makes sense. This is, I think what it is supposed to do.
In case needed, those two methods read and process the annotations of the EPL module (I didn't write them myself, they are taken from coinTrader Context class that could be found here - https://github.com/timolson/cointrader/blob/master/src/main/java/org/cryptocoinpartners/module/Context.java ):
private static Object getSubscriber(String className) throws Exception {
Class<?> cl = Class.forName(className);
return cl.newInstance();
}
private static void processAnnotations(EPStatement statement) throws Exception {
Annotation[] annotations = statement.getAnnotations();
for (Annotation annotation : annotations) {
if (annotation instanceof Subscriber) {
Subscriber subscriber = (Subscriber) annotation;
Object obj = getSubscriber(subscriber.className());
System.out.println(subscriber.className());
statement.setSubscriber(obj);
} else if (annotation instanceof Listeners) {
Listeners listeners = (Listeners) annotation;
for (String className : listeners.classNames()) {
Class<?> cl = Class.forName(className);
Object obj = cl.newInstance();
if (obj instanceof StatementAwareUpdateListener) {
statement.addListener((StatementAwareUpdateListener) obj);
} else {
statement.addListener((UpdateListener) obj);
}
}
}
}
}
Well, after a month of struggle I finally solved it. In case anyone has similar problem in the future, here's where the problem was. The epl worked fine in the online tool but not in my code. Eventually, I figured out initial events aren't firing, hence context partitions aren't being created and as a result the subscribers and listeners do not receive any updates. My mistake was that I had POJO InitEvent fired, but the event that the context was using was created within the EPL module via create schema. I don't know what I was thinking, it makes sense now that it didn't work. As a result, the events I fire within Java aren't the events that the context uses. My solution was only within the EPL. Since I couldn't figure out if I can fire events in Java that are created within the module, I created a schema which is populated by my POJO and the stream is then used by the context as such:
#Name('schemaCreator')
create schema StartEvent(firstStock string, secondStock string, difference
double);
#Name('insertInitEvent')
insert into StartEvent
select * from InitEvent;
All else remains the same, as well as the Java code.