Google Cloud Dataflow Custom Keys with common functionality - google-cloud-dataflow

We are using the Dataflow Java SDK and we have an increasing number of custom key classes that are almost the same.
I would like to have them extend a common abstract class however the Dataflow SDK seems to try to instantiate the abstract class causing an InstantiationException.
Caused by: java.lang.RuntimeException: java.lang.InstantiationException
at org.apache.avro.specific.SpecificData.newInstance(SpecificData.java:316)
at org.apache.avro.specific.SpecificData.newRecord(SpecificData.java:332)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:173)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
at com.google.cloud.dataflow.sdk.coders.AvroCoder.decode(AvroCoder.java:242)
at com.google.cloud.dataflow.sdk.coders.KvCoder.decode(KvCoder.java:97)
at com.google.cloud.dataflow.sdk.coders.KvCoder.decode(KvCoder.java:42)
at com.google.cloud.dataflow.sdk.util.CoderUtils.decodeFromSafeStream(CoderUtils.java:156)
at com.google.cloud.dataflow.sdk.util.CoderUtils.decodeFromByteArray(CoderUtils.java:139)
at com.google.cloud.dataflow.sdk.util.CoderUtils.decodeFromByteArray(CoderUtils.java:133)
at com.google.cloud.dataflow.sdk.util.MutationDetectors$CodedValueMutationDetector.<init>(MutationDetectors.java:108)
at com.google.cloud.dataflow.sdk.util.MutationDetectors.forValueWithCoder(MutationDetectors.java:45)
at com.google.cloud.dataflow.sdk.transforms.ParDo$ImmutabilityCheckingOutputManager.output(ParDo.java:1218)
at com.google.cloud.dataflow.sdk.util.DoFnRunner$DoFnContext.outputWindowedValue(DoFnRunner.java:329)
at com.google.cloud.dataflow.sdk.util.DoFnRunner$DoFnProcessContext.output(DoFnRunner.java:483)
at com.telstra.cdf.rmr.model.pardos.ParDoAbstractCampaignUAKeyExtractor.processElement(ParDoAbstractCampaignUAKeyExtractor.java:5
here is our abstract class,
#DefaultCoder(AvroCoder.class)
public abstract class SuperClassKey {
public SuperClassKey(){}
public abstract double getSomeValue();
}
and this is the sub class
#DefaultCoder(AvroCoder.class)
public class SubClassKey extends SuperClassKey {
public String foo;
public SubClassKey() {
}
public SubClassKey(String foo){
this.foo = foo;
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
SubClassKey that = (SubClassKey) o;
if (!foo.equals(that.foo)) return false;
return true;
}
#Override
public int hashCode() {
return foo.hashCode();
}
#Override
public double getSomeValue() {
return foo;
}
}
I have also tried using an interface without success.
Is it possible to have a common abstract class or interface between Keys?

The issue is likely from using a PCollection<SuperClassKey> instead of PCollection<SubClassKey>. The PCollection needs to be typed with a concrete class. The coder can be explicitly specified with .setCoder(AvroCoder.of(SubClassKey.class)) if type inference is not sufficient.

In my canse, i changed the Coder class, example:
Before:
AvroIO.parseGenericRecords(RecordConverter::convert)
.withCoder(AvroCoder.of(Struct.class)).from(...)
After:
AvroIO.parseGenericRecords(RecordConverter::convert)
.withCoder(SerializableCoder.of(Struct.class)).from(...)

Related

Dependency Injection in Apache Storm topology

Little background: I am working on a topology using Apache Storm, I thought why not use dependency injection in it, but I was not sure how it will behave on cluster environment when topology deployed to cluster. I started looking for answers on if DI is good option to use in Storm topologies, I came across some threads about Apache Spark where it was mentioned serialization is going to be problem and saw some responses for apache storm along the same lines. So finally I decided to write a sample topology with google guice to see what happens.
I wrote a sample topology with two bolts, and used google guice to injects dependencies. First bolt emits a tick tuple, then first bolt creates message, bolt prints the message on log and call some classes which does the same. Then this message is emitted to second bolt and same printing logic there as well.
First Bolt
public class FirstBolt extends BaseRichBolt {
private OutputCollector collector;
private static int count = 0;
private FirstInjectClass firstInjectClass;
#Override
public void prepare(Map map, TopologyContext topologyContext, OutputCollector outputCollector) {
collector = outputCollector;
Injector injector = Guice.createInjector(new Module());
firstInjectClass = injector.getInstance(FirstInjectClass.class);
}
#Override
public void execute(Tuple tuple) {
count++;
String message = "Message count "+count;
firstInjectClass.printMessage(message);
log.error(message);
collector.emit("TO_SECOND_BOLT", new Values(message));
collector.ack(tuple);
}
#Override
public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) {
outputFieldsDeclarer.declareStream("TO_SECOND_BOLT", new Fields("MESSAGE"));
}
#Override
public Map<String, Object> getComponentConfiguration() {
Config conf = new Config();
conf.put(Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS, 10);
return conf;
}
}
Second Bolt
public class SecondBolt extends BaseRichBolt {
private OutputCollector collector;
private SecondInjectClass secondInjectClass;
#Override
public void prepare(Map map, TopologyContext topologyContext, OutputCollector outputCollector) {
collector = outputCollector;
Injector injector = Guice.createInjector(new Module());
secondInjectClass = injector.getInstance(SecondInjectClass.class);
}
#Override
public void execute(Tuple tuple) {
String message = (String) tuple.getValue(0);
secondInjectClass.printMessage(message);
log.error("SecondBolt {}",message);
collector.ack(tuple);
}
#Override
public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) {
}
}
Class in which dependencies are injected
public class FirstInjectClass {
FirstInterface firstInterface;
private final String prepend = "FirstInjectClass";
#Inject
public FirstInjectClass(FirstInterface firstInterface) {
this.firstInterface = firstInterface;
}
public void printMessage(String message){
log.error("{} {}", prepend, message);
firstInterface.printMethod(message);
}
}
Interface used for binding
public interface FirstInterface {
void printMethod(String message);
}
Implementation of interface
public class FirstInterfaceImpl implements FirstInterface{
private final String prepend = "FirstInterfaceImpl";
public void printMethod(String message){
log.error("{} {}", prepend, message);
}
}
Same way another class that receives dependency via DI
public class SecondInjectClass {
SecondInterface secondInterface;
private final String prepend = "SecondInjectClass";
#Inject
public SecondInjectClass(SecondInterface secondInterface) {
this.secondInterface = secondInterface;
}
public void printMessage(String message){
log.error("{} {}", prepend, message);
secondInterface.printMethod(message);
}
}
another interface for binding
public interface SecondInterface {
void printMethod(String message);
}
implementation of second interface
public class SecondInterfaceImpl implements SecondInterface{
private final String prepend = "SecondInterfaceImpl";
public void printMethod(String message){
log.error("{} {}", prepend, message);
}
}
Module Class
public class Module extends AbstractModule {
#Override
protected void configure() {
bind(FirstInterface.class).to(FirstInterfaceImpl.class);
bind(SecondInterface.class).to(SecondInterfaceImpl.class);
}
}
Nothing fancy here, just two bolts and couple of classes for DI. I deployed it on server and it works just fine. The catch/problem though is that I have to initialize Injector in each bolt which makes me question what is side effect of it going to be?
This implementation is simple, just 2 bolts.. what if I have more bolts? what impact it would create on topology if I have to initialize Injector in all bolts?
If I try to initialize Injector outside prepare method I get error for serialization.

Dagger 2: how to change provided dependencies at runtime

In order to learn Dagger 2 i decided to rewrite my application but I'm stuck with finding the proper solution for the following problem.
For the purpose of this example let's assume we have an interface called Mode:
public interface Mode {
Object1 obj1();
//some other methods providing objects for app
}
and two implementations:
NormalMode and DemoMode.
Mode is stored in singleton so it could be accessed from anywhere within application.
public enum ModeManager {
INSTANCE,;
private Mode mode;
public Mode mode() {
if (mode == null)
mode = new NormalMode();
return mode;
}
public void mode(Mode mode) { //to switch modules at runtime
this.mode = mode;
}
}
The NormalMode is switched to DemoMode at runtime (let's say, when user clickcs on background couple of times)
public void backgroundClicked5Times(){
ModeManager.INSTANCE.mode(new DemoMode());
//from now on every object that uses Mode will get Demo implementations, great!
}
So first I got rid of the singleton and defined Modes as Dagger 2 modules:
#Module
public class NormalModeModule {
#Provides
public Object1 provideObject1() {
return new NormalObject1();
}
}
#Module
public class DemoModeModule {
#Provides
public Object1 provideObject1() {
return new DemoObject1();
}
}
Now in the method backgroundClicked5Times instead of dealing with singleton I would like to replace NormalModeModule with DemoModeModule in DAG so the other classes that need Object1 would get a DemoObject1 implementation from now on.
How can I do that in Dagger?
Thanks in advance.
Maybe you can consider using multibindings?
#Module
public class NormalModeModule {
#Provides
#IntoMap
#StringKey("normal")
public Object1 provideObject1() {
return new NormalObject1();
}
}
#Module
public class DemoModeModule {
#Provides
#IntoMap
#StringKey("demo")
public Object1 provideObject1() {
return new DemoObject1();
}
}
and when using Mode:
#Inject
Map<String, Mode> modes;
//or you perfer lazy initialization:
Map<String, Provider<Mode>> modes;
public void backgroundClicked5Times(){
ModeManager.INSTANCE.mode(modes.get("demo"));
//if you are using Provider:
ModeManager.INSTANCE.mode(modes.get("demo").get());
//from now on every object that uses Mode will get Demo implementations, great!
}
Having experimented with dagger for a while I came up with solution that seems to be working well in my use case.
Define class that will hold state information about mode
public class Conf {
public Mode mode;
public Conf(Mode mode) {
this.mode = mode;
}
public enum Mode {
NORMAL, DEMO
}
}
Provide singleton instance of Conf in Module
#Module
public class ConfModule {
#Provides
#Singleton
Conf provideConf() {
return new Conf(Conf.Mode.NORMAL);
}
}
Add module to AppComponent
#Singleton
#Component(modules = {AppModule.class, ConfModule.class})
public interface AppComponent {
//...
}
Define modules that provide different objects based on Mode
#Module
public class Object1Module {
#Provides
Object1 provideObject1(Conf conf) {
if (conf.mode == Conf.Mode.NORMAL)
return new NormalObject1();
else
return new DemoObject1();
}
}
To switch mode at runtime simply inject Conf object and modify it:
public class MyActivity extends Activity {
#Inject Conf conf;
//...
public void backgroundClicked5Times(){
conf.mode = Conf.Mode.DEMO;
//if you have dagger objects in this class that depend on Mode
//execute inject() once more to refresh them
}
}

Binding between an Object and a SimpleIntegerProperty

I have a combo box over my GUI in JavaFX.
This Combo Box is composed of a complex type elements :
public class DureeChoiceBoxElement extends ObservableValueBase<DureeChoiceBoxElement> {
private IntegerProperty duree;
#Override
public String toString() {
return duree.get() + " an";
}
}
I want to map (or bind) the selected complex element with my model which contains the simple type :
public class Pel {
private IntegerProperty duree = new SimpleIntegerProperty(1);
public Property<Number> dureeProperty() {
return duree;
}
public void setDuree(Integer duree) {
this.duree.setValue(duree);
}
public Integer getDuree() {
return duree.getValue();
}
}
How to do it ?
I tried in the controller with :
public class PelController {
#FXML
private ChoiceBox<DureeChoiceBoxElement> duree;
//etc..
pel.dureeProperty().bind(createElapsedBindingByBindingsAPI2(duree.getValue()));
/*
* #return an ObjectBinding of immutable TimeElapsed objects for the player
*/
private ObjectBinding<Property<Number>> createElapsedBindingByBindingsAPI2(
final DureeChoiceBoxElement dureeChoiceBoxElement) {
return Bindings.createObjectBinding(new Callable<Property<Number>>() {
#Override
public IntegerProperty call() throws Exception {
return dureeChoiceBoxElement.dureeProperty();
}
}, dureeChoiceBoxElement.dureeProperty());
}
}
But it doesn't work (even not compile). I want to say that "Bind this simple property to this complex Object calling the method I give you through the method named "createElapsedBindingByBindingsAPI2(..)".
It is logical read but I didn't managed to make it works anyway.
That's poor ....
Any help please :).
Example that (obviously) works with legacy code style (Swing coding) :
duree.getSelectionModel().selectedItemProperty().addListener(new ChangeListener<DureeChoiceBoxElement>() {
#Override
public void changed(ObservableValue<? extends DureeChoiceBoxElement> observable,
DureeChoiceBoxElement oldValue, DureeChoiceBoxElement newValue) {
// changement durée
log.debug("Durée sélectionnée : {}", duree.getSelectionModel().getSelectedItem().getDuree());
log.debug("Durée bindée ? : {}", pel.getDuree());
pel.setDuree(duree.getSelectionModel().getSelectedItem().getDuree());
}
});
Like this my model is set to selected item. But it implies some boilerplate code. Any better idea based on high level bindings of JavaFX ?

Dependency Injection of Primitive Types (Decided at Runtime) With HK2

So basically, I have a situation where I want to inject primitive types into a class (i.e. a String and an Integer). You can think of a URL and port number for an application as example inputs. I have three components:
Now say I have a class, which does take in these params:
public class PrimitiveParamsDIExample {
private String a;
private Integer b;
public PrimitiveParamsDIExample(String a, Integer b) {
this.a = a;
this.b = b;
}
}
So my question here is simple. How do I inject a and b into class PrimitiveParamsDIExample?
In general, this is also asking how to inject parameters that are decided on runtime as well. If I have a and b above, read from STDIN or from an input file, they're obviously going to be different from run to run.
All the more, how do I do the above within the HK2 framework?
EDIT[02/23/15]: #jwells131313, I tried your idea, but I'm getting the following error (this one for the String param; similar one for int):
org.glassfish.hk2.api.UnsatisfiedDependencyException: There was no object available for injection at Injectee(requiredType=String,parent=PrimitiveParamsDIExample,qualifiers
I set up classes exactly as you did in your answer. I also overrode the toString() method to print both variables a and b in PrimitiveParamsDIExample. Then, I added the following in my Hk2Module class:
public class Hk2Module extends AbstractBinder {
private Properties properties;
public Hk2Module(Properties properties){
this.properties = properties;
}
#Override
protected void configure() {
bindFactory(StringAFactory.class).to(String.class).in(RequestScoped.class);
bindFactory(IntegerBFactory.class).to(Integer.class).in(RequestScoped.class);
bind(PrimitiveParamsDIExample.class).to(PrimitiveParamsDIExample.class).in(Singleton.class);
}
}
So now, I created a test class as follows:
#RunWith(JUnit4.class)
public class TestPrimitiveParamsDIExample extends Hk2Setup {
private PrimitiveParamsDIExample example;
#Before
public void setup() throws IOException {
super.setupHk2();
//example = new PrimitiveParamsDIExample();
example = serviceLocator.getService(PrimitiveParamsDIExample.class);
}
#Test
public void testPrimitiveParamsDI() {
System.out.println(example.toString());
}
}
where, Hk2Setup is as follows:
public class Hk2Setup extends TestCase{
// the name of the resource containing the default configuration properties
private static final String DEFAULT_PROPERTIES = "defaults.properties";
protected Properties config = null;
protected ServiceLocator serviceLocator;
public void setupHk2() throws IOException{
config = new Properties();
Reader defaults = Resources.asCharSource(Resources.getResource(DEFAULT_PROPERTIES), Charsets.UTF_8).openBufferedStream();
load(config, defaults);
ApplicationHandler handler = new ApplicationHandler(new MyMainApplication(config));
final ServiceLocator locator = handler.getServiceLocator();
serviceLocator = locator;
}
private static void load(Properties p, Reader r) throws IOException {
try {
p.load(r);
} finally {
Closeables.close(r, false);
}
}
}
So somewhere, the wiring is messed up for me to get an UnsatisfiedDependencyException. What have I not correctly wired up?
Thanks!
There are two ways to do this, but one isn't documented yet (though it is available... I guess I need to work on documentation again...)
I'll go through the first way here.
Basically, you can use the HK2 Factory.
Generally when you start producing Strings and ints and long and scalars like this you qualify them, so lets start with two qualifiers:
#Retention(RUNTIME)
#Target( { TYPE, METHOD, FIELD, PARAMETER })
#javax.inject.Qualifier
public #interface A {}
and
#Retention(RUNTIME)
#Target( { TYPE, METHOD, FIELD, PARAMETER })
#javax.inject.Qualifier
public #interface B {}
then write your factories:
#Singleton // or whatever scope you want
public class StringAFactory implements Factory<String> {
#PerLookup // or whatever scope, maybe this checks the timestamp?
#A // Your qualifier
public String provide() {
// Write your code to get your value...
return whatever;
}
public void dispose(String instance) {
// Probably do nothing...
}
}
and for the Integer:
#Singleton // or whatever scope you want
public class IntegerBFactory implements Factory<Integer> {
#PerLookup // or whatever scope, maybe this checks the timestamp?
#B // Your qualifier
public Integer provide() {
// Write your code to get your value...
return whatever;
}
public void dispose(String instance) {
// Probably do nothing...
}
}
Now lets re-do your original class to accept these values:
public class PrimitiveParamsDIExample {
private String a;
private int b;
#Inject
public PrimitiveParamsDIExample(#A String a, #B int b) {
this.a = a;
this.b = b;
}
}
Note I changed Integer to int, well... just because I can. You can also just use field injection or method injection in the same way. Here is field injection, method injection is an exercise for the reader:
public class PrimitiveParamsDIExample {
#Inject #A
private String a;
#Inject #B
private int b;
public PrimitiveParamsDIExample() {
}
}
There are several ways to bind factories.
In a binder: bindFactory
Using automatic class analysis: addClasses
An EDSL outside a binder: buildFactory

More than one singleton instance in Guice injector

We got a Jetty/Jersey application. We are converting it to use Guice for DI. The problem: We need more than one instance of a Singleton classes. The catch: The number of instances is determined dynamically from a configuration file. Therefore we cant use annotations for different instances.
final InjectedClass instance = injector.getInstance(InjectedClass.class);
This is the standard syntax of the injector. I need something like
final String key = getKey();
final InjectedClass instance = injector.getInstance(InjectedClass.class, key);
There is a way to get an instance from a Guice Key.class
final InjectedClass instance = injector.getInstance(Key.get(InjectedClass.class, <Annotation>);
but the problem is that I need some dynamic annotation, not predefined one.
You could try to use Provider, or #Provides method that would have map of all instances already created. When the number of instances is reached number defained in config file, you wont create any new instances, instead you return old instance from map.
For example something like this could help you.
public class MyObjectProvider implements Provider<MyObject> {
private final Injector inj;
private int counter;
private final int maxNum = 5;
private List<MyObject> myObjPool = new ArrayList<MyObject>();
#Inject
public MyObjectProvider(Injector inj) {
this.connection = connection;
}
public MyObject get() {
counter = counter+1%maxNum;
if(myObjPool.size()=<maxNum) {
MyObject myobj = inj.getInstance(MyObject.class);
myObjPool.add(myobj);
return myobj;
} else {
return myObjPool.get(counter);
}
}
}
P.S.
I wrote this from my head so maybe it does not compile, this is just an idea.
You can solve this by creating a factory. In my example I have used the guice extension called multibindings
interface InjectedClassFactory {
public InjectedClass get(String key);
}
class InjectedClass {}
class InjectedClassFactoryImpl implements InjectedClassFactory{
private final Map<String, InjectedClass> instances;
#Inject
InjectedClassFactoryImpl(Map<String, InjectedClass> instances) {
this.instances = instances;
}
#Override
public InjectedClass get(String key) {
return instances.get(key);
}
}
class MyModule extends AbstractModule {
#Override
protected void configure() {
MapBinder<String, InjectedClass> mapBinder =
MapBinder.newMapBinder(binder(), String.class, InjectedClass.class);
//read you config file and retrieve the keys
mapBinder.addBinding("key1").to(InjectedClass.class).in(Singleton.class);
mapBinder.addBinding("key2").to(InjectedClass.class).in(Singleton.class);
}
}

Resources