How do I code an ant Task that takes an arbitrary Mapper? - ant

Generally speaking, any ant task which accepts a <mapper> will also accept several tags designating particular mappers: <identitymapper>, <regexmapper>, etc.
But if you're writing your own task, you are supposed to supply a method for each possible tag that may exist inside your task. You don't want to add separate addConfiguredMapper(), addConfiguredIdentityMapper(), addConfiguredRegexMapper(), etc. methods. How do you easily set up a custom ant Task to take any arbitrary Mapper, specified by either the general <mapper> tag or the tag for each particular instance?

These are the two methods you will need to supply:
public Mapper createMapper() throws BuildException;
public void add(FileNameMapper fileNameMapper);
Take a look at the Copy task in the ant source distribution to see how these are implemented.

Related

Is there a way to upload jars for a dataflow job so we don't have to serialize everything?

As an example, I remember in hadoop I could make classes serializable or give a path with the jars needed for my job. I had either option. I am wondering if this is true in a dataflow job such that I can have all the clients we have in jar files to be packaged for all the workers.
In our case, we have MicroserviceApi and a generated client, etc. and would prefer to output to that downstream microservice without having to make it serializable.
Is there a method to do this?
First, let me clarify about serialization
When you add implements Serializable to a class in Java, you make it such that object instances of that class can be serialized (not the class itself). The destination JVM needs to have access to the class to be able to understand the serialized instances that you send to it. So, in fact you always need to provide the JAR for it.
Beam has code to automatically find all JARs in your classpath and upload them to Dataflow or whatever runner you're using, so if the JAR is in your classpath, then you don't need to worry about it (if you're using Maven/Gradle and specifying it as a dependency, then you're most likely fine).
Now, how can I use a class in Beam if it's not serializable?
In Beam, the more important part of it is to figure out where and when the different parts of the pipeline code will execute. Some things execute at pipeline construction time and some things execute at pipeline running time.
Things that run at construction time
Constructors for all your classes (DoFns, PTransforms, etc)
The expand method of your PTransforms
Things that run at executuion time
For your DoFns: ProcessElement, StartBundle, FinishBundle, Setup, TearDown methods.
If your class does not implement serializable, but you want to access it at execution time, then you need to create it at execution time. So, suppose that you have a DoFn:
class MyDoFnImplementation extends DoFn<String, String> {
// All members of the object need to be serializable. String is easily serializable.
String config;
// Your MicroserviceApi is *not* serializable, so you can mark it as transient.
// The transient keyword ensures that Java will ignore the member when serializing.
transient MicroserviceApi client;
public MyDoFnImplementation(String configuration) {
// This code runs at *construction time*.
// Anything you create here needs to be serialized and sent to the runner.
this.config = configuration;
}
#ProcessElement
public void process(ProcessContext c) {
// This code runs at execution time. You can create your object here.
// Add a null check to ensure it's only created once.
// You can also create it at #Setup or #StartBundle.
if (client == null) client = new MicroserviceApi(this.config);
}
}
By ensuring that objects are created at execution time, you can avoid the need to make them serializable - but your configuration needs to be serializable.

Find implementation of interface method

I'd like to visualize method chains of our codebase (which method invokes which method) starting from a given method with jqassistant.
For normal method calls the following Cypher query works. workupNotification is the method I am starting from:
MATCH (begin:Method {name: "workupNotification"}) -[INVOKES*1..20]-> (method:Method) WHERE not method:Constructor and exists(method.name) RETURN begin, method, type
But many of the methods calls in our software are calls to interfaces of which the implementation is not known in the method (SOA with Dependency Inversion).
serviceRegistry.getService(MyServiceInterface.class).serviceMethod();
How can I select the implementation of this method (There are two classes implementing each interface. One is automatically generated (Proxy), the other is the one I'm interested in.)
You need to do what the JVM is performing at runtime for you: resolving virtual method invocations. There's a predefined jQAssistant concept that propagates INVOKES relations to implementing sub-classes: java:InvokesOverriddenMethod. You can either reference it as a required concept from one of your own rules or apply it from the command line, e.g. with Maven:
mvn jqassistant:analyze -Djqassistant.concepts=java:InvokesOverriddenMethod
The rule is documented in the manual, see http://buschmais.github.io/jqassistant/doc/1.6.0/#java:InvokesOverriddenMethod
(The name of the concept is not intuitive, it would be better to replace it with something like java:VirtualInvokes).
It is deprecated. In 1.9.0 version it should be used this command line:
mvn jqassistant:analyze -Djqassistant.concepts=java:VirtualInvokes
http://jqassistant.github.io/jqassistant/doc/1.8.0/#java:VirtualInvokes

Unable to run multiple Pipelines in desired order by creating template in Apache Beam

I have two separate Pipelines say 'P1' and 'P2'. As per my requirement I need to run P2 only after P1 has completely finished its execution. I need to get this entire operation done through a single Template.
Basically Template gets created the moment it finds run() its way say p1.run().
So what I can see that I need to handle two different Pipelines using two different templates but that would not satisfy my strict order based Pipeline execution requirement.
Another way I could think of calling p1.run() inside the ParDo of p2.run() and keep the run() of p2 wait until finish of run() of p1. I tried this way but stuck at IllegalArgumentException given below.
java.io.NotSerializableException: PipelineOptions objects are not serializable and should not be embedded into transforms (did you capture a PipelineOptions object in a field or in an anonymous class?). Instead, if you're using a DoFn, access PipelineOptions at runtime via ProcessContext/StartBundleContext/FinishBundleContext.getPipelineOptions(), or pre-extract necessary fields from PipelineOptions at pipeline construction time.
Is it not possible at all to call the run() of a pipeline inside any transform say 'Pardo' of another Pipeline?
If this is the case then how to satisfy my requirement of calling two different Pipelines in sequence by creating a single template?
A template can contain only a single pipeline. In order to sequence the execution of two separate pipelines each of which is a template, you'll need to schedule them externally, e.g. via some workflow management system (such as what Anuj mentioned, or Airflow, or something else - you might draw some inspiration from this post for example).
We are aware of the need for better sequencing primitives in Beam within a single pipeline, but do not have a concrete design yet.

JBehave Set mapping between #BeforeStory/BeforeScenario per individual Story/Scenario

Looking at the code of jBehave I noticed that all the #BeforeStory/Scenarios annotations run for all the Stories/Scenarios present in the purview of JBehave. There doesn't seem to be any one to one correspondence between Stories/Scenarios and #BeforeStory/Scenario annotations. Please correct me if I am wrong. If this is the expected behavior, how can I get a one-to-one mapping of #BeforeStory/BeforeScenarios to stories and the scenarios?
For your reference this is what i am doing :
For each text story there is a corresponding *Story.java which extends a SuperStory.java which in return extends JUnitStories. Moreover there exists a *Steps.java corresponding to each text story as well. *Steps.java are injected from a common spring bean inside the SuperStory.java.apart from this there is single LifeCycleSteps extending PerStoriesWebDriverSteps.
What I am looking to achieve :
I want to configure my #BeforeStory/#BeforeScenarios to configure in a way such that execution of story "x" should execute only the #BeforeStory/Scenarios of the Step "x"
Since JBehave is BDD, you should design your stories such that you only need to have a #BeforeScenario since each step inside the scenario will be run in sequence, any pre-requisites should be done in the #Given or the #BeforeScenario. Other testing frameworks that are not BDD need a #Before because every step is meant to be designed such that it can be run without any other dependencies.

passing parameters to dependent ant target

i have two ant files
mainBuild.xml
subBuild.xml
subBuild.xml is imported in the mainBuild.xml. One of target from mainBuild depends on subBuild. I need to pass the argument to the dependent ant target. I dont want to use the <antcall> or the <ant> tags, as i need the some properties from the
You can define the arguments in the property files, and then read that property in ant like this.
<property file="build.start.properties"/>
All properties in the property file will be imported in ant, and will be available as ant properties, which you can use in both mainBuild.xml and subBuild.xml.
refer this for further reference
Macros are one way to have re-usable code in ant. You can call them with different argument. Re-using of targets (using property ) may not be desirable as the properties are immutable.

Resources