Migrating from Dataflow 2.5.0 SDK to Beam 2.13 release - google-cloud-dataflow

I got an error message saying Dataflow 2.5 (Java) is the last supported release and I should use Beam. Is there a migration guide? I can find Dataflow 1.x to 2.x but not Dataflow to Beam.
For example, DataflowPipelineOptions doesn't seem to be installed if you use the maven archetype suggested in the Beam documentation.
Specifically:
import org.apache.beam.runners.dataflow.options.DataflowPipelineOptions
is not found when I use the pom.xml generated using:
mvn archetype:generate \
-DarchetypeGroupId=org.apache.beam \
-DarchetypeArtifactId=beam-sdks-java-maven-archetypes-starter \
-DarchetypeVersion=2.13.0 \
-DgroupId=com.myexample \
-DartifactId=newpackage \
-Dversion="1.1" \
-DinteractiveMode=false
even after adding:
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-google-cloud-dataflow-java</artifactId>
<version>2.13.0</version>
<scope>runtime</scope>
</dependency>
to the generated pom.xml.

You need a few additional Google Cloud dependencies in your pom.xml in order to run your Beam pipeline on Dataflow. Things worked for me after I added:
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-google-cloud-dataflow-java</artifactId>
<version>${beam.version}</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-google-cloud-platform</artifactId>
<version>${beam.version}</version>
<exclusions>
<exclusion>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-extensions-google-cloud-platform-core</artifactId>
<version>${beam.version}</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-extensions-protobuf</artifactId>
<version>${beam.version}</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-jdbc</artifactId>
<version>${beam.version}</version>
</dependency>
In addition, you may need to add a few more parameters to your startup script. I had to add:
gcpTempLocation=gs://$BUCKET/tmp

This blog post may be helpful. A user here described their migration.
I believe the package renaming (com.google.cloud.dataflow to org.apache.beam) and new class/method signatures is already done if you're on Dataflow 2.x SDKs.
So I think in this case the migration should be straightforward. Please try removing the Dataflow SDK and introducing org.apache.beam on the lastest version. It may work without modification. You could also try using org.apache.beam on 2.5 first. Then upgrading to 2.13, and see if that goes smoothly as well.

Related

Smaller deps for spring stomp websocket client

I am using spring stomp websocket on a Java client using StandardWebSocketClient, much like this example:
https://www.baeldung.com/websockets-api-java-spring-client
Using spring-boot-starter-websocket maven dep adds too many deps that are not needed on the client, so I came up with:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-websocket</artifactId>
<exclusions>
<exclusion>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.tomcat.embed</groupId>
<artifactId>tomcat-embed-websocket</artifactId>
<version>${tomcat.embed.websocket.version}</version>
<scope>compile</scope>
<exclusions>
<exclusion>
<artifactId>tomcat-annotations-api</artifactId>
<groupId>org.apache.tomcat</groupId>
</exclusion>
</exclusions>
</dependency>
The tomcat-embed-websocket is to provide a javax.websocket implementation. But it's quite big as it adds tomcat-embed-core.
I tried to use Tyrus but don't know which deps to add.
Would Tyrus be smaller? How to add only client deps?
Tyrus standalone client bundle might be the best thing to start with
https://repo1.maven.org/maven2/org/glassfish/tyrus/bundles/tyrus-standalone-client/2.0.1/
I'm not sure how big is the tomcat embed core, but this isn't a small lib, it has ~2.2 MB.

Monitoring custom Stream apps in Spring Cloud data flow

I am trying scdf and its monitoring with prometheus and grafana. I followed the documentation available and able to deploy the sample stream and able to see the metrics in the grafana.
I have created a stream with some custom stream app (other than the supplied rabbit mq starter apps).
Stream:
htt | participant | log
But am not able see the "participant" application metrics in gafana. But able to see the metrics of http and log apps.
Added below properties in application.properties.
management.endpoint.metrics.enabled=true
management.endpoints.web.exposure.include=*
management.endpoint.prometheus.enabled=true
management.metrics.export.prometheus.enabled=true
spring.cloud.streamapp.security.enabled=false
Added below dependencies:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<!--<!– https://mvnrepository.com/artifact/org.springframework.cloud.stream.app/app-starters-common –>-->
<!--<dependency>-->
<!--<groupId>org.springframework.cloud.stream.app</groupId>-->
<!--<artifactId>app-starters-common</artifactId>-->
<!--<version>2.1.1.RELEASE</version>-->
<!--<type>pom</type>-->
<!--</dependency>-->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-amqp</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-stream</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-stream-binder-rabbit</artifactId>
</dependency>
After adding app-starters-common:org.springframework.cloud.stream.app dependency localhost:< port >/ opens a login page.
I think you need app-starters-micrometer-common dependency which adds some of the micrometer tags to your app. This dependency is intended to be used by the Spring cloud stream app starters and I believe you can use it in your custom application as well.
I came across this question while looking answers for the same.
Here is my working code snippet:-
Pom.xml
<dependency>
<groupId>org.springframework.cloud.stream.app</groupId>
<artifactId>app-starters-micrometer-common</artifactId>
<version>2.1.2.RELEASE</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-jmx</artifactId>
</dependency>
application property change to enable prometheus and disable security(login page)
management.endpoints.web.exposure.include=*
management.metrics.export.prometheus.enabled=true
-- this one to remove security (login page), which was automatically added by app-starter-micrometre-common dependency.
spring.autoconfigure.exclude=org.springframework.boot.autoconfigure.security.servlet.SecurityAutoConfiguration, org.springframework.boot.actuate.autoconfigure.security.servlet.ManagementWebSecurityAutoConfiguration
or
exclude the dependency, I excluded config-client and few other since I don't need them in my application.
<dependency>
<groupId>org.springframework.cloud.stream.app</groupId>
<artifactId>app-starters-micrometer-common</artifactId>
<version>2.1.2.RELEASE</version>
<exclusions>
<exclusion>
<artifactId>spring-security-config</artifactId>
<groupId>org.springframework.security</groupId>
</exclusion>
<exclusion>
<artifactId>spring-cloud-services-starter-config-client</artifactId>
<groupId>io.pivotal.spring.cloud</groupId>
</exclusion>
<exclusion>
<artifactId>*</artifactId>
<groupId>org.springframework.boot</groupId>
</exclusion>
</exclusions>
</dependency>
With later Data Flow 2.3.x, you need to add the following dependencies to your processor:
<dependency>
<groupId>org.springframework.cloud.stream.app</groupId>
<artifactId>app-starters-micrometer-common</artifactId>
<version>2.1.2.RELEASE</version>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer.prometheus</groupId>
<artifactId>prometheus-rsocket-spring</artifactId>
<version>0.9.0</version>
</dependency>
The app-starters-micrometer-common injects DataFlow specific tags, such as stream.name, application.name, application.type all used by the dashboard to aggregate the required metrics.
In addition you can follow instructions in the sample projects, showing how to build custom Source, Processor and Sink apps with enabled prometheus monitoring:
https://github.com/spring-cloud/spring-cloud-dataflow-samples/tree/master/monitoring-samples/stream-apps

Making dropwizard 1.0.5 work with Powermock

How can I make Powermock work with dropwizard version 1.0.5. I have tried to include all kinds of versions of powermock to my project each time a get a different kind of error.
For example when I use:
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-module-junit4</artifactId>
<version>1.6.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-api-mockito</artifactId>
<version>1.6.1</version>
<scope>test</scope>
</dependency>
I get:
java.lang.AbstractMethodError: org.powermock.api.mockito.internal.mockmaker.PowerMockMaker.isTypeMockable(Ljava/lang/Class;)Lorg/mockito/plugins/MockMaker$TypeMockability;
at org.mockito.internal.util.MockUtil.typeMockabilityOf(MockUtil.java:26)
Using version 1.5.4 gives me:
org.powermock.reflect.exceptions.FieldNotFoundException: Field 'fTestClass' was not found in class org.junit.internal.runners.MethodValidator.
I have even tried to use version 1.7.3 and <artifactId>powermock-api-mockito2</artifactId>
My test class is as simple as this
#RunWith(PowerMockRunner.class)
#PrepareForTest(MyStaticMethodClass.class)
public class TestStaticMethods {
#Test
public void testMyStatic() {
PowerMockito.mockStatic(MyStaticMethodClass.class);
Mockito.when(MyStaticMethodClass.getString()).thenReturn("Hello World");
String result = MyStaticMethodClass.getString();
Assert.assertEquals("Helo World", result);
}
}
I have looked into the documentation of powermock my junit version is 4.12 https://github.com/powermock/powermock/wiki/Mockito-Maven
I have the following external libraries
Are they fetched from
<dependency>
<groupId>io.dropwizard</groupId>
<artifactId>dropwizard-testing</artifactId>
<scope>test</scope>
</dependency>
Tried to exclude them but they don't disappear I am using Intellij as my IDE.
Is it because of these libraries that there might be some conflicting initializations of the testing environment?
EDIT 1
Ok, so I have tried to create a small java project with nothing other than a the following dependencies:
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-module-junit4</artifactId>
<version>1.7.3</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-api-mockito</artifactId>
<version>1.7.3</version>
<scope>test</scope>
</dependency>
my libraries are:
And my test file is exactly the same as above, then it works perfectly fine. So I guess it has to do something with Dropwizard...?
I have created a simple project using DropWizard and PowerMock and the tests execution were successful using all different versions of PM (1.6.1, 1.7.3 and 1.5.4), both using Intellij and Maven.
Having said that, it is strange that the dropwizard-testing artifact is pulling different versions of mockito (1.10.8 for all and 2.0.54-beta for core). You can exclude the mockito-core dependency from the dropwizard-testing artifact and that will at least ensure that there are no conflicting versions of mockito.
<dependency>
<groupId>io.dropwizard</groupId>
<artifactId>dropwizard-testing</artifactId>
<version>${dropwizard.version}</version>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>org.mockito</groupId>
<artifactId>mockito-core</artifactId>
</exclusion>
</exclusions>
</dependency>
I have also tested with versions 1.1.7 and 1.2.4 of DW but both worked fine for me.

google cloud dataflow sdk - dependencies issue

Added the dataflow dependency to the project. The project builds , but on start up ( using jetty ) I get a runtime exception
<dependency>
<groupId>com.google.cloud.dataflow</groupId>
<artifactId>google-cloud-dataflow-java-sdk-all</artifactId>
<version>1.9.0</version>
</dependency>
caused by: java.lang.ClassNotFoundException: com.google.auth.http.HttpTransportFactory
at org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy.loadClass(SelfFirstStrategy.java:50)
at org.codehaus.plexus.classworlds.realm.ClassRealm.unsynchronizedLoadClass(ClassRealm.java:259)
at org.codehaus.plexus.classworlds.realm.ClassRealm.loadClass(ClassRealm.java:235)
at org.codehaus.plexus.classworlds.realm.ClassRealm.loadClass(ClassRealm.java:227)
at org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:487)
at org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:428)
if i remove the dependency. The start up has not issues.
Any idea why the dataflow dependency is causing startup error.
Added the exclusion for the conflicting dependency, and it works.
<dependency>
<groupId>com.google.cloud.dataflow</groupId>
<artifactId>google-cloud-dataflow-java-sdk-all</artifactId>
<version>1.9.0</version>
<exclusions>
<exclusion>
<groupId>com.google.auth</groupId>
<artifactId>google-auth-library-oauth2-http</artifactId>
</exclusion>
</exclusions>
</dependency>

SDN 4 - InProcessServer broken in snapshot build

Since about a week ago, running tests with InProcessServer on 4.0.0.BUILD-SNAPSHOT results in the following exception:
Caused by: java.lang.NoClassDefFoundError: org/neo4j/ogm/testutil/TestServer
at org.springframework.data.neo4j.server.InProcessServer.<init>(InProcessServer.java:25) ~[spring-data-neo4j-4.0.0.BUILD-SNAPSHOT-tests.jar:na]
at com.ninjasquare.server.test.integration.IntegrationTestConfig.neo4jServer(IntegrationTestConfig.java:43) ~[test-classes/:na]
Swithing the test dependency back to 4.0.0.M1 resolves the issue:
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-neo4j</artifactId>
<version>4.0.0.M1</version>
<type>test-jar</type>
</dependency>
I assume it's something to do with some refactoring work on SDN4/OGM?
Thanks.
Yes, in recent snapshots, the OGM has been separated from SDN. You'll need to include these two dependencies now to use the test utilities.
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-ogm</artifactId>
<version>1.1.0</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.neo4j.test</groupId>
<artifactId>neo4j-harness</artifactId>
<version>${neo4j.version}</version>
<scope>test</scope>
</dependency>

Resources