Can I use third party libraries with Cloud Dataflow? - google-cloud-dataflow

Does Cloud Dataflow allows you to use it with third party library jar files? How about non-Java libraries?
Kaz

Yes you can use third party library files just fine. By default when you run your Dataflow main program to submit your job, Dataflow will analyze your classpath and upload any jars it sees and add them to the class path of the workers.
If you need more controlthen you can use the command line option --filesToStage to specify additional files to stage on the workers.
Another common technique is building a single bundled jar which contains all your dependencies. One way to build a bundled jar is to use a maven plugin like shade.

Related

Build Beam pipelines using Bazel (with DataflowRunner)

I use Bazel to build my Beam pipeline. The pipeline works well using the DirectRunner, however, I have some trouble managing dependencies when I use DataflowRunner, Python can not find local dependencies (e.g. generated by py_library) in DataflowRunner. Is there any way to hint Dataflow to use the python binary (py_binray zip file) in the worker container to resolve the issue?
Thanks,
Please see here for more details on setting up dependencies for Python SDK on Dataflow. If you are using a local dependency, you should probably look into developing a Python package and using the extra_package option or developing a custom container.

add local jars to ivy build script

I am working on an existing application that uses ivy to manage dependencies, and the source comes with ivy.xml and ivysettings.xml files. I am trying to add my own jar to the build. What would be the easiest way to do this?
I tried adding a dependency to ivy.xml and I am not sure how to configure the repository directories. Maybe there are easy ways to do this? Any quick and dirty way will do.
The filesystem resolver in conjunction with the chain resolver should help you, assuming that you can modify the ivysettings.xml that you just inherited.
You can store your jars locally on your machine under your Local Ivy cache or your Shared Ivy cache. I believe it's $HOME/.ivy2/local and $HOME/.ivy2/shared and its in the same format as the $HOME/.ivy2/cache directory. If you use <ivy:publish/> Ant task to push your local jars to your local repository, they'll be accessible to all of your projects.
However, I recommend biting the bullet and doing things ...what's the technical term? oh yeah... The correct way.
Go ahead and setup a project wide Ivy/Maven repository where you can fetch your local jars the same way you fetch your third party jars. This way, there is no difference between your local jars, and the third party jars you're using. No one has to think where a particular Jar is located or adjust their Ivy configuration to get one jar or another.
Download either Nexus or Artifactory. You can set these repositories up so that all the third-party jars and your local jars are available as if they're all stored in the same server. You can even add in other jar repositories that are not centrally located.
I recommend Loughran's book Ant in Action. It has an excellent chapter on using Ivy. You can also look at my ivy.dir to see how I configure Ivy, so it's easily accessible to all of our projects.

Include ant libs from within the build file

My problem is the following:
I would like to use the propertyregex task in ant. The project I am working on is built on various different servers and I don't want to configure (install the ant-nodeps.jar) every server. The source needs to include everything, that is not installed on the system by default.
So now I would need to add the ant-nodeps.jar to the ant classpath from within the build file. Does somebody know how to do that?
Cheers,
Robert
The propertyregex task is part of ant-contrib and can be installed as part of your build using Apache ivy
Checkout the following example, which demonstrates how to download and use the "for" task (also from the ant-contrib project):
Problems getting my ANT builds to work after OS upgrade
The one downside is that ivy does not come pre-packaged with ANT, so the following answer has a tip on how to bootstrap your ANT builds. Once ivy is started it can be used to pull down everything else your build needs.
Ivy fails to resolve a dependency, unable to find cause
Update
While I understand you requirement to have no change on the target platforms, it's a very difficult problem to solve if you must also match several old versions of the build software. I have found incompatibilities between the latest ANT and 5 year old versions like 1.7 (ANT 1.6.5 is now 8 years old....)
What I do is install a very limited number of ANT versions on my Jenkins slave nodes. Build jobs can then only choose from these and then use ivy to download all other 3rd party software dependencies (This setup emulates how you'd manage a set of Maven projects).
I suspect you're using ANT to run your deployments? If that is the case I would suggest switching to something like Groovy, which can be deployed as a single jar file and can pull down dependencies on the fly, using Grape.

How do you integrate ivy with MSbuild

What approach has worked well for you combining IVY + msbuild?
Our goal is to integrate IVY into the C#/C++ build process for dependency resolution and publishing. We have tried adding it to custom tasks at the beginning and end of the build and we have tried wrapping the msbuild calls with ant+ apache-ant-dotnet.
Other options might be gradle, buildr, rake.
What do you use?
Thanks
Peter
Most build technologies can use libraries found in a local directory. I'd suggest using the command-line ivy program to populate this, at the start of your build:
java -jar ivy.jar -ivy ivy.xml -settings ivysettings.xml -retrieve "lib/[conf]/[artifact].[ext]"
Your dependencies are listed in a standard ivy file called ivy.xml. The protocol, location and layout of your remote repository is described in ivysettings.xml
The advantage of this approach (as opposed to switching to Gradle, etc) is that you're not trying to replace your existing build tool. Ivy is solely concerned with managing dependencies.
My team has been using Ivy for .NET for a couple of years very successfully. I know several more that give it a vote of confidence.
Use it standalone. Wrap calls into msbuild tasks. No need to use Ant integration.

Cleanup Antscript

Are there any tools available that allow the cleanup of a ant script?
I especially need to remove unecessary jar files... The Ant script I have to clean has more than 500 entries and has grown uncrontrolled over time.
There's no automated way of cleaning up jar files. You can look at the various include statements in your Java code, but they merely mention classes to include and not the jar themselves. Even if you can determine that a particular class is served by jarA.jar, it could be that jarA.jar is dependent upon jarB.jar.
You can even start removing jar files one at a time to see what breaks your build. That can be somewhat automated, especially if you specify your classpath via fileset instead of each specific jar. However, what if you actually need a jar for runtime, and not for the build?
My suggestion is to use Ant with Ivy. Ivy gives you the same Maven jar dependency capabilities without converting your project to Maven.
Take a look at Ivy and see how it works with Ant. Then, if possible, ask your developers to determine exactly what jars they need and what versions of those jars they need. You will have to help them. You might have to go through the jars in your repository and attempt to figure out what versions of the jars are in your repository.
You don't have to worry about jars that other jars depend upon. Ivy will take care of that for you. What you simply need are the jars that your developers depend upon, and they should know because they're the ones who use the include statements in their programs to specify a particular dependency.
Once you've determined the primary jars (and revisions) you need, you can easily convert your build.xml files to take advantage of Ivy's jar dependency system. Once you've done that, you can remove all the jars from your source repository since Ant with Ivy will download the required ones from the Internet based Maven repository system.

Resources