How do I manage multi-projects dependency (Directed acyclic path) with IBM RAD ant? - ant

I am working on an ant script to build java prjects developed with IBM RAD 7.5.
The an script is calling IBM RAD ant extenstion API. I am using Task to load the project set file(*.psf) into the memory, and calling Task to compile the projects listed in the projectSetImport.
The problem is the projects listed in psf file is not ordered by project dependency, when compiles, it fails because the depency is incorrect.
Is there any API or method to manage the dependency automatically? the psf files Iam handling is quite big, with 200+ projects in each file and it is constanly changing(e.g. some projects get removed and some new projects added in each week)
here is a detailed description for the question:
The project dependency is like:
1) project A depends on B and D.
2) project B depends on C
3) project E depends on F
A -> B -> C
A -> D
E-> F
The sample.psf file just list all projects:
A
B
C
D
E
F
loads sample.psf, which have a project list [A,B,C,D,E,F]
build project list from
the build fail at A, because A need B and D to be build first.
My current solution is to rebuild the sample.psf manually, e.g.
sample.psf file:
C
B
D
A
F
E
but this is hard to maintain, because there are 200+ projects in a psf file and they are constanly changing.
One way to attack this issue is to write a parser to read the .project file for each project, the dependency projects are listed in "projects" tag. Then implement a Directed acyclic path algorithm to reorder the dependency. This approach might be over kill. This must be a common issue in teams build IBM java projects, is there a solution?

Finally, I wrote some python code to compute the dependency. I Listed the logic below:
read the psf file into an list, the psf file is a xml file, and
the project name is in tag.
for each project in the
list, go to project source code and read the .project file and
.classpath file, these two files contains the dependency project.
for .project file(xml), fetch the project name from tag,
for .classpath file. fetch the line with attribute kind='src'
now you got [source]->[dependened_project_list], implement a
Directed acyclic map.(see attached code)
load the [source]->[dependened_project] in to the AdjecentListDigraph, and
call topoSort() to return the dependency.
generate a new ordered psf file.
/////////////////////// dap_graph.py/////////////////////////////
# -*- coding: utf-8 -*-
'''Use directed acyclic path to calculate the dependency'''
class Vertex:
def init(self, name):
self._name = name
self.visited = True
class InValidDigraphError(RuntimeError):
def init(self, arg):
self.args = arg
class AdjecentListDigraph:
'''represent a directed graph by adjacent list'''
def __init__(self):
'''use a table to store edges,
the key is the vertex name, value is vertex list
'''
self._edge_table = {}
self._vertex_name_set = set()
def __addVertex(self, vertex_name):
self._vertex_name_set.add(vertex_name)
def addEdge(self, start_vertex, end_vertex):
if not self._edge_table.has_key(start_vertex._name):
self._edge_table[start_vertex._name] = []
self._edge_table[start_vertex._name].append(end_vertex)
# populate vertex set
self.__addVertex(start_vertex._name)
self.__addVertex(end_vertex._name)
def getNextLeaf(self, vertex_name_set, edge_table):
'''pick up a vertex which has no end vertex. return vertex.name.
algorithm:
for v in vertex_set:
get vertexes not in edge_table.keys()
then get vertex whose end_vertex is empty
'''
print 'TODO: validate this is a connected tree'
leaf_set = vertex_name_set - set(edge_table.keys())
if len(leaf_set) == 0:
if len(edge_table) > 0:
raise InValidDigraphError("Error: Cyclic directed graph")
else:
vertex_name = leaf_set.pop()
vertex_name_set.remove(vertex_name)
# remove any occurrence of vertext_name in edge_table
for key, vertex_list in edge_table.items():
if vertex_name in vertex_list:
vertex_list.remove(vertex_name)
# remove the vertex who has no end vertex from edge_table
if len(vertex_list) == 0:
del edge_table[key]
return vertex_name
def topoSort(self):
'''topological sort, return list of vertex. Throw error if it is
a cyclic graph'''
sorted_vertex = []
edge_table = self.dumpEdges()
vertex_name_set = set(self.dumpVertexes())
while len(vertex_name_set) > 0:
next_vertex = self.getNextLeaf(vertex_name_set, edge_table)
sorted_vertex.append(next_vertex)
return sorted_vertex
def dumpEdges(self):
'''return the _edge_list for debugging'''
edge_table = {}
for key in self._edge_table:
if not edge_table.has_key(key):
edge_table[key] = []
edge_table[key] = [v._name for v in self._edge_table[key]]
return edge_table
def dumpVertexes(self):
return self._vertex_name_set
//////////////////////projects_loader.py///////////////////////
-- coding: utf-8 --
'''
This module will load dependencies from every projects from psf, and compute
the directed acyclic path.
Dependencies are loaded into a map structured as below:
dependency_map{"project_A":set(A1,A2,A3),
"A1:set(B1,B2,B3)}
The algorithm is:
1) read
2) call readProjectDependency(project_name)
'''
import os, xml.dom.minidom
from utils.setting import configuration
class ProjectsLoader:
def __init__(self, application_name):
self.dependency_map = {}
self.source_dir = configuration.get('Build', 'base.dir')
self.application_name = application_name
self.src_filter_list = configuration.getCollection('psf',\
'src.filter.list')
def loadDependenciesFromProjects(self, project_list):
for project_name in project_list:
self.readProjectDependency(project_name)
def readProjectDependency(self, project_name):
project_path = self.source_dir + '\\' + self.application_name + '\\'\
+ project_name
project_file_path = os.path.join(project_path,'.project')
projects_from_project_file = self.readProjectFile(project_file_path)
classpath_file_path = os.path.join(project_path,'.classpath')
projects_from_classpath_file = self.\
readClasspathFile(classpath_file_path)
projects = (projects_from_project_file | projects_from_classpath_file)
if self.dependency_map.has_key(project_name):
self.dependency_map[project_name] |= projects
else:
self.dependency_map[project_name] = projects
def loadDependencyByProjectName(self, project_name):
project_path = self.source_dir + '\\' + self.application_name + '\\'\
+ project_name
project_file_path = os.path.join(project_path,'.project')
projects_from_project_file = self.readProjectFile(project_file_path)
classpath_file_path = os.path.join(project_path,'.classpath')
projects_from_classpath_file = self.\
readClasspathFile(classpath_file_path)
projects = list(set(projects_from_project_file\
+ projects_from_classpath_file))
self.dependency_map[project_name] = projects
for project in projects:
self.loadDependencyByProjectName(project)
def readProjectFile(self, project_file_path):
DOMTree = xml.dom.minidom.parse(project_file_path)
projects = DOMTree.documentElement.getElementsByTagName('project')
return set([project.childNodes[0].data for project in projects])
def readClasspathFile(self, classpath_file_path):
dependency_projects = set([])
if os.path.isfile(classpath_file_path):
DOMTree = xml.dom.minidom.parse(classpath_file_path)
projects = DOMTree.documentElement.\
getElementsByTagName('classpathentry')
for project in projects:
if project.hasAttribute('kind') and project.getAttribute\
('kind') == 'src' and project.hasAttribute('path') and \
project.getAttribute('path') not in self.src_filter_list:
project_name = project.getAttribute('path').lstrip('/')
dependency_projects.add(project_name)
return dependency_projects
def getDependencyMap(self):
return self.dependency_map

Related

Using DASK to read files and write to NEO4J in PYTHON

I am having trouble parallelizing code that reads some files and writes to neo4j.
I am using dask to parallelize the process_language_files function (3rd cell from the bottom).
I try to explain the code below, listing out the functions (First 3 cells).
The errors are printed at the end (Last 2 cells).
I am also listing environments and package versions at the end.
If I remove dask.delayed and run this code sequentially, its works perfectly well.
Thank you for your help. :)
==========================================================================
Some functions to work with neo4j.
from neo4j import GraphDatabase
from tqdm import tqdm
def get_driver(uri_scheme='bolt', host='localhost', port='7687', username='neo4j', password=''):
"""Get a neo4j driver."""
connection_uri = "{uri_scheme}://{host}:{port}".format(uri_scheme=uri_scheme, host=host, port=port)
auth = (username, password)
driver = GraphDatabase.driver(connection_uri, auth=auth)
return driver
def format_raw_res(raw_res):
"""Parse neo4j results"""
res = []
for r in raw_res:
res.append(r)
return res
def run_bulk_query(query_list, driver):
"""Run a list of neo4j queries in a session."""
results = []
with driver.session() as session:
for query in tqdm(query_list):
raw_res = session.run(query)
res = format_raw_res(raw_res)
results.append({'query':query, 'result':res})
return results
global_driver = get_driver(uri_scheme='bolt', host='localhost', port='8687', username='neo4j', password='abc123') # neo4j driver object.=
This is how we create a dask client to parallelize.
from dask.distributed import Client
client = Client(threads_per_worker=4, n_workers=1)
The functions that the main code is calling.
import sys
import time
import json
import pandas as pd
import dask
def add_nodes(nodes_list, language_code):
"""Returns a list of strings. Each string is a cypher query to add a node to neo4j."""
list_of_create_strings = []
create_string_template = """CREATE (:LABEL {{node_id:{node_id}}})"""
for index, node in nodes_list.iterrows():
create_string = create_string_template.format(node_id=node['new_id'])
list_of_create_strings.append(create_string)
return list_of_create_strings
def add_relations(relations_list, language_code):
"""Returns a list of strings. Each string is a cypher query to add a relationship to neo4j."""
list_of_create_strings = []
create_string_template = """
MATCH (a),(b) WHERE a.node_id = {source} AND b.node_id = {target}
MERGE (a)-[r:KNOWS {{ relationship_id:{edge_id} }}]-(b)"""
for index, relations in relations_list.iterrows():
create_string = create_string_template.format(
source=relations['from'], target=relations['to'],
edge_id=''+str(relations['from'])+'-'+str(relations['to']))
list_of_create_strings.append(create_string)
return list_of_create_strings
def add_data(language_code, edges, features, targets, driver):
"""Add nodes and relationships to neo4j"""
add_nodes_cypher = add_nodes(targets, language_code) # Returns a list of strings. Each string is a cypher query to add a node to neo4j.
node_results = run_bulk_query(add_nodes_cypher, driver) # Runs each string in the above list in a neo4j session.
add_relations_cypher = add_relations(edges, language_code) # Returns a list of strings. Each string is a cypher query to add a relationship to neo4j.
relations_results = run_bulk_query(add_relations_cypher, driver) # Runs each string in the above list in a neo4j session.
# Saving some metadata
results = {
"nodes": {"results": node_results, "length":len(add_nodes_cypher),},
"relations": {"results": relations_results, "length":len(add_relations_cypher),},
}
return results
def load_data(language_code):
"""Load data from files"""
# Saving file names to variables
edges_filename = './edges.csv'
features_filename = './features.json'
target_filename = './target.csv'
# Loading data from the file names
edges = helper.read_csv(edges_filename)
features = helper.read_json(features_filename)
targets = helper.read_csv(target_filename)
# Saving some metadata
results = {
"edges": {"length":len(edges),},
"features": {"length":len(features),},
"targets": {"length":len(targets),},
}
return edges, features, targets, results
The main code.
def process_language_files(process_language_files, driver):
"""Reads files, creates cypher queries to add nodes and relationships, runs cypher query in a neo4j session."""
edges, features, targets, reading_results = load_data(language_code) # Read files.
writing_results = add_data(language_code, edges, features, targets, driver) # Convert files nodes and relationships and add to neo4j in a neo4j session.
return {"reading_results": reading_results, "writing_results": writing_results} # Return some metadata
# Execution starts here
res=[]
for index, language_code in enumerate(['ENGLISH', 'FRENCH']):
lazy_result = dask.delayed(process_language_files)(language_code, global_driver)
res.append(lazy_result)
Result from res. These are dask delayed objects.
print(*res)
Delayed('process_language_files-a73f4a9d-6ffa-4295-8803-7fe09849c068') Delayed('process_language_files-c88fbd4f-e8c1-40c0-b143-eda41a209862')
The errors. Even if use dask.compute(), I am getting similar errors.
futures = dask.persist(*res)
AttributeError Traceback (most recent call last)
~/Code/miniconda3/envs/MVDS/lib/python3.6/site-packages/distributed/protocol/pickle.py in dumps(x, buffer_callback, protocol)
48 buffers.clear()
---> 49 result = pickle.dumps(x, **dump_kwargs)
50 if len(result) < 1000:
AttributeError: Can't pickle local object 'BoltPool.open.<locals>.opener
==========================================================================
# Name
Version
Build
Channel
dask
2020.12.0
pyhd8ed1ab_0
conda-forge
jupyterlab
3.0.3
pyhd8ed1ab_0
conda-forge
neo4j-python-driver
4.2.1
pyh7fcb38b_0
conda-forge
python
3.9.1
hdb3f193_2
You are getting this error because you are trying to share the driver object amongst your worker.
The driver object contains private data about the connection, data that do not make sense outside the process (and also are not serializable).
It is like trying to open a file somewhere and share the file descriptor somewhere else.
It won't work because the file number makes sense only within the process that generates it.
If you want your workers to access the database or any other network resource, you should give them the directions to connect to the resource.
In your case, you should not pass the global_driver as a parameter but rather the connection parameters and let each worker call get_driver to get its own driver.

Conditionally create a Bazel rule based on --config

I'm working on a problem in which I only want to create a particular rule if a certain Bazel config has been specified (via '--config'). We have been using Bazel since 0.11 and have a bunch of build infrastructure that works around former limitations in Bazel. I am incrementally porting us up to newer versions. One of the features that was missing was compiler transitions, and so we rolled our own using configs and some external scripts.
My first attempt at solving my problem looks like this:
load("#rules_cc//cc:defs.bzl", "cc_library")
# use this with a select to pick targets to include/exclude based on config
# see __build_if_role for an example
def noop_impl(ctx):
pass
noop = rule(
implementation = noop_impl,
attrs = {
"deps": attr.label_list(),
},
)
def __sanitize(config):
if len(config) > 2 and config[:2] == "//":
config = config[2:]
return config.replace(":", "_").replace("/", "_")
def build_if_config(**kwargs):
config = kwargs['config']
kwargs.pop('config')
name = kwargs['name'] + '_' + __sanitize(config)
binary_target_name = kwargs['name']
kwargs['name'] = binary_target_name
cc_library(**kwargs)
noop(
name = name,
deps = select({
config: [ binary_target_name ],
"//conditions:default": [],
})
)
This almost gets me there, but the problem is that if I want to build a library as an output, then it becomes an intermediate dependency, and therefore gets deleted or never built.
For example, if I do this:
build_if_config(
name="some_lib",
srcs=[ "foo.c" ],
config="//:my_config",
)
and then I run
bazel build --config my_config //:some_lib
Then libsome_lib.a does not make it to bazel-out, although if I define it using cc_library, then it does.
Is there a way that I can just create the appropriate rule directly in the macro instead of creating a noop rule and using a select? Or another mechanism?
Thanks in advance for your help!
As I noted in my comment, I was misunderstanding how Bazel figures out its dependencies. The create a file section of The Rules Tutorial explains some of the details, and I followed along here for some of my solution.
Basically, the problem was not that the built files were not sticking around, it was that they were never getting built. Bazel did not know to look in the deps variable and build those things: it seems I had to create an action which uses the deps, and then register an action by returning a (list of) DefaultInfo
Below is my new noop_impl function
def noop_impl(ctx):
if len(ctx.attr.deps) == 0:
return None
# ctx.attr has the attributes of this rule
dep = ctx.attr.deps[0]
# DefaultInfo is apparently some sort of globally available
# class that can be used to index Target objects
infile = dep[DefaultInfo].files.to_list()[0]
outfile = ctx.actions.declare_file('lib' + ctx.label.name + '.a')
ctx.actions.run_shell(
inputs = [infile],
outputs = [outfile],
command = "cp %s %s" % (infile.path, outfile.path),
)
# we can also instantiate a DefaultInfo to indicate what output
# we provide
return [DefaultInfo(files = depset([outfile]))]

Multiple outputs from one input based on features

I would like to build many outputs based on the same input, e.g. a hex and a binary from an elf.
I will do this multiple times, different places in the wscript so I'd like to wrap it in a feature.
Ideally something like:
bld(features="hex", source="output.elf")
bld(features="bin", source="output.elf")
How would I go about implementing this?
If your elf files always have the same extension, you can simply use that:
# untested, naive code
from waflib import TaskGen
#TaskGen.extension('.elf')
def process_elf(self, node): # <- self = task gen, node is the current input node
if "bin" in self.features:
bin_node = node.change_ext('.bin')
self.create_task('make_bin_task', node, bin_node)
if "hex" in self.features:
hex_node = node.change_ext('.hex')
self.create_task('make_hex_task', node, hex_node)
If not, you have to define the features you want like that:
from waflib import TaskGen
#Taskgen.feature("hex", "bin") # <- attach method features hex AND bin
#TaskGen.before('process_source')
def transform_source(self): # <- here self = task generator
self.inputs = self.to_nodes(getattr(self, 'source', []))
self.meths.remove('process_source') # <- to disable the standard process_source
#Taskgen.feature("hex") # <- attach method to feature hex
#TaskGen.after('transform_source')
def process_hex(self):
for i in self.inputs:
self.create_task("make_hex_task", i, i.change_ext(".hex"))
#Taskgen.feature("bin") # <- attach method to feature bin
#TaskGen.after('transform_source')
def process_hex(self):
for i in self.inputs:
self.create_task("make_bin_task", i, i.change_ext(".bin"))
You have to write the two tasks make_elf_task and make_bin_task. You should put all this in a separate python file and make a "plugin".
You can also define a "shortcut" to call:
def build(bld):
bld.make_bin(source = "output.elf")
bld.make_hex(source = "output.elf")
bld(features = "hex bin", source = "output.elf") # when both needed in the same place
Like that:
from waflib.Configure import conf
#conf
def make_bin(self, *k, **kw): # <- here self = build context
kw["features"] = "bin" # <- you can add bin to existing features kw
return self(*k, **kw)
#conf
def make_hex(self, *k, **kw):
kw["features"] = "hex"
return self(*k, **kw)

Patching class cprogram to accept two targets

I have a implement a custom C compiler tool, but at the last step (linking) I am struggling to get it working. The linker produces to output files, one is the binary, and the second one is some file with additional information.
Normally you would have a wscript with something like this:
def configure(cnf):
cnf.load('my_compiler_c')
def build(bld):
bld(features='c cprogram', source='main.c', target='app.bbin')
And I could fake a second target like this
class cprogram(link_task):
run_str = (
"${LINK_CC} ${CFLAGS} ${OTHERFLAGS} "
"${INFO_FILE}${TGT[0].relpath()+'.abc'} " # TGT[0] + some string concatenating will be the app.bbin.abc file
"${CCLNK_TGT_F}${TGT[0].relpath()} " # TGT[0] this is the app.bbin file
"${CCLNK_SRC_F}${SRC} ${STLIB_MARKER} ${STLIBPATH_ST:STLIBPATH} "
"${CSTLIB_ST:CSTLIB} ${STLIB_ST:STLIB} ${LIBPATH_ST:LIBPATH} ${LIB_ST:LIB} ${LDFLAGS}"
)
ext_out = [".bbin"]
vars = ["LINKDEPS"]
But of course, with this hacky implementation waf does not know about the second target and rebuilds will not be triggered when app.bbin.abc is missing.
So how do I correctly pass two or more targets to the cprogram class?
Well, you just have to tell waf that you need two targets:
def configure(cnf):
cnf.load('my_compiler_c')
def build(bld):
bld(features='c cprogram', source='main.c', target=['app.bbin', 'app.bbin.abc'])
As I suppose you dont want to type two targets, you can use an alias to build your task generator:
# Naive, non-tested code.
from waflib.Configure import conf
#conf
def myprogram(bld, *k, **kw):
kw['features'] = "c cprogram"
add_my_abc_target_to_target(kw) # I'm lazy
return bld(*k, **kw)
You call:
def build(bld):
bld.myprogram(source='main.c', target='app.bbin')
Note: You can put all your code in a plugin, to have clean wscripts:
def configure(cnf):
cnf.load('myprogram') # loads my_c_compiler and myprogram alias
def build(bld):
bld.myprogram(source='main.c', target='app.bbin')

Getting reverse dependencies of a module in Ivy

We are using Ivy for storing our binaries and managing dependencies.
With the purpose of managing the impact of changes in modules, we would need to gather this information from the repository:
Given a module name, organization, branch and revision, obtain all modules that are directly or transitively dependent on that module (with branch and revision). Particularly interesting are the impacted "top-level" (application) modules.
Is there any tool suitable for this task? Otherwise, what would you suggest to solve it?
I've tried the repreport task without much success, as it doesn't seem appropriate to browse the dependencies in reverse.
We have Jenkins, and have a bunch of jars we build that other applications depend upon. We use a Maven repository for storing these jars. In Jenkins, a developer can take a particular jar build, and promote that jar into our Maven repo.
The problem is that this jar breaks some of the applications that use that jar. Therefore, I want to be able to build those projects whenever a jar was promoted to our Maven repo. This list gives me the names of the Jenkins projects that depend upon a particular jar.
Now, our projects were built with ant pre-ivy, so I created several macros that helped developers use Ivy. We have replaced the <jar> task with the <jar.macro/> task. It is like the <jar> task except I take the ivy.xml, convert it into a Maven pom.xml and embed it in the jar. I take a look at the build.xml for these <jar.macro> tasks so I know the names of the jars it builds. You'll probably have to munge this to fit your own.
The following Perl script goes through our Subversion projects, looks for the ones with an ivy.xml. Goes through the build.xml, sees what jars are built, associates them with the project, and then goes through the ivy.xml to see what projects it depends upon.
You're welcome to use it.
#! /usr/bin/env perl
#
use warnings;
use strict;
use autodie;
use feature qw(say);
use XML::Simple;
use Data::Dumper;
use File::Find;
use Pod::Usage;
use constant {
SVN_REPO => 'http://svn/rsvp',
IVY_XML_FILE => 'ivy.xml',
SVN => '/usr/local/bin/svn',
};
#
# These are projects we don't want to include, but
# they have 'ivy.xml' in them anyway
#
use constant BAD_PROJECTS => qw( # These are projects with an ivy.xml, but we don't want
...
);
my %bad_projects = map { $_ => 1 } BAD_PROJECTS;
#
# Find the branch or use trunk (First parameter used
#
my $branch = shift; # This is the branch to search on
if ( not defined $branch ) {
$branch = "trunk";
}
my $branch_url;
if ( $branch eq "trunk" ) {
$branch_url = $branch;
}
else {
$branch_url = "branches/$branch";
}
#
# Use "svn ls" to find all the projects that have an ivy.xml
#
open my $project_fh, "-|", "#{[SVN]} ls #{[SVN_REPO]}/$branch_url";
my %ivy_projects;
say "FINDING IVY PROJECTS";
while ( my $svn_project_name = <$project_fh> ) {
chomp $svn_project_name;
$svn_project_name =~ s|/$||; # Remove the trailing slash
next if exists $bad_projects{$svn_project_name};
#
# See if an ivy.xml file exists in this project via "svn ls"
#
my $svn_ivy_project_url = SVN_REPO . "/$branch_url/$svn_project_name";
my $ivy_file = "$svn_ivy_project_url/ivy.xml";
my $error = system qq( #{[SVN]} ls $ivy_file > /dev/null 2>&1 );
next if $error; # No ivy.xml
say " " x 4 . "Ivy Project: $svn_ivy_project_url";
#
# Ivy project exists. Create a new "project" object to store all the info
#
my $project = Local::Project->new($svn_ivy_project_url);
my $ivy_xml = qx( #{[SVN]} cat $svn_ivy_project_url/ivy.xml );
$project->Ivy_xml( $ivy_xml );
my $build_xml = qx( #{[SVN]} cat $svn_ivy_project_url/build.xml );
$project->Build_xml( $build_xml );
$ivy_projects{ $svn_project_name } = $project;
}
#
# Go through build.xml files and look for all jar.macro tasks. Go through
# these and map the Ivy Artifact Name to the project that builds it.
# The Ivy Artifact Name could be from the ivy.xml file. However, if
# the paramter pom.artifact.name exists, the Ivy Artifact Name will be that.
#
my %jars_to_project_name;
for my $svn_project ( sort keys %ivy_projects ) {
my $project = $ivy_projects{$svn_project};
my $url = $project->Url;
my $build_ref = $project->Build_ref;
my $ivy_ref = $project->Ivy_ref;
my $build_xml_project_name = $build_ref->{name};
say qq(Parsing build.xml of "$svn_project");
#
# Go through all targets looking for jar.macro tasks
#
for my $target ( keys %{ $build_ref->{target} } ) {
next unless $build_ref->{target}->{$target}->{"jar.macro"};
#
# Contains a Jar Macro Task: This could be an array reference?
#
my #jar_macros;
my $jar_macro_task = $build_ref->{target}->{$target}->{"jar.macro"};
if ( ref $jar_macro_task eq "ARRAY" ) {
#jar_macros = #{ $jar_macro_task };
} else {
#jar_macros = ( $jar_macro_task );
}
for my $jar_macro ( #jar_macros ) {
#
# If there is no "pom.artifact.name" in the jar.macro
# task, we need to use the name of the module in the
# ivy.xml file. If pom.artifact.name does exist, we will
# use that. We also need to find out if the name contains
# "${ant.project.name}". If it does, we need to replace that
# name with the name of the build.xml project entity name.
#
my $ivy_jar_name;
if ( not exists $jar_macro->{"pom.artifact.name"} ) {
$ivy_jar_name = $ivy_ref->{info}->{module};
}
else { # Name of jar is in the jar.macro task
$ivy_jar_name = $jar_macro->{"pom.artifact.name"};
my $ant_project_name = $build_ref->{name};
$ivy_jar_name =~ s/\${ant\.project\.name}/$build_xml_project_name/;
}
$jars_to_project_name{$ivy_jar_name} = $svn_project
}
}
}
#
# At this point, we now have all of the information in the ivy.xml file
# and the mapping of artifact name to the project they're in in Subversion.
#
# Now, we need to go through the ivy.xml files, find all com.travelclick
# artifact dependencies, and map them back to the SVN projects.
#
#
# A Hashes of Arrays. This will be keyed by BASE JAR svn project. The array
# will be a list of all the other projects that depend upon that BASE JAR svn
# project.
#
my %project_dependencies;
say "MAPPING IVY.XML back to the dependent projects.";
for my $project ( sort keys %ivy_projects ) {
say "On $project";
my $ivy_ref = $ivy_projects{$project}->Ivy_ref;
my $dependencies_ref = $ivy_ref->{dependencies}->{dependency};
for my $dependency ( sort keys %{ $dependencies_ref } ) {
next unless exists $dependencies_ref->{$dependency}->{org};
next unless $dependencies_ref->{$dependency}->{org} eq 'com.travelclick';
#
# This is a TravelClick Dependency. Map this back to the SVN Project
# which produced this jar.
#
# We now have the SVN project that contained the dependent jar and the
# svn project that mentions that jar in the ivy.xml project.
#
my $svn_project = $jars_to_project_name{$dependency};
next if not $svn_project;
if ( not exists $project_dependencies{$svn_project} ) {
$project_dependencies{$svn_project} = {};
}
$project_dependencies{$svn_project}->{$project} = 1;
}
}
for my $project ( sort { lc $a cmp lc $b } keys %project_dependencies ) {
printf "%-30.30s", "$project - ";
say join ( "-$branch,", sort { lc $a cmp lc $b } keys %{ $project_dependencies{$project} } ) . "-$branch";
}
package Local::Project;
use XML::Simple;
sub new {
my $class = shift;
my $project_url = shift;
my $self = {};
bless $self, $class;
$self->Url($project_url);
return $self;
}
sub Url {
my $self = shift;
my $url = shift;
if ( defined $url ) {
$self->{URL} = $url;
}
return $self->{URL};
}
sub Ivy_xml {
my $self = shift;
my $ivy_xml = shift;
if ( defined $ivy_xml ) {
$self->{IVY_XML} = $ivy_xml;
$self->Ivy_ref($ivy_xml); #Generate the ref structure while you're at it.
}
return $self->{IVY_XML};
}
sub Build_xml {
my $self = shift;
my $build_xml = shift;
if ( defined $build_xml ) {
$self->{BUILD_XML} = $build_xml;
$self->Build_ref($build_xml); #Generate the ref structure while you're at it.
}
return $self->{BUILD_XML};
}
sub Ivy_ref {
my $self = shift;
my $ivy_xml = shift;
if ( defined $ivy_xml ) {
$self->{IVY_REF} = XMLin("$ivy_xml");
}
return $self->{IVY_REF};
}
sub Build_ref {
my $self = shift;
my $build_xml = shift;
if ( defined $build_xml ) {
$self->{BUILD_REF} = XMLin("$build_xml");
}
return $self->{BUILD_REF};
}
=pod
=head1 NAME
find_dependencies.pl
=head1 SYNOPSIS
$ find_depdendencies.pl [ <branch> ]
Where:
=over 4
=item *
<branch> - Name of the branch. If not given, it is assumed to be trunk.
=back
=head1 DESCRIPTION
This program goes through the Subversion repository looking for
projects that have an C<ivy.xml> file in them. If this file is found,
the project is parsed to see what jars produced by TravelClick it is
dependent upon, and what jars it produces.
Once all of the Subversion projects are parsed. The jars are listed
along with their dependent projects.
The purpose of this program is to build a list of projects that should
be automaticlaly rebuilt when a Jar is promoted to the Maven repository.
=head1 PERL MODULE DEPENDENCIES
This project is dependent upon the following Perl modules that must be
installed before this program can be executed:
=over 4
=item XML::Simple
=back
This project must also be executed on Perl 5.12 or greater.
=head1 BUGS
=over 4
=item *
The list assumes Jenkins projects are named after the Subversion
project with the branch or trunk tacked on to the end.
=back

Resources