JUNG Pagerank edge weight issue - jung

I am new to Gremlin and I am trying to implement the pagerank algorithm with edge weights from JUNG. These are the steps I have taken. I have 2.0.0.0 version of Gremlin installed. I have created a .graphml file using the iGraph package in R, which I am loading into gremlin.
import edu.uci.ics.jung.algorithms.scoring.PageRank
g1 = new TinkerGraph()
g1.loadGraphML('file path.graphml')
My g1 graph has the following edge attributes:
g1.E.map
==>{weight=1, freq=1}
==>{weight=1, freq=1}
==>{weight=2, freq=2}
==>{weight=1, freq=1}
==>{weight=1, freq=1}
==>{weight=1, freq=1}
==>{weight=1, freq=1}
==>{weight=1, freq=1}
==>{weight=2, freq=2}
gremlin> g1.V.map
==>{name=a}
==>{name=b}
==>{name=c}
==>{name=d}
==>{name=e}
==>{name=f}
==>{name=g}
==>{name=h}
==>{name=i}
==>{name=k}
j = new GraphJung(g1)
t = new EdgeWeightTransformer("weight",true, false)
pr = new PageRank<Vertex,Edge>(j, t, 0.15d)
pr.evaluate()
j.getVertices().collect{[it, pr.getVertexScore(it)]}
However, my results are
==>[v[n1], 0.046875]
==>[v[n0], 0.046875]
==>[v[n5], 0.046875]
==>[v[n4], 0.046875]
==>[v[n3], 0.046875]
==>[v[n2], 0.046875]
==>[v[n9], 0.046875]
==>[v[n8], 0.046875]
==>[v[n7], 0.046875]
==>[v[n6], 0.046875]
which are are incorrect. Please can someone help me understand what is wrong in the code. I also tried to check the effect of the transformer on the edge weights of j by the following:
j.getEdges().t.
I get NULLS when I do this.
But I know there are weights associated with these edges as when I run:
j.getEdges().collect{[it, it.weight]}
I get the following results:
==>[e[3][n1-_default->n5], 1]
==>[e[2][n0-_default->n4], 1]
==>[e[1][n0-_default->n3], 2]
==>[e[0][n0-_default->n2], 1]
==>[e[7][n1-_default->n8], 1]
==>[e[6][n1-_default->n7], 1]
==>[e[5][n1-_default->n6], 1]
==>[e[4][n1-_default->n1], 1]
==>[e[8][n1-_default->n9], 2]
Finally, I am unable to create automatic keys for my vertices. I tried
g1.createAutoIndex('test', Vertex.class, ['name'] as Set)
And got the following error:
No signature of method: groovy.lang.MissingMethodException.createAutoIndex() is applicable for argument types: () values: []
Thank you

I also took a tough time to find how to implement pagerank of weight-edge-graph using Jung.
Below is the pesudo code, you should use GrepCode to see the detail implementation of Pagerank.
For(Edge e: edges){ // Edge is a user-defined class
graph.add(edgeCount,e.getStart,e.getEnd);
map.put(edgeCount,e.getWeight); // map is HashMap
edgeCount++;
}
Transformer edge_weights = MapTransformer.getInstance(map) //Key Step!
Pagerank<Vertex,Edge> ranker = new Pagerank<Vertex,Edge>(graph, edge_weights, alpha);
I would recommend you look this example:
https://github.com/lintool/Cloud9/blob/master/src/dist/edu/umd/cloud9/example/pagerank/SequentialPageRank.java
You can modify based on this example using my pseudo code.

Related

Subgraph isomorphism (or even set membership) in Z3?

I'm trying to find a way to encode a sort of basic subgraph isomorphism in Z3 (preferably z3py). While I know there are papers on this in the abstract, finding any mechanism to do it has eluded me even for very trivial cases, because I'm very new to Z3 in general!
Suppose you have just about the most basic subgraph with nodes (0,1,2) and edges (0,1) with node 2 off on its own, and the supergraph has nodes (0,1,2) and edges (1,2) with node 0 off on its own. You could map the nodes of the subgraph into the supergraph with
0->1,
1->2,
2->0
...as one possible mapping that would satisfy "if these two nodes are connected in the subgraph, their mapped nodes are connected in the supergraph"
So okay :) I tried
from networkx import Graph
from networkx.linalg.graphmatrix import adjacency_matrix
subgraph = Graph()
subgraph.add_nodes_from([0,1,2])
subgraph.add_edges_from([(0,1)])
supergraph = Graph()
supergraph.add_nodes_from([0,1,2])
supergraph.add_edges_from([(1,2)])
s = Solver()
assignments = [Int(f'n{node}') for node in subgraph.nodes]
# each bit assignment in the subgraph belongs to one in the supergraph
assignment_constraint = [ And(assignments[i] >= 0, assignments[i] <= max(supergraph.nodes)) for i in subgraph.nodes ]
# subgraph bits can't be assigned to the same supergraph bits
assignment_distinct = [ Distinct([assignments[i] for i in subgraph.nodes])]
which just gets me as far as "each assignment from subgraph to supergraph should map a node in the subgraph to some node in the supergraph and no two subgraph nodes can be assigned to the same supergraph node"
...but then I get stuck because I keep thinking along the lines of
for edge in subgraph.edges:
s.add( (assignments[edge[0]], assignments[edge[1]]) in supergraph.edges )
...but of course that doesn't work because pythonically those aren't the right sort of keys so that's always false or broken.
So how does one approach that? I can add constraints like "this_var == 1" but get very confused on things like checking membership, ie
>>> assignments[0] == 1.0
n0 == 1 # so that's OK then
>>> assignments[0] in [1.0, 2.0, 3.0]
False # woops, that fails horribly
and I feel like I'm missing a very basic "frame of mind" thing here.
It is relatively straightforward to encode subgraph isomorphism in z3, pretty much along the lines of how you described. However, this encoding is unlikely to scale to large graphs. As you no doubt know, subgraph isomorphism is NP-complete in general, and this encoding will cause z3 to simply enumerate all possibilities and thus will blow up exponentially.
Having said that, here's a straightforward encoding:
from z3 import *
# Subgraph, number of nodes and edges.
# Nodes will be named implicitly from 0 to noOfNodesA - 1
noOfNodesA = 3
edgesA = [(0, 1)]
# Supergraph:
noOfNodesB = 3
edgesB = [(1, 2)]
# Mapping of subgraph nodes to supergraph nodes:
mapping = Array('Map', IntSort(), IntSort())
s = Solver()
# Check that elt is between low and high, inclusive
def InRange(elt, low, high):
return And(low <= elt, elt <= high)
# Check that (x, y) is in the list
def Contains(x, y, lst):
return Or([And(x == x1, y == y1) for x1, y1 in lst])
# Make sure mapping is into the supergraph
s.add(And([InRange(Select(mapping, n1), 0, noOfNodesB-1) for n1 in range(noOfNodesA)]))
# Make sure we map nodes to distinct nodes
s.add(Distinct([Select(mapping, n1) for n1 in range(noOfNodesA)]))
# Make sure edges are preserved:
for x, y in edgesA:
s.add(Contains(Select(mapping, x), Select(mapping, y), edgesB))
# Solve:
r = s.check()
if r == sat:
m = s.model()
for x in range(noOfNodesA):
print ("%s -> %s" % (x, m.evaluate(Select(mapping, x))))
else:
print ("Solver said: %s" % r)
I've added comments along the way, so hopefully you should be able to read the code through; feel free to ask specific questions.
When I run this, I get:
$ python a.py
0 -> 1
1 -> 2
2 -> 0
which finds exactly the mapping you alluded to in your question.
Best of luck!

Is there a way to get the final system of equations sent by cvxpy to the solver?

If I understand correctly, cvxpy converts our high-level problem description to the standard canonical form before it is sent to a solver.
By the standard form I mean the form that can be used for the descent algorithms, so, for instance, it would convert all the absolute values in the objective to be a difference of two positive numbers with some new constraints, etc.
Wondering if its possible to see what the reduction looked like for a problem I specify in cvxpy?
For instance, lets say I have the following problem:
import numpy as np
import cvxpy as cp
x = cp.Variable(2)
L = np.asarray([[1,2],[2,3]])
P = L.T # L
constraints = []
constraints.append(x >= [-10, -10])
constraints.append(x <= [10, 10])
obj = cp.Minimize(cp.quad_form(x, P) - [1, 2] * x)
prob = cp.Problem(obj, constraints)
prob.solve(), prob.solver_stats.solver_name
(-0.24999999999999453, 'OSQP')
So, I would like to see the actual arguments (P, q, A, l, u) being sent to the OSQP solver https://github.com/oxfordcontrol/osqp-python/blob/master/module/interface.py#L278
Any help is greatly appreciated!
From looking at the documentation, it seems you can do this using the command get_problem_data as follows:
data, chain, inverse_data = prob.get_problem_data(prob.solver_stats.solver_name)
I have not tried it, and it says it output depends on the particular solver and the solver chain, but it may help you!

Using quantile in Flux (Julia) in loss function

I am trying to use quantile in a loss function to train! (for some robustness, like least trimmed squares), but it mutates the array and Zygote throws an error Mutating arrays is not supported, coming from sort! . Below is a simple example (the content does not make sense of course):
using Flux, StatsBase
xdata = randn(2, 100)
ydata = randn(100)
model = Chain(Dense(2,10), Dense(10, 1))
function trimmedLoss(x,y; trimFrac=0.f05)
yhat = model(x)
absRes = abs.(yhat .- y) |> vec
trimVal = quantile(absRes, 1.f0-trimFrac)
s = sum(ifelse.(absRes .> trimVal, 0.f0 , absRes ))/(length(absRes)*(1.f0-trimFrac))
#s = sum(absRes)/length(absRes) # using this and commenting out the two above works (no surprise)
end
println(trimmedLoss(xdata, ydata)) #works ok
Flux.train!(trimmedLoss, params(model), zip([xdata], [ydata]), ADAM())
println(trimmedLoss(xdata, ydata)) #changed loss?
This is all in Flux 0.10 with Julia 1.2
Thanks in advance for any hints or workaround!
Ideally, we'd define a custom adjoint for quantile so that this works out of the box. (Feel free to open an issue to remind us to do this.)
In the mean time there's a quick workaround. It's actually the sorting that causes trouble here so if you do quantile(xs, p, sorted=true) it'll work. Obviously this requires xs to be sorted to get correct results, so you might need to use quantile(sort(xs), ...).
Depending on your Zygote version you might also need an adjoint for sort. That one's pretty easy:
julia> using Zygote: #adjoint
julia> #adjoint function sort(x)
p = sortperm(x)
x[p], x̄ -> (x̄[invperm(p)],)
end
julia> gradient(x -> quantile(sort(x), 0.5, sorted=true), [1, 2, 3, 3])
([0.0, 0.5, 0.5, 0.0],)
We'll make that built-in in the next Zygote release, but for now if you add that to your script it'll get your code working.

Translating an optimization problem from CVX to CVXPY?

I am attempting to translate a semidefinite programming problem from CVX to CVXPY as described here. My attempt follows:
import cvxpy as cvx
import numpy as np
c = [0, 1]
n = len(c)
# Create optimization variables.
f = cvx.Variable((n, n), hermitian=True)
# Create constraints.
constraints = [f >> 0]
for k in range(1, n):
indices = [(i * n) + i - (n - k) for i in range(n - k, n)]
constraints += [cvx.sum(cvx.vec(f)[indices]) == c[n - k]]
# Form objective.
obj = cvx.Maximize(c[0] - cvx.trace(f))
# Form and solve problem.
prob = cvx.Problem(obj, constraints)
sol = prob.solve()
print(sol)
print(f.value)
The issue here is that when I take the coefficients of the Fourier series and translate them into the array c it fails on complex values. I think this is due to a discrepancy between the maximize function of CVX and CVXPY. I'm not sure what CVX is maximizing, since the trace of the matrix is a complex value. As pointed out below the trace is real since the matrix is Hermitian, but the code still fails. Can someone with CVXPY knowledge clear this up?

How tf.gradients work in TensorFlow

Given I have a linear model as the following I would like to get the gradient vector with regards to W and b.
# tf Graph Input
X = tf.placeholder("float")
Y = tf.placeholder("float")
# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")
# Construct a linear model
pred = tf.add(tf.mul(X, W), b)
# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)
However if I try something like this where cost is a function of cost(x,y,w,b) and I only want to gradients with respect to w and b:
grads = tf.gradients(cost, tf.all_variable())
My placeholders will also be included (X and Y).
Even if I do get a gradient with [x,y,w,b] how do I know which element in the gradient that belong to each parameter since it is just a list without names to which parameter the derivative has be taken with regards to?
In this question I'm using parts of this code and I build on this question.
Quoting the docs for tf.gradients
Constructs symbolic partial derivatives of sum of ys w.r.t. x in xs.
So, this should work:
dc_dw, dc_db = tf.gradients(cost, [W, b])
Here, tf.gradients() returns the gradient of cost wrt each tensor in the second argument as a list in the same order.
Read tf.gradients for more information.

Resources