I have a neo4j graph (created using gremlin), and I'd like to use it using Gremlin as well, however, cypher queries on the graph do not seem to work:
import org.apache.tinkerpop.gremlin
import gremlin.neo4j.structure.Neo4jGraph
import gremlin.tinkergraph.structure.TinkerGraph
import gremlin.hadoop.structure.HadoopGraph
import org.apache.commons.configuration.Configuration
trait Graph[G] {
def create(location: String, args: Configuration = null): G
}
object Grapher {
implicit val createNeo4j = new Graph[Neo4jGraph] {
def create(location: String, args: Configuration = null) =
if (args != null) Neo4jGraph.open(args) else Neo4jGraph.open(location)
}
implicit val createTinkerGraph = new Graph[TinkerGraph] {
def create(location: String, args: Configuration = null) =
if (args != null) TinkerGraph.open(args) else TinkerGraph.open()
}
implicit val createHadoopGraph = new Graph[HadoopGraph] {
def create(location: String, args: Configuration = null) =
if (args != null) HadoopGraph.open(args) else HadoopGraph.open(location)
}
}
object GraphSyntax {
def createGraph[G](location: String, args: Configuration = null)(implicit graph: Graph[G]) = graph.create(location, args)
}
This is how I try to execute the query:
import Grapher._
import GraphSyntax._
val graph = createGraph[Neo4jGraph](fileName)
// org.apache.tinkerpop.gremlin.neo4j.structure.Neo4jGraph = neo4jgraph[community single [/media/ixaxaar/Source/src/telperion/core/neo4j.db]]
graph.cypher("match (n) return n limit 10").toList
// java.util.List[Nothing] = []
If I load the graph into Neo4j server, the same query works on the neo4j web console.
match (n) return n limit 10
I'm using the following libs:
final val Gremlin = "3.2.1"
final val Neo4jTinkerpop = "0.4-3.0.3"
val gremlinCore = "org.apache.tinkerpop" % "gremlin-core" % Version.Gremlin
val gremlinGiraph = "org.apache.tinkerpop" % "giraph-gremlin" % Version.Gremlin
val gremlinNeo4j = "org.apache.tinkerpop" % "neo4j-gremlin" % Version.Gremlin
val hadoopGremlin = "org.apache.tinkerpop" % "hadoop-gremlin" % Version.Gremlin
val tinkergraphGremlin = "org.apache.tinkerpop" % "tinkergraph-gremlin" % Version.Gremlin
val neo4jTinkerpop = "org.neo4j" % "neo4j-tinkerpop-api-impl" % Version.Neo4jTinkerpop
Very sorry for this, turns out I was opening a new graph (fileName was incorrect).
scala> graph.cypher("match (n) return n limit 10").toList
res3: java.util.List[Nothing] = [{n=v[0]}, {n=v[1]}, {n=v[2]}, {n=v[3]}, {n=v[4]}, {n=v[5]}, {n=v[6]}, {n=v[7]}, {n=v[8]}, {n=v[9]}]
Related
final_map = ["/7amd64-Aug2022.1":"2022-08-09","/7amd64-Oct2022.1":"2022-10-12","/7":"2022-11-08","/7amd64-Jul2022.1":"2022-07-12","/7amd64":"2022-11-08","/7amd64-June2022.1":"2022-06-14","/7amd64-beta":"2022-11-08","/7amd64-Sep2022.1":"2022-09-14","/7amd64-Nov2022.1":"2022-11-08","/_uploads":"2022-11-08"]
Jenkins Pipeline (below is the code I have, which is not working)
result = final_map.sort { a,b -> a.value <=> b.value }
echo "Output: ${result}"
Expecting to sort the map with date (value).
You can use a custom comparator for this. Check the following Groovy code.
final_map = ["/7amd64-Aug2022.1":"2022-08-09","/7amd64-Oct2022.1":"2022-10-12","/7":"2022-11-08","/7amd64-Jul2022.1":"2022-07-12","/7amd64":"2022-11-08","/7amd64-June2022.1":"2022-06-14","/7amd64-beta":"2022-11-08","/7amd64-Sep2022.1":"2022-09-14","/7amd64-Nov2022.1":"2022-11-08","/_uploads":"2022-11-08"]
final_map.sort { s1, s2 ->
def s1Date = new Date(s1.value.replace('-', '/'))
def s2Date = new Date(s2.value.replace('-', '/'))
if( s1Date.before(s2Date)) {
return -1
} else {
return 1
}
}
println final_map
I'm trying to convert a predicted RasterFrameLayer in RasterFrames into a GeoTiff file after training a machine learning model.
When using the demo data Elkton-VA from rasterframes,it works fine.
But when using one cropping sentinel 2a tif with ndvi indice (normalized from -1000 to 1000), it failed with NullPointedException in toRaster step.
Feel like it's due to nodata value outside the ROI.
The test data is here, geojson and log.
Geotrellis version:3.3.0
Rasterframes version:0.9.0
import geotrellis.proj4.LatLng
import geotrellis.raster._
import geotrellis.raster.io.geotiff.{MultibandGeoTiff, SinglebandGeoTiff}
import geotrellis.raster.io.geotiff.reader.GeoTiffReader
import geotrellis.raster.render.{ColorRamps, Png}
import org.apache.spark.ml.Pipeline
import org.apache.spark.ml.classification.DecisionTreeClassifier
import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
import org.apache.spark.ml.feature.VectorAssembler
import org.apache.spark.ml.tuning.{CrossValidator, ParamGridBuilder}
import org.apache.spark.sql._
import org.locationtech.rasterframes._
import org.locationtech.rasterframes.ml.{NoDataFilter, TileExploder}
object ClassificiationRaster extends App {
def readTiff(name: String) = GeoTiffReader.readSingleband(getClass.getResource(s"/$name").getPath)
def readMtbTiff(name: String): MultibandGeoTiff = GeoTiffReader.readMultiband(getClass.getResource(s"/$name").getPath)
implicit val spark = SparkSession.builder()
.master("local[*]")
.appName(getClass.getName)
.withKryoSerialization
.getOrCreate()
.withRasterFrames
import spark.implicits._
val filenamePattern = "xiangfuqu_202003_mask_%s.tif"
val bandNumbers = "ndvi".split(",").toSeq
val bandColNames = bandNumbers.map(b ⇒ s"band_$b").toArray
val tileSize = 256
val joinedRF: RasterFrameLayer = bandNumbers
.map { b ⇒ (b, filenamePattern.format(b)) }
.map { case (b, f) ⇒ (b, readTiff(f)) }
.map { case (b, t) ⇒ t.projectedRaster.toLayer(tileSize, tileSize, s"band_$b") }
.reduce(_ spatialJoin _)
.withCRS()
.withExtent()
val tlm = joinedRF.tileLayerMetadata.left.get
// println(tlm.totalDimensions.cols)
// println(tlm.totalDimensions.rows)
joinedRF.printSchema()
val targetCol = "label"
val geojsonPath = "/Users/ethan/work/data/L2a10m4326/zds/test.geojson"
spark.sparkContext.addFile(geojsonPath)
import org.locationtech.rasterframes.datasource.geojson._
val jsonDF: DataFrame = spark.read.geojson.load(geojsonPath)
val label_df: DataFrame = jsonDF
.select($"CLASS_ID", st_reproject($"geometry",LatLng,LatLng).alias("geometry"))
.hint("broadcast")
val df_joined = joinedRF.join(label_df, st_intersects(st_geometry($"extent"), $"geometry"))
.withColumn("dims",rf_dimensions($"band_ndvi"))
val df_labeled: DataFrame = df_joined.withColumn(
"label",
rf_rasterize($"geometry", st_geometry($"extent"), $"CLASS_ID", $"dims.cols", $"dims.rows")
)
df_labeled.printSchema()
val tmp = df_labeled.filter(rf_tile_sum($"label") > 0).cache()
val exploder = new TileExploder()
val noDataFilter = new NoDataFilter().setInputCols(bandColNames :+ targetCol)
val assembler = new VectorAssembler()
.setInputCols(bandColNames)
.setOutputCol("features")
val classifier = new DecisionTreeClassifier()
.setLabelCol(targetCol)
.setFeaturesCol(assembler.getOutputCol)
val pipeline = new Pipeline()
.setStages(Array(exploder, noDataFilter, assembler, classifier))
val evaluator = new MulticlassClassificationEvaluator()
.setLabelCol(targetCol)
.setPredictionCol("prediction")
.setMetricName("f1")
val paramGrid = new ParamGridBuilder()
//.addGrid(classifier.maxDepth, Array(1, 2, 3, 4))
.build()
val trainer = new CrossValidator()
.setEstimator(pipeline)
.setEvaluator(evaluator)
.setEstimatorParamMaps(paramGrid)
.setNumFolds(4)
val model = trainer.fit(tmp)
val metrics = model.getEstimatorParamMaps
.map(_.toSeq.map(p ⇒ s"${p.param.name} = ${p.value}"))
.map(_.mkString(", "))
.zip(model.avgMetrics)
metrics.toSeq.toDF("params", "metric").show(false)
val scored = model.bestModel.transform(joinedRF)
scored.groupBy($"prediction" as "class").count().show
scored.show(20)
val retiled: DataFrame = scored.groupBy($"crs", $"extent").agg(
rf_assemble_tile(
$"column_index", $"row_index", $"prediction",
tlm.tileCols, tlm.tileRows, IntConstantNoDataCellType
)
)
val rf: RasterFrameLayer = retiled.toLayer(tlm)
val raster: ProjectedRaster[Tile] = rf.toRaster($"prediction", 5848, 4189)
SinglebandGeoTiff(raster.tile,tlm.extent, tlm.crs).write("/Users/ethan/project/IdeaProjects/learn/spark_ml_learn.git/src/main/resources/easy_b1.tif")
val clusterColors = ColorRamp(
ColorRamps.Viridis.toColorMap((0 until 1).toArray).colors
)
// val pngBytes = retiled.select(rf_render_png($"prediction", clusterColors)).first //It can output the png.
// retiled.tile.renderPng(clusterColors).write("/Users/ethan/project/IdeaProjects/learn/spark_ml_learn.git/src/main/resources/classified2.png")
// Png(pngBytes).write("/Users/ethan/project/IdeaProjects/learn/spark_ml_learn.git/src/main/resources/classified2.png")
spark.stop()
}
I suspect there is a bug in the way the toLayer extension method is working. I will follow up with a bug report to RasterFrames project. That will take a little more effort I suspect.
Here is a possible workaround that is a little bit lower level. In this case it results in 25 non-overlapping GeoTiffs written out.
import geotrellis.store.hadoop.{SerializableConfiguration, _}
import geotrellis.spark.Implicits._
import org.apache.hadoop.fs.Path
// Need this to write local files from spark
val hconf = SerializableConfiguration(spark.sparkContext.hadoopConfiguration)
ContextRDD(
rf.toTileLayerRDD($"prediction")
.left.get
.filter{
case (_: SpatialKey, null) ⇒ false // remove any null Tiles
case _ ⇒ true
},
tlm)
.regrid(1024) //Regrid the Tiles so that they are 1024 x 1024
.toGeoTiffs()
.foreach{ case (sk: SpatialKey, gt: SinglebandGeoTiff) ⇒
val path = new Path(new Path("file:///tmp/output"), s"${sk.col}_${sk.row}.tif")
gt.write(path, hconf.value)
}
I my case - at input I have List<List<Float>> (list of word representation vectors). And - have one Double at output from one sequence.
So I building next structure (first index - example number, second - sentence item number, third - word vector element number) : http://pastebin.com/KGdjwnki
And in output : http://pastebin.com/fY8zrxEL
But when I masting one of next (http://pastebin.com/wvFFC4Hw) to model.output - I getting vector [0.25, 0.24, 0.25, 0.25], not one value.
What can be wrong? Attached code (at Kotlin). classCount is one.
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork
import org.deeplearning4j.nn.conf.NeuralNetConfiguration.Builder
import org.deeplearning4j.nn.api.OptimizationAlgorithm
import org.deeplearning4j.nn.conf.Updater
import org.deeplearning4j.nn.weights.WeightInit
import org.deeplearning4j.nn.conf.layers.GravesLSTM
import org.deeplearning4j.nn.conf.layers.RnnOutputLayer
import org.deeplearning4j.nn.conf.BackpropType
import org.nd4j.linalg.api.ndarray.INDArray
import org.nd4j.linalg.cpu.nativecpu.NDArray
import org.nd4j.linalg.indexing.NDArrayIndex
import org.nd4j.linalg.factory.Nd4j
import org.nd4j.linalg.lossfunctions.LossFunctions
import java.util.*
class ClassifierNetwork(wordVectorSize: Int, classCount: Int) {
data class Dimension(val x: Array<Int>, val y: Array<Int>)
val model: MultiLayerNetwork
val optimization = OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT
val iterations = 1
val learningRate = 0.1
val rmsDecay = 0.95
val seed = 12345
val l2 = 0.001
val weightInit = WeightInit.XAVIER
val updater = Updater.RMSPROP
val backtropType = BackpropType.TruncatedBPTT
val tbpttLength = 50
val epochs = 50
var dimensions = Dimension(intArrayOf(0).toTypedArray(), intArrayOf(0).toTypedArray())
init {
val baseConfiguration = Builder().optimizationAlgo(optimization)
.iterations(iterations).learningRate(learningRate).rmsDecay(rmsDecay).seed(seed).regularization(true).l2(l2)
.weightInit(weightInit).updater(updater)
.list()
baseConfiguration.layer(0, GravesLSTM.Builder().nIn(wordVectorSize).nOut(64).activation("tanh").build())
baseConfiguration.layer(1, GravesLSTM.Builder().nIn(64).nOut(32).activation("tanh").build())
baseConfiguration.layer(2, GravesLSTM.Builder().nIn(32).nOut(16).activation("tanh").build())
baseConfiguration.layer(3, RnnOutputLayer.Builder().lossFunction(LossFunctions.LossFunction.MCXENT)
.activation("softmax").weightInit(WeightInit.XAVIER).nIn(16).nOut(classCount).build())
val cfg = baseConfiguration.build()!!
cfg.backpropType = backtropType
cfg.tbpttBackLength = tbpttLength
cfg.tbpttFwdLength = tbpttLength
cfg.isPretrain = false
cfg.isBackprop = true
model = MultiLayerNetwork(cfg)
}
private fun dataDimensions(x: List<List<Array<Double>>>, y: List<Array<Double>>): Dimension {
assert(x.size == y.size)
val exampleCount = x.size
assert(x.size > 0)
val sentenceLength = x[0].size
assert(sentenceLength > 0)
val wordVectorLength = x[0][0].size
assert(wordVectorLength > 0)
val classCount = y[0].size
assert(classCount > 0)
return Dimension(
intArrayOf(exampleCount, wordVectorLength, sentenceLength).toTypedArray(),
intArrayOf(exampleCount, classCount).toTypedArray()
)
}
data class Fits(val x: INDArray, val y: INDArray)
private fun fitConversion(x: List<List<Array<Double>>>, y: List<Array<Double>>): Fits {
val dim = dataDimensions(x, y)
val xItems = ArrayList<INDArray>()
for (i in 0..dim.x[0]-1) {
val itemList = ArrayList<DoubleArray>();
for (j in 0..dim.x[1]-1) {
var rowList = ArrayList<Double>()
for (k in 0..dim.x[2]-1) {
rowList.add(x[i][k][j])
}
itemList.add(rowList.toTypedArray().toDoubleArray())
}
xItems.add(Nd4j.create(itemList.toTypedArray()))
}
val xFits = Nd4j.create(xItems, dim.x.toIntArray(), 'c')
val yItems = ArrayList<DoubleArray>();
for (i in 0..y.size-1) {
yItems.add(y[i].toDoubleArray())
}
val yFits = Nd4j.create(yItems.toTypedArray())
return Fits(xFits, yFits)
}
private fun error(epoch: Int, x: List<List<Array<Double>>>, y: List<Array<Double>>) {
var totalDiff = 0.0
for (i in 0..x.size-1) {
val source = x[i]
val result = y[i]
val realResult = predict(source)
var diff = 0.0
for (j in 0..result.size-1) {
val elementDiff = result[j] - realResult[j]
diff += Math.pow(elementDiff, 2.0)
}
diff = Math.sqrt(diff)
totalDiff += Math.pow(diff, 2.0)
}
totalDiff = Math.sqrt(totalDiff)
print("Epoch ")
print(epoch)
print(", diff ")
println(totalDiff)
}
fun train(x: List<List<Array<Double>>>, y: List<Array<Double>>) {
dimensions = dataDimensions(x, y)
val(xFit, yFit) = fitConversion(x, y)
for (i in 0..epochs-1) {
model.input = xFit
model.labels = yFit
model.fit()
error(i+1, x, y)
}
}
fun predict(x: List<Array<Double>>): Array<Double> {
val xList = ArrayList<DoubleArray>();
for (i in 0..dimensions.x[1]-1) {
var row = ArrayList<Double>()
for (j in 0..dimensions.x[2]-1) {
row.add(x[j][i])
}
xList.add(row.toDoubleArray())
}
val xItem = Nd4j.create(xList.toTypedArray())
val y = model.output(xItem)
val result = ArrayList<Double>()
return result.toTypedArray()
}
}
upd. Seems like next example have "near" task, so later I'll check it and post solution : https://github.com/deeplearning4j/dl4j-0.4-examples/blob/master/dl4j-examples/src/main/java/org/deeplearning4j/examples/recurrent/word2vecsentiment/Word2VecSentimentRNN.java
LSTM input/output can only be rank 3: see:
http://deeplearning4j.org/usingrnns
next to the recommendation to post this in the very active gitter and the hint of Adam to check out the great documentation, which explains how to set up the in- and output being of rank 3, I want to point out a few other things in your code, as I was struggling with similar problems:
check out the basic example here in examples/recurrent/basic/BasicRNNExample.java, here you see that for RNN you don't use model.output(xItem), but model.rnnTimeStep(xItem);
with class count of one you seem to be performing a regression, for that also check out the regression examples at examples/feedforward/regression/RegressionSum.java and documenation here, here you see that as an activiation function you should use "identity". "softmax" actually normalizes the output to sum up to one (see in glossary), so if you have just one output it will always output 1 (at least it did for my problem).
Not sure if I understand your requirements correctly, but if you want single output (that is predict a number or regression), you usually go with Identity activation, and MSE loss function. You've used softmax, which is usually used in classificatoin.
I am not sure if this is the right place to post this, but I have been trying to code up a simple decision tree class for a while and am getting lost at various points.
Specifically, I'm not sure what kind of data structure would represent a recursive tree that uses (feature, value) as nodes.
class DecisionTree():
def entropy(self, data):
# if there's nothing in this region, entropy is 1
if len(data) <= 1:
return 1
target_col = data.ix[:,-1]
size = float(len(target_col))
classes = Counter(target_col)
# if there's only one class, entropy is 1
if len(classes) == 1:
return 1
else:
probs = [i / size for i in classes.values()]
entropy = np.sum([-probs[i]*np.log(probs[i]) for i in range(len(probs))])
return entropy
def what_to_split_on(self, data):
split_feature = -1
best_entropy = 0.0
base_entropy = self.entropy(data)
for f, feature in enumerate(data.T):
unique_vals = list(set(feature))
for val in unique_vals:
left, right = self.split(f, val)
prop_left = float(len(left)) / (len(left) + len(right))
prop_right = 1 - prop_left
e_1 = prop_left * self.entropy(left)
e_2 = prop_right * self.entropy(right)
entropy_change = base_entropy - e_1 - e_2
if entropy_change > best_entropy:
best_entropy = entropy_change
split_feature = f; split_val = val
if split_feature != -1:
return split_feature, split_val
def split(self, data, f, val):
left = np.array([row for row in data if row[f] == val])
right = np.array([row for row in data if row[f] != val])
return left, right
def create_tree(self, data):
if self.entropy(data) == 1:
return
feature, value = self.what_to_split_on(data)
dt = Tree(feature, value)
left_child = np.array([row for row in data if row[feature] == value])
right_child = np.array([row for row in data if row[feature] == value])
feature, value = self.what_to_split_on(left_child)
sub_left = create_tree(left_child)
dt.insert_left(sub_left)
feature, value = self.what_to_split_on(right_child)
sub_right = create_tree(right_child)
dt.insert_right(sub_right)
return dt
I use this but I can't connect DataBase
conn1.Open();
using (OracleCommand crtCommand = new OracleCommand);
for connect to Oracle from application
string command = "Enter your command";
OracleConnection orclecon;
orclecon = new OracleConnection(connection);
orclecon.Open();
use this for select commands:
DataSet ds = new DataSet();
using (OracleDataAdapter Oda = new OracleDataAdapter(command, orclecon))
{
Oda.Fill(ds);
}
and use this for insert/update/delete commands:
//used for Oracle command (insert,update,delete) if number of rows that affected >0 return true else return false
using (OracleCommand orclcommand = new OracleCommand(command, orclecon))
{
int n = orclcommand.ExecuteNonQuery();
if (n > 0)
return true;
else
return false;
}
orclecon.Close();
To make it dynamic use this;
string sentence = "";
string formatprototype = "";//This will hold the string to be formatted.
string output="";
public void SearchString()
{
string pattern = #".*[ ]+?[\""]{1}(?<String>[a-zA-Z0-9_]*)[\""]{1}[ ]+?MINVALUE[ ]*(?<MinValue>[-?\d]*)[ ]*MAXVALUE[ ]*(?<MaxValue>[\d]*)[ ]+?[INCREMENT]*[ ]+?[BY]*[ ]+?(?<IncrementBy>[\d]*)[ ]+?[START]*[ ]+?[WITH]*[ ]+?(?<StartWith>[\d]*)[ ]+?[CACHE]*[ ]+?(?<Cache>[\d]*)\s+?";
Regex regex = new Regex(pattern);
Match match = regex.Match(sentence);
Group #string = match.Groups[1];
Group minvalue = match.Groups[2];
Group maxvalue = match.Groups[3];
Group incrementby = match.Groups[4];
Group startswith = match.Groups[5];
Group cache = match.Groups[6];
formatprototype = #"CREATE SEQUENCE ""{0}"" MINVALUE {1} MAXVALUE {2} INCREMENT BY {3} START WITH {4} CACHE {5} NOORDER NOCYCLE";
if (minvalue.Value.StartsWith("-"))
{
output = string.Format(formatprototype, #string, minvalue, maxvalue, incrementby, maxvalue, cache);
}
else if (!minvalue.Value.StartsWith("-"))
{
output = string.Format(formatprototype, #string, minvalue, maxvalue, incrementby, minvalue, cache);
}
MessageBox.Show(output);
}
Assume that SearchString() is the function in which you are doing this stuff.And make sure to assign each string that is extracted from database,to sentence.Try it and reply if it worked or not.