How to load Esper queries from config file? - esper

I am just getting started with Esper and used http://coffeeonesugar.wordpress.com/2009/07/21/getting-started-with-esper-in-5-minutes/
as a quick start reference. Could you point me to how to go about loading EPL queries via a config file instead of hardcoding.

Esper has a deployment admin API that can read EPL such as from a file. and deploy.

Related

Question: BigQueryIO creates one file per input line, is it correct?

I'm new on Apache Beam and I'm developing a pipeline to get rows from JDBCIO and send them to BigQueryIO. I'm converting the rows to avro files with withAvroFormatFunction but it is creating a new file for each row returned by JDBCIO. The same for withFormatFunction with json files.
It is so slow to run locally with DirectRunner because it uploads a lot of files to Google Storage. Is this approach good for scaling on Google Dataflow? Is there a better way to deal with it?
Thanks
In BigqueryIO there is an option to specify withNumFileShards which controls the number of files that get generated while using Bigquery Load Jobs.
From the documentation
Control how many file shards are written when using BigQuery load jobs. Applicable only when also setting withTriggeringFrequency(org.joda.time.Duration).
You can set test your process by setting the value to 1 to see if only 1 large file gets created.
BigQueryIO will commit results to BigQuery for each bundle. The DirectRunner is known to be a bit inefficient about bundling. It never combines bundles. So whatever bundling is provided by a source is propagated to the sink. You can try using other runners such as Flink, Spark, or Dataflow. The in-process open source runners are about as easy to use as the direct runner. Just change --runner=DirectRunner to --runner=FlinkRunner and the default settings will run in local embedded mode.

Graphenedb Triggers Issue (https://app.graphenedb.com/)

I m new to the Neo4j platform. Our team is working to migrate from SQL to Graph. We were able to enable APOC triggers when we were in working local environment but when we deployed it to Graphenedb instance we were not able to find any configuration or remote shell access to configure APOC triggers. I contacted their support, but there is no reply. We are in a kind of pickle here, so it would be a huge help to us if someone can provide any solution. We just want to write a single configuration line in the setting which would enable the trigger
Looking at the documentation it looks like you can enable APOC using these steps: https://docs.graphenedb.com/docs/stored-procedures#section-adding-apoc

running queries using bash script

I need to run Neo4j on Kali linux, use a bash script to start it and run my intended Cypher queries. Is it possible? If so, would you please tell me how can I do this? I didn't find anything about this. All I have done is this:
sudo apt-get install neo4j
which has installed neo4j on my Kali. What should I do next?
To further clarify my question: I have a bash script which produces a .csv file. Now I want to use this .csv file to create a graph in neo4j. I want to know whether there is any way that after creating the .csv file in my bash script to run the neo4j through the same bash script and create the graph through the query I have written for .csv file.
Neo4j includes cypher-shell, a command line tool that you can use to connect to Neo4j and execute queries.
Rather than invoking this in an interactive way, you can execute cypher directly when issuing the command to run cypher-shell, and you can also pipe in a file with cypher commands to execute, and also supply parameters to use when executing the cypher. Provided that the CSV file is in an accessible location (should be in the import folder under your neo4j home folder), you can supply the parameter of the file name, and use that parameter in the cypher query provided when executing cypher-shell.
Since you seem really lost, a good start is always to have a look at the doc. Everything you could possibly wonder is explainer there. Try to search the internet a bit when something seems wrong.
"Cypher is a declarative query language for graphs. For more details, see the Cypher Manual. The recommended way of programmatically interacting with the database is either through the official drivers or using the Java API." (From the doc, Introduction
Apparently, you don't even need to use bash which is old school and not very handy to use unless you already know very well what you are doing. You can use the Java API instead if you know this language.
Here you will even find how to use drivers to use it with the popular JavaScript!
You should precise your question.

Jenkins job to read data from SQL DB

I'm new to Jenkins. I have a task where I need to create a Jenkins job to automate builds of certain projects. The build job parameters are going to be stored in the SQL database. So, the job would keep querying the DB and it has to load data from the DB and perform the build operation.
Examples would be greatly appreciated.
How can this be done?
You have to transform the data from available source to the format expecting by the destination.
Here your source data available in DB and you want to use this data in Jenkins.
There might be numerous ways but the efficient way of reading data is using EnvyInJect Plugin.
If you were able to provide the whole data as Properties file format and type to EnvyInject plugin, the whole data is available as environment variables you can use these variable in the Jobs configuration.
EnvyInject Plugin can read this properties file from the Jenkins Job Workspace. And you can provide that file path in Properties File Path input.
To read the data from source and make available as properties file.
Either you can write a executable or if your application provides api to download the properties data.
Both ways to be executed before the SCM step, for this you have to use Pre-SCM-Step
Get the data and inject the data in pre-scm-step only, so that the data available as environment variables.
This is one thought to give gist for you to start. while implementing you may get lot of thoughts to implement according to your requirement.

How to configure telegraf to send a folder-size to influxDB

I am having a hard time understanding and using the disk plugin.
I want to emit the folder size for /var/lib/influxdb/hh/, and I am trying to figure out how to wire this using the disk plugin.
I have tried the following :
setup a new configuration file for telegraf here vi /etc/telegraf/telegraf.d/influxdbdata_telegraf.conf
added this configuration
[[inputs.disk]]
mount_points = ["/var/lib/influxdb/hh/"]
now I try to query the data from influxDB, and I don't see this. But I see the other measurements.
I am definitely confused with the usage of disk plugin. I can write an exec plugin and finish it. But wanted to hold off and try to get this working with an existing plugin. (I am pretty sure this would be in some plugin, which I am missing)

Resources