Scala import issue with PredictionIO Universal Recommender integration test - mahout

I've tried to get the Universal Recommender template set up as per the instructions located at UR Quickstart. Getting an import error. Is there a dependency/step that I missed?
[ERROR] [Engine$] [error] \import org.apache.mahout.math.cf.{DownsamplableCrossOccurrenceDataset, SimilarityAnalysis}
[ERROR] [Engine$] [error] ^
[ERROR] [Engine$] [error] one error found
The relevant block in my build.sbt file is as follows:
libraryDependencies ++= Seq(
"org.apache.predictionio" %% "apache-predictionio-core" % pioVersion % "provided",
"org.apache.predictionio" %% "apache-predictionio-data-elasticsearch1" % pioVersion % "provided",
"org.apache.spark" %% "spark-core" % "1.4.0" % "provided",
"org.apache.spark" %% "spark-mllib" % "1.4.0" % "provided",
"org.xerial.snappy" % "snappy-java" % "1.1.1.7",
// Mahout's Spark libs
"org.apache.mahout" %% "mahout-math-scala" % mahoutVersion,
"org.apache.mahout" %% "mahout-spark" % mahoutVersion
exclude("org.apache.spark", "spark-core_2.10"),
"org.apache.mahout" % "mahout-math" % mahoutVersion,
"org.apache.mahout" % "mahout-hdfs" % mahoutVersion
exclude("com.thoughtworks.xstream", "xstream")
exclude("org.apache.hadoop", "hadoop-client"),
//"org.apache.hbase" % "hbase-client" % "0.98.5-hadoop2" %
"provided",

Please disregard. I'm not familiar with scala syntax so I didn't realize the incorrect inclusion of the backslash in the source file was problematic.

Consider this answer as an alternative for build PIO-UR engine.
Actually, I also had struggled with these dependency issues. What I am telling you is not the right solution for this. But this will provide you a working PredictionIO UR engine.
Use the docker image for the universal recommender template.
Use this link for getting PIO-UR docker image.
If you are not familiar with docker, use these links below :
INSTALLATIONS:
Docker for MAC
Docker for Windows
For Ubuntu, use automated script: curl -sSL https://get.docker.com/ | sh
Then use the above image from git for using the UR template. By using docker, we don't need to struggle with the dependencies. The readme file in the git repository is really helpful and use that as a guide. You can set the PIO engine up and run with 3 simple commands.

Related

Data Flow Flex Runner Fails on start

I created a simple beam pipeline that looks like this
with beam.Pipeline(options=options) as pipeline:
(pipeline | 'readFile' >> ReadFromText(..)
| 'parseJson' >> beam.Map(json.loads)
| 'convertToRow' >> beam.Map(lambda x: beam.Row(a=str(x['a'])))
| 'sql' >> SqlTransform(""" SELECT a FROM PCOLLECTION """)
| 'print' >> beam.Map(print)
)
and followed the code examples provided here: https://cloud.google.com/dataflow/docs/guides/templates/using-flex-templates to be able to run this as a flex template. This is the error that I see
FileNotFoundError: [Error 2] No such file or directory: 'java': 'java'
My (very limited) understanding is that the docker image created as part of the flex template just launches the job on Dataflow, so don't quite understand why it's complaining about java directory not being present. Any leads would be greatly appreciated.
It looks like you need to install java into the container image, adding something like https://github.com/dockerfile/java/blob/master/openjdk-7-jre/Dockerfile#L11 to the dockerfile.

Rule in snakemake using singularity: unterminated quoted string

I'm running a snakemake pipeline that for a specific rule loads a container:
rule counts:
params:
transcriptome=os.environ["INDEX"],
outdir= (os.environ["OUTDIR"] + "/counts/"),
indir= (os.environ["INDIR"] + "{sample}"),
name = lambda wildcards: SAMPLES[wildcards.sample]
output:
(os.environ["OUTDIR"] + "counts/" + "{sample}" + "/outs/web_summary.html")
container:
"docker://marcusczi/cellranger_clean"
shell:
"""
cellranger count --id={wildcards.sample} --transcriptome={params.transcriptome} --fastqs={params.indir} --sample={params.name}
mkdir -p {params.outdir}
mv ./{wildcards.sample}/ {params.outdir}
"""
Dry run looks fine, the rule itself I'm sure it works (tried it without the container). However, when I run it with docker I get this error:
Activating singularity image /some/path/.snakemake/singularity/c288fbc3fef5771f055a688c6678c24d.simg
/bin/sh: syntax error: unterminated quoted string
[ 1.228141] reboot: Power down
And then it waits for the missing files, and fails.
I think the answer to this situation might be related to this previous question, but I have tried everything i can think of in terms of escaping characters (except for the wildcards and variables within curly brackets because I'm guessing it should be fine, and if not why am i even using snakemake :-( ). The paths for the directories I'm using are valid and exist, the name and wildcard "sample" are in the shape "sample_123", nothing fancy.
It's also worth saying that there are no single or double quotes in any of these variables.
Thank you!!
Software and OS:
I am in macos catalina 10.15.5, running snakemake 5.20.1, and I have been using the beta version of singularity for macos (3.3.0-rc.1.658.g7427b73f1.dirty).
Running singularity outside Snakemake:
I tried running the singularity outside snakemake, the software that I'm trying to run starts, but then complains that there is no disk left on space (which is not true). I'm running the singularity as sudo singularity run -B "$(pwd):$(pwd)" docker://marcusczi/cellranger_clean
I think this latest error might be either 1) I'm not running singularity as I should..? Or 2) A false statement of what is happening since cellranger (the software I'm trying to run) often has misleading error messages.
Minimal reproducible example:
If you install snakemake, you should be able to reproduce my error when running snakemake -j1 --use-singularity in the same directory of the Snakefile.
Snakefile:
rule all:
input:
"output.txt"
rule counts:
output:
"output.txt"
container:
"docker://marcusczi/cellranger_clean"
shell:
"""
cellranger count --help
echo "hurray!" > {output}
"""

snakemake: MissingOutputException within docker

I am trying to run a pipeline within a docker using snakemake. I am having problem using the sortmerna tool to produce {sample}_merged_sorted_mRNA and {sample}_merged_sorted output from control_merged.fq and treated_merged.fq input files.
Here my Snakefile:
SAMPLES = ["control","treated"]
for smp in SAMPLES:
print("Sample " + smp + " will be processed")
rule final:
input:
expand('/output/{sample}_merged.fq', sample=SAMPLES),
expand('/output/{sample}_merged_sorted', sample=SAMPLES),
expand('/output/{sample}_merged_sorted_mRNA', sample=SAMPLES),
rule sortmerna:
input: '/output/{sample}_merged.fq',
output: merged_file='/output/{sample}_merged_sorted_mRNA', merged_sorted='/output/{sample}_merged_sorted',
message: """---SORTING---"""
shell:
'''
sortmerna --ref /usr/share/sortmerna/rRNA_databases/silva-bac-23s-id98.fasta,/ usr/share/sortmerna/rRNA_databases/index/silva-bac-23s-id98: --reads {input} --paired_in -a 16 --log --fastx --aligned {output.merged_file} --other {output.merged_sorted} -v
'''
When runnig this I get:
Waiting at most 5 seconds for missing files.
MissingOutputException in line 57 of /input/Snakefile:
Missing files after 5 seconds:
/output/control_merged_sorted_mRNA
/output/control_merged_sorted
This might be due to filesystem latency. If that is the case, consider to increase the wait $ime with --latency-wait.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /input/.snakemake/log/2018-11-05T091643.911334.snakemake.log
I tried to increase the latency with --latency-wait but I get the same result. Funny thing is that two output files control_merged_sorted_mRNA.fq and control_merged_sorted.fq are produced but the program fails and exits. The version of snakemake is 5.3.0. Any help?
snakemake fails because the outputs described by the rule sortmerna are not produced. This is not a latency problem, it is a problem with your outputs.
Your rule sortmerna expects as output:
/output/control_merged_sorted_mRNA
and
/output/control_merged_sorted
but the program you are using (I know nothing about sortmerna) is apparently producing
/output/control_merged_sorted_mRNA.fq
and
/output/control_merged_sorted.fq
Make sure that when you specify the options --aligned and --other on the command line of your program, it should be the real names of the files produced or if it is only the basename and the program will add a suffix .fq. If you are in the latter case, I suggest you use:
rule final:
input:
expand('/output/{sample}_merged.fq', sample=SAMPLES),
expand('/output/{sample}_merged_sorted', sample=SAMPLES),
expand('/output/{sample}_merged_sorted_mRNA', sample=SAMPLES),
rule sortmerna:
input:
'/output/{sample}_merged.fq',
output:
merged_file='/output/{sample}_merged_sorted_mRNA.fq',
merged_sorted='/output/{sample}_merged_sorted.fq'
params:
merged_file_basename='/output/{sample}_merged_sorted_mRNA',
merged_sorted_basename='/output/{sample}_merged_sorted'
message: """---SORTING---"""
shell:
"""
sortmerna --ref /usr/share/sortmerna/rRNA_databases/silva-bac-23s-id98.fasta,/usr/share/sortmerna/rRNA_databases/index/silva-bac-23s-id98: --reads {input} --paired_in -a 16 --log --fastx --aligned {params.merged_file_basename} --other {params.merged_sorted_basename} -v
"""

Integrating silhouette v5.0 with Play 2.6

I'am trying to integrate silhouette v5.0 with play 2.6 but i have an error while running the app, here is my build.sbt file and the error. thanhs for any help.
val buildVersion = "0.0.4"
version := buildVersion
resolvers += "Sonatype Staging" at "https://oss.sonatype.org/content/repositories/staging/"
scalaVersion := "2.12.4"
libraryDependencies ++= Seq(
guice,
"com.typesafe.play" %% "play-iteratees" % "2.6.1",
"org.reactivemongo" %% "play2-reactivemongo" % "0.12.4-fix26", "io.swagger" %% "swagger-play2" % "1.6.0",
"org.webjars" % "swagger-ui" % "3.2.2",
"com.mohiva" %% "play-silhouette" % "5.0.0",
"com.mohiva" %% "play-silhouette-password-bcrypt" % "5.0.0",
"com.mohiva" %% "play-silhouette-crypto-jca" % "5.0.0",
"com.mohiva" %% "play-silhouette-persistence" % "5.0.0",
"com.mohiva" %% "play-silhouette-testkit" % "5.0.0" % "test" )
here is the error :
Unexpected exception
CreationException: Unable to create injector, see the following errors:
1) Error injecting method, java.lang.NoSuchMethodError: play.api.ApplicationLoader$Context.lifecycle()Lplay/api/inject/DefaultApplicationLifecycle;
at com.google.inject.util.Providers$GuicifiedProviderWithDependencies.initialize(Providers.java:149)
at play.modules.reactivemongo.ReactiveMongoModule.$anonfun$apiBindings$1(ReactiveMongoModule.scala:25):
Binding(interface play.modules.reactivemongo.ReactiveMongoApi to ProviderTarget(play.modules.reactivemongo.ReactiveMongoProvider#4f6aa416)) (via modules: com.google.inject.util.Modules$OverrideModule -> play.api.inject.guice.GuiceableModuleConversions$$anon$1)
2) Error injecting method, java.lang.NoSuchMethodError: play.api.ApplicationLoader$Context.lifecycle()Lplay/api/inject/DefaultApplicationLifecycle;
at com.google.inject.util.Providers$GuicifiedProviderWithDependencies.initialize(Providers.java:149)
at play.modules.reactivemongo.ReactiveMongoModule.$anonfun$apiBindings$1(ReactiveMongoModule.scala:22):
Binding(interface play.modules.reactivemongo.ReactiveMongoApi qualified with QualifierInstance(#play.modules.reactivemongo.NamedDatabase(value=default)) to ProviderTarget(play.modules.reactivemongo.ReactiveMongoProvider#4f6aa416)) (via modules: com.google.inject.util.Modules$OverrideModule -> play.api.inject.guice.GuiceableModuleConversions$$anon$1)
2 errors
I am was stuck on this play authentication with 2.6. On finding no solution, what I did was make one from scratch using Deadbolt2 with the help of my senior developer.
Here's is link to git repository. Please star/fork it if you find this helpful.

Monitoring URLs with Nagios

I'm trying to monitor actual URLs, and not only hosts, with Nagios, as I operate a shared server with several websites, and I don't think its enough just to monitor the basic HTTP service (I'm including at the very bottom of this question a small explanation of what I'm envisioning).
(Side note: please note that I have Nagios installed and running inside a chroot on a CentOS system. I built nagios from source, and have used yum to install into this root all dependencies needed, etc...)
I first found check_url, but after installing it into /usr/lib/nagios/libexec, I kept getting a "return code of 255 is out of bounds" error. That's when I decided to start writing this question (but wait! There's another plugin I decided to try first!)
After reviewing This Question that had almost practically the same problem I'm having with check_url, I decided to open up a new question on the subject because
a) I'm not using NRPE with this check
b) I tried the suggestions made on the earlier question to which I linked, but none of them worked. For example...
./check_url some-domain.com | echo $0
returns "0" (which indicates the check was successful)
I then followed the debugging instructions on Nagios Support to create a temp file called debug_check_url, and put the following in it (to then be called by my command definition):
#!/bin/sh
echo `date` >> /tmp/debug_check_url_plugin
echo $* /tmp/debug_check_url_plugin
/usr/local/nagios/libexec/check_url $*
Assuming I'm not in "debugging mode", my command definition for running check_url is as follows (inside command.cfg):
'check_url' command definition
define command{
command_name check_url
command_line $USER1$/check_url $url$
}
(Incidentally, you can also view what I was using in my service config file at the very bottom of this question)
Before publishing this question, however, I decided to give 1 more shot at figuring out a solution. I found the check_url_status plugin, and decided to give that one a shot. To do that, here's what I did:
mkdir /usr/lib/nagios/libexec/check_url_status/
downloaded both check_url_status and utils.pm
Per the user comment / review on the check_url_status plugin page, I changed "lib" to the proper directory of /usr/lib/nagios/libexec/.
Run the following:
./check_user_status -U some-domain.com.
When I run the above command, I kept getting the following error:
bash-4.1# ./check_url_status -U mydomain.com
Can't locate utils.pm in #INC (#INC contains: /usr/lib/nagios/libexec/ /usr/local/lib/perl5 /usr/local/share/perl5 /usr/lib/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib/perl5 /usr/share/perl5) at ./check_url_status line 34.
BEGIN failed--compilation aborted at ./check_url_status line 34.
So at this point, I give up, and have a couple of questions:
Which of these two plugins would you recommend? check_url or check_url_status?
(After reading the description of check_url_status, I feel that this one might be the better choice. Your thoughts?)
Now, how would I fix my problem with whichever plugin you recommended?
At the beginning of this question, I mentioned I would include a small explanation of what I'm envisioning. I have a file called services.cfg which is where I have all of my service definitions located (imagine that!).
The following is a snippet of my service definition file, which I wrote to use check_url (because at that time, I thought everything worked). I'll build a service for each URL I want to monitor:
###
# Monitoring Individual URLs...
#
###
define service{
host_name {my-shared-web-server}
service_description URL: somedomain.com
check_command check_url!somedomain.com
max_check_attempts 5
check_interval 3
retry_interval 1
check_period 24x7
notification_interval 30
notification_period workhours
}
I was making things WAY too complicated.
The built-in / installed by default plugin, check_http, can accomplish what I wanted and more. Here's how I have accomplished this:
My Service Definition:
define service{
host_name myers
service_description URL: my-url.com
check_command check_http_url!http://my-url.com
max_check_attempts 5
check_interval 3
retry_interval 1
check_period 24x7
notification_interval 30
notification_period workhours
}
My Command Definition:
define command{
command_name check_http_url
command_line $USER1$/check_http -I $HOSTADDRESS$ -u $ARG1$
}
The better way to monitor urls is by using webinject which can be used with nagios.
The below problem is due to the reason that you dont have the perl package utils try installing it.
bash-4.1# ./check_url_status -U mydomain.com Can't locate utils.pm in #INC (#INC contains:
You can make an script plugin. It is easy, you only have to check the URL with something like:
`curl -Is $URL -k| grep HTTP | cut -d ' ' -f2`
$URL is what you pass to the script command by param.
Then check the result: If you have an code greater than 399 you have a problem, else... everything is OK! THen an right exit mode and the message for Nagios.

Resources