I was testing out a simple test function as below using the library provided MySql test container when the container startup failed
class Test extends FlatSpec with ForAllTestContainer {
override val container = MySQLContainer()
it should "temp" in {
assert(1 == 1)
}
}
The stack trace for the error is as shown below
Exception encountered when invoking run on a nested suite - Container startup failed
org.testcontainers.containers.ContainerLaunchException: Container startup failed
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:322)
at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:302)
at com.dimafeng.testcontainers.SingleContainer.start(Container.scala:46)
at com.dimafeng.testcontainers.ForAllTestContainer.run(ForAllTestContainer.scala:17)
at com.dimafeng.testcontainers.ForAllTestContainer.run$(ForAllTestContainer.scala:13)
Caused by: org.testcontainers.containers.ContainerFetchException: Can't get Docker image: RemoteDockerImage(imageNameFuture=java.util.concurrent.CompletableFuture#18920cc[Completed normally], imagePullPolicy=DefaultPullPolicy(), dockerClient=LazyDockerClient.INSTANCE)
at org.testcontainers.containers.GenericContainer.getDockerImageName(GenericContainer.java:1265)
at org.testcontainers.containers.GenericContainer.logger(GenericContainer.java:600)
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:311)
... 18 more
Caused by: java.util.NoSuchElementException: No value present
at java.util.Optional.get(Optional.java:135)
at org.testcontainers.utility.ResourceReaper.start(ResourceReaper.java:103)
at org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:155)
at org.testcontainers.LazyDockerClient.getDockerClient(LazyDockerClient.java:14)
at org.testcontainers.LazyDockerClient.listImagesCmd(LazyDockerClient.java:12)
at org.testcontainers.images.LocalImagesCache.maybeInitCache(LocalImagesCache.java:68)
at org.testcontainers.images.LocalImagesCache.get(LocalImagesCache.java:32)
at org.testcontainers.images.AbstractImagePullPolicy.shouldPull(AbstractImagePullPolicy.java:18)
at org.testcontainers.images.RemoteDockerImage.resolve(RemoteDockerImage.java:62)
at org.testcontainers.images.RemoteDockerImage.resolve(RemoteDockerImage.java:25)
at org.testcontainers.utility.LazyFuture.getResolvedValue(LazyFuture.java:20)
at org.testcontainers.utility.LazyFuture.get(LazyFuture.java:27)
at org.testcontainers.containers.GenericContainer.getDockerImageName(GenericContainer.java:1263)
... 20 more
I am trying to encrypt the password using maven by following this link- Generate settings-security.xml file for maven password encryption. But, in the step after creating master password and creating the file settings-security.xml, but while trying to run the command- mvn --encrypt-password '!12345', I am getting the following error:
[ERROR] Error executing Maven.
[ERROR] org.codehaus.plexus.util.xml.pull.XmlPullParserException: start tag unexpected character { (position: TEXT seen <settingsSecurity>\n<master{... #2:9)
[ERROR] Caused by: start tag unexpected character { (position: TEXT seen <settingsSecurity>\n<master{... #2:9)
abhinashkumarjha#C02DP5F7MD6R ~ % mvn --encrypt-password '!Abhi#090342'
[ERROR] Error executing Maven.
[ERROR] org.codehaus.plexus.util.xml.pull.XmlPullParserException: start tag unexpected character { (position: START_TAG seen <settingsSecurity><master{... #1:26)
[ERROR] Caused by: start tag unexpected character { (position: START_TAG seen <settingsSecurity><master{... #1:26)
My settings-security.xml looks like this:
<settingsSecurity>
<master{42xI34HcwGIH/t9Bhr5P4ctsVIjOtvPO81b2eb9uYWY=}</master>
</settingsSecurity>
Thanks in Advance!!
I have a simple dataflow job for testing that ran successfully with apache-beam 2.1.0, the code looks something like:
public static void main(String[] args) throws Exception {
DataflowPipelineOptions dataflowOptions = PipelineOptionsFactory.as(DataflowPipelineOptions.class);
dataflowOptions.setProject("MY_PROJECT_ID");
dataflowOptions.setStagingLocation("gs://MY_STAGING_LOC");
dataflowOptions.setTempLocation("gs://MY_TEMP_LOC");
dataflowOptions.setFilesToStage(Collections.singletonList("MY_LOCAL_JAR_FILE.jar"));
dataflowOptions.setRunner(DataflowRunner.class);
dataflowOptions.setNetwork("SOME_NETWORK");
dataflowOptions.setSubnetwork("regions/SOME_REGION/subnetworks/SOME_SUBNETWORK");
dataflowOptions.setZone("SOME_ZONE");
Pipeline p = Pipeline.create(dataflowOptions);
List<String> LINES = Arrays.asList("foobar");
p.apply(Create.of(LINES)).setCoder(StringUtf8Coder.of());
p.run().waitUntilFinish();
}
However, when I migrate to apache-beam 2.4.0, I immediately get the following error when trying to submit a dataflow job via the cli.
Exception in thread "main" java.lang.RuntimeException: Error while staging packages
at org.apache.beam.runners.dataflow.util.PackageUtil.stageClasspathElements(PackageUtil.java:396)
at org.apache.beam.runners.dataflow.util.PackageUtil.stageClasspathElements(PackageUtil.java:273)
at org.apache.beam.runners.dataflow.util.GcsStager.stageFiles(GcsStager.java:76)
at org.apache.beam.runners.dataflow.util.GcsStager.stageDefaultFiles(GcsStager.java:64)
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:661)
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:174)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
at com.company.app.App.main(App.java:48)
Caused by: java.io.IOException: Error executing batch GCS request
at org.apache.beam.sdk.util.GcsUtil.executeBatches(GcsUtil.java:607)
at org.apache.beam.sdk.util.GcsUtil.getObjects(GcsUtil.java:339)
at org.apache.beam.sdk.extensions.gcp.storage.GcsFileSystem.matchNonGlobs(GcsFileSystem.java:216)
at org.apache.beam.sdk.extensions.gcp.storage.GcsFileSystem.match(GcsFileSystem.java:85)
at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:123)
at org.apache.beam.sdk.io.FileSystems.matchSingleFileSpec(FileSystems.java:188)
at org.apache.beam.runners.dataflow.util.PackageUtil.alreadyStaged(PackageUtil.java:160)
at org.apache.beam.runners.dataflow.util.PackageUtil.stagePackageSynchronously(PackageUtil.java:184)
at org.apache.beam.runners.dataflow.util.PackageUtil.lambda$stagePackage$1(PackageUtil.java:174)
at org.apache.beam.sdk.util.MoreFutures.lambda$supplyAsync$0(MoreFutures.java:101)
at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: com.google.api.client.http.HttpResponseException: 404 Not Found
...
I haven't changed any configuration settings.
Further debugging the code, it is failing on a POST request to https://www.googleapis.com/null
Looks like it is a bug which was fixed in the dev branch on Feb 13. Hopefully the fix will be released soon:
Original Issue: https://github.com/google/google-api-java-client/issues/1073
Flawed Fix: https://github.com/google/google-api-java-client/pull/1087
Corrected Fix: https://github.com/google/google-api-java-client/pull/1096
You're hitting this issue: https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/607
To fix, add the following if using Gradle:
compile (group: 'com.google.api-client', name: 'google-api-client', version: '1.22.0') {
force = true
}
Or Maven:
<dependency>
<groupId>com.google.api-client</groupId>
<artifactId>google-api-client</artifactId>
<version>[1.22.0]</version>
</dependency>
I need to upload file:
def newTeam(String nameTeam){
render '123 ' + nameTeam
if(request instanceof MultipartHttpServletRequest) {
MultipartHttpServletRequest mpr = (MultipartHttpServletRequest)request
CommonsMultipartFile f = (CommonsMultipartFile) mpr.getFile("myFile");
}
}
i have error:
2017-04-11 23:22:37.416 ERROR --- [ Thread-12]
grails.boot.GrailsApp : Compilation Error: startup
failed: General error during class generation:
java.lang.NoClassDefFoundError: Unable to load class
org.springframework.web.multipart.commons.CommonsMultipartFile due to
missing dependency Lorg/apache/commons/fileupload/FileItem;
java.lang.RuntimeException: java.lang.NoClassDefFoundError: Unable to
load class
org.springframework.web.multipart.commons.CommonsMultipartFile due to
missing dependency Lorg/apache/commons/fileupload/FileItem; at
more....
It is fix. But i do not know how to use it. I have to write the dsl code in my resources.groovy.
Try adding the following to build.gradle:
dependencies {
....
compile 'commons-fileupload:commons-fileupload:1.3.2'
}
I'm following this quick start after starting this ready-to-use PredictionIO Amazon EC2 instance and after running these commands it fails in the pio train:
pio app new MyTextApp
pio import --appid 1 --input data/stopwords.json
pio import --appid 1 --input data/emails.json
pio build
pio train
...
Data set is empty, make sure event fields match imported data.
Exception in thread "main" java.lang.IllegalStateException: Haven't seen any document yet.
at org.apache.spark.mllib.feature.IDF$DocumentFrequencyAggregator.idf(IDF.scala:132)
at org.apache.spark.mllib.feature.IDF.fit(IDF.scala:56)
at uk.co.news.PreparedData.<init>(Preparator.scala:70)
at uk.co.news.Preparator.prepare(Preparator.scala:47)
at uk.co.news.Preparator.prepare(Preparator.scala:43)
Since there is no error when running the command to import emails, I don't understand why the data set is still empty. I double-checked the email.json file and the data is indeed there and this is the result when running
pio import --appid 1 --input data/emails.json
ubuntu#ip-172-31-0-60:~/pio-textclassification$ pio import --appid 1 --input data/emails.json
[INFO] [Runner$] Submission command: /opt/spark-1.4.1-bin-hadoop2.6/bin/spark-submit --class io.prediction.tools.imprt.FileToEvents --files file:/opt/PredictionIO/conf/log4j.properties --driver-class-path /opt/PredictionIO/conf file:/opt/PredictionIO/lib/pio-assembly-0.9.4.jar --appid 1 --input file:/home/ubuntu/pio-textclassification/data/emails.json --env PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/home/ubuntu/.pio_store,PIO_HOME=/opt/PredictionIO,PIO_FS_ENGINESDIR=/home/ubuntu/.pio_store/engines,PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio,PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc,PIO_FS_TMPDIR=/home/ubuntu/.pio_store/tmp,PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=PGSQL,PIO_CONF_DIR=/opt/PredictionIO/conf
[INFO] [Remoting] Starting remoting
[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://sparkDriver#172.31.0.60:49257]
[INFO] [FileToEvents$] Events are imported.
[INFO] [FileToEvents$] Done.
EDIT:
pio build --verbose
showed an exception that was being swallowed. The problem is with the database connection, but it's still not clear what is wrong since parts of the exception are being replaced with "..."
[DEBUG] [ConnectionPool$] Registered connection pool : ConnectionPool(url:jdbc:postgresql://localhost/pio, user:pio) using factory : <default>
[DEBUG] [ConnectionPool$] Registered singleton connection pool : ConnectionPool(url:jdbc:postgresql://localhost/pio, user:pio)
[DEBUG] [StatementExecutor$$anon$1] SQL execution completed
[SQL Execution]
create table if not exists pio_meta_enginemanifests ( id varchar(100) not null primary key, version text not null, engineName text not null, description text, files text not null, engineFactory text not null); (10 ms)
[Stack Trace]
...
io.prediction.data.storage.jdbc.JDBCEngineManifests$$anonfun$1.apply(JDBCEngineManifests.scala:37)
io.prediction.data.storage.jdbc.JDBCEngineManifests$$anonfun$1.apply(JDBCEngineManifests.scala:29)
scalikejdbc.DBConnection$class.autoCommit(DBConnection.scala:222)
scalikejdbc.DB.autoCommit(DB.scala:60)
scalikejdbc.DB$$anonfun$autoCommit$1.apply(DB.scala:215)
scalikejdbc.DB$$anonfun$autoCommit$1.apply(DB.scala:214)
scalikejdbc.LoanPattern$class.using(LoanPattern.scala:18)
scalikejdbc.DB$.using(DB.scala:138)
scalikejdbc.DB$.autoCommit(DB.scala:214)
io.prediction.data.storage.jdbc.JDBCEngineManifests.<init>(JDBCEngineManifests.scala:29)
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
java.lang.reflect.Constructor.newInstance(Constructor.java:526)
io.prediction.data.storage.Storage$.getDataObject(Storage.scala:293)
...
[INFO] [RegisterEngine$] Registering engine JmhjlGoEjJuKXhXpY70MbEkuGHMuOZzL 8ccd38126d56ed48adaa9f85547131467f7629f7
[DEBUG] [StatementExecutor$$anon$1] SQL execution completed
[SQL Execution]
update pio_meta_enginemanifests set engineName = 'pio-textclassification', description = 'pio-autogen-manifest', files = 'file:/home/ubuntu/pio-textclassification/target/scala-2.10/uk.co.news-assembly-0.1-SNAPSHOT-deps.jar... (192)', engineFactory = '' where id = 'JmhjlGoEjJuKXhXpY70MbEkuGHMuOZzL' and version = '8ccd38126d56ed48adaa9f85547131467f7629f7'; (3 ms)
[Stack Trace]
...
io.prediction.data.storage.jdbc.JDBCEngineManifests$$anonfun$7.apply(JDBCEngineManifests.scala:85)
io.prediction.data.storage.jdbc.JDBCEngineManifests$$anonfun$7.apply(JDBCEngineManifests.scala:78)
scalikejdbc.DBConnection$$anonfun$3.apply(DBConnection.scala:297)
scalikejdbc.DBConnection$class.scalikejdbc$DBConnection$$rollbackIfThrowable(DBConnection.scala:274)
scalikejdbc.DBConnection$class.localTx(DBConnection.scala:295)
scalikejdbc.DB.localTx(DB.scala:60)
scalikejdbc.DB$.localTx(DB.scala:257)
io.prediction.data.storage.jdbc.JDBCEngineManifests.update(JDBCEngineManifests.scala:78)
io.prediction.tools.RegisterEngine$.registerEngine(RegisterEngine.scala:50)
io.prediction.tools.console.Console$.build(Console.scala:813)
io.prediction.tools.console.Console$$anonfun$main$1.apply(Console.scala:698)
io.prediction.tools.console.Console$$anonfun$main$1.apply(Console.scala:684)
scala.Option.map(Option.scala:145)
io.prediction.tools.console.Console$.main(Console.scala:684)
io.prediction.tools.console.Console.main(Console.scala)
...
[DEBUG] [StatementExecutor$$anon$1] SQL execution completed
[SQL Execution]
INSERT INTO pio_meta_enginemanifests VALUES( 'JmhjlGoEjJuKXhXpY70MbEkuGHMuOZzL', '8ccd38126d56ed48adaa9f85547131467f7629f7', 'pio-textclassification', 'pio-autogen-manifest', 'file:/home/ubuntu/pio-textclassification/target/scala-2.10/uk.co.news-assembly-0.1-SNAPSHOT-deps.jar... (192)', ''); (1 ms)
[Stack Trace]
...
io.prediction.data.storage.jdbc.JDBCEngineManifests$$anonfun$2.apply(JDBCEngineManifests.scala:48)
io.prediction.data.storage.jdbc.JDBCEngineManifests$$anonfun$2.apply(JDBCEngineManifests.scala:40)
scalikejdbc.DBConnection$$anonfun$3.apply(DBConnection.scala:297)
scalikejdbc.DBConnection$class.scalikejdbc$DBConnection$$rollbackIfThrowable(DBConnection.scala:274)
scalikejdbc.DBConnection$class.localTx(DBConnection.scala:295)
scalikejdbc.DB.localTx(DB.scala:60)
scalikejdbc.DB$.localTx(DB.scala:257)
io.prediction.data.storage.jdbc.JDBCEngineManifests.insert(JDBCEngineManifests.scala:40)
io.prediction.data.storage.jdbc.JDBCEngineManifests.update(JDBCEngineManifests.scala:89)
io.prediction.tools.RegisterEngine$.registerEngine(RegisterEngine.scala:50)
io.prediction.tools.console.Console$.build(Console.scala:813)
io.prediction.tools.console.Console$$anonfun$main$1.apply(Console.scala:698)
io.prediction.tools.console.Console$$anonfun$main$1.apply(Console.scala:684)
scala.Option.map(Option.scala:145)
io.prediction.tools.console.Console$.main(Console.scala:684)
...
[INFO] [Console$] Your engine is ready for training.
A few things to check:
Does "pio app list" show MyTextApp has appId 1?
Download https://github.com/yipjustin/pio-event-distribution-checker and change engine.json so that appId reads 1, then "pio build" and "pio train" to see if the data is actually imported.
P.S. There is a google group (https://groups.google.com/forum/#!forum/predictionio-user) for which your question will be answered more quickly by the community of PredictionIO users.
The solution was to change the DataSource.scala to match the schema in the emails.json file before running pio build.
This is the only method I had to change in the file:
private def readEventData(sc: SparkContext) : RDD[Observation] = {
//Get RDD of Events.
PEventStore.find(
appName = dsp.appName,
entityType = Some("content"),
eventNames = Some(List("e-mail"))
// Convert collected RDD of events to and RDD of Observation
// objects.
)(sc).map(e => {
val label : String = e.properties.get[String]("label")
Observation(
if (label == "spam") 1.0 else 0.0,
e.properties.get[String]("text"),
label
)
}).cache
}
I had to change the previous values to "content", "e-mail" and "spam".