NoSuchMethodError when using Apache Tika - apache-tika

The following error is encountered when I extract the metadata of a JPEG image using Apache Tika
java.lang.NoSuchMethodError: com.adobe.xmp.properties.XMPPropertyInfo.getValue()Ljava/lang/Object;
at com.drew.metadata.xmp.XmpReader.extract(Unknown Source)
at com.drew.imaging.jpeg.JpegMetadataReader.extractMetadataFromJpegSegmentReader(Unknown Source)
at com.drew.imaging.jpeg.JpegMetadataReader.readMetadata(Unknown Source)
at org.apache.tika.parser.image.ImageMetadataExtractor.parseJpeg(ImageMetadataExtractor.java:91)
at org.apache.tika.parser.jpeg.JpegParser.parse(JpegParser.java:56)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120
Tika version being used: Tika 1.4
What is the cause of the error?
Also note that the metadata for an image that doesn't contain any XMP metadata is extracted by the API correctly. This error occurs only for those images that have XMP metadata.

Apache Tika uses the API : metadata-extractor to extract metadata from image files.
The most likely reason for the cause is if one has both tika and the metadata-extractor libraries in their classpath. The binary for metadata-extractor may have been built using a different version of XMPCore library compared with the one that Tika uses.
Solution: remove metadata-extractor's library from your classpath.
The issue with incompatible usage of XMPCore libraries by the two projects isn't resolved:
https://code.google.com/p/metadata-extractor/issues/detail?id=55

Related

How to Update a python library thats already present in GCP dataflow

I am using avro version 1.11.0 for parsing an avro file and decoding it. We have a custom requirement, so i am not able to use ReadFromAvro. When trying this with dataflow there arises a dependency issues as avro-python3 with version 1.82 is already available. The issue is of class TimestampMillisSchema which is not present in avro-python3. It fails stating Attribute TimestampMillisSchema not found in avro.schema.
I then tried passing a requirements file with avro==1.11.0 but now the dataflow was not able to start giving error "Error syncing pod" which seems to be because of dependencies conflicts.
Any Idea/help on how this should be resolved.
Thanks

EMR 6 Beta with Docker Support has S3 Access Issue

I am exploring the new EMR 6.0.0 with Docker support in order to make decision if we want to use it. One of our projects is written in Scala 2.11. But EMR 6.0.0 comes with Spark built from Scala 2.12. So I switched to try 6.00-beta, which is Spark 2.4.3 built from Scala 2.11. If it works on 6.0.0-beta, then we will upgrade our code to Scala 2.12 and use 6.0.0.
A few issues I am having are when I tried to run my Scala spark job:
When it tried to read parquet from S3, I got error: java.lang.RuntimeException: Cannot create temp dirs: [/mnt/s3]
When I tried to make API call with https, I got error: usun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target.
When it tried to read files from S3, I got error: Class com.amazon.ws.emr.hadoop.fs.EmrFileSystem not found. I was able to able to hack this one by passing the path by --jars. Maybe not the best solution.
I am guessing there must be something I need to set either during bootstrap or in the Docker file.
Can someone please help? Thanks!
I figure out the S3 issue. In beta version, /mnt/s3 is not mounted and given the read and write permission.
So I need to add the "docker.allowed.rw-mounts" to the container-executor configuration like below:
docker.allowed.rw-mounts=/etc/passwd,/mnt/s3

Sonarqube 7 unable to start due to elastic search binary not found

I have setup sonarqube 7, problem is i am getting below error while starting it -
2018.03.27 08:02:15 ERROR app[][o.s.a.p.SQProcess] Fail to launch process [es]
java.lang.IllegalStateException: Cannot find elasticsearch binary
at org.sonar.application.command.CommandFactoryImpl.createEsInstallation(CommandFactoryImpl.java:111)
at org.sonar.application.command.CommandFactoryImpl.createEsCommandForUnix(CommandFactoryImpl.java:80)
at org.sonar.application.command.CommandFactoryImpl.createEsCommand(CommandFactoryImpl.java:76)
at org.sonar.application.SchedulerImpl$$Lambda$12/1128486197.get(Unknown Source)
at org.sonar.application.SchedulerImpl.lambda$tryToStartProcess$2(SchedulerImpl.java:153)
at org.sonar.application.SchedulerImpl$$Lambda$13/1288526896.get(Unknown Source)
at org.sonar.application.process.SQProcess.start(SQProcess.java:68)
at org.sonar.application.SchedulerImpl.tryToStart(SchedulerImpl.java:160)
at org.sonar.application.SchedulerImpl.tryToStartProcess(SchedulerImpl.java:152)
at org.sonar.application.SchedulerImpl.tryToStartEs(SchedulerImpl.java:110)
at org.sonar.application.SchedulerImpl.tryToStartAll(SchedulerImpl.java:102)
at org.sonar.application.SchedulerImpl.schedule(SchedulerImpl.java:98)
at org.sonar.application.App.start(App.java:64)
Don't know why its unable to find elasticsearch binary file, as its already located inside its installation directory.
Anywhere i have to mention its path inside any config file of sonarqube 7?
This is a new installation & i am not finding any solution anywhere around.
Thanks for your help.
I had the same error at my Gentoo Linux system. All 6.7.x versions didn't work with identical error (Cannot find elasticsearch binary).
I downloaded a fresh zip file from the sonarqube website (https://www.sonarqube.org/downloads/). The zipfile contains a elasticsearch subdirectory, but this folder was missing in my original installation (installed by Gentoo emerge using the Godin overlay: https://data.gpo.zugaina.org/godin/dev-util/sonarqube-bin/).
By copying the elasticsearch folder into my original installation my problem was solved.
I found this pull request fixing the ebuild for sonarqube-bin:
https://github.com/Godin/gentoo-overlay/commit/d52d7491e10d3589832bf4785edb29caf9dd4012

How to run swagger-codegen for OpenAPI 3.0.0

looks like official swagger for openapi specification V3 support is near release https://blog.readme.io/an-example-filled-guide-to-swagger-3-2/, and the swagger-codegen has 3.0.0 support developed and passing some level of testing https://github.com/swagger-api/swagger-codegen on the 3.0.0 branch
I have a swagger spec (generated from my existing 2.0 spec via https://github.com/mermade/swagger2openapi, output looks good)
Is there an easy way to run the swagger-codegen without having to package the jar myself?
this is the single result i found: https://oss.sonatype.org/content/repositories/snapshots/io/swagger/swagger-codegen-cli/3.0.0-SNAPSHOT/ but running that seems to be broken (from the output, possibly actually running 2.0 not 3.0.0?):
[main] INFO io.swagger.parser.Swagger20Parser - reading from /input/myspec.openapi3.json
[main] INFO io.swagger.codegen.ignore.CodegenIgnoreProcessor - No .swagger-codegen-ignore file found.
Exception in thread "main" java.lang.RuntimeException: missing swagger input or config!
at io.swagger.codegen.DefaultGenerator.generate(DefaultGenerator.java:685)
at io.swagger.codegen.cmd.Generate.run(Generate.java:285)
at io.swagger.codegen.SwaggerCodegen.main(SwaggerCodegen.java:35)
It looks like the swagger-codegen repo has a somewhat supported way to run a docker container after you build; I'm just hoping/guessing there is a supported way to do this without needing to compile locally, as I need to set this up in several places.
OpenAPI Generator (found by top contributors of Swagger Codegen) supports both OpenAPI specification v2 and v3.
You can use the docker images, Java JAR (SNAPSHOT), Brew or npm to give it a try.
For more information about OpenAPI Generator, please refer to the project's README
If you need any help, please open an issue and we'll look into it.
UPDATE: 1st stable version 3.0.0 has been released: https://github.com/OpenAPITools/openapi-generator/releases/tag/v3.0.0
Swagger-codegen 3.0.0 snapshots now include a limited number of targets for code generation from OpenAPI 3.0 definitions. https://github.com/swagger-api/swagger-codegen/issues/6598#issuecomment-333428808
There is an alternative experimental implementation of the codegen engine, using the original swagger-codegen 2.x templates, written in Node.js: https://github.com/mermade/openapi-codegen - if your language is not yet supported, a config file just needs to be created for it mapping the template files to outputs.

Add-on giving NoClassDefFoundError in Vaadin

I came across this add-on that converts the screencontent into a PDF file. However, when I add these lines of code:
PdfFromComponent factory = new PdfFromComponent();
factory.export(contentcity);
I get this error message:
'javax.servlet.ServletException: com.vaadin.server.ServiceException:
java.lang.NoClassDefFoundError: com/itextpdf/text/DocumentException'
with root cause: 'com.vaadin.server.ServiceException:
java.lang.NoClassDefFoundError: com/itextpdf/text/DocumentException'
I Already added the jar file into the library and compiled the widgetset but the error persists. Can someone briefly explain me how to deal with this?
You need to add the following Maven dependency to your project:
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itextpdf</artifactId>
<version>5.5.6</version>
</dependency>
For some strange reason the author does not ship a pom file within his addon, so he could not mark this as a dependency.
Unfortunately, the JVM throws a NoClassDefFoundError also when there is more than one version of a class. It could be that you have more than one iText JAR in your classpath. Check if the add-on you are using includes the iText JAR and double-check that you are not including the JAR more than once.

Resources