Apache Tika : Settting classpath for opennlp models on tika-server

Apache Tika : Settting classpath for opennlp models on tika-server - apache-tika

I can't seem to set the classpath for the tika-server so that the opennlp models are detected correctly.
I've followed the instructions here:
https://wiki.apache.org/tika/TikaAndNER
(substituting app for -server, seen as that looked like it contained everything required)
I have created the following folder structure
tika
`-- tika-ner-resources
`-- org
`-- apache
`-- tika
`-- parser
`-- ner
`-- opennlp
|-- ner-location.bin
|-- ner-organization.bin
`-- ner-person.bin
Running:
java -classpath tika/tika-ner-resources -jar tika-server-1.18.jar --config /etc/tika-config.xml -enableUnsecureFeatures -h 0.0.0.0
and issuing
{{ curl -v -XPUT --data-binary #test.pdf http://localhost:9998/tika --header "Accept: text/plain" --header "Content-Type: application/pdf"}}
results in
INFO going to load, instantiate and bind the instance of org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
WARN Couldn't find model from org/apache/tika/parser/ner/opennlp/ner-location.bin using class loader
INFO LOCATION NER : Available for service ? false
WARN Couldn't find model from org/apache/tika/parser/ner/opennlp/ner-organization.bin using class loader
INFO ORGANIZATION NER : Available for service ? false
WARN Couldn't find model from org/apache/tika/parser/ner/opennlp/ner-date.bin using class loader
INFO DATE NER : Available for service ? false
WARN Couldn't find model from org/apache/tika/parser/ner/opennlp/ner-money.bin using class loader
INFO MONEY NER : Available for service ? false
WARN Couldn't find model from org/apache/tika/parser/ner/opennlp/ner-person.bin using class loader
INFO PERSON NER : Available for service ? false
WARN Couldn't find model from org/apache/tika/parser/ner/opennlp/ner-percentage.bin using class loader
INFO PERCENT NER : Available for service ? false
WARN Couldn't find model from org/apache/tika/parser/ner/opennlp/ner-time.bin using class loader
INFO TIME NER : Available for service ? false
INFO org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser is available ? false
INFO going to load, instantiate and bind the instance of org.apache.tika.parser.ner.regex.RegexNERecogniser
INFO org.apache.tika.parser.ner.regex.RegexNERecogniser is available ? false
INFO Number of NERecognisers in chain 0
The only thing that seems to work is re-packing the jar by adding the contents of the tika/tika-ner-resources directory (i.e. org/blah/blah/*.bin). The curl command then executes without any issues. I've also tried almost every combination of setting the classpath.
Does anyone have any ideas ?

For anyone else having issues, the following command worked for me by removing the -jar and manually specifying the TikaServerCli class
java -classpath tika/tika-ner-resources/:tika-server-1.18.jar
org.apache.tika.server.TikaServerCli --config /etc/tika-config.xml -enableUnsecureFeatures -h 0.0.0.0

Related

ERROR: cannot launch node of type [darknet_ros/darknet_ros]:

dishita#dishita-VirtualBox:~/catkin_ws/src/darknet_ros$ roslaunch darknet_ros darknet_ros.launch
... logging to /home/dishita/.ros/log/a54fc4ec-3828-11ed-8e10-2d44a183ac97/roslaunch-dishita-VirtualBox-7714.log
Checking log directory for disk usage. This may take a while.
Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.
started roslaunch server http://dishita-VirtualBox:37933/
SUMMARY
PARAMETERS
/darknet_ros/actions/camera_reading/name: /darknet_ros/chec...
/darknet_ros/config_path: /home/dishita/cat...
/darknet_ros/image_view/enable_console_output: True
/darknet_ros/image_view/enable_opencv: True
/darknet_ros/image_view/wait_key_delay: 1
/darknet_ros/publishers/bounding_boxes/latch: False
/darknet_ros/publishers/bounding_boxes/queue_size: 1
/darknet_ros/publishers/bounding_boxes/topic: /darknet_ros/boun...
/darknet_ros/publishers/detection_image/latch: True
/darknet_ros/publishers/detection_image/queue_size: 1
/darknet_ros/publishers/detection_image/topic: /darknet_ros/dete...
/darknet_ros/publishers/object_detector/latch: False
/darknet_ros/publishers/object_detector/queue_size: 1
/darknet_ros/publishers/object_detector/topic: /darknet_ros/foun...
/darknet_ros/subscribers/camera_reading/queue_size: 1
/darknet_ros/subscribers/camera_reading/topic: /webcam/image_raw
/darknet_ros/weights_path: /home/dishita/cat...
/darknet_ros/yolo_model/config_file/name: yolov2-tiny.cfg
/darknet_ros/yolo_model/detection_classes/names: ['person', 'bicyc...
/darknet_ros/yolo_model/threshold/value: 0.3
/darknet_ros/yolo_model/weight_file/name: yolov2-tiny.weights
/rosdistro: noetic
/rosversion: 1.15.14
NODES
/
darknet_ros (darknet_ros/darknet_ros)
auto-starting new master
process[master]: started with pid [7722]
ROS_MASTER_URI=http://localhost:11311
setting /run_id to a54fc4ec-3828-11ed-8e10-2d44a183ac97
process[rosout-1]: started with pid [7732]
started core service [/rosout]
ERROR: cannot launch node of type [darknet_ros/darknet_ros]: Cannot locate node of type [darknet_ros] in package [darknet_ros]. Make sure file exists in package path and permission is set to executable (chmod +x)
Ive sourced the bash file and made the file executible using chmod +x ~/catkin_ws/src/darknet_ros
I am still getting this error, help me out.

You have to build the darknet_ros repo again.
Use:
catkin build darknet_ros
also source the file again.

chmod +x ~/catkin_ws/src/darknet_ros is not enough to make the ROS node file executable.
You have to locate the source file of the node, which probably is darknet_ros/ros/ yolo_object_detector_node.cpp, and make this file executable.

Ansible connection to docker engine on osx apple Silicon

I'm trying to connect to my local docker engine running on OSX (m1 chip) in order to create a dynamic inventory.
I've created a host file with the following config
I made sure that docker_containers module is well installed.
plugin: community.docker.docker_containers
docker_host: "unix://Users/ME/.docker/run/docker-cli-api.sock"
Then I run ansible-inventory --graph -i ./hosts/hosts-docker-local.yaml.
But I'm getting the following error:
[WARNING]: * Failed to parse /Users/ME/Projects/ansible-test/hosts/hosts-docker-local.yaml with auto plugin: inventory source '/Users/ME/Projects/ansible-test/hosts/hosts-docker-local.yaml' could not be
verified by inventory plugin 'community.docker.docker_containers'
[WARNING]: * Failed to parse /Users/ME/Projects/ansible-test/hosts/hosts-docker-local.yaml with yaml plugin: Plugin configuration YAML file, not YAML inventory
[WARNING]: * Failed to parse /Users/ME/Projects/ansible-test/hosts/hosts-docker-local.yaml with ini plugin: Invalid host pattern 'plugin:' supplied, ending in ':' is not allowed, this character is reserved to
provide a port.
[WARNING]: Unable to parse /Users/ME/Projects/ansible-test/hosts/hosts-docker-local.yaml as an inventory source
[WARNING]: No inventory was parsed, only implicit localhost is available
#all:
|--#ungrouped:
I tried
ansible-doc -t inventory -l | grep docker
community.docker.docker_containers Ansible dynamic inv...
community.docker.docker_machine Docker Machine inve...
community.docker.docker_swarm Ansible dynamic inv...
but somehow if I do this
ansible localhost -i ./hosts/hosts-docker-local.yaml -m community.docker.docker_containers
It complains
localhost | FAILED! => {
"msg": "The module community.docker.docker_containers was not found in configured module paths"
}
maybe something wrong with my module path, something wierd with OSX? (I installed Ansible with brew)

The inventory file must end in docker.yaml, as pointed out by #Zeitounator.
Uses a YAML configuration file that ends with docker.[yml|yaml].
https://docs.ansible.com/ansible/latest/collections/community/docker/docker_containers_inventory.html#synopsis

Failed to run docker with exported model in GoogleCP Vision AutoML Model

I trained an image classifying model using GCP AutoML Vision and I would like to deploy it in my own web app using Docker. Following the tutorial from GCP, I exported my Vision autoML model to a saved_model.pb and managed to copy it to my local drive.
sudo docker run --rm --name ${CONTAINER_NAME} -p ${PORT}:8501 -v ${YOUR_MODEL_PATH}:/tmp/mounted_model/0001 -t ${CPU_DOCKER_GCR_PATH}
When I tried to run the docker image, there's an error. Error message below:
2020-03-18 06:52:52.851811: I tensorflow_serving/model_servers/server_core.cc:462] Adding/updating models.
2020-03-18 06:52:52.851825: I tensorflow_serving/model_servers/server_core.cc:559] (Re-)adding model: default
2020-03-18 06:52:52.859873: I tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: default version: 1}
2020-03-18 06:52:52.859923: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: default version: 1}
2020-03-18 06:52:52.859938: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: default version: 1}
2020-03-18 06:52:52.860387: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:363] Attempting to load native SavedModelBundle in bundle-shim from: /tmp/mounted_model/0001
2020-03-18 06:52:52.860426: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /tmp/mounted_model/0001
2020-03-18 06:52:52.861256: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2020-03-18 06:52:52.861345: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:310] SavedModel load for tags { serve }; Status: fail. Took 916 microseconds.
2020-03-18 06:52:52.861357: E tensorflow_serving/util/retrier.cc:37] Loading servable: {name: default version: 1} failed: Not found: Could not find meta graph def matching supplied tags: { serve }. To inspect available tag-sets in the SavedModel, please use the SavedModel CLI: `saved_model_cli`
I did some researches online, it seems the problem lies on the exporting part of the model, which GCP does not offer any options when I'm exporting the model. I could really use the help, thanks guys.

Seems that the model doesn't have a graph corresponding to the serving tag.
I found a similar issue reported in the Tensorflow github page. To inspect the available tag-sets in a saved model you can use the SavedModel CLI, you can use the saved_model_cli to inspect the tags:
$ saved_model_cli show --dir ./modelDir
I found how add the serving tag to the model from Tensorflow Hub and it seems that using Transfer Learning could help you to export or save the model with the serving tag.

SaltStack: getting No top file or master_tops data matches found

I am new to SaltStack and following some tutorials and trying to execute state.apply but getting below error:
# salt "host2" state.apply
host2:
----------
ID: states
Function: no.None
Result: False
Comment: No Top file or external nodes data matches found
Started:
Duration:
Changes:
Summary for host2
------------
Succeeded: 0
Failed: 1
------------
Total states run: 1
I am able to test.ping successfully to host.
here is directory structure:
/etc/salt/srv/salt/states
|-top.sls
|-installations
|-init.sls
file root entry in master config
file_roots:
base:
- /srv/salt/states
top.sls ->
base:
'*':
- installations
init.sls->
install_apache:
pkg.installed:
- name: apache2

You need to change the path to your states, or move them to the path set in file_roots.
The file_roots option is where you should place your files, you should have the following tree:
# tree /srv/salt/
/srv/salt/
|-- installations
`-- init.sls
`-- top.sls
Or you could change your file_roots, but I wouldn't do it, since /srv/salt/ seems to be a sort of "standard".
Have a look at the tutorials, if you haven't already: https://docs.saltstack.com/en/getstarted/fundamentals/

I changes the
file_root:
base:
- /etc/salt/srv/salt/state
and it works for me. looks it wasn't picking path correctly

Flume agentSink "Unable to load output format plugin class"

I'm getting the following error and I have no idea why. If I change the sink to "console", it works fine. I'm just trying to recreate an example from the flume documentation except across two different nodes. This is using CDH3.
2011-10-20 17:41:13,046 [main] WARN text.FormatFactory: Unable to load output format plugin class - Class not found
2011-10-20 17:41:13,065 [main] INFO agent.FlumeNode: Loading spec from command line: 'foo:console|agentSink("somehost",35853);'
2011-10-20 17:41:13,228 [main] WARN agent.FlumeNode: Caught exception loading node:null
I'm trying to run flume as such:
flume node_nowatch -1 -s -n foo -c 'foo:console|agentSink("somehost",35853);'
Thanks in advance.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Apache Tika : Settting classpath for opennlp models on tika-server - apache-tika

For anyone else having issues, the following command worked for me by removing the -jar and manually specifying the TikaServerCli class java -classpath tika/tika-ner-resources/:tika-server-1.18.jar org.apache.tika.server.TikaServerCli --config /etc/tika-config.xml -enableUnsecureFeatures -h 0.0.0.0

Related

ERROR: cannot launch node of type [darknet_ros/darknet_ros]:

Ansible connection to docker engine on osx apple Silicon

Failed to run docker with exported model in GoogleCP Vision AutoML Model

SaltStack: getting No top file or master_tops data matches found

Flume agentSink "Unable to load output format plugin class"

Categories

Resources