I'm using apache velocity in front of LaTeX. The # and $ escape chars are conflicting with LaTeX. I want to replace # with %% and $ with ## to avoid the conflicts. Simply using a string replace on the source file code is not a good solution because I have to use things like #parse and #include. The parsed/included file should also be able to use the modified escape chars. Is there a way to configure this? Is there a configuration option?
You can use a custom resource loader to modify files loaded by #parse:
VelocityEngine engine = new VelocityEngine();
Properties props = new Properties();
props.put("resource.loader", "customloader");
props.put("customloader.resource.loader.class", CustomLoader.class.getName());
engine.init(props);
public static class CustomLoader extends FileResourceLoader {
public InputStream getResourceStream(String arg0) throws ResourceNotFoundException {
InputStream original = super.getResourceStream(arg0);
//TODO modify original, return modified
original.close();
}
}
Related
I have text that is already tokenized, sentence-split, and POS-tagged.
I would like to use CoreNLP to additionally annotate lemmas (lemma), named entities (ner), contituency and dependency parse (parse), and coreferences (dcoref).
Is there a combination of commandline options and option file specifications that makes this possible from the command line?
According to this question, I can ask the parser to view whitespace as delimiting tokens, and newlines as delimiting sentences by adding this to my properties file:
tokenize.whitespace = true
ssplit.eolonly = true
This works well, so all that remains is to specify to CoreNLP that I would like to provide POS tags too.
When using the Stanford Parser standing alone, it seems to be possible to have it use existing POS tags, but copying that syntax to the invocation of CoreNLP doesn't seem to work. For example, this does not work:
java -cp *:./* -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -props my-properties-file -outputFormat xml -outputDirectory my-output-dir -sentences newline -tokenized -tagSeparator / -tokenizerFactory edu.stanford.nlp.process.WhitespaceTokenizer -tokenizerMethod newCoreLabelTokenizerFactory -file my-annotated-text.txt
While this question covers programmatic invocation, I'm invoking CoreNLP form the commandline as part of a larger system, so I'm really asking whether this is possible to achieve this with commandline options.
I don't think this is possible with command line options.
If you want you can make a custom annotator and include it in your pipeline you could go that route.
Here is some sample code:
package edu.stanford.nlp.pipeline;
import edu.stanford.nlp.util.logging.Redwood;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.util.concurrent.MulticoreWrapper;
import edu.stanford.nlp.util.concurrent.ThreadsafeProcessor;
import java.util.*;
public class ProvidedPOSTaggerAnnotator {
public String tagSeparator;
public ProvidedPOSTaggerAnnotator(String annotatorName, Properties props) {
tagSeparator = props.getProperty(annotatorName + ".tagSeparator", "_");
}
public void annotate(Annotation annotation) {
for (CoreLabel token : annotation.get(CoreAnnotations.TokensAnnotation.class)) {
int tagSeparatorSplitLength = token.word().split(tagSeparator).length;
String posTag = token.word().split(tagSeparator)[tagSeparatorSplitLength-1];
String[] wordParts = Arrays.copyOfRange(token.word().split(tagSeparator), 0, tagSeparatorSplitLength-1);
String tokenString = String.join(tagSeparator, wordParts);
// set the word with the POS tag removed
token.set(CoreAnnotations.TextAnnotation.class, tokenString);
// set the POS
token.set(CoreAnnotations.PartOfSpeechAnnotation.class, posTag);
}
}
}
This should work if you provide your token with POS tokens separated by "_". You can change it with the forcedpos.tagSeparator property.
If you set customAnnotator.forcedpos = edu.stanford.nlp.pipeline.ProvidedPOSTaggerAnnotator
to the property file, include the above class in your CLASSPATH, and then include "forcedpos" in your list of annotators after "tokenize", you should be able to pass in your own pos tags.
I may clean this up some more and actually include it in future releases for people!
I have not had time to actually test this code out, if you try it out and find errors please let me know and I'll fix it!
I'm working with a maven plugin that is using plexus-archiver in order to create a zip file.
Basically, I'm getting the component inject by Sisu, then I'm traversing a specified fileSet and registering the ones required:
zipArchiver.addFile(from_file, to_file);
And the zip are being generated properly.
But I need to include an extra-field for the file mime-type in some of those files that are being added to the zip.
how can I do that with plexus-archiver ?
It seems that the current plexus-archiver (3.0) doesn't support extra-fields.
I have to hack a bit in order to keep using plexus-archive.
The solution was to extend ZipArchiver class and override the method initZipOutputStream that provides an object from ZipArchiveOutputStream class.
With it I could create the entry and its extra-field:
#Override
protected void initZipOutputStream(ZipArchiveOutputStream pZOut)
throws ArchiverException, IOException {
super.initZipOutputStream(pZOut);
ZipArchiveEntry ae = new ZipArchiveEntry(pFile,
pFile.getName());
ZipExtraField zef = new ContentTypeExtraField(
Constants.MIME_STRING);
ae.addExtraField(zef);
pZOut.putArchiveEntry(ae);
pZOut.write(content);
pZOut.closeArchiveEntry();
}
I am currently new with javaC. I have installed JDK and set the path to make it work. I have already done several test programs and they worked.
Let's say I have a java file called Read.java and a text file called Numbers.txt
I have already set my directory to where the files are and I enter to command
javac Read.java
then
java Read < input.txt
Problem is how I can set Read.java program to receive the input.txt file?
I know you can read the file from the program itself without redirection. But I want to learn how you can read a file using redirection.
Java's main method looks something like:
public static void main(String[] args)
{
// method body
}
args is an array of parameters that the user can pass to the program - the first parameter would be args[0], the second args[1] and so on.
To receive the input text file, you can have the user type java Read input.txt. input.txt will be the first parameter, and so you can access it by using args[0] in your main method.
A simple example of command line arguments:
public static void main(String[] args)
{
String input = args[0];
System.out.println("You entered: " + input);
}
You can run this by typing java ProgramName hello, and the output will be You entered hello.
You need to read from standard input:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class IORedirection {
public static void main(String[] args) throws IOException {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
String line;
while((line = in.readLine()) != null){
System.out.println(line);
}
}
}
> echo "hello stdin" | java IORedirection
> hello stdin
how I can set Read.java program to receive the input.txt file? I know you can read the file from the program itself without redirection. But I want to learn how you can read a file using redirection.
There are several ways to get input to your program.
This isn't about "Java", but rather what are the ways for the caller to write data to "standard input" (or "stdin"). Within any Java program, you can read stdin with System.in.
So, use System.in within your program, and then use a pipe (|) or a redirect (<). Below are two working examples from an answer I posted on a related question:
% cat input.txt | java SystemInExample.java
% java SystemInExample.java < input.txt
I'm looking for the right way to externalize the settings in my server Dart application.
In Java the common way would be a property file. Exists something similar in Dart?
You can just use a Dart script for your settings. No point in using a different format if there is no specific reason.
With a simple import you have it available in a typed way.
When the Resource class is implemented, I would just use a JSON file that is deployed with my program.
You could use a global variables, for example:
DB_URL = 'localhost:5432/mydb';
DB_PASS = 'my_pass';
then you could create a different configuration file for every enviroment. For example, for production you could create a production_config.dart which could contains:
loadConfig() {
DB_URL = '123.123.123.123:5432/mydb';
DB_PASS = 'my_prod_pass';
}
Then in your main function you could call production_config.loadConfig if environment is production, for example:
import 'production_config.dart' as prodConfig;
main(List<String> args) {
var ENV = getEnvFromArgs(args);
if(ENV == 'PROD') {
prodConfig.loadConfig();
}
//do other stuff here
}
In that way if you want to change from development to production you only need to pass an argument to your dart program for example:
dart myprogram.dart -env=PROD
The advantages of this approach are that you don't need to create a separate properties, json or yaml file for this, and you don't need to parse them. Furthermore the properties are type-ckecked.
I like putting configuration in a Dart class like what Günter Zöchbauer was talking about, but there is also the option of using the safe_config package. With this you enter the values in a yaml file. Quoting from the docs:
You define a subclass of Configuration with those properties:
class ApplicationConfiguration extends Configuration {
ApplicationConfiguration(String fileName) :
super.fromFile(File(fileName));
int port;
String serverHeader;
}
Your YAML file should contain those two, case-sensitive keys:
port: 8000
serverHeader: booyah/1
To read your configuration file:
var config = new ApplicationConfiguration("config.yaml");
print("${config.port}"); // -> 8000
print("${config.serverHeader}"); // -> "booyah/1"
See also an example from a setup in Aqueduct.
main() {
var env = const String.fromEnvironment("ENV", defaultValue: "local");
print("Env === " + env);
}
Give environment as option while running Dart App
pub serve --port=9002 --define ENV=dev
References:
http://blog.sethladd.com/2013/12/compile-time-dead-code-elimination-with.html
https://github.com/dart-lang/sdk/issues/27998
I am working on a legacy project based on ZF1 which uses ISO-8859-1 charset. Also the servers default encoding is ISO. New modules should be implemented using ZF2. How can the default encoding e.g. for escapers etc. be set globally to anything else than UTF-8 in ZF2?
If you use escapers directly in your modules, this will be a problem. If you use only the view helpers, there is an option to set the encoding.
Every escaper view helper (EscapeCss, EscapeHtml and so on) extend from the Zend\View\Helper\Escaper\AbstractHelper. This class has a method setEncoding(). because the encoding is not shared between all helper instances, you must set them individually, but you are able to set the encoding there.
For example, you can set the correct encoding during bootstrap. Say you have your Application module:
<?php
namespace Application;
use Zend\Mvc\MvcEvent;
class Module
{
public function onBootstrap(MvcEvent $e)
{
$app = $e->getApplication();
$sm = $app->getServiceManager;
$manager = $sm->get('ViewHelperManager');
$plugins = array(escapehtml', 'escapehtmlattr', 'escapejs', 'escapecss', 'escapeurl');
$encoding = 'ISO-8859-1';
foreach ($plugins as $name) {
$plugin = $manager->get($name);
$plugin->setEncoding($encoding);
}
}
}
This should correct all plugins to the ISO-8859-1 encoding. If any of your modules, or any 3rd party modules, use the escaper view helpers, the ISO-8859-1 encoding will be used.