import java.util.ArrayList;
import java.util.List;
import net.sf.ehcache.pool.sizeof.UnsafeSizeOf;
public class EhCacheTest {
private List<Double> testList = new ArrayList<Double>();
public static void main(String[] args) {
EhCacheTest test = new EhCacheTest();
for(int i=0;i<1000;i++) {
test.addItem(1.0);
System.out.println(new UnsafeSizeOf().sizeOf(test));
}
}
public void addItem(double a) {
testList.add(a);
}
}
I used UnsafeSizeOf to calculate the size of Object 'test'.
Nomatter how many doubles I add into the list, the size of 'test' is always 16 bytes.
Because of this, the maxBytesLocalHeap paramater is useless for me.
Given your code snippet, this is totally expected.
See the javadoc of the method SizeOf.sizeOf, it clearly states that it sizes the instance without navigating the object graph.
What you want to use is SizeOf.deepSizeOf which will navigate the object graph.
However this test is flawed, cause you risk comparing that specific SizeOf implementation against another one, as Ehcache uses multiple implementations, depending on what's available in a specific environment.
If you really want to confirm the byte sizing on heap works, you are better off filling a cache, finding out the size it reports through statistics and then do a heap dump to see how much memory is effectively used.
Related
According to the Apache Beam documentation the recommended way
to write simple sources is by using Read Transforms and ParDo. Unfortunately the Apache Beam docs has let me down here.
I'm trying to write a simple unbounded data source which emits events using a ParDo but the compiler keeps complaining about the input type of the DoFn object:
message: 'The method apply(PTransform<? super PBegin,OutputT>) in the type PBegin is not applicable for the arguments (ParDo.SingleOutput<PBegin,Event>)'
My attempt:
public class TestIO extends PTransform<PBegin, PCollection<Event>> {
#Override
public PCollection<Event> expand(PBegin input) {
return input.apply(ParDo.of(new ReadFn()));
}
private static class ReadFn extends DoFn<PBegin, Event> {
#ProcessElement
public void process(#TimerId("poll") Timer pollTimer) {
Event testEvent = new Event(...);
//custom logic, this can happen infinitely
for(...) {
context.output(testEvent);
}
}
}
}
A DoFn performs element-wise processing. As written, ParDo.of(new ReadFn()) will have type PTransform<PCollection<PBegin>, PCollection<Event>>. Specifically, the ReadFn indicates it takes an element of type PBegin and returns 0 or more elements of type Event.
Instead, you should use an actual Read operation. There are a variety provided. You can also use Create if you have a specific set of in-memory collections to use.
If you need to create a custom source you should use the Read transform. Since you're using timers, you likely want to create an Unbounded Source (a stream of elements).
So, I have 2 partitions in a step which writes into a database. I want to record the number of rows written in each partition, get the sum, and print it to the log;
I was thinking of using a static variable in the Writer and use Step Context/Job Context to get it in afterStep() of the Step Listener. However when I tried it I got null. I am able to get these values in close() of the Reader.
Is this the right way to go about it? Or should I use Partition Collector/Reducer/ Analyzer?
I am using a java batch in Websphere Liberty. And I am developing in Eclipse.
I was thinking of using a static variable in the Writer and use Step Context/Job Context to get it in afterStep() of the Step Listener. However when i tried it i got null.
The ItemWriter might already be destroyed at this point, but I'm not sure.
Is this the right way to go about it?
Yes, it should be good enough. However, you need to ensure the total row count is shared for all partitions because the batch runtime maintains a StepContext clone per partition. You should rather use JobContext.
I think using PartitionCollector and PartitionAnalyzer is a good choice, too. Interface PartitionCollector has a method collectPartitionData() to collect data coming from its partition. Once collected, batch runtime passes this data to PartitionAnalyzer to analyze the data. Notice that there're
N PartitionCollector per step (1 per partition)
N StepContext per step (1 per partition)
1 PartitionAnalyzer per step
The records written can be passed via StepContext's transientUserData. Since the StepContext is reserved for its own step-partition, the transient user data won't be overwritten by other partition.
Here's the implementation :
MyItemWriter :
#Inject
private StepContext stepContext;
#Override
public void writeItems(List<Object> items) throws Exception {
// ...
Object userData = stepContext.getTransientUserData();
stepContext.setTransientUserData(partRowCount);
}
MyPartitionCollector
#Inject
private StepContext stepContext;
#Override
public Serializable collectPartitionData() throws Exception {
// get transient user data
Object userData = stepContext.getTransientUserData();
int partRowCount = userData != null ? (int) userData : 0;
return partRowCount;
}
MyPartitionAnalyzer
private int rowCount = 0;
#Override
public void analyzeCollectorData(Serializable fromCollector) throws Exception {
rowCount += (int) fromCollector;
System.out.printf("%d rows processed (all partitions).%n", rowCount);
}
Reference : JSR352 v1.0 Final Release.pdf
Let me offer a bit of an alternative on the accepted answer and add some comments.
PartitionAnalyzer variant - Use analyzeStatus() method
Another technique would be to use analyzeStatus which only gets called at the end of each entire partition, and is passed the partition-level exit status.
public void analyzeStatus(BatchStatus batchStatus, String exitStatus)
In contrast, the above answer using analyzeCollectorData gets called at the end of each chunk on each partition.
E.g.
public class MyItemWriteListener extends AbstractItemWriteListener {
#Inject
StepContext stepCtx;
#Override
public void afterWrite(List<Object> items) throws Exception {
// update 'newCount' based on items.size()
stepCtx.setExitStatus(Integer.toString(newCount));
}
Obviously this only works if you weren't using the exit status for some other purpose. You can set the exit status from any artifact (though this freedom might be one more thing to have to keep track of).
Comments
The API is designed to facilitate an implementation dispatching individual partitions across JVMs, (e.g. in Liberty you can see this here.) But using a static ties you to a single JVM, so it's not a recommended approach.
Also note that both the JobContext and the StepContext are implemented in the "thread-local"-like fashion we see in batch.
To my best knowledge, RollingFileAppender in log4j2 will not roll over at the specified time (let's say - at the end of an hour), but at the first log event that arrives after the time threshold has been exceeded.
Is there a way to trigger an event, that on one hand will cause the file to roll over, and on another - will not append to the log (or will append something trivial, like an empty string)?
No there isn't any (built-in) way to do this. There are no background threads monitoring rollover time.
You could create a log4j2 plugin that implements org.apache.logging.log4j.core.appender.rolling.TriggeringPolicy (See the built-in TimeBasedTriggeringPolicy and SizeBasedTriggeringPolicy classes for sample code.)
If you configure your custom triggering policy, log4j2 will check for every log event whether it should trigger a rollover (so take care when implementing the isTriggeringEvent method to avoid impacting performance). Note that for your custom plugin to be picked up, you need to specify the package of your class in the packages attribute of the Configuration element of your log4j2.xml file.
Finally, if this works well for you and you think your solution may be useful to others too, consider contributing your custom triggering policy back to the log4j2 code base.
Following Remko's idea, I wrote the following code, and it's working.
package com.stony;
import org.apache.logging.log4j.core.LogEvent;
import org.apache.logging.log4j.core.appender.rolling.*;
import org.apache.logging.log4j.core.config.plugins.Plugin;
import org.apache.logging.log4j.core.config.plugins.PluginFactory;
#Plugin(name = "ForceTriggerPolicy", category = "Core")
public class ForceTriggerPolicy implements TriggeringPolicy {
private static boolean isRolling;
#Override
public void initialize(RollingFileManager arg0) {
setRolling(false);
}
#Override
public boolean isTriggeringEvent(LogEvent arg0) {
return isRolling();
}
public static boolean isRolling() {
return isRolling;
}
public static void setRolling(boolean _isRolling) {
isRolling = _isRolling;
}
#PluginFactory
public static ForceTriggerPolicy createPolicy(){
return new ForceTriggerPolicy();
}
}
If you have access to the Object RollingFileAppender you could do something like:
rollingFileAppender.getManager().rollover();
Here you can see the manager class:
https://github.com/apache/logging-log4j2/blob/d368e294d631e79119caa985656d0ec571bd24f5/log4j-core/src/main/java/org/apache/logging/log4j/core/appender/rolling/RollingFileManager.java
I've made a SAX parser for parsing XML files with a number of different tags. For performance reasons, I chose SAX over DOM. And I'm glad I did because it works fast and good. The only issue I currently have is that the main class (which extends DefaultHandler) is a bit large and not very easy on the eyes. It contains a huge if/elseif block where I check the tag name, with some nested if's for reading specific attributes. This block is located in the StartElement method.
Is there any nice clean way to split this up? I would like to have a main class which reads the files, and then a Handler for every tag. In this Tag Handler, I'd like to read the attributes for that tag, do something with them, and then go back to the main handler to read the next tag which again gets redirected to the appropriate handler.
My main handler also has a few global Collection variables, which gather information regarding all the documents I parse with it. Ideally, I would be able to add something to those collections from the Tag Handlers.
A code example would be very helpful, if possible. I read something on this site about a Handler Stack, but without code example I was not able to reproduce it.
Thanks in advance :)
I suggest setting up a chain of SAX filters. A SAX filter is just like any other SAX Handler, except that it has another SAX handler to pass events into when it's done. They're frequently used to perform a sequence of transformations to an XML stream, but they can also be used to factor things the way you want.
You don't mention the language you're using, but you mention DefaultHandler so I'll assume Java. The first thing to do is to code up your filters. In Java, you do this by implementing XMLFilter (or, more simply, by subclassing XMLFilterImpl)
import java.util.Collection;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.XMLFilterImpl;
public class TagOneFilter extends XMLFilterImpl {
private Collection<Object> collectionOfStuff;
public TagOneFilter(Collection<Object> collectionOfStuff) {
this.collectionOfStuff = collectionOfStuff;
}
#Override
public void startElement(String uri, String localName, String qName,
Attributes atts) throws SAXException {
if ("tagOne".equals(qName)) {
// Interrogate the parameters and update collectionOfStuff
}
// Pass the event to downstream filters.
if (getContentHandler() != null)
getContentHandler().startElement(uri, localName, qName, atts);
}
}
Next, your main class, which instantiates all of the filters and chains them together.
import java.util.ArrayList;
import java.util.Collection;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
public class Driver {
public static void main(String[] args) throws Exception {
Collection<Object> collectionOfStuff = new ArrayList<Object>();
XMLReader parser = XMLReaderFactory.createXMLReader();
TagOneFilter tagOneFilter = new TagOneFilter(collectionOfStuff);
tagOneFilter.setParent(parser);
TagTwoFilter tagTwoFilter = new TagTwoFilter(collectionOfStuff);
tagTwoFilter.setParent(tagOneFilter);
// Call parse() on the tail of the filter chain. This will finish
// tying the filters together before executing the parse at the
// XMLReader at the beginning.
tagTwoFilter.parse(args[0]);
// Now do something interesting with your collectionOfStuff.
}
}
For Testing purposes I'm trying to design a way to verify that the results of statistical tests are identical across versions, platforms and such. There are a lot things that go on that include ints, nums, dates, Strings and more inside our collections of Objects.
In the end I want to 'know' that the whole set of instantiated objects sum to the same value (by just doing something like adding the checkSum of all internal properties).
I can write low level code for each internal value to return a checkSum but I was thinking that perhaps something like this already exists.
Thanks!
_swarmii
This sounds like you should be using the serialization library (install via Pub).
Here's a simple example to get you started:
import 'dart:io';
import 'package:serialization/serialization.dart';
class Address {
String street;
int number;
}
main() {
var address = new Address()
..number = 5
..street = 'Luumut';
var serialization = new Serialization()
..addRuleFor(address);
Map output = serialization.write(address, new SimpleJsonFormat());
print(output);
}
Then depending on what you want to do exactly, I'm sure you can fine tune the code for your purpose.