Field and values connection in Storm

Field and values connection in Storm - stream

I have a fundamental question in storm. I can clearly understand some basic things. For example i have a main class with this code inside:
...
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout(SENTENCE_SPOUT_ID, new SentenceSpout());
builder.setBolt(SPLIT_BOLT_ID, new SplitSentenceBolt()).shuffleGrouping(SENTENCE_SPOUT_ID);
builder.setBolt(COUNT_BOLT_ID, new WordCountBolt(), 3).fieldsGrouping(SPLIT_BOLT_ID, new Fields("word"));
builder.setBolt(REPORT_BOLT_ID, new ReportBolt()).globalGrouping(COUNT_BOLT_ID);
...
and i understand that 1st element(ex. "SENTENCE_SPOUT_ID") is the id of the bolt/spout in order to show the connection between 2 of them. The 2nd element(ex.new SentenceSpout()) specifies the spout or bold that we set in our topology. 3rd element marks the num of tasks that we need for this certain bolt spout.
Then we use .fieldsGrouping or .shuffleGrouping etc to specify the type of grouping and then between the parenthesis the 1st element is the connection with the bolt/spout that takes the input and the 2nd (ex. new Fields("word")) determines the fields that we will group by.
Inside the code of one of the bolts:
public class SplitSentenceBolt extends BaseRichBolt{
private OutputCollector collector;
public void prepare(Map config, TopologyContext context, OutputCollector collector) {
this.collector = collector;
}
public void execute(Tuple tuple) {
this.collector.emit(a, new Values(word, time, name));
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word"));
}
}
At this.collector.emit(a, new Values(word, time, name)); a is the stream_ID and values(...) are the elements of the tuple.
At declarer.declare(new Fields("word")); word must be one of the previous values. Am i right to all the previous?
So my question is: that in declarer.declare(new Fields("word")); word must be the same with word in this.collector.emit(a, new Values(word, time, name)); and the same with the word in builder.setBolt(COUNT_BOLT_ID, new WordCountBolt(), 3).fieldsGrouping(SPLIT_BOLT_ID, new Fields("word")); ????

The number and order of the fields you declare in declareOutputFields should match the fields you emit.
Two changes I'd recommend:
For now use the default stream by omitting the first parameter: collector.emit(new Values(word, time, name));
Make sure you declare the same number of fields: declarer.declare(new Fields("word", "time", "name"));

Related

Vaadin data Binder - ComboBox issues

Later Edit: I noticed that by returning one of the options in ValueProvider's apply method leads to having the check mark present, but appears to show the previous select too. I.e. if the current and previous values are distinct, two check marks are shown.
I am having troubles with ComboBox binding. I cannot get the com.vaadin.flow.data.binder.Binder properly select an option inside the combobox - i.e. tick the check mark in the dropdown.
My binder is a "generic", i.e. I am using it along with a Map, and I provide dynamic getters/setters for various map keys. So, consider Binder<Map>, while one of the properites inside the Map should be holding a Person's id.
ComboBox<Person> combobox = new ComboBox<>("Person");
List<Person> options = fetchPersons();
combobox.setItems(options);
combobox.setItemLabelGenerator(new ItemLabelGenerator<Person>() {
#Override
public String apply(final Person p) {
return p.getName();
}
});
binder.bind(combobox, new ValueProvider<Map, Person>() {
#Override
public Person apply(final Map p) {
return new Person((Long)p.get("id"), (String)p.get("name"));
}
}, new Setter<Map, Person>() {
#Override
public void accept(final Map bean, final Person p) {
bean.put("name", p.getName());
}
});
Wondering what could I possibly do wrong...
Later edit: Adding a screenshot for the Status ComboBox which has a String for caption and Integer for value.

Your problem is that you are creating a new instance in your binding, which is not working. You probably have some other bean, (I say here Bean) where Person is a property. So you want to use Binder of type Bean, to bind ComboBox to the property, which is a Person. And then populate your form with the Bean by using e.g. binder.readBean(bean). Btw. using Java 8 syntax makes your code much less verbose.
Bean bean = fetchBean();
Binder<Bean> binder = new Binder();
ComboBox<Person> combobox = new ComboBox<>("Person");
List<Person> options = fetchPersons();
combobox.setItems(options);
combobox.setItemLabelGenerator(Person::getName);
binder.forField(combobox).bind(Bean::getPerson, Bean::setPerson);
binder.readBean(bean);

How to fix object set in grid?

In my application i have a class like:
public class Team {
private Country teamId;
private Set<Player> playerSet;
private Set<Player> substitutes;
private Set<Coach> coachSet;
}
When i instantiate a grid like:
Grid<Team> grid = new Grid<>(Team.class);
and set allTeam() from database it shows object for playerSet and coachSet.
My question is i just want to show players name and coach name concate by ,or \n.
Any idea how can i do that?As a beginner it is complicated for me

I see three options.
The first option is the one you already found yourself: concatenate their names in a single String. This can be done like this:
grid.addColumn(team -> {
Set<String> coachNames = new HashSet<>();
for (Coach coach : team.getCoaches()){
coachNames.add(coach.getName());
}
return String.join(", ", coachNames);
});
The second one would be to make use of the Grid item Detail - you could show a coaches grid in the item details. Since you want to display both coaches and players, this option is probably not the best but I wanted to mention the possibility. (Placing two grids inside the item details is possible, but quite strange. Not optimal user experience.)
grid.setItemDetailsRenderer(new ComponentRenderer<>(team -> {
Grid<Coach> coachGrid = new Grid<>(Coach.class);
coachGrid.setItems(team.getCoaches());
return coachGrid;
}));
A third option would be to have the team grid on one side of the view, and on the other you show some relevant stuff of the selected item of the team grid. You can have a separate Grid for the coaches, one for the players, one for the substitutes. You could implement this team detail layout also as a separate view if you wish. If your Team object will get more complicated with more sets, collections and other relative properties, the more will this option become appealing, as this is quite scalable/expandable.
grid.addSelectionListener(event -> {
if(event.getFirstSelectedItem().isPresent()){
buildTeamDetails(event.getFirstSelectedItem().get())
}
})
private void buildTeamDetails(Team team){
// build your team detail layouts here
}

You can configure which columns are shown in the grid by using grid.removeAllColumns() and then adding all columns you want to have in the grid with grid.addColumn(). Within addColumn() you can create a renderer that defines how the fields (coachName and playerSet) are displayed in the grid.
Let's have a class Team like
public class Team {
private String coachName;
private Set<Player> playerSet;
private Set<Object> objects;
//getters and setters
}
and a class Player like
public class Player {
private String firstName;
private String lastName;
// getters and setters
}
Now you want to only have coach and player names in the grid. So (in my example) for coachName we can just use the field's getter and we can create a comma separated String for the playerSet with java streams easily.
Configure the grid like:
grid.setItems(team);
grid.removeAllColumns();
grid.addColumn(new TextRenderer<>((ItemLabelGenerator<Team>) Team::getCoachName))
.setHeader("Coach");
grid.addColumn(new TextRenderer<>((ItemLabelGenerator<Team>) team1 -> team1.getPlayerSet().stream()
.map(player1 -> player1.getFirstName() + " " + player1.getLastName())
.collect(Collectors.joining(", "))))
.setHeader("Players")
.setFlexGrow(1);
Then the result looks like:

How do I make View's asList() sortable in Google Dataflow SDK?

We have a problem making asList() method sortable.
We thought we could do this by just extending the View class and override the asList method but realized that View class has a private constructor so we could not do this.
Our other attempt was to fork the Google Dataflow code on github and modify the PCollectionViews class to return a sorted list be using the Collections.sort method as shown in the code snippet below
#Override
protected List<T> fromElements(Iterable<WindowedValue<T>> contents) {
Iterable<T> itr = Iterables.transform(
contents,
new Function<WindowedValue<T>, T>() {
#SuppressWarnings("unchecked")
#Override
public T apply(WindowedValue<T> input){
return input.getValue();
}
});
LOG.info("#### About to start sorting the list !");
List<T> tempList = new ArrayList<T>();
for (T element : itr) {
tempList.add(element);
};
Collections.sort((List<? extends Comparable>) tempList);
LOG.info("##### List should now be sorted !");
return ImmutableList.copyOf(tempList);
}
Note that we are now sorting the list.
This seemed to work, when run with the DirectPipelineRunner but when we tried the BlockingDataflowPipelineRunner, it didn't seem like the code change was being executed.
Note: We actually recompiled the dataflow used it in our project but this did not work.
How can we be able to achieve this (as sorted list from the asList method call)?

The classes in PCollectionViews are not intended for extension. Only the primitive view types provided by View.asSingleton, View.asSingleton View.asIterable, View.asMap, and View.asMultimap are supported.
To obtain a sorted list from a PCollectionView, you'll need to sort it after you have read it. The following code demonstrates the pattern.
// Assume you have some PCollection
PCollection<MyComparable> myPC = ...;
// Prepare it for side input as a list
final PCollectionView<List<MyComparable> myView = myPC.apply(View.asList());
// Side input the list and sort it
someOtherValue.apply(
ParDo.withSideInputs(myView).of(
new DoFn<A, B>() {
#Override
public void processElement(ProcessContext ctx) {
List<MyComparable> tempList =
Lists.newArrayList(ctx.sideInput(myView));
Collections.sort(tempList);
// do whatever you want with sorted list
}
}));
Of course, you may not want to sort it repeatedly, depending on the cost of sorting vs the cost of materializing it as a new PCollection, so you can output this value and read it as a new side input without difficulty:
// Side input the list, sort it, and put it in a PCollection
PCollection<List<MyComparable>> sortedSingleton = Create.<Void>of(null).apply(
ParDo.withSideInputs(myView).of(
new DoFn<Void, B>() {
#Override
public void processElement(ProcessContext ctx) {
List<MyComparable> tempList =
Lists.newArrayList(ctx.sideInput(myView));
Collections.sort(tempList);
ctx.output(tempList);
}
}));
// Prepare it for side input as a list
final PCollectionView<List<MyComparable>> sortedView =
sortedSingleton.apply(View.asSingleton());
someOtherValue.apply(
ParDo.withSideInputs(sortedView).of(
new DoFn<A, B>() {
#Override
public void processElement(ProcessContext ctx) {
... ctx.sideInput(sortedView) ...
// do whatever you want with sorted list
}
}));
You may also be interested in the unsupported sorter contrib module for doing larger sorts using both memory and local disk.

We tried to do it the way Ken Knowles suggested. There's a problem for large datasets. If the tempList is large (so sort takes some measurable time as it's proportion to O(n * log n)) and if there are millions of elements in the "someOtherValue" PCollection, then we are unecessarily re-sorting the same list millions of times. We should be able to sort ONCE and FIRST, before passing the list to the someOtherValue.apply's DoFn.

SDN4 or neo4j-ogm performances issue

I wrote some simple java code and I encountered some bad performances with SDN4 that I didn't have with SDN3. I suspect the find repositories methods depth parameter to not work exactly in the way it should be. Let me explain the problem:
Here are my java classes(it's just an example) in which I removed getters, setters, contructors, ...
First class is 'Element' :
#NodeEntity
public class Element {
#GraphId
private Long id;
private int age;
private String uuid;
#Relationship(type = "HAS_VALUE", direction = Relationship.OUTGOING)
private Set<Value> values = new HashSet<Value>();
Second one is 'Attribute'
#NodeEntity
public class Attribute {
#GraphId
private Long id;
#Relationship(type = "HAS_PROPERTIES", direction = Relationship.OUTGOING)
private Set<HasInterProperties> properties;
The 'value' class allow my user to add a value on an Element for a specific attribute :
#RelationshipEntity(type = "HAS_VALUE")
public class Value {
#GraphId
private Long id;
#StartNode
Element element;
#EndNode
Attribute attribute;
private Integer value;
private String uuid;
public Value() {
}
public Value(Element element, Attribute attribute, Integer value) {
this.element = element;
this.attribute = attribute;
this.value = value;
this.element.getValues().add(this);
this.uuid = UUID.randomUUID().toString();
}
'Element' classe really need to know its values but 'Attribute' class do not care at all about values.
An attribute has references on InternationalizedProperties class which is like that :
#NodeEntity
public class InternationalizedProperties {
#GraphId
private Long id;
private String name;
The relationship entity between an attribute and it InternationalizedProperties is like the following :
#RelationshipEntity(type = "HAS_PROPERTIES")
public class HasInterProperties {
#GraphId
private Long id;
#StartNode
private Attribute attribute;
#EndNode
private InternationalizedProperties properties;
private String locale;
I then created a little main method to create two attributes and 10000 elements. All my elements have a specific value for the first attribute but no values for the second one (no relation between them). Both attributes hav two differents internationalizedProperties. Here is a sample :
public static void main(String[] args) {
ApplicationContext context = new ClassPathXmlApplicationContext("spring/*.xml");
Session session = context.getBean(Session.class);
session.query("START n=node(*) OPTIONAL MATCH n-[r]-() WHERE ID(n) <> 0 DELETE n,r", new HashMap<String, Object>());
ElementRepository elementRepository = context.getBean(ElementRepository.class);
AttributeRepository attributeRepository = context.getBean(AttributeRepository.class);
InternationalizedPropertiesRepository internationalizedPropertiesRepository = context.getBean(InternationalizedPropertiesRepository.class);
HasInterPropertiesRepository hasInterPropertiesRepository = context.getBean(HasInterPropertiesRepository.class);
//Creation of an attribute object with two internationalized properties
Attribute att = new Attribute();
attributeRepository.save(att);
InternationalizedProperties p1 = new InternationalizedProperties();
p1.setName("bonjour");
internationalizedPropertiesRepository.save(p1);
InternationalizedProperties p2 = new InternationalizedProperties();
p2.setName("hello");
internationalizedPropertiesRepository.save(p2);
hasInterPropertiesRepository.save(new HasInterProperties(att, p1, "fr"));
hasInterPropertiesRepository.save(new HasInterProperties(att, p2, "en"));
LOGGER.info("First attribut id is {}", att.getId());
//Creation of 1000 elements having a differnt value on a same attribute
for(int i = 0; i< 10000; i++) {
Element elt = new Element();
new Value(elt, att, i);
elementRepository.save(elt);
if(i%50 == 0) {
LOGGER.info("{} elements created. Last element created with id {}", i+1, elt.getId());
}
}
//Another attribut without any values from element.
Attribute att2 = new Attribute();
attributeRepository.save(att2);
InternationalizedProperties p12 = new InternationalizedProperties();
p12.setName("bonjour");
internationalizedPropertiesRepository.save(p12);
InternationalizedProperties p22 = new InternationalizedProperties();
p22.setName("hello");
internationalizedPropertiesRepository.save(p22);
hasInterPropertiesRepository.save(new HasInterProperties(att2, p12, "fr"));
hasInterPropertiesRepository.save(new HasInterProperties(att2, p22, "en"));
LOGGER.info("Second attribut id is {}", att2.getId());
Finally, in another main method, I try to get several times the first attribute and the second one :
private static void getFirstAttribute(AttributeRepository attributeRepository) {
StopWatch st = new StopWatch();
st.start();
Attribute attribute = attributeRepository.findOne(25283L, 1);
LOGGER.info("time to get attribute (some element have values on it) is {}ms", st.getTime());
}
private static void getSecondAttribute(AttributeRepository attributeRepository) {
StopWatch st = new StopWatch();
st.start();
Attribute attribute2 = attributeRepository.findOne(26286L, 1);
LOGGER.info("time to get attribute (no element have values on it) is {}ms", st.getTime());
}
public static void main(String[] args) {
ApplicationContext context = new ClassPathXmlApplicationContext("spring/*.xml");
AttributeRepository attributeRepository = context.getBean(AttributeRepository.class);
getFirstAttribute(attributeRepository);
getSecondAttribute(attributeRepository);
getFirstAttribute(attributeRepository);
getSecondAttribute(attributeRepository);
getFirstAttribute(attributeRepository);
getSecondAttribute(attributeRepository);
getFirstAttribute(attributeRepository);
getSecondAttribute(attributeRepository);
}
Here are the logs of this execution :
time to get attribute (some element have values on it) is 2983ms
time to get attribute (no element have values on it) is 4ms
time to get attribute (some element have values on it) is 1196ms
time to get attribute (no element have values on it) is 2ms
time to get attribute (some element have values on it) is 1192ms
time to get attribute (no element have values on it) is 3ms
time to get attribute (some element have values on it) is 1194ms
time to get attribute (no element have values on it) is 3ms
Getting the second attribut (and its internationalized properties thanks to depth=1) is very quick but to get the first one remains very slow. I know that there are many relations (10000 exactly) which are pointing on the first attribute, but when I want to get an attribute with its internationalized properties I clearly do not want to get all the values which are pointing on it. (since Set is not specified on Attribute class).
That's why I think there is a performance problem here. Or may be I do something wrong ?
Thanks for your help

When loading data from the graph we don't currently analyse how your domain model is wired together, so we may potentially bring back related nodes that you do not require. These will then be discarded if they are not mappable in your domain, but if there are many of them, it could potentially impact response times.
There are two reasons for this approach.
It is obviously much simpler to create generic queries to any depth,than it would be dynamically analyse your domain model to any arbitrary depth and generate on-the-fly custom queries; its also much simpler to analyse and prove the correctness of generic queries.
We want to preserve the capability to support polymorphic domain
models in the future, where we don't necessarily know what's in the
database from one day to the next, but we want to adapt our domain
model hydration according to what we find.
In this case I would suggest writing a custom query to load the Attribute objects, to ensure you don't bring back all the unwanted relationships.

PrimeFaces DataTable: (Multi)selection does not work for dynamically built tables giving only nulls

I'm working with the multiple row selection to give a user ability to delete the selecting records. According to the PDF documentation, and the ShowCase Labs, I must use the code translated to the Java like that:
final DataTable = new DataTable();
...
// (1)
dataTable.setSelectionMode("multiple");
// (2)
dataTable.setValueExpression("selection", createValueExpression(DbeBean.class, "selection", Object[].class));
// (3)
dataTable.setValueExpression("rowKey", createValueExpression("#{" + VARIABLE + ".indexKey}", Object.class));
...
final ClientBehaviorHolder dataTableAsHolder = dataTable;
...
// (4)
dataTableAsHolder.addClientBehavior("rowSelect", createAjaxBehavior(createMethodExpression(metaData.controllerBeanType, "onRowSelect", void.class, new Class<?>[] {SelectEvent.class})));
multiple - This line features the multiple selection, works fine visually at the front-end.
selection - Being invoked, the #{dbeBean.selection} is really bound and the public void setSelection(T[] selection) is only invoked.
rowKey - Being invoked, works fine, the getIndexKey() is invoked and returns the necessary result.
rowSelect - This event handler is invoked too, DbeBean.onRowSelect(SelectEvent e).
I also use lazy data model (I don't really believe it may be the reason but who knows?; by the way, it returns List<T> though setSelection() requires T[] -- why it's like that?):
public abstract class AbstractLazyDataSource<T extends IIndexable<K>, K> extends LazyDataModel<T> {
...
#Override
public final List<T> load(int first, int pageSize, String sortField, SortOrder sortOrder, Map<String, String> filters) {
...
final IResultContainer<T> resultContainer = getData(querySpecifier);
final List<T> data = resultContainer.getData();
setRowCount(resultContainer.getTotalEntitiesCount());
return getPage(data, first, pageSize);
}
...
#Override
public final K getRowKey(T object) {
return object.getIndexKey(); // T instanceof IIndexable<K>, have to return a unique ID
}
...
However, the handlers do not work as they are expected to work. Please help me to understand why (2) DbeBean.setSelection(T[] selection) & (4) DbeBean.onRowSelect(SelectEvent e) get only the null value: T[] selection = null, and SelectEvent: e.getObject = null, respectively. What am I doing wrong?
Thanks in advance.
PrimeFaces 3.2
Mojarra 2.1.7

I've got it to work: I simply removed the rowKey property during to the dynamic p:dataTable creation (DataTable), and simply overloaded getRowData in lazy data model. Now it works.

Categories

HOME

opencv

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Field and values connection in Storm - stream

Related

Vaadin data Binder - ComboBox issues

How to fix object set in grid?

How do I make View's asList() sortable in Google Dataflow SDK?

SDN4 or neo4j-ogm performances issue

PrimeFaces DataTable: (Multi)selection does not work for dynamically built tables giving only nulls

Categories

Resources