How to transform to Entity Attribute Value (EAV) using Spoon Normalise - normalization

I am trying to use Spoon (Pentaho Data Integration) to change data that is in typical row format to Entity Attribute Value format.
My source data is as follows:
My Normaliser is setup as follows:
And here are the results:
Why is the value for the CONDITION_START_DATE and CONDITION_STOP_DATE in the string_value column instead of the date_value column?
According to this documentation
Fieldname: Name of the fields to normalize
Type: Give a string to classify the field.
New field: You can give one or more fields where the new value should transferred to.

Please check Normalizing multiple rows in a single step section in http://wiki.pentaho.com/display/EAI/Row+Normaliser. Accordind to this, you should have a group of fields with the same Type (pr_sl -> Product1, pr1_nr -> Product1), only in this case you can get multiple fields in output (pr_sl -> Product Sales, pr1_nr -> Product Number).
In your case you can convert dates to strings and then use row normalizer with single new field and then use formula for example:
And then convert date_value to date.

Related

Sorting rows in a Grid column of String type based on corresponding integer values

I have the following structure for a Grid and wanted to know how to sort a column based on the Integer values of the Strings. The data provider is not flexible to change, so I have to sort with some kind of intermediary step:
Grid<String[]> grid = new Grid<>();
...
grid.addColumn(str -> str[columnIndex]).setHeader("sample").setKey("integercolumn").setSortable(true);
...
GridSortOrder<String> order = new GridSortOrder<>(grid.getColumnByKey("integercolumn"), SortDirection.DESCENDING);
grid.sort(Arrays.asList(order));
This sorts two digit numbers properly but not one digit or 3+.
You can define a custom comparator on the column that is used for sorting its values. In your case the comparator needs to extract the column-specific value from the array, and then convert it to an int:
grid.addColumn(row -> row[columnIndex])
.setHeader("sample")
.setKey("integercolumn")
.setSortable(true)
.setComparator(row -> Integer.parseInt(row[columnIndex]));
See https://vaadin.com/docs/latest/components/grid/#specifying-the-sort-property.

deleting columns from influx DN using flux command line

Is there any way to delete columns of an influx timeseries as we have accidentally injected data using the wrong data type (int instead of float).
Or to change the type of data instead.
Unfortunately, there is no way to delete a "column" (i.e. a tag or a field) from an Influx measurement so far. Here's the feature request for that but there is no ETA yet.
Three workarounds:
use SELECT INTO to copy the desirable data into a different measurement, excluding the undesirable "columns". e.g.:
SELECT desirableTag1, desirableTag2, desirableField1, desirableField2 INTO new_measurement FROM measurement
use CAST operations to "change the data type" from float to int. e.g.:
SELECT desirableTag1, desirableTag2, desirableField1, desirableField2, undesiredableTag3::integer, undesiredableField3::integer INTO new_measurement FROM measurement
"Update" the data with insert statement, which will overwrite the data with the same timestamp, same tags, same field keys. Keep all other things equal, except that the "columns" that you would like to update. To make the data in integer data type, remember to put a trailing i on the number. Example: 42i. e.g.:
insert measurement,desirableTag1=v1 desirableField1=fv1,desirableField2=fv2,undesirableField1=someValueA-i 1505799797664800000
insert measurement,desirableTag1=v21 desirableField1=fv21,desirableField2=fv22,undesirableField1=someValueB-i 1505799797664800000

One measurement - three datatypes

I have a Line Protocol like this:
Measurement1,Valuetype=Act_value,metric=Max,dt=Int value=200i 1553537228984000000
Measurement1,Valuetype=Act_value,metric=Lower_bound,dt=Int value=25i 1553537228987000000
Measurement1,Valuetype=Act_value,metric=Min,dt=Int value=10i 1553537228994000000
Measurement1,Valuetype=Act_value,metric=Upper_limit,dt=Int value=222i 1553537228997000000
Measurement1,Valuetype=Act_value,metric=Lower_limit,dt=Int value=0i 1553537229004000000
Measurement1,Valuetype=Act_value,metric=Simulation,dt=bool value=False 1553537229007000000
Measurement1,Valuetype=Act_value,metric=Value,dt=Int value=69i 1553537229014000000
Measurement1,Valuetype=Act_value,metric=Percentage,dt=Int value=31i 1553537229017000000
Measurement1,Valuetype=Set_value,metric=Upper_limit,dt=Int value=222i 1553537229024000000
Measurement1,Valuetype=Set_value,metric=Lower_limit,dt=Int value=0i 1553537229028000000
Measurement1,Valuetype=Set_value,metric=Unit,dt=string value="Kelvin" 1553537229035000000
Measurement1,Valuetype=Set_value,metric=Value,dt=Int value=222i 1553537229038000000
Measurement1,Valuetype=Set_value,metric=Percentage,dt=Int value=0i 1553537229045000000
I need to insert multiple Lines at once. The issue is likely that I insert integers, booleans and strings into the same table. It worked when I created measurements like, e.g. Measurement1_Int,Measurement1_bool,Measurement1_string. In the above configuration I get an error.
I have the following questions:
Is there any way to save values of different (data-)types to one
table/measurement?
If yes how do I need to adjust my Line Protocol?
Would it work I write the three datatypes seperately but still in the same table?
If you can afford to assign the same timestamp to all metrics within measurement datapoint the best variant would be to use metric name a field name in influxdb record:
Measurement1,Valuetype=Act_value Max=200i,Lower_bound=25i,Min=10i,Upper_limit=222i,Lower_limit=0i,Simulation=False,Value=69i,Percentage=31i 1553537228984000000
Otherwise you can still use metric name as field name but missing fields for each timestamp will have null values:
Measurement1,Valuetype=Set_value Upper_limit=222i 1553537229024000000
Measurement1,Valuetype=Set_value Lower_limit=0i 1553537229028000000
Measurement1,Valuetype=Set_value Unit="Kelvin" 1553537229035000000
Measurement1,Valuetype=Set_value Value=222i 1553537229038000000
Measurement1,Valuetype=Set_value Percentage=0i 1553537229045000000

Get Rapidminer to transpose/pivot a single attribute/column in a table

I have a table that looks like the following:
ID City Code
"1005AE" "Oakland" "Value1"
"1006BR" "St.Louis" "Value2"
"102AC" "Miami" "Value1"
"103AE" "Denver" "Value3"
And I want to transpose/pivot the Code examples/values into column attributes like this:
ID City Value1 Value2 Value3
"1005" "Oakland" 1 0 0
"1006" "St.Louis" 0 1 0
"1012" "Miami" 1 0 0
"1030" "Denver" 0 0 1
Note that the ID field is numeric values encoded as strings because Rapidminer had trouble importing bigint datatypes. So that is a separate issue I need to fix--but my focus here is the pivoting or transposing of the data.
I read through a few different Stackoverflow posts listed below. They suggested the Pivot or Transpose operations. I tried both of these, but for some reason I am getting either a huge table which creates City as a dummy variable as well, or just some subset of attribute columns.
How can I set the rows to be the attributes and columns the samples in rapidminer?
Rapidminer data transpose equivalent to melt in R
Any suggestions would be appreciated.
In pivoting, the group attribute parameter dictates how many rows there will be and the index attribute parameter dictates what the last part of the name of new attributes will be. The first part of the name of each new attribute is driven by any other regular attributes that are neither group nor index and the value within the cell is the value found in the original example set.
This means you have to create a new attribute with a constant value of 1; use Generate Attributes for this. Set the role of the ID attribute to be ID so that it is no longer a regular attribute; use Set Role for this. In the Pivot operator, set the group attribute to be City and the index attribute to be Code. The end result is close to what you want. The final steps are, firstly to set missing values to be 0; use Replace Missing Values for this and, secondly to rename the attributes to match what you want; use Rename for this.
You will have to join the result back to the original since the pivot operation loses the ID.
You can find a worked example here http://rapidminernotes.blogspot.co.uk/2011/05/worked-example-using-pivot-operator.html

which type i need for this input "06/09/2011 05:25" ? on the model ? (column type)

i'm using this datetimepicker, http://trentrichardson.com/examples/timepicker/ and i get this datetime input: 06/09/2011 05:25, now i want to save the input in the database so i can use it later, which column type i need to setup on the datebase?
i'm using this code for getting the input from users.
f.text_field :status_date
i tried the types date and datetime but both of them did not work (the column was nil when submiting)
thanks,
Gal

Resources