Sorting rows in a Grid column of String type based on corresponding integer values - vaadin

I have the following structure for a Grid and wanted to know how to sort a column based on the Integer values of the Strings. The data provider is not flexible to change, so I have to sort with some kind of intermediary step:
Grid<String[]> grid = new Grid<>();
...
grid.addColumn(str -> str[columnIndex]).setHeader("sample").setKey("integercolumn").setSortable(true);
...
GridSortOrder<String> order = new GridSortOrder<>(grid.getColumnByKey("integercolumn"), SortDirection.DESCENDING);
grid.sort(Arrays.asList(order));
This sorts two digit numbers properly but not one digit or 3+.

You can define a custom comparator on the column that is used for sorting its values. In your case the comparator needs to extract the column-specific value from the array, and then convert it to an int:
grid.addColumn(row -> row[columnIndex])
.setHeader("sample")
.setKey("integercolumn")
.setSortable(true)
.setComparator(row -> Integer.parseInt(row[columnIndex]));
See https://vaadin.com/docs/latest/components/grid/#specifying-the-sort-property.

Related

Writing variable-length sequence to a compound array

I am using compound datatypes with h5py, with some elements being variable-length arrays. I can't find a way to set the item. The following MWE shows 6 various ways to do that (sequential indexing — which would not work in h5py anyway, fused indexing, read-modify-commit for columns/rows), neither of which works.
What is the correct way? Why is h5py saying Cannot change data-type for object array when writing integer list to int32 list?
with h5py.File('/tmp/test-vla.h5','w') as h5:
dt=np.dtype([('a',h5py.vlen_dtype(np.dtype('int32')))])
dset=h5.create_dataset('test',(5,),dtype=dt)
dset['a'][2]=[1,2,3] # does not write the value back
dset[2]['a']=[1,2,3] # does not write the value back
dset['a',2]=[1,2,3] # Cannot change data-type for object array
dset[2,'a']=[1,2,3] # Cannot change data-type for object array
tmp=dset['a']; tmp[2]=[1,2,3]; dset['a']=tmp # Cannot change data-type for object array
tmp=dset[2]; tmp['a']=[1,2,3]; dset[2]=tmp # 'list' object has no attribute 'dtype'
When working with compound datasets, I've discovered it's best to add all row data in a single statement. I tweaked your code and to show how add 3 rows of data (each of different length). Note how I: 1) define the row of data with a tuple; 2) define the list of integers with np.array(); and 3) don't reference the field name ['a'].
with h5py.File('test-vla.h5','w') as h5:
dt=np.dtype([('a',h5py.vlen_dtype(np.dtype('int32')))])
dset=h5.create_dataset('test',(5,),dtype=dt)
print (dset.dtype, dset.shape)
dset[0] = ( np.array([0,1,2]), )
dset[1] = ( np.array([1,2,3,4]), )
dset[2] = ( np.array([0,1,2,3,4]), )
For more info, take a look at this post on the HDF Group Forum under HDF5 Ancillary Tools / h5py:
Compound datatype with int, float and array of floats

How do i remove rows based on comma-separated list of values in a Power BI parameter in Power Query?

I have a list of data with a title column (among many other columns) and I have a Power BI parameter that has, for example, a value of "a,b,c". What I want to do is loop through the parameter's values and remove any rows that begin with those characters.
For example:
Title
a
b
c
d
Should become
Title
d
This comma separated list could have one value or it could have twenty. I know that I can turn the parameter into a list by using
parameterList = Text.Split(<parameter-name>,",")
but then I am unsure how to continue to use that to filter on. For one value I would just use
#"Filtered Rows" = Table.SelectRows(#"Table", each Text.StartsWith([key], <value-to-filter-on>))
but that only allows one value.
EDIT: I may have worded my original question poorly. The comma separated values in the parameterList can be any number of characters (e.g.: a,abcd,foo,bar) and I want to see if the value in [key] starts with that string of characters.
Try using List.Contains to check whether the starting character is in the parameter list.
each List.Contains(parameterList, Text.Start([key], 1)
Edit: Since you've changed the requirement, try this:
Table.SelectRows(
#"Table",
(C) => not List.AnyTrue(
List.Transform(
parameterList,
each Text.StartsWith(C[key], _)
)
)
)
For each row, this transforms the parameterList into a list of true/false values by checking if the current key starts with each text string in the list. If any are true, then List.AnyTrue returns true and we choose not to select that row.
Since you want to filter out all the values from the parameter, you can use something like:
= Table.SelectRows(#"Changed Type", each List.Contains(Parameter1,Text.Start([Title],1))=false)
Another way to do this would be to create a custom column in the table, which has the first character of title:
= Table.AddColumn(#"Changed Type", "FirstChar", each Text.Start([Title],1))
and then use this field in the filter step:
= Table.SelectRows(#"Added Custom", each List.Contains(Parameter1,[FirstChar])=false)
I tested this with a small sample set and it seems to be running fine. You can test both and see if it helps with the performance. If you are still facing performance issues, it would probably be easier if you can share the pbix file.
This seems to work fairly well:
= List.Select(Source[Title], each Text.Contains(Parameter1,Text.Start(_,1))=false)
Replace Source with the name of your table and Parameter1 with the name of your Parameter.

Active Record Array array query - to check records that are present in an array

I have an Objective model which has an attribute called as labels whose values are array data type. I need to query all the Objectives whose labels attribute has values that are present in some particular array.
For Example:
I have an array
a = ["textile", "blazer"]
the Objective.labels may have values as ["textile, "ramen"]
I need to return all objectives that might have either "textile" or "blazer" as one of their labels array values
I tried the following:
Objective.where("labels #> ARRAY[?]::varchar[]", ["textile"])
This returns some records.Now when I try
Objective.where("labels #> ARRAY[?]::varchar[]", ["textile", "Blazer"])
I expect it to return all Objectives which contains at-least one of the labels array value as textile or blazer.
However, it returns an empty array. Any Solutions?
Try && overlap operator.
overlap (have elements in common)
Objective.where("labels && ARRAY[?]::varchar[]", ["textile", "Blazer"])
If you have many rows, a GIN index can speed it up.

Excluding one column from forEach

I'm using the following expression to return an md5 hash of a concatenation of all values in a row.
md5(forEach(row.columnNames,cn,if(isNull(cells[cn]),"",cells[cn].value)).join("|"))
This is to create an easy index for identifying duplicates (I do not wish to remove them at this stage). However, I've just realised that because one of the columns contains the unique index for the data set, I cannot hash every column as the inclusion of this column will obviously make every hash unique! (duh)
Is there a way to exclude a nominated column from the forEach loop? A sort of forEach except this...
Thanks
Assuming the column you want to exclude is the first one, you can subset row.columnNames like this:
md5(forEach(row.columnNames.slice(1),cn,if(isNull(cells[cn]),"",cells[cn].value)).join("|"))
If you prefer to exclude a column by its name (for example, "ID"), you should use filter() :
md5(forEach(filter(row.columnNames, v, v!="ID"),cn,if(isNull(cells[cn]),"",cells[cn].value)).join("|"))
Similarly, you can also use filter()to include/exclude column names based on conditions (here : exclude columns that contain a capital "C" in their name):
filter(row.columnNames, v, v.contains("C")==false)

How to transform to Entity Attribute Value (EAV) using Spoon Normalise

I am trying to use Spoon (Pentaho Data Integration) to change data that is in typical row format to Entity Attribute Value format.
My source data is as follows:
My Normaliser is setup as follows:
And here are the results:
Why is the value for the CONDITION_START_DATE and CONDITION_STOP_DATE in the string_value column instead of the date_value column?
According to this documentation
Fieldname: Name of the fields to normalize
Type: Give a string to classify the field.
New field: You can give one or more fields where the new value should transferred to.
Please check Normalizing multiple rows in a single step section in http://wiki.pentaho.com/display/EAI/Row+Normaliser. Accordind to this, you should have a group of fields with the same Type (pr_sl -> Product1, pr1_nr -> Product1), only in this case you can get multiple fields in output (pr_sl -> Product Sales, pr1_nr -> Product Number).
In your case you can convert dates to strings and then use row normalizer with single new field and then use formula for example:
And then convert date_value to date.

Resources