How to split a string column in two in Flux (InfluxDB) - influxdb

I have a column of #datatype string which is called names and contains the following info for each row:
ABV,BVA
BAC,DWA
ZZA,DSW
...
My question is how can I split (by the comma ,) this column into two columns with names (names_1 and names_2), such that I will get something like this:
names_1 names_2
ABV BVA
BAC DWA
ZZA DSW
...
I tried strings.split() but it only works on an single string. So maybe I need a way to do apply this code to a whole column:
import "strings"
data
|> map (fn:(r) => strings.split(v: r.names, t: ","))

I think you might be looking for something like this:
import "experimental/array"
import "strings"
array.from(rows: [{value: "A,B"}])
|> map(fn: (r) => {
parts = strings.split(v: r.value, t: ",")
return {first: parts[0], second: parts[1]}
})
The array.from() can be replaced with from() to read data from influxdb. The map function expects a record to be returned. If you have some data that might not have two values after the split, you can also do:
import "experimental/array"
import "strings"
array.from(rows: [{value: "A,B"}])
|> map(fn: (r) => {
parts = strings.split(v: r.value, t: ",")
return if length(arr: parts) > 1 then
{first: parts[0], second: parts[1]}
else {first: parts[0], second: ""}
})

Your original attempt was almost fine.
This way it should split the column:
import "strings"
data
|> map (fn:(r) => ({
r with
names_1: strings.split(v: r.name, t: ",")[0],
names_2: strings.split(v: r.name, t: ",")[1]
}))

Related

Parsec sepBy Haskell

I wrote a function and it complies, but I'm not sure if it works the way I intend it to or how to call it in the terminal. Essentially, I want to take a string, like ("age",5),("age",6) and make it into a list of tuples [("age1",5)...]. I am trying to write a function separate the commas and either I am just not sure how to call it in the terminal or I did it wrong.
items :: Parser (String,Integer) -> Parser [(String,Integer)]
items p = do { p <- sepBy strToTup (char ",");
return p }
I'm not sure what you want and I don't know what is Parser.
Starting from such a string:
thestring = "(\"age\",5),(\"age\",6),(\"age\",7)"
I would firstly remove the outer commas with a regular expression method:
import Text.Regex
rgx = mkRegex "\\),\\("
thestring' = subRegex rgx thestring ")("
This gives:
>>> thestring'
"(\"age\",5)(\"age\",6)(\"age\",7)"
Then I would split:
import Data.List.Split
thelist = split (startsWith "(") thestring'
which gives:
>>> thelist
["(\"age\",5)","(\"age\",6)","(\"age\",7)"]
This is what you want, if I correctly understood.
That's probably not the best way. Since all the elements of the final list have form ("age", X) you could extract all numbers (I don't know but it should not be difficult) and then it would be easy to get the final list. Maybe better.
Apologies if this has nothing to do with your question.
Edit
JFF ("just for fun"), another way:
import Data.Char (isDigit)
import Data.List.Split
thestring = "(\"age\",15),(\"age\",6),(\"age\",7)"
ages = (split . dropBlanks . dropDelims . whenElt) (not . isDigit) thestring
map (\age -> "(age," ++ age ++ ")") ages
-- result: ["(age,15)","(age,6)","(age,7)"]
Or rather:
>>> map (\age -> ("age",age)) ages
[("age","15"),("age","6"),("age","7")]
Or if you want integers:
>>> map (\age -> ("age", read age :: Int)) ages
[("age",15),("age",6),("age",7)]
Or if you want age1, age2, ...:
import Data.List.Index
imap (\i age -> ("age" ++ show (i+1), read age :: Int)) ages
-- result: [("age1",15),("age2",6),("age3",7)]

How to get the output for several sequential nom parsers when the input is a &str?

This question is almost identical to Capture the entire contiguous matched input with nom, but I have to parse UTF-8 text as input (&str) not just bytes (&[u8]). I am trying to get the whole match for several parsers:
named!(parse <&str, &str>,
recognize!(
chain!(
is_not_s!(".") ~
tag_s!(".") ~
is_not_s!( "./ \r\n\t" ),
|| {}
)
)
);
And it causes this error:
no method named "offset" found for type "&str" in the current scope
Is the only way to do this to switch to &[u8] as input and then do map_res!?
there's an Offset trait implementation for &str that will be available in the next version of nom. There is no planned release date yet for nom 2.0, so in the meantime, you can copy the implementation in your code:
use nom::Offset;
impl Offset for str {
fn offset(&self, second: &Self) -> usize {
let fst = self.as_ptr();
let snd = second.as_ptr();
snd as usize - fst as usize
}
}

Map over values of one column

I want to map over the values of the Title column of my dataframe.
The solution I came up with is the following:
df.Columns.[ [ "Title"; "Amount" ] ]
|> Frame.mapCols(fun k s ->
if k = "Title"
then s |> Series.mapValues (string >> someModif >> box)
else s.Observations |> Series)
Since s is of type ObjectSeries<_> I have to cast it to string, modify it then box it back.
Is there a recommended way to map over the values of a single column?
Another option would be to add a TitleMapped column with:
df?TitleMapped <- df?Title |> Series.mapValues (...your mapping fn...)
...and then throw away the Title column with df |> Frame.dropCol "Title" (or not bother if you don't care whether it stays or not).
Or, if you don't like the "imperativeness" of <-, you can do something like:
df?Title
|> Series.mapValues (...your mapping fn...)
|> fun x -> Frame( ["Title"], [x] )
|> Frame.join JoinKind.Left (df |> Frame.dropCol "Title")
You can use GetColumn:
df.GetColumn<string>("Title")
|> Series.mapValues(someModif)
Or in more F#-style:
df
|> Frame.getCol "Title"
|> Series.mapValues(string >> someModif)
In some cases, you may want to map over values of a specific column and keep that mapped column in the frame. Supposing we have a frame called someFrame with 2 columns (Col1 and Col2) and we want to transform Col1 (for example, Col1 + Col2), what I usually do is:
someFrame
|> Frame.replaceCol "Col1"
(Frame.mapRowValues (fun row ->
row.GetAs<float>("Col1") + row.GetAs<float>("Col2"))
someFrame)
If you want to create a new column instead of replacing it, all you have to do is to change the "replaceCol" method for "addCol" and choose a new name for the column instead of "Col1" of the given example. I don't know if this is the most efficient way, but it worked for me so far.

Fields with common names in different records

I have some records with similar fields, like this:
-define(COMMON_FIELDS, common1, common2, common3).
-record(item1, a, b, c, ?COMMON_FIELDS).
-record(item2, x, y, z, ?COMMON_FIELDS).
But later I need to write similar code for every record:
Record#item1.common1,
Record#item1.common2,
Record#item1.common3
and:
Record#item2.common1,
Record#item2.common2,
Record#item2.common3
Is there way to write one function for access to same fields in different records?
Is there way to write one function for access to same fields in
different records?
1) Pattern matching in multiple function clauses:
-module(x1).
-export([read/1]).
-define(COMMON_FIELDS, common1, common2, common3).
-record(item1, {x, ?COMMON_FIELDS}). %Note that you defined your records incorrectly.
-record(item2, {y, ?COMMON_FIELDS}).
read(#item1{common1=C1, common2=C2, common3=C3} = _Item) ->
io:format("~p, ~p, ~p~n", [C1, C2, C3]);
read(#item2{common1=C1, common2=C2, common3=C3} = _Item) ->
io:format("~p, ~p, ~p~n", [C1, C2, C3]).
...
25> c(x1).
{ok,x1}
26> rr(x1).
[item1,item2]
27> A = #item1{x=10, common1="hello", common2="world", common3="goodbye"}.
#item1{x = 10,common1 = "hello",common2 = "world",
common3 = "goodbye"}
28> B = #item2{y=20, common1="goodbye", common2="mars", common3="hello"}.
#item2{y = 20,common1 = "goodbye",common2 = "mars",
common3 = "hello"}
29> x1:read(A).
"hello", "world", "goodbye"
ok
30> x1:read(B).
"goodbye", "mars", "hello"
ok
Note the export statement--it's a list of length 1, i.e. the module exports one function. The output shows that the read() function can read records of either type.
2) A case statement:
If for some reason, by stating one function you mean one function clause, you can do this:
read(Item) ->
case Item of
#item1{common1=C1, common2=C2, common3=C3} -> true;
#item2{common1=C1, common2=C2, common3=C3} -> true
end,
io:format("~p, ~p, ~p~n", [C1, C2, C3]).
You can use exprecs parse transform from parse_transe.
-module(parse).
-compile({parse_transform, exprecs}).
-record(item1, {x, common1, common2}).
-record(item2, {y, common1, common2}).
-export_records([item1, item2]).
-export([p/0]).
f() ->
R1 = #item1{x=1, common1=foo1, common2=bar1},
R2 = #item2{y=2, common1=foo2, common2=bar2},
['#get-'(Field, Rec) || Field <- [common1, common2], Rec <- [R1, R2]].
...
1> c(parse).
{ok,parse}
2> parse:f().
[foo1,foo2,bar1,bar2]
It might make sense to factor out the common fields into a single field in each record companies containing a record with all the common data or even a tulple. Then refactor your code to do all common processing to its own function.
You still need to pattern match every top level record to get the common sub record. But somewhere you probably want to do the processing specific to each record kind and there you can already match out the common field.
-record(common, {c1, c2, c3}).
-record(item1, {a, b, c, com}).
...
process_item(#item1{a=A, b=B, c=C, com=Com}) ->
process_abc(A, B, C),
process_common(Com),
...;
process_item(#item2{x=X, y=Y ...
Data structures like this might also be a indication to use the new Map data type instead of records.

Merge multiple lists of data together by common ID in F#

I have multiple lists of data from 4 different sources with a common set of IDs that I would like to merge together, based on ID, basically ending up with a new list, one for each ID and a single entry for each source.
The objects in the output list from each of the 4 sources look something like this:
type data = {ID : int; value : decimal;}
so, for example I would have:
let sourceA = [data1, data2, data3];
let sourceB = [data1, data2, data3];
let sourceC = [data1, data2, data3];
let sourceD = [data1, data2, data3];
(I realize this code is not valid, just trying to give a basic idea... the lists are actually pulled and generated from a database)
I would then like to take sourceA, sourceB, sourceC and sourceD and process them into a list containing objects something like this:
type dataByID = {ID : int; valueA : decimal; valueB : decimal; valueC : decimal; valueD : decimal; }
...so that I can then print them out in a CSV, with the first column being the ID and coulmns 2 - 5 being data from sources A - D corresponding to the ID in that row.
I'm totally new to F#, so what would be the best way to process this data so that I match up all the source data values by ID??
It seems that you could simply concatenate all the lists and then use Seq.groupBy to get a list that contains unique IDs in the input lists and all values associated with the ID. This can be done using something like:
let data =
[ data1; data2; data3; data4 ] // Create list of lists of items
|> Seq.concat // Concatenate to get a single list of items
|> Seq.groupBy (fun d -> d.ID) // Group elements by ID
seq { for id, values in data ->
// ID is the id and values is a sequence with all values
// (that come from any data source) }
If you want to associate the source (whether it was data1, data2, etc...) with the value then you can first usemap` operation to add an index of the data source:
let addIndex i data =
data |> Seq.map (fun v -> i, v)
let data =
[ List.map (addIndex 1) data1;
List.map (addIndex 2) data2;
List.map (addIndex 3) data3;
List.map (addIndex 4) data4 ]
|> Seq.concat
|> Seq.groupBy (fun (index, d) -> d.ID)
Now, data also contains index of the data source (from 1 to 3), so when iterating over the values, you can use index to find out from which data source the item comes from. Even nicer version can be written using Seq.mapi to iterate over list of data sources and add index to all the values automatically:
let data =
[ data1; data2; data3; data4 ]
|> Seq.mapi (fun index data -> Seq.map (addIndex index) data)
|> Seq.concat
|> Seq.groupBy (fun (index, d) -> d.ID)

Resources