processing a variable size map in pig

processing a variable size map in pig - parsing

I have a data set that is incoming as
(str,[[40,74],[50,75],[60,73],[70,43]])
and I need to be able to get this in the output variable using pig:
str, 40, 74
str , 50, 75
str, 60, 73
str, 70, 43
and this could be variable set of elements.
Tried with tokenizing and then flatten, but that doesn't help as it creates token using comma. and end up being this way..
str , {([[40), (74]), ... }
Would any one suggestions on if I could use built in functions or write a UDF for this.
many thanks,
Ana

You will need to write a custom UDF to parse this. Assuming your data does not get more complicated than this, you can probably get away with a quick, shallow method of parsing using String.split with delimiter "],[".

Related

How to stop parsing other opts if past opts failed in nom.rs?

Currently i have 2 methods: parse_int and parse_string. They return either one number or one string, respectively. For example, if i have the parser tuple((parse_int, parse_string)), then it will convert " 123 sdf" to (123, "sdf").
My actual parser looks like tuple((parse_string, opt(parse_int), opt(parse_string))). It will succesfully convert " dsf 123 sdf" to ("dfs", Some(234), Some("sdf")), which is good. But if I enter "dfs sdf" as input, then we get ("dfs", None, Some("sdf"))). And I want it to return to me ("dfs", None, None)), because sdf could not be converted to int. For a better understanding, I am making a command args parser.

Use tuple() inside opt(), to tie parse_int and parse_string together:
tuple((parse_string, opt(tuple((parse_int, opt(parse_string))))))
It will produce (&str, Option<(int, Option<&str>)>) instead of (&str, Option<int>, Option<&str>). You can use map() to flatten it if desired.

Other ways to call/eval dynamic strings in Lua?

I am working with a third party device which has some implementation of Lua, and communicates in BACnet. The documentation is pretty janky, not providing any sort of help for any more advanced programming ideas. It's simply, "This is how you set variables...". So, I am trying to just figure it out, and hoping you all can help.
I need to set a long list of variables to certain values. I have a userdata 'ME', with a bunch of variables named MVXX (e.g. - MV21, MV98, MV56, etc).
(This is all kind of background for BACnet.) Variables in BACnet all have 17 'priorities', i.e., every BACnet variable is actually a sort of list of 17 values, with priority 16 being the default. So, typically, if I were to say ME.MV12 = 23, that would set MV12's priority-16 to the desired value of 23.
However, I need to set priority 17. I can do this in the provided Lua implementation, by saying ME.MV12_PV[17] = 23. I can set any of the priorities I want by indexing that PV. (Corollaries - what is PV? What is the underscore? How do I get to these objects? Or are they just interpreted from Lua to some function in C on the backend?)
All this being said, I need to make that variable name dynamic, so that i can set whichever value I need to set, based on some other code. I have made several attempts.
This tells me the object(MV12_PV[17]) does not exist:
x = 12
ME["MV" .. x .. "_PV[17]"] = 23
But this works fine, setting priority 16 to 23:
x = 12
ME["MV" .. x] = 23
I was trying to attempt some sort of what I think is called an evaluation, or eval. But, this just prints out function followed by some random 8 digit number:
x = 12
test = assert(loadstring("MV" .. x .. "_PV[17] = 23"))
print(test)
Any help? Apologies if I am unclear - tbh, I am so far behind the 8-ball I am pretty much grabbing at straws.

Underscores can be part of Lua identifiers (variable and function names). They are just part of the variable name (like letters are) and aren't a special Lua operator like [ and ] are.
In the expression ME.MV12_PV[17] we have ME being an object with a bunch of fields, ME.MV12_PV being an array stored in the "MV12_PV" field of that object and ME.MV12_PV[17] is the 17th slot in that array.
If you want to access fields dynamically, the thing to know is that accessing a field with dot notation in Lua is equivalent to using bracket notation and passing in the field name as a string:
-- The following are all equivalent:
x.foo
x["foo"]
local fieldname = "foo"
x[fieldname]
So in your case you might want to try doing something like this:
local n = 12
ME["MV"..n.."_PV"][17] = 23

BACnet "Commmandable" Objects (e.g. Binary Output, Analog Output, and o[tionally Binary Value, Analog Value and a handful of others) actually have 16 priorities (1-16). The "17th" you are referring to may be the "Relinquish Default", a value that is used if all 16 priorities are set to NULL or "Relinquished".
Perhaps your system will allow you to write to a BACnet Property called "Relinquish Default".

torch FloatTensor toString method?

I have a torch.FloatTensor that is 2 rows and 200 columns. I want to print the lines to a text file. Is there a toString() method for the torch.FloatTensor? If not, how do you print the line to the file? Thanks.
I can convert the FloatTensor into a Lua table:
local line = a[1]
local table = {}
for i=1,line:size(1) do
table[i] = line[i]
end
Is there an easy way to convert the Lua table to a string, so I can write it to file? Thanks!

There is a built-in table conversion called torch.totable. If what you want to do is save and load your tensor, then it's even easier to use Torch native serialization like torch.save('file.txt', tensor, 'ascii').

matlab indexing into nameless matrix [duplicate]

For example, if I want to read the middle value from magic(5), I can do so like this:
M = magic(5);
value = M(3,3);
to get value == 13. I'd like to be able to do something like one of these:
value = magic(5)(3,3);
value = (magic(5))(3,3);
to dispense with the intermediate variable. However, MATLAB complains about Unbalanced or unexpected parenthesis or bracket on the first parenthesis before the 3.
Is it possible to read values from an array/matrix without first assigning it to a variable?

It actually is possible to do what you want, but you have to use the functional form of the indexing operator. When you perform an indexing operation using (), you are actually making a call to the subsref function. So, even though you can't do this:
value = magic(5)(3, 3);
You can do this:
value = subsref(magic(5), struct('type', '()', 'subs', {{3, 3}}));
Ugly, but possible. ;)
In general, you just have to change the indexing step to a function call so you don't have two sets of parentheses immediately following one another. Another way to do this would be to define your own anonymous function to do the subscripted indexing. For example:
subindex = #(A, r, c) A(r, c); % An anonymous function for 2-D indexing
value = subindex(magic(5), 3, 3); % Use the function to index the matrix
However, when all is said and done the temporary local variable solution is much more readable, and definitely what I would suggest.

There was just good blog post on Loren on the Art of Matlab a couple days ago with a couple gems that might help. In particular, using helper functions like:
paren = #(x, varargin) x(varargin{:});
curly = #(x, varargin) x{varargin{:}};
where paren() can be used like
paren(magic(5), 3, 3);
would return
ans = 16
I would also surmise that this will be faster than gnovice's answer, but I haven't checked (Use the profiler!!!). That being said, you also have to include these function definitions somewhere. I personally have made them independent functions in my path, because they are super useful.
These functions and others are now available in the Functional Programming Constructs add-on which is available through the MATLAB Add-On Explorer or on the File Exchange.

How do you feel about using undocumented features:
>> builtin('_paren', magic(5), 3, 3) %# M(3,3)
ans =
13
or for cell arrays:
>> builtin('_brace', num2cell(magic(5)), 3, 3) %# C{3,3}
ans =
13
Just like magic :)
UPDATE:
Bad news, the above hack doesn't work anymore in R2015b! That's fine, it was undocumented functionality and we cannot rely on it as a supported feature :)
For those wondering where to find this type of thing, look in the folder fullfile(matlabroot,'bin','registry'). There's a bunch of XML files there that list all kinds of goodies. Be warned that calling some of these functions directly can easily crash your MATLAB session.

At least in MATLAB 2013a you can use getfield like:
a=rand(5);
getfield(a,{1,2}) % etc
to get the element at (1,2)

unfortunately syntax like magic(5)(3,3) is not supported by matlab. you need to use temporary intermediate variables. you can free up the memory after use, e.g.
tmp = magic(3);
myVar = tmp(3,3);
clear tmp

Note that if you compare running times with the standard way (asign the result and then access entries), they are exactly the same.
subs=#(M,i,j) M(i,j);
>> for nit=1:10;tic;subs(magic(100),1:10,1:10);tlap(nit)=toc;end;mean(tlap)
ans =
0.0103
>> for nit=1:10,tic;M=magic(100); M(1:10,1:10);tlap(nit)=toc;end;mean(tlap)
ans =
0.0101
To my opinion, the bottom line is : MATLAB does not have pointers, you have to live with it.

It could be more simple if you make a new function:
function [ element ] = getElem( matrix, index1, index2 )
element = matrix(index1, index2);
end
and then use it:
value = getElem(magic(5), 3, 3);

Your initial notation is the most concise way to do this:
M = magic(5); %create
value = M(3,3); % extract useful data
clear M; %free memory
If you are doing this in a loop you can just reassign M every time and ignore the clear statement as well.

To complement Amro's answer, you can use feval instead of builtin. There is no difference, really, unless you try to overload the operator function:
BUILTIN(...) is the same as FEVAL(...) except that it will call the
original built-in version of the function even if an overloaded one
exists (for this to work, you must never overload
BUILTIN).
>> feval('_paren', magic(5), 3, 3) % M(3,3)
ans =
13
>> feval('_brace', num2cell(magic(5)), 3, 3) % C{3,3}
ans =
13
What's interesting is that feval seems to be just a tiny bit quicker than builtin (by ~3.5%), at least in Matlab 2013b, which is weird given that feval needs to check if the function is overloaded, unlike builtin:
>> tic; for i=1:1e6, feval('_paren', magic(5), 3, 3); end; toc;
Elapsed time is 49.904117 seconds.
>> tic; for i=1:1e6, builtin('_paren', magic(5), 3, 3); end; toc;
Elapsed time is 51.485339 seconds.

Custom Array Functions in Open Office Calc

Could someone please tell me how to write a custom function in Open Office Basic to be used in Open Office Calc and that returns an array of values. An example of one such built-in function is MINVERSE. I need to write a custom function that populates a range of cells in much the same way.
Help would be much appreciated.

Yay, I just figured it out: all you do is return an array from your macro, BUT you also have to press Ctrl+Shift+Enter when typing in the cell formula to call your function (which is also the case when working with other arrays in calc). Here's an example:
Function MakeArray
Dim ret(2,2)
ret(0,0) = 1
ret(1,0) = 2
ret(0,1) = 3
ret(1,1) = 4
MakeArray = ret
End Function

FWIW, damjan's MakeArray function returns a Variant containing an array, I think. (The type returned by MakeArray is unspecified, so it defaults to Variant. A Variant is a container with a descriptive header, apparently cast as needed by the interpreter.)
Almost, but not quite, the same thing as returning an array. According to http://www.cpearson.com/excel/passingandreturningarrays.htm, Microsoft did not introduce the ability to return an array until 2000. His example [ LoadNumbers(Low As Long, High As Long) As Long()] does not compile in OO, flagging a syntax error on the parens following Long. It appears that OO's Basic emulates the pre-2k VBA.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

processing a variable size map in pig - parsing

You will need to write a custom UDF to parse this. Assuming your data does not get more complicated than this, you can probably get away with a quick, shallow method of parsing using String.split with delimiter "],[".

Related

How to stop parsing other opts if past opts failed in nom.rs?

Other ways to call/eval dynamic strings in Lua?

torch FloatTensor toString method?

matlab indexing into nameless matrix [duplicate]

Custom Array Functions in Open Office Calc

Categories

Resources