CGPDFObject get ID - ios

How can I obtain the ID of a CGPDFObject?
I have this dictionary in my PDF:
3 0 obj
<< /Type /Pages /MediaBox [0 0 612 792] /Count 5 /Kids [ 2 0 R 9 0 R 15 0 R
21 0 R 27 0 R ] >>
endobj
which I obtain using:
CGPDFDictionaryRef pdfPagesObjectRef;
CGPDFDictionaryGetDictionary(pdfCatalogueRef, "Pages", &pdfPagesObjectRef);
Now I am aware of the CGPDFDictionaryApplyFunctionto get the key/value pairs inside the dictionary. But how can I get the own object ID and generation Number? (In this case 3 and 0).
EDIT:
Why I need these information? I am trying to add text annotations to the file. In my understanding, there is no "high level" way to do this in iOS. You have to append a new section (xref table, overridden objects, trailer etc.) by hand.
Therefore it is inevitable to get the IDs and generation numbers of the objects I want to override, and for those referenced in objects I override (eg. /Resources, /Contents in an overridden page).

Related

Problems filling an array with numbers

I created an array and wanted to fill it with numbers. I used a loop but it spoils the previous item when goes to next:
create mass 2 2 * CELLS ALLOT
: [!] ( value index array -- ) + ! ;
: show
4 0 DO mass I + ? LOOP ;
: fill
4 0 DO I I mass [!] show CR LOOP ;
fill
3 3 mass !
show
...so show-word gives me this step by step:
0 0 0 0
256 1 0 0
131328 513 2 0
50462976 197121 770 3
moreover, after 3 3 mass ! the show-word gives me this:
3 0 0 0 ok
I don't understand how to work with arrays and what happens in my loop and why after 3 3 mass ! it gives me not what I get in loop.. please help.
(I understand that forth section is all in my questions now... sorry)
That + word in [!] and show will simply add the index as a number to the address, producing new address that is not aligned to the cell size. This is why you damage the content of mass, and also you don't see its content with show correctly.
Without changing much the stack effects of your words, the fix could look like this:
create mass 2 2 * CELLS ALLOT
: [!] ( value index array -- ) swap cells + ! ;
: show
4 0 DO mass I cells + ? LOOP ;
: fill
4 0 DO I I mass [!] show CR LOOP ;
fill
3 3 mass [!]
show
Note the cells word that will transform the index into correctly sized offset into the array.
edit: in the last manual assignment use your [!] word, or transform the index 3 into correct offset, as mentioned by #dave_thompson_085 in the comments.

hypothesis function space in decision tree

I am reading the book "Artificial Intelligence" by Stuart Russell and Peter Norvig (Chapter 18). The following paragraph is from the decision trees context.
For a wide variety of problems, the decision tree format yields a
nice, concise result. But some functions cannot be represented
concisely. For example, the majority function, which returns true if
and only if more than half of the inputs are true, requires an
exponentially large decision tree.
In other words, decision trees are good for some kinds of functions
and bad for others. Is there any kind of representation that is
efficient for all kinds of functions? Unfortunately, the answer is no.
We can show this in a general way. Consider the set of all Boolean
functions on "n" attributes. How many different functions are in this
set? This is just the number of different truth tables that we can
write down, because the function is defined by its truth table.
A truth table over "n" attributes has 2^n rows, one for each
combination of values of the attributes.
We can consider the “answer” column of the table as a 2^n-bit number
that defines the function. That means there are (2^(2^n)) different
functions (and there will be more than that number of trees, since
more than one tree can compute the same function). This is a scary
number. For example, with just the ten Boolean attributes of our
restaurant problem there are 2^1024 or about 10^308 different
functions to choose from.
What does author mean by "answer" column of the table as a 2^n-bit number that defines the function?
How did author derive (2^(2^n)) different functions?
Please elaborate on above question, preferably with simple example, such as n = 3.
Consider a general truth table for a 3-input function, where the result for each triple is also a Boolean (1 or 0), represented by variables i through 'p':
A B C f(a,b,c)
0 0 0 i
0 0 1 j
0 1 0 k
0 1 1 l
1 0 0 m
1 0 1 n
1 1 0 o
1 1 1 p
We can now represent any function on three variables as an 8-bit number, ijklmnop. For instance, and is 00000001; or is 01111111; one_hot (exactly one input True) is 01101000.
For 3 variables, you have 2^3 bits in the "answer", the complete function definition. Since there are 8 bits in the "answer", there are 2^8 possible functions we can define.
Does that outline the field of comprehension for you?
More detail on an example function
You simply (once you see the pattern) make the eight bits correspond to the entires in the table. For instance, the table for one-hot looks like this:
A B C f(a,b,c)
0 0 0 0
0 0 1 1
0 1 0 1
0 1 1 0
1 0 0 1
1 0 1 0
1 1 0 0
1 1 1 0
Reading down the "answer" column, labeled f(a,b,c), you get the 8-bit sequence 01101000. That 8-bit number is sufficient to completely define the function: the rows listing all the combinations of a, b, c are in a fixed (numerical) sequence.
You can write any such function in a template format:
def and(a, b, c):
and_def = '00000001'
index = 4*a + 2*b + 1*c
return and_def[index]
Now, if we generalize this to any 3-input binary function:
def_bin_func(a, b, c, func_def)
return func_def[4*a + 2*b + 1*c]
If you wish, you can further generalize the template for a list of inputs: concatenate the bits and use that integer as the index into the func_def string.
Does that clear it up?

How to equalize the number of rows per unit in an SPSS file

I have a file with a different number of rows for every "unit", and I'd like all the units to have the same number of rows, by adding the right number of empty rows per unit in the data.
For example:
data list list/ unit serial someData.
begin data.
1 1 54
2 1 57
2 2 87
2 3 91
3 1 17
3 2 43
end data.
what i'd like to get to is this:
1 1 54
1 2 .
1 3 .
2 1 57
2 2 87
2 3 91
3 1 17
3 2 43
3 3 .
I've worked with simple workarounds, for example casestovars => varstocases (keeping nulls), or preparing a base file with all the lines with unit names and serials, and then matching it with the data file so I end up with all the lines and all the data.
Could anyone suggest a more direct (\elegant\efficient\simple) approach?
Thanks!
Cartesian product is what you require here.
Using your example data and downloading the Custom Extension Command, you can solve as below:
data list list/ unit serial someData.
begin data.
1 1 54
2 1 57
2 2 87
2 3 91
3 1 17
3 2 43
end data.
DATASET NAME ds0.
DATASET ACTIVATE ds0.
STATS CARTPROD VAR1=unit VAR2=serial /SAVE OUTFILE="C:\Temp\dsCart".
SORT CASES BY unit serial.
MATCH FILES FILE=* /BY unit serial /FIRST=Primary.
SELECT IF Primary.
MATCH FILES FILE=* /FILE=ds0 /BY unit serial /DROP=Primary.
EXE.
I'm not sure how efficient this Custom Extension Command is so you may want to experiment with different flavours of using STATS CARTPROD. An alternative approach would be to create two datasets (left and right) with your unique unit and serial values and then process these through the STATS CARTPROD command.
You already mentioned it: creating a base file with all the lines with unit names and serials, and then matching it with the data file would be a simple approach. I'd like to outline this one here for other readers.
So for the questions example you would create the base data set like this:
INPUT PROGRAM.
LOOP #i = 1 to 3. /* 3 = maximum value of unit.
LOOP # = 1 to 3. /* 3 = maximum value of serial.
COMPUTE unit = #i.
COMPUTE serial = #j.
END CASE.
END LOOP.
END LOOP.
END FILE.
END INPUT PROGRAM.
DATASET NAME base.
EXECUTE.
The data set will look like this.
unit serial
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
The following match files command will bring the wanted result.
MATCH FILES
/FILE base
/FILE data1
/BY unit serial.
If you want the code be more flexible regarding the maximum value of "unit" and "serial" you can make use of the python extension:
BEGIN PROGRAM.
import spss, spssdata
# list of variable names
variables = ["unit", "serial"]
#fetch variable data
data = spssdata.Spssdata(variables).fetchall()
# get maximum of 'unit' and 'serial'
maxunit = max([int(i[0]) for i in data])
maxserial = max([int(i[1]) for i in data])
# create base data set
spss.Submit('''
INPUT PROGRAM.
LOOP #i = 1 to {maxu}.
LOOP #j = 1 to {maxs}.
COMPUTE unit = #i.
COMPUTE serial = #j.
END CASE.
END LOOP.
END LOOP.
END FILE.
END INPUT PROGRAM.
DATASET NAME base.
EXECUTE.
'''.format(maxu=maxunit, maxs=maxserial))
END PROGRAM.

Functional impact of declaring local variables via function parameters

In writing some one-off Lua code for an answer, I found myself code golfing to fit a function on a single line. While this code did not fit on one line...
foo=function(a,b) local c=bob; some_code_using_c; return c; end
...I realized that I could just make it fit by converting it to:
foo=function(a,b,c) c=bob; some_code_using_c; return c; end
Are there any performance or functional implications of using a function parameter to declare a function-local variable (assuming I know that a third argument will never be passed to the function) instead of using local? Do the two techniques ever behave differently?
Note: I included semicolons in the above for clarity of concept and to aid those who do not know Lua's handling of whitespace. I am aware that they are not necessary; if you follow the link above you will see that the actual code does not use them.
Edit Based on #Oka's answer, I compared the bytecode generated by these two functions, in separate files:
function foo(a,b)
local c
return function() c=a+b+c end
end
function foo(a,b,c)
-- this line intentionally blank
return function() c=a+b+c end
end
Ignoring addresses, the byte code report is identical (except for the number of parameters listed for the function).
You can go ahead and look at the Lua bytecode generated by using luac -l -l -p my_file.lua, comparing instruction sets and register layouts.
On my machine:
function foo (a, b)
local c = a * b
return c + 2
end
function bar (a, b, c)
c = a * b
return c + 2
end
Produces:
function <f.lua:1,4> (4 instructions at 0x80048fe0)
2 params, 4 slots, 0 upvalues, 3 locals, 1 constant, 0 functions
1 [2] MUL 2 0 1
2 [3] ADD 3 2 -1 ; - 2
3 [3] RETURN 3 2
4 [4] RETURN 0 1
constants (1) for 0x80048fe0:
1 2
locals (3) for 0x80048fe0:
0 a 1 5
1 b 1 5
2 c 2 5
upvalues (0) for 0x80048fe0:
function <f.lua:6,9> (4 instructions at 0x800492b8)
3 params, 4 slots, 0 upvalues, 3 locals, 1 constant, 0 functions
1 [7] MUL 2 0 1
2 [8] ADD 3 2 -1 ; - 2
3 [8] RETURN 3 2
4 [9] RETURN 0 1
constants (1) for 0x800492b8:
1 2
locals (3) for 0x800492b8:
0 a 1 5
1 b 1 5
2 c 1 5
upvalues (0) for 0x800492b8:
Not very much difference, is there? If I'm not mistaken, there's just a slightly different declaration location specified for each c, and the difference in the params size, as one might expect.

Return multiple columns / a dataframe in Deedle based on row-wise mapping

I want to look at each row in a frame and construct multiple columns for a new frame based on values in that row.
The final result should be a frame that has the columns of the original frame plus the new columns.
I have a solution but I wonder if there is a better one. I think the best way to explain the desired behavior is with an example. I'm using Deedle's titanic data set:
#r #"F:\aolney\research_projects\braintrust\code\QualtricsToR\packages\Deedle.1.2.3\lib\net40\Deedle.dll";;
#r #"F:\aolney\research_projects\braintrust\code\QualtricsToR\packages\FSharp.Charting.0.90.12\lib\net40\FSharp.Charting.dll";;
#r #"F:\aolney\research_projects\braintrust\code\QualtricsToR\packages\FSharp.Data.2.2.2\lib\net40\FSharp.Data.dll";;
open System
open FSharp.Data
open Deedle
open FSharp.Charting;;
#load #"F:\aolney\research_projects\braintrust\code\QualtricsToR\packages\FSharp.Charting.0.90.12\FSharp.Charting.fsx";;
#load #"F:\aolney\research_projects\braintrust\code\QualtricsToR\packages\Deedle.1.2.3\Deedle.fsx";;
let titanic = Frame.ReadCsv(#"C:\Users\aolne_000\Downloads\titanic.csv");;
This is what that frame looks like:
val titanic : Frame<int,string> =
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
0 -> 1 False 3 Braund, Mr. Owen Harris male 22 1 0 A/5 21171 7.25 S
1 -> 2 True 1 Cumings, Mrs. John Bradley (Florence Briggs Thayer) female 38 1 0 PC 17599 71.2833 C85 C
My approach grabs each row, uses some selection logic, and then returns a new row value as a dictionary. Then I use Deedle's expansion operation to convert the values in this dictionary to new columns.
titanic?test <- titanic |> Frame.mapRowValues( fun x -> if x.GetAs<int>("Pclass") > 1 then dict ["A", 1; "B", 2] else dict ["A", 2 ; "B", 1] );;
titanic |> Frame.expandCols ["test"];;
This gives the following new frame:
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked test.A test.B
0 -> 1 False 3 Braund, Mr. Owen Harris male 22 1 0 A/5 21171 7.25 S 1 2
1 -> 2 True 1 Cumings, Mrs. John Bradley (Florence Briggs Thayer) female 38 1 0 PC 17599 71.2833 C85 C 2 1
Note the last two columns are test.A and test.B. Effectively this approach creates a new frame (A and B) and then joins the frame to the existing frame.
This is fine for my use case but it is probably confusing for others to read. Also it forces the prefix, e.g. "test", on the final columns which isn't highly desirable.
Is there a way to append the new values to the end of the row series represented in the code above by x?
I find your approach quite elegant and clever. Because the new series shares the index with the original frame, it is also going to be pretty fast. So, I think your solution may actually be better than the alternative option (but I have not measured this).
Anyway, the other option would be to return new rows from your Frame.mapRowValues call - so for each row, we return the original row together with the additional columns.
titanic
|> Frame.mapRowValues(fun x ->
let add =
if x.GetAs<int>("Pclass") > 1 then series ["A", box 1; "B", box 2]
else series ["A", box 2 ; "B", box 1]
Series.merge x add)
|> Frame.ofRows

Resources