std.json.parse and memory management - zig

I have the following zig code (simplified):
const allocator = ...;
const Entry = struct {
name: []u8,
};
fn list() ![]u8 {
var entries = try std.json.parse([]Entry, ...);
defer std.json.parseFree([]Entry, entries, ...);
return entries[0];
}
fn main() !void {
const e = try list();
...
}
Everything is (de)allocated with allocator.
I'd like to return entries[0] and recycle everything else. The problem is that parseFree here recycles everything including entries[0] so I can't seemingly use this function.
So what's the most effective way to do that without copying? What if Entry is a big structure and I want to return just one its field, say Entry.name (and, again, recycle everything else)?

The quickest way of achieving what you want is to not use parseFree but instead:
const first = entries[0].name;
for (entries[1..entries.len]) |e| alloc.free(e.name); // frees individual strings
alloc.free(entries); // frees the array of structs
return Entry { .name = first };
This is fairly straightforward but relies on understanding how memory is allocated by std.json and the implementation might one day want to allocate things differently, causing the code above to break.
Because of that, my recommendation would be to just copy the string and only resort to this type of extra complexity once you have demonstrated that avoiding that copy brings perceivable benefits.

Related

Object methods and fields co-exist in Lua library

EDIT: Turns out this is not something possible with Lua, having the __index method AND methods like class instance methods. It's either or.
Trying to get my Lua interface to work where both fields and instance methods are supported. It seems that by manipulating the initialization, I can only get the functions (_f), or the methods (_m) to work, not both.
I feel like it's something really simple I'm just missing.
How I initialize the library:
void PushUserdata(const void *data, const char *metatable)
{
const void **wrapped_ptr = (const void**)lua_newuserdata(l, sizeof(const void*));
*wrapped_ptr = data;
luaL_getmetatable(l, metatable);
lua_setmetatable(l, -2);
}
static int player_head(lua_State *L)
{
if (!Player::playerHead)
lua_pushnil(L);
else
PushUserdata(Player::playerHead, "player");
return 1;
}
static int playerget(lua_State *L)
{
Player *player = *CHECKPLAYER(L, 1); // Get double pointer and dereference to get real pointer
const char *field = luaL_checkstring(L, 2);
if (!strcmp(field, "next"))
{
if (!player->next)
lua_pushnil(L);
else
PushUserdata(player->next, "player");
}
else if (!strcmp(field, "prev"))
{
if (!player->prev)
lua_pushnil(L);
else
PushUserdata(player->prev, "player");
}
else if (!strcmp(field, "obj"))
{
if (!player->obj)
lua_pushnil(L);
else
PushUserdata(player->obj, "wobj");
}
else if (!strcmp(field, "AddCollisionObjHook")) // This ends up here if __index is in the table below...
{
}
else
return 0;
return 1;
}
static const struct luaL_Reg playerlib_f[] = {
{"head", player_head},
{"AddPreThinker", AddPreThinker},
{"AddPostThinker", AddPostThinker},
{NULL, NULL}
};
static const struct luaL_Reg playerlib_m[] = {
{"__tostring", player2string},
{"__index", playerget},
{"__newindex", playerset},
{"AddCollisionObjHook", AddCollisionObjHook},
{NULL, NULL}
};
int Lua_PlayerLib(lua_State *L)
{
luaL_newmetatable(L, "player");
lua_pushvalue(L, -1); // duplicates the metatable...but why?
luaL_setfuncs(L, playerlib_m, 0);
luaL_newlib(L, playerlib_f, 0);
lua_setglobal(L, "player");
return 1;
}
Lua script:
me = playerlib.head()
me:AddCollisionObjHook(playerHitObj)
Error message:
Warning: [string "postload.lua"]: attempt to call method 'AddCollisionObjHook' (a nil value)
'me' is absolutely a valid non-nil value.
What you're trying to do is possible, but not in the way you're trying to do it.
I think it's worth reviewing how method calls and metatables/metamethods work, and what your code, as written, is actually doing. The tl;dr is:
method calls are just normal field lookups
metatables, and the metamethods they contain, are operator overloads, not method definitions
if you're implementing this for userdata you need an __index metamethod that can handle both field and method lookups
First of all, Lua has no inbuilt distinction between "methods" and "fields". You may find it convenient to differentiate the two when organizing your code, but as far as the Lua language is concerned methods and fields are the same thing. A method is just a field where the key is a valid lua identifier and the value is a function.
So, when you write something like me:AddCollisionObjHook(playerHitObj), what actually happens is something like:
local self = me
local method = self["AddCollisionObjHook"]
method(self, playerHitObj)
(Two notes on this:
no actual new locals are created; this all happens in the internals of the Lua interpreter.
self["AddCollisionObjHook"] and self.AddCollisionObjHook are two ways of writing the same thing; the latter is just a convenient shortcut for the former.)
So, how does that self["AddCollisionObjHook"] lookup work? The same way any other field lookup works. The Lua Manual goes into detail on this, including pseudocode, but the part relevant to your code is:
-- We're looking up self[key] but self is userdata, not table
local mt = getmetatable(self)
if mt and type(mt.__index) == 'function' then
-- user provided an __index function
return mt.__index(self, key)
elseif mt and mt.__index ~= nil then
-- user provided an __index table (or table-like object)
-- retry the lookup using it
return mt.__index[key]
else
-- no metatable, or metatable lacks __index metamethod
error(...) -- don't know how to do field lookup on this type!
end
Note that at no point in this process are fields other than __index looked up in the metatable. The metatable exists only to tell Lua how to implement operators for types that don't normally have them; in this case the field lookup ("index") operator ([], and its aliases . and :) for a specific kind of userdata. It's entirely down to __index itself to handle the actual process of turning field names into values, either by being a table that the lookup can be retried with or by being a function that can return the associated value.
So, this brings us to the answer of how to support both (settable) fields and (callable) methods:
__newindex needs to understand how to set fields
__index needs to understand how to return both field values and method implementations
Since, from Lua's perspective, both field lookups and method lookups are the same operation, and thus __index gets used for both.
In light of that, how should we organize the code to support both, and how can we restructure your code to work? There's lots of ways to do this, although for the purposes of this answer I'm going to make a few assumptions:
fields are stored entirely C-side, with no corresponding data to manage in Lua
methods cannot be overwritten by Lua code
metamethods are stored separately from instance methods
The last one is not strictly necessary; in fact it's very common to store both metamethods and instance methods in the same table (I usually do this myself). However, I think this also tends to engender confusion among Lua newbies about the distinction between them, so the interest of making the code as clear as possible, I'm separating them out in this answer.
With that in mind, let's rework the setup code. I've looked at your edits to try to reconstruct what you originally had in mind.
static int playerget(lua_State *L)
{
Player *player = *CHECKPLAYER(L, 1);
const char *field = luaL_checkstring(L, 2);
// Check if it's a method, by getting the method table
// and then seeing if the key exists in it.
// This code can be re-used (or factored out into its own function)
// at the start of playerset() to raise an error if the lua code tries
// to overwrite a method.
lua_getfield(L, LUA_REGISTRYINDEX, "player-methods");
lua_getfield(L, -1, field);
if (!lua_isnil(L, -1)) {
// Lookup in methods table successful, so return the method impl, which
// is now on top of the stack
return 1;
} else {
// No method, so clean up the stack of both the nil value and the
// table of methods we got it from.
lua_pop(L, 2);
}
if (!strcmp(field, "next"))
// ... code for reading fields rather than methods goes here ... //
}
// Functions that are part of the player library rather than tied to any
// one player instance.
static const struct luaL_Reg playerlib_api[] = {
// player.head() -> returns the first player
{"head", player_head},
{NULL, NULL}
};
// Metamethods defining legal operators on player-type objects.
static const struct luaL_Reg playerlib_metamethods[] = {
// Overrides the tostring() library function
{"__tostring", player2string},
// Adds support for the table read operators:
// t[k], t.k, and t:k(...)
{"__index", playerget},
// Adds support for the table write operators:
// t[k]=v and t.k=v
{"__newindex", playerset},
{NULL, NULL}
};
// Instance methods for player-type objects.
static const struct luaL_Reg playerlib_methods[] = {
// player_obj:AddCollisionObjHook(hook)
{"AddCollisionObjHook", AddCollisionObjHook},
// player_obj:AddPreThinker(thinker)
{"AddPreThinker", AddPreThinker},
// player_obj:AddPostThinker(thinker)
{"AddPostThinker", AddPostThinker},
{NULL, NULL}
};
int Lua_PlayerLib(lua_State *L)
{
// Create the metatable and fill it with the stuff from playerlib_metamethods.
// Every time a player object is pushed into Lua (via player_head() or similar)
// this metatable will get attached to it, allowing lua to see the __index,
// __newindex, and __tostring metamethods for it.
luaL_newmetatable(L, "player");
luaL_setfuncs(L, playerlib_metamethods, 0);
lua_pop(L, 1);
// Create the method table and fill it.
// We push the key we're going to be storing it in the registry under,
// then the table itself, then store it into the registry.
lua_pushliteral(L, "player-methods");
luaL_newlib(L, playerlib_methods, 0);
lua_settable(L, LUA_REGISTRYINDEX);
// Initialize the `player` library with the API functions.
luaL_newlib(L, playerlib_api, 0);
// Set that table as the value of the global "player".
// This also pops it, so we duplicate it first...
lua_pushvalue(L, -1);
lua_setglobal(L, "player");
// ...so that we can also return it, so that constructs like
// local player = require 'player'; work properly.
return 1;
}
Breaking it down, this gives us three tables:
player, which holds the actual library API like player.head()
REGISTRY["player"], which holds the metatable shared by all player objects
__tostring which is invoked for prettyprinting
__newindex which is invoked for field writes
__index which is invoked for field reads (including method lookup!)
REGISTRY["player-methods"], which holds all the instance methods
__index looks here for method implementations
As noted above, I've kept the table of metamethods and the table of methods separate in the hopes of minimizing conceptual confusion; idiomatic code would probably store all the methods and metamethods together, and use luaL_getmetafield() at the start of playerset() and playerget() to do method lookup.

How to modify a functions internal variables at runtime and pass it to another function?

Functions in Dart are first-class objects, allowing you to pass them to other objects or functions.
void main() {
var shout = (msg) => ' ${msg.toUpperCase()} ';
print(shout("yo"));
}
This made me wonder if there was a way to modify a function a run time, just like an object, prior to passing it to something else. For example:
Function add(int input) {
return add + 2;
}
If I wanted to make the function a generic addition function, then I would do:
Function add(int input, int increment) {
return add + increment;
}
But then the problem would be that the object I am passing the function to would need to specify the increment. I would like to pass the add function to another object, with the increment specified at run time, and declared within the function body so that the increment cannot be changed by the recipient of the function object.
The answer seems to be to use a lexical closure.
From here: https://dart.dev/guides/language/language-tour#built-in-types
A closure is a function object that has access to variables in its
lexical scope, even when the function is used outside of its original
scope.
Functions can close over variables defined in surrounding scopes. In
the following example, makeAdder() captures the variable addBy.
Wherever the returned function goes, it remembers addBy.
/// Returns a function that adds [addBy] to the
/// function's argument.
Function makeAdder(int addBy) {
return (int i) => addBy + i;
}
void main() {
// Create a function that adds 2.
var add2 = makeAdder(2);
// Create a function that adds 4.
var add4 = makeAdder(4);
assert(add2(3) == 5);
assert(add4(3) == 7);
}
In the above cases, we pass 2 or 4 into the makeAdder function. The makeAdder function uses the parameter to create and return a function object that can be passed to other objects.
You most likely don't need to modify a closure, just the ability to create customized closures.
The latter is simple:
int Function(int) makeAdder(int increment) => (int value) => value + increment;
...
foo(makeAdder(1)); // Adds 1.
foo(makeAdder(4)); // Adds 2.
You can't change which variables a closure is referencing, but you can change their values ... if you an access the variable. For local variables, that's actually hard.
Mutating state which makes an existing closure change behavior can sometimes be appropriate, but those functions should be very precise about how they change and where they are being used. For a function like add which is used for its behavior, changing the behavior is rarely a good idea. It's better to replace the closure in the specific places that need to change behavior, and not risk changing the behavior in other places which happen to depend on the same closure. Otherwise it becomes very important to control where the closure actually flows.
If you still want to change the behavior of an existing global, you need to change a variable that it depends on.
Globals are easy:
int increment = 1;
int globalAdder(int value) => value + increment;
...
foo(globalAdd); // Adds 1.
increment = 2;
foo(globalAdd); // Adds 2.
I really can't recommend mutating global variables. It scales rather badly. You have no control over anything.
Another option is to use an instance variable to hold the modifiable value.
class MakeAdder {
int increment = 1;
int instanceAdd(int value) => value + increment;
}
...
var makeAdder = MakeAdder();
var adder = makeAdder.instanceAdd;
...
foo(adder); // Adds 1.
makeAdder.increment = 2;
foo(adder); // Adds 2.
That gives you much more control over who can access the increment variable. You can create multiple independent mutaable adders without them stepping on each other's toes.
To modify a local variable, you need someone to give you access to it, from inside the function where the variable is visible.
int Function(int) makeAdder(void Function(void Function(int)) setIncrementCallback) {
var increment = 1;
setIncrementCallback((v) {
increment = v;
});
return (value) => value + increment;
}
...
void Function(int) setIncrement;
int Function(int) localAdd = makeAdder((inc) { setIncrement = inc; });
...
foo(localAdd); // Adds 1.
setIncrement(2);
foo(localAdd); // Adds 2.
This is one way of passing back a way to modify the local increment variable.
It's almost always far too complicated an approach for what it gives you, I'd go with the instance variable instead.
Often, the instance variable will actually represent something in your model, some state which can meaningfully change, and then it becomes predictable and understandable when and how the state of the entire model changes, including the functions referring to that model.
Using partial function application
You can use a partial function application to bind arguments to functions.
If you have something like:
int add(int input, int increment) => input + increment;
and want to pass it to another function that expects to supply fewer arguments:
int foo(int Function(int input) applyIncrement) => applyIncrement(10);
then you could do:
foo((input) => add(input, 2); // `increment` is fixed to 2
foo((input) => add(input, 4); // `increment` is fixed to 4
Using callable objects
Another approach would be to make a callable object:
class Adder {
int increment = 0;
int call(int input) => input + increment;
}
which could be used with the same foo function above:
var adder = Adder()..increment = 2;
print(foo(adder)); // Prints: 12
adder.increment = 4;
print(foo(adder)); // Prints: 14

Accessing <objc/runtime.h> from Cycript

I wan't to be able to use associated objects and ISA swizzle, but I can't figure out how to import objc/runtime.h for use with Cycript. I have tried in both the console and in .js files but no luck.
Ideally I'd like to figure out how to include frameworks as well.
It seems like a subset of runtime.h is included by default in the Cycript environment. For example, class_copyMethodList and objc_getClass work without any added effort.
var count = new new Type(#encode(int));
var methods = class_copyMethodList(objc_getClass("NSObject"), count);
However objc_setAssociatedObject is not referenced:
objc_getAssociatedObject(someVar, "asdf")
#ReferenceError: Can't find variable: objc_getAssociatedObject
After a lot of searching, I realized the answer was right under my nose. limneos's weak_classdump uses the runtime to do it's dump and Cycript's tutorial shows how to grab C functions.
The solution I ended up with is this:
function setAssociatedObject(someObject, someValue, constVoidPointer) {
SetAssociatedObject = #encode(void(id, const void*, id, unsigned long))(dlsym(RTLD_DEFAULT, "objc_setAssociatedObject"))
SetAssociatedObject(someObject, constVoidPointer, someValue, 1)
}
function getAssociatedObject(someObject, constVoidPointer) {
GetAssociatedObject = #encode(id(id, const void*))(dlsym(RTLD_DEFAULT, "objc_getAssociatedObject"))
return GetAssociatedObject(someObject, constVoidPointer)
}
It is used like this:
# create void pointer (probably should be a global variable for later retrieval)
voidPtr = new new Type(#encode(const void))
someVar = [[NSObject alloc] init]
setAssociatedObject(someVar, #[#"hello", #"world"], voidPtr)
getAssociatedObject(someVar, voidPtr)
# spits out #["Hello", "World"]

Any max and argmax?

I want to use max and argmax on collections. I saw an issue for max but not argmax and it looks different to what I had in mind. Everything here is also applicable to min and argmin. Example of code with equivalent behaviour (minus error handling):
import 'dart:html';
import 'dart:math' as math;
void main() {
final nums = [3, 1, 2];
final animalNames = ['cat', 'turtle', 'sheep'];
final highest = nums.reduce(nums[0], (stored, curr) => math.max(stored, curr));
final longest = animalNames.reduce(animalNames[0], longerString);
print('highest: $highest');
print('longest: $longest');
}
String longerString(final String first, final String second) {
if (first.length < second.length) {
return second;
} else {
return first;
}
}
I've been searching the API but haven't find anything like:
final highest = nums.max;
final longest = animalNames.argmax((name) => name.length);
Similar to Ruby max and max_by.
Questions:
Are there API calls like these already (under some name I haven't checked)?
Are there any plans to make them?
Should I raise an issue?
I don't think there is, but here's a pretty trivial implementation based on your example:
animalNames.reduce("", (prev, cur) => prev.length > cur.length ? prev : cur);
This is faster than sorting if you only want to look it up once, but this can get a little unwieldy if you have complex logic for your argmax.
In one of the Dartisans videos, one of the devs mentioned how they're trying to make a lot of the common patterns easy, so I think something like this would have a pretty good chance of making it into the standard library if you make a good enough case for it.

Firefox extension javascript module: what happens to unexported symbols?

I'm just starting to write my first FF extension using javascript modules (rather than trying an XPCOM component) but I'm fuzzy on what happens when a jsm is loaded.
Q: Does the file scope act as a closure for non-exported symbols, or are unreferenced symbols simply garbage collected?
For an example, could/should a module be written as follows?
//modules/myModule.js
var EXPORTED_SYMBOLS = [ 'foo', 'bar' ];
var data;
function foo(){
return data;
}
function bar(newData){
data = newData;
}
importing it somewhere else as:
var aNS = {};
Components.utils.import("resource://myext/myModule.js", aNS);
aNS.bar('it works?');
alert(aNS.foo()); //alert: 'it works?'
Even if a module can be written this way, is there a good reason not to?
It acts in closure manner, but only if you're referencing it somewhere. In my extension I have something like:
var EXPORTED_SYMBOLS = ['foo'];
let cacheService = Components.classes["#mozilla.org/network/cache-service;1"]
.getService(Components.interfaces.nsICacheService);
let foo = {
svc : cacheService,
dosomethingwithit : function(){this.svc.somemethod();}
}
So because it is referenced by foo.svc my cacheService is well and alive. If I wasn't referencing it anywhere it would've been garbage collected - which is to no surprise since if it's not used who cares.
But now thinking a bit more about it, I'm just wandering why I did it this way. Doesn't really makes much sense, or difference. I could've had something like:
var EXPORTED_SYMBOLS = ['foo'];
function something(){
this.svc = Components.classes["#mozilla.org/network/cache-service;1"]
.getService(Components.interfaces.nsICacheService);
}
let foo = new something();
I think I just liked the looks of the first approach more.

Resources