Vala string processing corrupts memory. Why and how to avoid? - glib

I'm not sure whether I'm misusing Vala or GLib.Regex, because I'm new to both. I've created a minimal example, which reproduces the error. From the following code, I'd expect that it prints a INPUTX b six times, prefixed with source and result alternatingly:
public class Test
{
public static void run( string src )
{
var regex = new Regex( "INPUT[0-9]" );
for( int i = 0; i < 3; ++i )
{
stdout.printf( #"-- source: $src\n" );
src = regex.replace( src, -1, 0, "value" );
stdout.printf( #"-- result: $src\n\n" );
}
}
public static void main()
{
Test.run( "a INPUTX b" );
}
}
I wrote this code based on the example in the docs. However, after compiling with valac Test.vala --pkg glib-2.0 and running, I get:
-- source: a INPUTX b
-- result: a INPUTX b
-- source: -- source:
-- result: N�
-- source: -- source:
-- result: PN�
What am I doing wrong?

After looking into the generated C code, I concluded that this rather is a Vala-related issue: Vala puts a g_free to the end of the loop's body, which frees the memory returned by g_regex_replace, and that is referenced by src. But why does Vala do that?
The reason is that (see)
arguments are, by default, unowned.
Hence, when we assign the string object returned by regex.replace to the unowned string src, that reference is (see)
not recorded in the object
and the Vala compile considers it to be safe to dispose - although it's not quiet clear, why this happens particularly at the end of the loop's body.
So the straiht-forward solution is to declare the src argument as owned.

Consider this (nonsense) code:
string foo (string s)
{
return s;
}
void run (string src)
{
var regex = new Regex( "INPUT[0-9]" );
for( int i = 0; i < 3; ++i )
{
stdout.printf( #"-- source: $src\n" );
//src = regex.replace( src, -1, 0, "value" );
src = foo (src);
stdout.printf( #"-- result: $src\n\n" );
}
}
void main ()
{
run( "a INPUTX b" );
}
The Vala compiler (rightfully) complains:
test.vala:13.2-13.16: error: Invalid assignment from owned expression to unowned variable
src = foo (src);
^^^^^^^^^^^^^^^
So there must be something different for methods from vapi files, since it allows the call to Regex.replace ().
I smell a bug somewhere (either in the compiler or the vapi), but I'm not sure.

Related

recursive functions within the class

The below code worked fine with me while it is a stand a lone function
chunkBySize(List list, int size) => list.isEmpty
? list
: ([list.take(size)]..addAll(chunkBySize(list.skip(size), size)));
And I was able to call it smoothly as:
void main() {
var list = new List();
list.addAll([1,2,3]);
print(chunkBySize(list, 2));
}
but While trying to use it in a structure it failed with me, so I forced to write in the below way:
import 'dart:collection';
class functionalList<E> extends ListBase<E> {
final List<E> l = []; // or List l = new List();
functionalList();
.
.
.
chunkBySize(int size) => _chunkBySize(l, size);
_chunkBySize(List list, int size) => list.isEmpty
? list
: ([list.take(size)]..addAll(_chunkBySize(list.skip(size), size)));
}
and was able to call it by:
void main() {
var list = new functionalList();
list.addAll([1,2,3]);
print(list.chunkBySize(2));
}
Is there a way to simplify it within the class boy, i.e. replacing the below by single statement:
chunkBySize(int size) => _chunkBySize(l, size);
_chunkBySize(List list, int size) => list.isEmpty
? list
: ([list.take(size)]..addAll(_chunkBySize(list.skip(size), size)));
You can declare a local helper function inside the chunkBySize method.
class FunctionalList<E> extends ListBase<E> {
...
List<Iterable<E>> chunkBySize(int size) {
List<Iterable<E>> helper(Iterable<E> list, int size) => list.isEmpty
? const []
: ([list.take(size)]..addAll(_chunkBySize(list.skip(size), size);
return helper(this, size);
}
}
I don't find this particularly more readable than having an external helper function, but in this case it actually gives you better typing.
I'd probably write the function as:
List<Iterable<E>> chunkBySize(int size) {
List<Iterable<E>> result = [];
for (int i = 0; i < this.length; i += size) {
result.add(this.getRange(i, min(i + size, this.length)));
}
return result;
}
Is both easier to read and more efficient.
The recursive function that repeatedly callas addAll will have time complexity that is quadratic in the length of the original list because it keeps copying the elements from one list to a another.

How to keep track of a variable with Clang's static analyzer?

Suppose I'm working with the following C snippet:
void inc(int *num) {*num++;}
void dec(int *num) {*num--;}
void f(int var) {
inc(&var);
dec(&var);
}
By using a static analyzer, I want to be able to tell if the value of var didn't change during the function's execution. I know I have to keep its state on my own (that's the point of writing a Clang checker), but I'm having troubles getting a unique reference of this variable.
For example: if I use the following API
void MySimpleChecker::checkPostCall(const CallEvent &Call,
CheckerContext &C) const {
SymbolRef MyArg = Call.getArgSVal(0).getAsSymbol();
}
I'd expect it to return a pointer to this symbol's representation in my checker's context. However, I always get 0 into MyArg by using it this way. This happens for both inc and dec functions in the pre and post callbacks.
What am I missing here? What concepts did I get wrong?
Note: I'm currently reading the Clang CFE Internals Manual and I've read the excellent How to Write a Checker in 24 Hours material. I still couldn't find my answer so far.
Interpretation of question
Specifically, you want to count the calls to inc and dec applied to each variable and report when they do not balance for some path in a function.
Generally, you want to know how to associate an abstract value, here a number, with a program variable, and be able to update and query that value along each execution path.
High-level answer
Whereas the tutorial checker SimpleStreamChecker.cpp associates an abstract value with the value stored in a variable, here we want associate an abstract value with the variable itself. That is what IteratorChecker.cpp does when tracking containers, so I based my solution on it.
Within the static analyzer's abstract state, each variable is represented by a MemRegion object. So the first step is to make a map where MemRegion is the key:
REGISTER_MAP_WITH_PROGRAMSTATE(TrackVarMap, MemRegion const *, int)
Next, when we have an SVal that corresponds to a pointer to a variable, we can use SVal::getAsRegion to get the corresponding MemRegion. For instance, given a CallEvent, call, with a first argument that is a pointer, we can do:
if (MemRegion const *region = call.getArgSVal(0).getAsRegion()) {
to get the region that the pointer points at.
Then, we can access our map using that region as its key:
state = state->set<TrackVarMap>(region, newValue);
Finally, in checkDeadSymbols, we use SymbolReaper::isLiveRegion to detect when a region (variable) is going out of scope:
const TrackVarMapTy &Map = state->get<TrackVarMap>();
for (auto const &I : Map) {
MemRegion const *region = I.first;
int delta = I.second;
if (SymReaper.isLiveRegion(region) || (delta==0))
continue; // Not dead, or unchanged; skip.
Complete example
To demonstrate, here is a complete checker that reports unbalanced use of inc and dec:
// TrackVarChecker.cpp
// https://stackoverflow.com/questions/23448540/how-to-keep-track-of-a-variable-with-clangs-static-analyzer
#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
#include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
#include "clang/StaticAnalyzer/Core/Checker.h"
#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramStateTrait.h"
using namespace clang;
using namespace ento;
namespace {
class TrackVarChecker
: public Checker< check::PostCall,
check::DeadSymbols >
{
mutable IdentifierInfo *II_inc, *II_dec;
mutable std::unique_ptr<BuiltinBug> BT_modified;
public:
TrackVarChecker() : II_inc(nullptr), II_dec(nullptr) {}
void checkPostCall(CallEvent const &Call, CheckerContext &C) const;
void checkDeadSymbols(SymbolReaper &SymReaper, CheckerContext &C) const;
};
} // end anonymous namespace
// Map from memory region corresponding to a variable (that is, the
// variable itself, not its current value) to the difference between its
// current and original value.
REGISTER_MAP_WITH_PROGRAMSTATE(TrackVarMap, MemRegion const *, int)
void TrackVarChecker::checkPostCall(CallEvent const &call, CheckerContext &C) const
{
const FunctionDecl *FD = dyn_cast<FunctionDecl>(call.getDecl());
if (!FD || FD->getKind() != Decl::Function) {
return;
}
ASTContext &Ctx = C.getASTContext();
if (!II_inc) {
II_inc = &Ctx.Idents.get("inc");
}
if (!II_dec) {
II_dec = &Ctx.Idents.get("dec");
}
if (FD->getIdentifier() == II_inc || FD->getIdentifier() == II_dec) {
// We expect the argument to be a pointer. Get the memory region
// that the pointer points at.
if (MemRegion const *region = call.getArgSVal(0).getAsRegion()) {
// Increment the associated value, creating it first if needed.
ProgramStateRef state = C.getState();
int delta = (FD->getIdentifier() == II_inc)? +1 : -1;
int const *curp = state->get<TrackVarMap>(region);
int newValue = (curp? *curp : 0) + delta;
state = state->set<TrackVarMap>(region, newValue);
C.addTransition(state);
}
}
}
void TrackVarChecker::checkDeadSymbols(
SymbolReaper &SymReaper, CheckerContext &C) const
{
ProgramStateRef state = C.getState();
const TrackVarMapTy &Map = state->get<TrackVarMap>();
for (auto const &I : Map) {
// Check for a memory region (variable) going out of scope that has
// a non-zero delta.
MemRegion const *region = I.first;
int delta = I.second;
if (SymReaper.isLiveRegion(region) || (delta==0)) {
continue; // Not dead, or unchanged; skip.
}
//llvm::errs() << region << " dead with delta " << delta << "\n";
if (ExplodedNode *N = C.generateNonFatalErrorNode()) {
if (!BT_modified) {
BT_modified.reset(
new BuiltinBug(this, "Delta not zero",
"Variable changed from its original value."));
}
C.emitReport(llvm::make_unique<BugReport>(
*BT_modified, BT_modified->getDescription(), N));
}
}
}
void ento::registerTrackVarChecker(CheckerManager &mgr) {
mgr.registerChecker<TrackVarChecker>();
}
bool ento::shouldRegisterTrackVarChecker(const LangOptions &LO) {
return true;
}
To hook this in to the rest of Clang, add entries to:
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td and
clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
Example input to test it:
// trackvar.c
// Test for TrackVarChecker.
// The behavior of these functions is hardcoded in the checker.
void inc(int *num);
void dec(int *num);
void call_inc(int var) {
inc(&var);
} // reported
void call_inc_dec(int var) {
inc(&var);
dec(&var);
} // NOT reported
void if_inc(int var) {
if (var > 2) {
inc(&var);
}
} // reported
void indirect_inc(int val) {
int *p = &val;
inc(p);
} // reported
Sample run:
$ gcc -E -o trackvar.i trackvar.c
$ ~/bld/llvm-project/build/bin/clang -cc1 -analyze -analyzer-checker=alpha.core.TrackVar trackvar.i
trackvar.c:10:1: warning: Variable changed from its original value
}
^
trackvar.c:21:1: warning: Variable changed from its original value
}
^
trackvar.c:26:1: warning: Variable changed from its original value
}
^
3 warnings generated.
I think you missed the check that this call event is a call to your function inc/dec. You should have something like
void MySimpleChecker::checkPostCall(const CallEvent &Call,
CheckerContext &C) const {
const IdentifierInfo* callee = Call.getCalleeIdentifier();
if (callee->getName().str() == "inc" || callee->getName().str() == "dec")
SymbolRef MyArg = Call.getArgSVal(0).getAsSymbol();
}

Code substitution for DSL using ANTLR

The DSL I'm working on allows users to define a 'complete text substitution' variable. When parsing the code, we then need to look up the value of the variable and start parsing again from that code.
The substitution can be very simple (single constants) or entire statements or code blocks.
This is a mock grammar which I hope illustrates my point.
grammar a;
entry
: (set_variable
| print_line)*
;
set_variable
: 'SET' ID '=' STRING_CONSTANT ';'
;
print_line
: 'PRINT' ID ';'
;
STRING_CONSTANT: '\'' ('\'\'' | ~('\''))* '\'' ;
ID: [a-z][a-zA-Z0-9_]* ;
VARIABLE: '&' ID;
BLANK: [ \t\n\r]+ -> channel(HIDDEN) ;
Then the following statements executed consecutively should be valid;
SET foo = 'Hello world!';
PRINT foo;
SET bar = 'foo;'
PRINT &bar // should be interpreted as 'PRINT foo;'
SET baz = 'PRINT foo; PRINT'; // one complete statement and one incomplete statement
&baz foo; // should be interpreted as 'PRINT foo; PRINT foo;'
Any time the & variable token is discovered, we immediately switch to interpreting the value of that variable instead. As above, this can mean that you set up the code in such a way that is is invalid, full of half-statements that are only completed when the value is just right. The variables can be redefined at any point in the text.
Strictly speaking the current language definition doesn't disallow nesting &vars inside each other, but the current parsing doesn't handle this and I would not be upset if it wasn't allowed.
Currently I'm building an interpreter using a visitor, but this one I'm stuck on.
How can I build a lexer/parser/interpreter which will allow me to do this? Thanks for any help!
So I have found one solution to the issue. I think it could be better - as it potentially does a lot of array copying - but at least it works for now.
EDIT: I was wrong before, and my solution would consume ANY & that it found, including those in valid locations such as inside string constants. This seems like a better solution:
First, I extended the InputStream so that it is able to rewrite the input steam when a & is encountered. This unfortunately involves copying the array, which I can maybe resolve in the future:
MacroInputStream.java
package preprocessor;
import org.antlr.v4.runtime.ANTLRInputStream;
public class MacroInputStream extends ANTLRInputStream {
private HashMap<String, String> map;
public MacroInputStream(String s, HashMap<String, String> map) {
super(s);
this.map = map;
}
public void rewrite(int startIndex, int stopIndex, String replaceText) {
int length = stopIndex-startIndex+1;
char[] replData = replaceText.toCharArray();
if (replData.length == length) {
for (int i = 0; i < length; i++) data[startIndex+i] = replData[i];
} else {
char[] newData = new char[data.length+replData.length-length];
System.arraycopy(data, 0, newData, 0, startIndex);
System.arraycopy(replData, 0, newData, startIndex, replData.length);
System.arraycopy(data, stopIndex+1, newData, startIndex+replData.length, data.length-(stopIndex+1));
data = newData;
n = data.length;
}
}
}
Secondly, I extended the Lexer so that when a VARIABLE token is encountered, the rewrite method above is called:
MacroGrammarLexer.java
package language;
import language.DSL_GrammarLexer;
import org.antlr.v4.runtime.Token;
import java.util.HashMap;
public class MacroGrammarLexer extends MacroGrammarLexer{
private HashMap<String, String> map;
public DSL_GrammarLexerPre(MacroInputStream input, HashMap<String, String> map) {
super(input);
this.map = map;
// TODO Auto-generated constructor stub
}
private MacroInputStream getInput() {
return (MacroInputStream) _input;
}
#Override
public Token nextToken() {
Token t = super.nextToken();
if (t.getType() == VARIABLE) {
System.out.println("Encountered token " + t.getText()+" ===> rewriting!!!");
getInput().rewrite(t.getStartIndex(), t.getStopIndex(),
map.get(t.getText().substring(1)));
getInput().seek(t.getStartIndex()); // reset input stream to previous
return super.nextToken();
}
return t;
}
}
Lastly, I modified the generated parser to set the variables at the time of parsing:
DSL_GrammarParser.java
...
...
HashMap<String, String> map; // same map as before, passed as a new argument.
...
...
public final SetContext set() throws RecognitionException {
SetContext _localctx = new SetContext(_ctx, getState());
enterRule(_localctx, 130, RULE_set);
try {
enterOuterAlt(_localctx, 1);
{
String vname = null; String vval = null; // set up variables
setState(1215); match(SET);
setState(1216); vname = variable_name().getText(); // set vname
setState(1217); match(EQUALS);
setState(1218); vval = string_constant().getText(); // set vval
System.out.println("Found SET " + vname +" = " + vval+";");
map.put(vname, vval);
}
}
catch (RecognitionException re) {
_localctx.exception = re;
_errHandler.reportError(this, re);
_errHandler.recover(this, re);
}
finally {
exitRule();
}
return _localctx;
}
...
...
Unfortunately this method is final so this will make maintenance a bit more difficult, but it works for now.
The standard pattern to handling your requirements is to implement a symbol table. The simplest form is as a key:value store. In your visitor, add var declarations as encountered, and read out the values as var references are encountered.
As described, your DSL does not define a scoping requirement on the variables declared. If you do require scoped variables, then use a stack of key:value stores, pushing and popping on scope entry and exit.
See this related StackOverflow answer.
Separately, since your strings may contain commands, you can simply parse the contents as part of your initial parse. That is, expand your grammar with a rule that includes the full set of valid contents:
set_variable
: 'SET' ID '=' stringLiteral ';'
;
stringLiteral:
Quote Quote? (
( set_variable
| print_line
| VARIABLE
| ID
)
| STRING_CONSTANT // redefine without the quotes
)
Quote
;

Unit test hanging when using lexical scoping and generics with extends

The behavior seems to be related to the presence of 'extends' as shown with unit test below:
typedef dynamic GetFromThing<T extends Thing>(T target);
typedef GetFromThing<T> DefGetFromThing<T extends Thing>(dynamic def);
typedef dynamic GetFromT<T>(T target);
typedef GetFromT<T> DefGetFromT<T>(dynamic def);
class Thing {
int value;
}
class Test {
static final GetFromThing<Thing> fromThingSimple = (Thing target) {
return target.value;
};
static final DefGetFromThing<Thing> fromThing = (dynamic def) {
return (target) => null;
};
static final DefGetFromT<int> fromInt = (dynamic def) {
return (target) => null;
};
}
main() {
test('this works', () {
var temp1 = Test.fromThingSimple(new Thing());
});
test('this works too', () {
var temp = Test.fromInt(10);
});
test('should let me call lexically closed functions', () {
var temp = Test.fromThing(10); // <-- causes test to hang
});
}
The fact that the VM hangs is clearly a bug. The code is legal. The fact that typedefs describe function types and can be generic whereas function types themselves are never generic is not an issue in principle (though it might be for the implementation).
I find it very interesting that type parameters in typedefs work without some kind of warning or error, since Dart doesn't have generic methods.
You very well may have come across two bugs here, the first that there's no errors, and the second that the VM hangs.

Assign function/method to variable in Dart

Does Dart support the concept of variable functions/methods? So to call a method by its name stored in a variable.
For example in PHP this can be done not only for methods:
// With functions...
function foo()
{
echo 'Running foo...';
}
$function = 'foo';
$function();
// With classes...
public static function factory($view)
{
$class = 'View_' . ucfirst($view);
return new $class();
}
I did not found it in the language tour or API. Are others ways to do something like this?
To store the name of a function in variable and call it later you will have to wait until reflection arrives in Dart (or get creative with noSuchMethod). You can however store functions directly in variables like in JavaScript
main() {
var f = (String s) => print(s);
f("hello world");
}
and even inline them, which come in handy if you are doing recusion:
main() {
g(int i) {
if(i > 0) {
print("$i is larger than zero");
g(i-1);
} else {
print("zero or negative");
}
}
g(10);
}
The functions stored can then be passed around to other functions
main() {
var function;
function = (String s) => print(s);
doWork(function);
}
doWork(f(String s)) {
f("hello world");
}
I may not be the best explainer but you may consider this example to have a wider scope of the assigning functions to a variable and also using a closure function as a parameter of a function.
void main() {
// a closure function assigned to a variable.
var fun = (int) => (int * 2);
// a variable which is assigned with the function which is written below
var newFuncResult = newFunc(9, fun);
print(x); // Output: 27
}
//Below is a function with two parameter (1st one as int) (2nd as a closure function)
int newFunc(int a, fun) {
int x = a;
int y = fun(x);
return x + y;
}

Resources