I am looking into calculating the cyclomatic complexity of java methods using Rascal.
One approach to this would be:
getting the AST from the method
use visit pattern on this tree
check for the following keywords which all increase the CC with one: case, catch, do, while, if, for, foreach
Another one is using graph theory and using the formula e-n+2.
both e and n can be obtained quite easily using rascal functionality. My problem is how do I go about constructing the control flow graph, I found the following module:
analysis::flow::ControlFlow which seems to be a step in the right direction but I am totally lost on where to go from there.
The easiest way is indeed counting the forking nodes on the AST.
In our publication that explained SLOC and CC do not have a strong correlation to each other (preprint), we have also shared our rascal code to calculate CC (see Figure 2).
Here is the code extracted from the article, first create the file's AST with m3, and search for all methods/code blocks in the file. Per method you call this rascal function that visits the AST and counts certain nodes.
int calcCC(Statement impl) {
int result = 1;
visit (impl) {
case \if(_,_) : result += 1;
case \if(_,_,_) : result += 1;
case \case(_) : result += 1;
case \do(_,_) : result += 1;
case \while(_,_) : result += 1;
case \for(_,_,_) : result += 1;
case \for(_,_,_,_) : result += 1;
case \foreach(_,_,_) : result += 1;
case \catch(_,_): result += 1;
case \conditional(_,_,_): result += 1;
case \infix(_,"&&",_) : result += 1;
case \infix(_,"||",_) : result += 1;
}
return result;
}
For construction control flow graphs in Rascal, there is a paper which explains how to do it in pure Rascal and how to raise the abstraction level using a declarative language called DCFlow: http://link.springer.com/chapter/10.1007%2F978-3-319-11245-9_18
Related
I understand other approaches such as using stack and reversing the second half of the linked list. But, what is wrong with my approach.
* Definition for singly-linked list.
* public class ListNode {
* int val;
* ListNode next;
* ListNode() {}
* ListNode(int val) { this.val = val; }
* ListNode(int val, ListNode next) { this.val = val; this.next = next; }
* }
*/
class Solution {
public boolean isPalindrome(ListNode head) {
if(head.next==null){return true;}
while(head!=null){
ListNode ptr=head, preptr=head;
while(ptr.next!=null){ptr=ptr.next;}
if(ptr==head){break;}
while(preptr.next.next!=null){preptr=preptr.next;}
if(head.val==ptr.val){
preptr.next=null;
head=head.next;
}
else{return false;}
}
return true;
}
}```
The following can be said about your solution:
It fails with an exception if head is null. To avoid that, you could just remove the first if statement. That case does not need a separate handling. When the list is a single node, then the first iteration will execute the break and so you'll get true as return value. But at least you will not access ->next when head is null
It mutates the given list. This is not very nice. The caller will not expect this will happen, and may need the original list for other purposes even after this call to isPalindrome.
It is slow. Its time complexity is quadratic. If this is part of a coding challenge, then the test data may be large, and the execution of your function may then exceed the allotted time.
Using a stack is indeed a solution, but it feels like cheating: then you might as well convert the whole list to an array and test whether the array is a palindrome using its direct addressing capabilities.
You can do this with just the list as follows:
Count the number of nodes in the list
Use that to identify the first node of the second half of the list. If the number of nodes is odd, let this be the node after the center node.
Apply a list reversal algorithm on that second half. Now you have two shorter lists.
Compare the values in those two lists are equal (ignore the center node if there was one). Remember the outcome (false or true)
Repeat step 3 so the reversal is rolled back, and the list is back in its original state.
Return the result that was found in step 4.
This takes linear time, and so for larger lists, this should outperform your solution.
I am using python to insert *Include, Input=file.inp into step load definition section to apply for pressure boundary condition on nodes. Here is my script, however, it is inserted in Part level section. I am wondering how to control the insert position using python. Thanks
def GetKeywordPosition(myModel, blockPrefix, occurrence=1):
if blockPrefix == '':
return len(myModel.keywordBlock.sieBlocks)+1
pos = 0
foundCount = 0
for block in myModel.keywordBlock.sieBlocks:
if string.lower(block[0:len(blockPrefix)])==\
string.lower(blockPrefix):
foundCount = foundCount + 1
if foundCount >= occurrence:
return pos
pos=pos+1
return +1
position = GetKeywordPosition(myModel, '*step')+24
myModel.keywordBlock.synchVersions(storeNodesAndElements=False)
myModel.keywordBlock.insert(position, "\n*INCLUDE, INPUT=file.inp")
You can adapt the re module. This should work
import re
# Get keywordBlock object
kw_block = myModel.keywordBlock
kw_block.synchVersions(storeNodesAndElements=False)
sie_blocks = kw_block.sieBlocks
# Define keywords for the search (don't forget to exclude special symbols with '\')
kw_list = ['\*Step, name="My Step"']
# Find index
idx = 0
for kw in kw_list:
r = re.compile(kw)
full_str = filter(r.match, sie_blocks[idx:])[0]
idx += sie_blocks[idx:].index(full_str)
UPD: Some explanations as requested
As keywords in the .inp file could be somewhat repetitive, the main idea here is to create a "search route", where the last pattern in the list will correspond to a place where you want to make your modifications (for example, if you want to find the "*End" keyword after a specific "*Instance" keyword).
So we proceed iteratively through our "search route" == list of search patterns:
Compile the regex expression;
Find the first appearance of the pattern in the sie_blocks starting from the index idx;
Update the idx so the next search is performed from this point.
Hope this will help
I'm trying to use Dafny with (unsigned) bitvectors (following this post).
The following simplified example (permalink) works fine, but when I change to bv32, I get:
Unexpected prover response: timeout
Is it a bug? or an expected performance gap between the two?
Here is the code to make this post self contained:
method bitvectors()
{
var a : bv16 := 0;
// var a : bv32 := 0;
ghost var atag := a;
while (a<0xFFFF)
// while (a<0xFFFFFFFF)
invariant atag < 0xFFFF
//invariant atag < 0xFFFFFFFF
{
atag := a;
a := a+1;
}
}
I'm hoping someone else has a better answer... but basically this is why I stay away from bitvectors :)
I did a little bit of digging, and it seems that on this particular example Dafny gets stuck in the termination check for the loop. At the Boogie level, comparing bitvectors involves converting them to mathematical integers, and then to real numbers, and then comparing those. It's pretty common for solvers to have trouble with these conversion functions, because they cut across different theories.
Sorry I couldn't be more helpful.
I have encountered a strange behavior of the torch.mm function in Lua/Torch. Here is a simple program that demonstrates the problem.
iteration = 0;
a = torch.Tensor(2, 2);
b = torch.Tensor(2, 2);
prod = torch.Tensor(2,2);
a:zero();
b:zero();
repeat
prod = torch.mm(a,b);
ent = prod[{2,1}];
iteration = iteration + 1;
until ent ~= ent
print ("error at iteration " .. iteration);
print (prod);
The program consists of one loop, in which the program multiplies two zero 2x2 matrices and tests if entry ent of the product matrix is equal to nan. It seems that the program should run forever since the product should always be equal to 0, and hence ent should be 0. However, the program prints:
error at iteration 548
0.000000 0.000000
nan nan
[torch.DoubleTensor of size 2x2]
Why is this happening?
Update:
The problem disappears if I replace prod = torch.mm(a,b) with torch.mm(prod,a,b), which suggests that something is wrong with the memory allocation.
My version of Torch was compiled without BLAS & LAPACK libraries. After I recompiled torch with OpenBLAS, the problem disappeared. However, I am still interested in its cause.
The part of code that auto-generates the Lua wrapper for torch.mm can be found here.
When you write prod = torch.mm(a,b) within your loop it corresponds to the following C code behind the scenes (generated by this wrapper thanks to cwrap):
/* this is the tensor that will hold the results */
arg1 = THDoubleTensor_new();
THDoubleTensor_resize2d(arg1, arg5->size[0], arg6->size[1]);
arg3 = arg1;
/* .... */
luaT_pushudata(L, arg1, "torch.DoubleTensor");
/* effective matrix multiplication operation that will fill arg1 */
THDoubleTensor_addmm(arg1,arg2,arg3,arg4,arg5,arg6);
So:
a new result tensor is created and resized with the proper dimensions,
but this new tensor is NOT initialized, i.e. there is no calloc or explicit fill here so it points to junk memory and could contain NaN-s,
this tensor is pushed on the stack so as to be available on the Lua side as the return value.
The last point means that this returned tensor is different from the initial prod one (i.e. within the loop, prod shadows the initial value).
On the other hand calling torch.mm(prod,a,b) does use your initial prod tensor to store the results (behind the scenes there is no need to create a dedicated tensor in that case). Since in your code snippet you do not initialize / fill it with given values it could also contain junk.
In both cases the core operation is a gemm multiplication like C = beta * C + alpha * A * B, with beta=0 and alpha=1. The naive implementation looks like that:
real *a_ = a;
for(i = 0; i < m; i++)
{
real *b_ = b;
for(j = 0; j < n; j++)
{
real sum = 0;
for(l = 0; l < k; l++)
sum += a_[l*lda]*b_[l];
b_ += ldb;
/*
* WARNING: beta*c[j*ldc+i] could give NaN even if beta=0
* if the other operand c[j*ldc+i] is NaN!
*/
c[j*ldc+i] = beta*c[j*ldc+i]+alpha*sum;
}
a_++;
}
Comments are mine.
So:
with torch.mm(a,b): at each iteration, a new result tensor is created without being initialized (it could contain NaN-s). So every iteration presents a risk of returning NaN-s (see above warning),
with torch.mm(prod,a,b): there is the same risk since you do not initialized the prod tensor. BUT: this risk only exists at the first iteration of the repeat / until loop since right after prod is filled with 0-s and re-used for the subsequent iterations.
So this is why you do not observe a problem here (it is less frequent).
In case 1: this should be improved at the Torch level, i.e. make sure the wrapper initializes the output (e.g. with THDoubleTensor_fill(arg1, 0);).
In case 2: you should initialize prod initially and use the torch.mm(prod,a,b) construct to avoid any NaN problem.
--
EDIT: this problem is now fixed (see this pull request).
In one of my SMT program, I use a real term. I need to bound the precision of the real number for increasing the efficiency, as there are almost infinite number of solutions are possible for this number, although only 5/6 digits after the decimal point is necessary. For example, the possible valuation of the real numbers can be the following, though all are the same if we take the first seven digits after the decimal point.
1197325/13631488 = 0.087835238530......
19157213/218103808 = 0.087835298134......
153257613/1744830464 = 0.087835245980......
1226060865/13958643712 = 0.087835243186......
I want that the SMT solver considers all these four number as a single number (so that the search space reduces). Is there any way to control the precision of the real number?
I tried programmatically (using Z3 Dot Net API) to solve this above problem, which is shown in the following. Here DelBP[j] is a real term.
{
BoolExpr[] _Exprs = new BoolExpr[nBuses];
for (j = 1; j <= nBuses; j++)
{
_Exprs[j - 1] = z3.MkEq(DelBP[j], z3.MkDiv(z3.MkInt2Real(DelBP_A[j]), z3.MkInt2Real(DelBP_B[j])));
}
BoolExpr Expr = z3.MkAnd(_Exprs);
s.Assert(Expr);
tw.WriteLine("(assert {0})", Expr.ToString());
}
{
BoolExpr[] _Exprs = new BoolExpr[nBuses];
for (j = 1; j <= nBuses; j++)
{
_Exprs[j - 1] = z3.MkAnd(z3.MkGe(DelBP_A[j], z3.MkInt(1)),
z3.MkLe(DelBP_A[j], z3.MkInt(10000)));
}
BoolExpr Expr = z3.MkAnd(_Exprs);
s.Assert(Expr);
tw.WriteLine("(assert {0})", Expr.ToString());
}
{
BoolExpr[] _Exprs = new BoolExpr[nBuses];
for (j = 1; j <= nBuses; j++)
{
_Exprs[j - 1] = z3.MkAnd(z3.MkGe(DelBP_B[j], z3.MkInt(1)),
z3.MkLe(DelBP_B[j], z3.MkInt(10000)));
}
BoolExpr Expr = z3.MkAnd(_Exprs);
s.Assert(Expr);
tw.WriteLine("(assert {0})", Expr.ToString());
}
However, it did not work. Can anyone help me to solve this problem? Thank you in advance.
If you feel the need to control the "precision" of real-numbers, then that strongly suggests Real is not the correct domain for your problem. Some ideas, depending on what you're really trying to do:
If 6 digits past the decimal point is all you care, then you might get away with using plain Integers, multiplying everything by 1e6 and restricting all variables to be less than 1e6; or some other similar transformation.
Keep in mind that Z3 has support for IEEE-floating point numbers these days, which are by definition of limited precision. So you can use those if your domain is truly the floating-point numbers as prescribed by IEEE-754.
If you're trying to generate "successive" results, i.e., by solving the problem, then adding the constraint that the result should be different than the previous one, and calling Z3 again; then you can consider adding a constraint that says the new result should differ from the old by more than 1e6 in absolute value.
Whether any of this applies depends on the precise problem you're trying to solve. If you can share some more of your problem, people might be able to come up with other ideas. But the first choice should be figuring out if Real is really the domain you want to work with.