Find available graphics card memory using Fortran - memory

I am using GlobalMemoryStatusEX in order to find out the amount of memory in my system.
Is there a similar way to find the amount of memory on my graphics card?
Here is a piece of my code :
use kernel32
use ifwinty
implicit none
type(T_MEMORYSTATUSEX) :: status
integer(8) :: RetVal
status%dwLength = sizeof(status)
RetVal = GlobalMemoryStatusEX(status)
write(*,*) 'Memory Available =',status%ullAvailPhys
I am using Intel Visual Fortran 2010 on Windows 7 x64.
Thank you!

Since you tagged this question with the CUDA tag, I'll offer a CUDA answer. Not sure if it really makes sense given your environment.
I haven't tested this on IVF, but it works on gfortran and PGI fortran (linux). You can use the fortran iso_c_binding module available in many implementations to directly call routines from the CUDA runtime API library in fortran code. One of those routines is cudaMemGetInfo.
Here's a fully worked example of calling it from gfortran (on linux):
$ cat cuda_mem.f90
!=======================================================================================================================
!Interface to cuda C subroutines
!=======================================================================================================================
module cuda_rt
use iso_c_binding
interface
!
integer (c_int) function cudaMemGetInfo(fre, tot) bind(C, name="cudaMemGetInfo")
use iso_c_binding
implicit none
type(c_ptr),value :: fre
type(c_ptr),value :: tot
end function cudaMemGetInfo
!
end interface
end module cuda_rt
!=======================================================================================================================
program main
!=======================================================================================================================
use iso_c_binding
use cuda_rt
type(c_ptr) :: cpfre, cptot
integer*8, target :: freemem, totmem
integer*4 :: stat
freemem = 0
totmem = 0
cpfre = c_loc(freemem)
cptot = c_loc(totmem)
stat = cudaMemGetInfo(cpfre, cptot)
if (stat .ne. 0 ) then
write (*,*)
write (*, '(A, I2)') " CUDA error: ", stat
write (*,*)
stop
end if
write (*, '(A, I10)') " free: ", freemem
write (*, '(A, I10)') " total: ", totmem
write (*,*)
end program main
$ gfortran -O3 cuda_mem.f90 -L/usr/local/cuda/lib64 -lcudart -o cuda_mem
$ ./cuda_mem
free: 2755256320
total: 2817982464
$
In windows, you would need to have a properly installed CUDA environment, (which presumes visual studio). You would then need to locate the cudart.lib in that install, and link against that. I'm not 100% sure this would link successfully in IVF, since I don't know if it would link similarly to the way VS libraries link.

Related

Is it possible to use clang to produce RISC V assembly without linking?

I am trying to learn more about compilers and RISC V assembly was specifically designed to be easy to learn and teach. I am interested in compiling some simple C code to assembly using clang for the purpose of understanding the semantics. I'm planning on using venus to step through the assembly and the source code does NOT actually need to be fully compiled to machine code in order to run on a real machine.
I want to avoid compiler optimizations so I can see what I've actually instructed the processor to do.
I don't actually need the program to compile to machine code--I just want the assembly.
I don't want to worry about linking to the system library because this code doesn't actually need to run
The code does not make any explicit use of system calls and so I think a std lib should not be required
This answer seems to indicate that clang definitely can compile to RISC V targets, but it requires having a version of the OS's standard library built for RISC V.
This answer indicates that some form of cross-compiling is necessary, but again I don't need to fully compile the code to machine instructions so this should not apply if I'm understanding correctly.
Use clang -S to stop after generating an assembly file:
$ cat foo.c
int main() { return 2+2; }
$ clang -target riscv64 -S foo.c
$ cat foo.s
.text
.attribute 4, 16
.attribute 5, "rv64i2p0_m2p0_a2p0_c2p0"
.file "foo.c"
.globl main
.p2align 1
.type main,#function
main:
addi sp, sp, -32
sd ra, 24(sp)
sd s0, 16(sp)
addi s0, sp, 32
li a0, 0
sw a0, -20(s0)
li a0, 4
ld ra, 24(sp)
ld s0, 16(sp)
addi sp, sp, 32
ret
.Lfunc_end0:
.size main, .Lfunc_end0-main
.ident "Ubuntu clang version 14.0.0-1ubuntu1"
.section ".note.GNU-stack","",#progbits
.addrsig
You can also use Compiler Explorer conveniently online.

Fortran DLL in Delphi

I am trying to compile some very old Fortran procedures in a DLL, so that to be able to use them with Delphi. Although the Fortran code is not very large (750-800 lines), its structure is very complicated with dozens of GOTO commands and the translation is not easy (I tried to make some useful code of it, but I failed).
Although I am new in Fortran, and not very experienced in calling DLLs, I gradually managed to overcome all the difficulties but for one, that is to be able to call the Fortran Subroutine with multiple dynamic arrays. Here’s a simple example that I created:
SUBROUTINE MYSUB1( NoEquations, INTARR1 )
!DEC$ ATTRIBUTES DLLEXPORT::MYSUB1
!DEC$ ATTRIBUTES C, REFERENCE, ALIAS:'MYSUB1' :: MYSUB1
C
C***************************************************************
C
INTEGER NoEquations, I
INTEGER INTARR1(*)
C
C***************************************************************
C
DO 100, I=1,NoEquations
INTARR1(I) = I
100 CONTINUE
RETURN
C
END
SUBROUTINE MYSUB2( NoEquations, INTARR1, INTARR2 )
!DEC$ ATTRIBUTES DLLEXPORT::MYSUB2
!DEC$ ATTRIBUTES C, REFERENCE, ALIAS:'MYSUB2' :: MYSUB2
C
C***************************************************************
C
INTEGER NoEquations, I
INTEGER INTARR1(*)
INTEGER INTARR2(*)
C
C***************************************************************
C
DO 100, I=1,NoEquations
INTARR2(I) = INTARR1(I)
100 CONTINUE
RETURN
C
END
I compile the Fortran code with mingw-w64 with the following command:
gfortran -shared -mrtd -fno-underscoring -o simple.dll simple.f
And I declare the procedure from within Delphi with:
procedure mysub1(var NoEquations: integer; var INTARR1 : array of integer); stdcall; external 'simple.dll';
procedure mysub2(var NoEquations: integer; var INTARR1,INTARR2: array of integer); stdcall; external 'simple.dll';
The Delphi proram compiles correctly, but when I run it, mysub1 works correctly and updates INTARR1, but mysub2 gives me an Access Violation. Obviously, the second dynamic array confuses the compiler, but I do not know how to make it understand.
Thanks in advance
I do not know Delphi, but here is what you can do to create a DLL accessible from the C language. I hope you find it helpful. Here is your F77 code modified to make it interoperable via iso_c_binding module features of Fortran:
SUBROUTINE MYSUB1(NoEquations,INTARR1) bind(C,name="MYSUB1")
!DEC$ ATTRIBUTES DLLEXPORT :: MYSUB1
use, intrinsic :: iso_c_binding, only: IK => c_int32_t
integer(IK), intent(in), value :: NoEquations
integer(IK), intent(out) :: INTARR1(NoEquations)
integer :: I
DO 100, I=1,NoEquations
INTARR1(I) = I
100 CONTINUE
RETURN
END SUBROUTINE MYSUB1
C***************************************************************
C***************************************************************
SUBROUTINE MYSUB2(NoEquations,INTARR1,INTARR2)
+bind(C,name="MYSUB2")
!DEC$ ATTRIBUTES DLLEXPORT :: MYSUB2
use, intrinsic :: iso_c_binding, only: IK => c_int32_t
integer(IK), intent(in), value :: NoEquations
integer(IK), intent(in) :: INTARR1(NoEquations)
integer(IK), intent(out) :: INTARR2(NoEquations)
integer :: I
DO 100, I=1,NoEquations
INTARR2(I) = INTARR1(I)
100 CONTINUE
RETURN
END SUBROUTINE MYSUB2
Note the many subtle but important changes that I have made to your code to make it interoperable with C:
The bind(C,name="MYSUB1") fixes the subroutine's name so you do not need the extra second compiler directives (which I have now removed from your code).
Also, note the value attribute which tells the compiler to pass by value as is done in C.
Also, note that I define the kinds of integers in the interfaces of the subroutine as c_int32_t to be compatible with the C processor's integer kinds.
Also, note that I have converted your assumed-size INTARR1 and INTARR2 arrays to explicit-shape arrays INTARR1(NoEquations), INTARR1(NoEquations).
Since you have the Intel !DEC$ compiler directives in your code, I assume that you are using the Intel Fortran compiler on Windows. Note that I removed your second lines of directives in both Fortran subroutines.
Now, assuming that you store the above code in a file named mysubs.F, then compiling this file via the Intel Fortran compiler ifort on the Intel-Windows command prompt as in the following command,
ifort mysubs.F /dll /out:libsubs
will generate a DLL named mysubs.dll in the current folder and print the following message on the screen,
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.1.1.216 Build 20200306
Copyright (C) 1985-2020 Intel Corporation. All rights reserved.
Microsoft (R) Incremental Linker Version 14.16.27027.1
Copyright (C) Microsoft Corporation. All rights reserved.
-out:mysubs.dll
-dll
-implib:mysubs.lib
mysubs.obj
Creating library mysubs.lib and object mysubs.exp
To test this DLL, you can try the following C code stored in main.c,
#include <stdio.h>
#include <stdint.h>
#include <string.h>
void MYSUB1(int32_t, int32_t []);
void MYSUB2(int32_t, int32_t [], int32_t []);
int main(int argc, char *argv[])
{
const int32_t NoEquations = 5;
int32_t INTARR1[NoEquations];
int32_t INTARR2[NoEquations];
int loop;
// C rules for argument passing apply here
MYSUB1(NoEquations,INTARR1);
printf("\nINTARR1:\n"); for(loop = 0; loop < NoEquations; loop++) printf("%d ", INTARR1[loop]);
MYSUB2(NoEquations,INTARR1,INTARR2);
printf("\nINTARR2:\n"); for(loop = 0; loop < NoEquations; loop++) printf("%d ", INTARR2[loop]);
return 0;
}
Compiling this C code with the Intel C compiler,
icl main.c -c
prints the following on screen,
Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.1.1.216 Build 20200306
Copyright (C) 1985-2020 Intel Corporation. All rights reserved.
main.c
and generates the C main object file. Finally, link the C object file with the Fortran DLL library to generate the executable via the following command,
icl main.obj mysubs.lib -o main.exe
which prints the following on screen,
Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.1.1.216 Build 20200306
Copyright (C) 1985-2020 Intel Corporation. All rights reserved.
Microsoft (R) Incremental Linker Version 14.16.27027.1
Copyright (C) Microsoft Corporation. All rights reserved.
-out:main.exe
main.obj
mysubs.lib
Testing the DLL call. Simply call the generated executable main.exe,
main.exe
which prints on screen,
INTARR1:
1 2 3 4 5
INTARR2:
1 2 3 4 5
Now, to call this DLL from Delphi, simply assume that you are calling the C functions with the prototypes specified in the C main code. That's all. No further dealing with Fortran from inside Delphi.
Final advice:
Fortran has powerful standard interoperability features, like the ones I have added to your F77 code, that can easily connect almost any Fortran code to any language (via C).
Stay away from FORTRAN77 which is almost a half-century old, even Fortran 90 is already more than 3 decades old. The latest Fortran standard was released in 2018, which together with Fortran 2008, makes Fortran an extremely powerful, high-level, fast, natively vectorized, concurrent, shared, and distributed parallel programming language for numerical computing.

DTrace build in built-in variable stackdepth always return 0

I am recently using DTrace to analyze my iOS app。
Everything goes well except when I try to use the built-in variable stackDepth。
I read the document here where shows the introduction of built-in variable stackDepth.
So I write some D code
pid$target:::entry
{
self->entry_times[probefunc] = timestamp;
}
pid$target:::return
{
printf ("-----------------------------------\n");
this->delta_time = timestamp - self->entry_times[probefunc];
printf ("%s\n", probefunc);
printf ("stackDepth %d\n", stackdepth);
printf ("%d---%d\n", this->delta_time, epid);
ustack();
printf ("-----------------------------------\n");
}
And run it with sudo dtrace -s temp.d -c ./simple.out。 unstack() function goes very well, but stackDepth always appears to 0。
I tried both on my iOS app and a simple C program.
So anybody knows what's going on?
And how to get stack depth when the probe fires?
You want to use ustackdepth -- the user-land stack depth.
The stackdepth variable refers to the kernel thread stack depth; the ustackdepth variable refers to the user-land thread stack depth. When the traced program is executing in user-land, stackdepth will (should!) always be 0.
ustackdepth is calculated using the same logic as is used to walk the user-land stack as with ustack() (just as stackdepth and stack() use similar logic for the kernel stack).
This seems like a bug in the Mac / iOS implementation of DTrace to me.
However, since you're already probing every function entry and return, you could just keep a new variable self->depth and do ++ in the :::entry probe and -- in the :::return probe. This doesn't work quite right if you run it against optimized code, because any tail-call-optimized functions may look like they enter but never return. To solve that, you can turn off optimizations.
Also, because what you're doing looks a lot like this, I thought maybe you would be interested in the -F option:
Coalesce trace output by identifying function entry and return.
Function entry probe reports are indented and their output is prefixed
with ->. Function return probe reports are unindented and their output
is prefixed with <-.
The normal script to use with -F is something like:
pid$target::some_function:entry { self->trace = 1 }
pid$target:::entry /self->trace/ {}
pid$target:::return /self->trace/ {}
pid$target::some_function:return { self->trace = 0 }
Where some_function is the function whose execution you want to be printed. The output shows a textual call graph for that execution:
-> some_function
-> another_function
-> malloc
<- malloc
<- another_function
-> yet_another_function
-> strcmp
<- strcmp
-> malloc
<- malloc
<- yet_another_function
<- some_function

fortran deallocate array but not released in OS

I have the following question:how do I deallocate array memory in type? Like a%b%c,
how do I deallocate c? the specific problem is(The compiler environment I tried are gfortran version gcc4.4.7 and ifort version 18.0.1.OS:linux):
module grist_domain_types
implicit none
public :: aaa
type bbb
real (8), allocatable :: c(:)
end type bbb
type aaa
type(bbb), allocatable :: b(:)
end type aaa
end module grist_domain_types
program main
use grist_domain_types
type(aaa) :: a
integer(4) :: time,i
time=20
allocate(a%b(1:100000000))
call sleep(time)!--------------1
do i=1,100000000
allocate(a%b(i)%c(1:1))
enddo
call sleep(time)!--------------2
do i=1,100000000
deallocate(a%b(i)%c)
enddo
call sleep(time)!--------------3
deallocate(a%b)
call sleep(time)!--------------4
end program
First,"gfortran main.F90 -o main" to compile the program, and run this program. Then I use top -p processID to see memory. When the program is executed to 1, the memory is 4.5G. When the program is executed to 2, the memory is 7.5G. When the program is executed to 3, the memory is also 7.5G(but I think is 4.5G). When the program is executed to 4, the memory is 3G(I think is 0G or close to 0G). So deallocate(a%b(i)%c) does not seem to work. However, I use valgrind to see memory. the memory of this program is all deallocate...I used ifort and gfortran. This problem happens no matter which compiler I use. How to explain this question? I allocate many c array in this way,the program will finally crash due to insufficient memory. And how to solve it?
Take a look at this post from the Intel forum. There are 2 important information in there:
(From Doctor Forran):
When you do a DEALLOCATE, the memory that was allocated returns to the pool used by the memory allocator (on Linux and OS X this is the same as C's malloc/free). The memory is not released back to the OS - it is very rare that this would even be possible. What often happens is that the pattern of allocations and deallocations causes virtual memory to be fragmented, so that while the total available space may be high, there may not be sufficient contiguous space to allocate a large item. Unlike with disks, there is no way to "defrag" memory.
(From Jim Dempsey)
See if you can deallocate the memory in the reverse order in which it was allocated. This can reduce memory fragmentation.
You may also refer to this other Intel post:
During the program run, the Fortran runtime library will manage your heap. Yes, if data is DEALLOCATED, the runtime may choose to wait to release that memory. It's an optimization - if you do another ALLOCATE with the same size it will just reuse those pages. If the heap starts to run low, it will do some collection but not until it's absolutely necessary.
Also, let me add something: Check if there aren't other objects dynamically created in scope, like automatic arrays or temporal array copies. That could be demanding memory that may be freed only when they get out of scope.
Summing up, even if 'top' says the memory is still in use, you should start to worry only if your program starts to crash or if Valgrind shows something wreid.
I modified your program ( to see where it was up to ) and ran on Windows 7 / gFortran 7.2.0. It does not demonstrate the memory retention as you report, as memory reverts to 13 mb. Contrary to my comment, memory demand did not change during initialisation of c.
module grist_domain_types
implicit none
public :: aaa
type bbb
real (8), allocatable :: c(:)
end type bbb
type aaa
type(bbb), allocatable :: b(:)
end type aaa
end module grist_domain_types
program main
use grist_domain_types
type(aaa) :: a
integer(4),parameter :: million = 1000000
integer(4) :: n = 100*million
integer(4) :: time = 5, i, pass
do pass = 1,5
write (*,*) ' go #', pass
allocate(a%b(1:n))
write (*,*) 'allocate b'
call sleep(time)!--------------1
write (*,*) ' go'
do i=1,n
allocate(a%b(i)%c(1:1))
enddo
write (*,*) 'allocate c'
call sleep(time)!--------------2
write (*,*) ' go'
do i=1,n
a%b(i)%c = real(i)
enddo
write (*,*) 'use c'
call sleep(time)!--------------2a
write (*,*) ' go'
do i=1,n
deallocate(a%b(i)%c)
enddo
write (*,*) 'deallocate c'
call sleep(time)!--------------3
write (*,*) ' go'
deallocate(a%b)
write (*,*) 'deallocate b'
call sleep(time)!--------------4
end do
write (*,*) ' done : exit ?'
read (*,*) i
end program
edit: I have given the test a repeat with do pass... to repeat the memory demands. This shows no memory leakage for this Fortran program. I use Task manager to identify memory usage, both for this program and the O/S. Your particular O/S and Fortran compiler may be different.

gfortran 4.8.0 bug? Return type mismatch of function

I just use gfortran 4.1.2 and gfortran 4.8.0 to compile the following simple code:
function foo(a, b) result(res)
integer, intent(in) :: a, b
integer res
res = a+b
end function foo
program test
integer a, b, c
c = foo(a, b)
end program test
gfortran 4.1.2 succeeds, but gfortran 4.8.0 gives the weird error:
test.F90:14.11:
c = foo(a, b)
1
Error: Return type mismatch of function 'foo' at (1) (REAL(4)/INTEGER(4))
Any idea?
There is a bug in your code, namely that you don't specify the return type of the function foo in the main program. Per the Fortran implicit typing rules it thus gets a type of default real.
You should (1) always use 'implicit none', furthermore if at all possible, (2) use modules or contained procedures thus giving you explicit interfaces.
The reason why GFortran 4.1 doesn't report this error is that older versions of GFortran always functioned in a 'procedure at a time' mode; thus the compiler is happily oblivious to any other functions in the same file. Newer versions work in 'whole file' mode (default since 4.6) where the compiler 'sees' all the procedures in a file at a time. This allows the compiler to catch errors such as the one in your code, and also provides some optimization opportunities.

Resources