I have run my simple tensorflow.js app on Chrome (Windows10), Android, and iOS - and it is working. But when I try to run on MS Edge (Windows10) I get this error:
Failed to create D3D shaders.
index.ts (67,1)
SCRIPT5022: Failed to link vertex and fragment shaders.
The error occurs when I am trying to make a prediction (so the GPU is used):
function predict() {
var cData = ctx.getImageData(0, 0, canvas.width, canvas.height);
var cdata = cData.data;
for (var i = 0; i < cdata.length; i += 4) { // to grayscale
cdata[i] = (cdata[i] + cdata[i + 1] + cdata[i + 2]) / 3;
}
var x = tf.browser.fromPixels(cData, 1).asType('float32'); // keep only one channel
x = tf.image.resizeNearestNeighbor(x, [28, 28]); // resize
x = x.expandDims();
x = x.div(255);
var prediction;
tf.tidy(() => {
const output = model.predict(x);
const axis = 1;
prediction = Array.from(output.argMax(axis).dataSync());
preds = output.arraySync();
});
}
The printout on the console:
C:\fakepath(114,28-43): warning X3556: integer divides may be much slower, try using uints if possible.
C:\fakepath(115,29-36): warning X3556: integer divides may be much slower, try using uints if possible.
C:\fakepath(106,7-48): error X3531: can't unroll loops marked with loop attribute
C:\fakepath(114,28-43): warning X3556: integer divides may be much slower, try using uints if possible.
C:\fakepath(115,29-36): warning X3556: integer divides may be much slower, try using uints if possible.
C:\fakepath(126,2-29): warning X3550: array reference cannot be used as an l-value; not natively addressable, forcing loop to unroll
C:\fakepath(126,2-29): error X3500: array reference cannot be used as an l-value; not natively addressable
C:\fakepath(106,7-48): error X3511: forced to unroll loop, but unrolling failed.
C:\fakepath(104,7-48): error X3511: forced to unroll loop, but unrolling failed.
Warning: D3D shader compilation failed with default flags. (ps_5_0)
Retrying with skip validation
C:\fakepath(114,28-43): warning X3556: integer divides may be much slower, try using uints if possible.
C:\fakepath(115,29-36): warning X3556: integer divides may be much slower, try using uints if possible.
C:\fakepath(126,2-29): warning X3550: array reference cannot be used as an l-value; not natively addressable, forcing loop to unroll
C:\fakepath(126,2-29): error X3500: array reference cannot be used as an l-value; not natively addressable
C:\fakepath(106,7-48): error X3511: forced to unroll loop, but unrolling failed.
C:\fakepath(104,7-48): error X3511: forced to unroll loop, but unrolling failed.
Warning: D3D shader compilation failed with skip validation flags. (ps_5_0)
Retrying with skip optimization
C:\fakepath(114,28-43): warning X3556: integer divides may be much slower, try using uints if possible.
C:\fakepath(115,29-36): warning X3556: integer divides may be much slower, try using uints if possible.
C:\fakepath(126,2-29): warning X3550: array reference cannot be used as an l-value; not natively addressable, forcing loop to unroll
C:\fakepath(126,2-29): error X3500: array reference cannot be used as an l-value; not natively addressable
C:\fakepath(106,7-48): error X3511: forced to unroll loop, but unrolling failed.
C:\fakepath(104,7-48): error X3511: forced to unroll loop, but unrolling failed.
Warning: D3D shader compilation failed with skip optimization flags. (ps_5_0)
Failed to create D3D shaders.
webgl_util.ts (155,5)
SCRIPT5022: Failed to link vertex and fragment shaders.
Is it a problem with some browser setting? Is tensorflow.js supporting Edge? I guess it must support Edge. tfjs 1.0 is used.
I upgraded face-api.min.js to the 0.22.2 version and the error is gone.
Here is the source to latest version:
https://github.com/justadudewhohacks/face-api.js/
Related
I came across this strange phenomenon while playing with unsafe Rust. I think this code should make a segmentation fault but it does not. Am I missing something?
I tried to set a pointer to a variable with a shorter lifetime and then dereference it.
// function that sets a pointer to a variable with a shorter lifetime
unsafe fn what(p: &mut *const i32) {
let a = 2;
*p = &a;
//let addr = *p; // I will talk about this later
println!("inside: {}", **p);
}
fn main() {
let mut p: *const i32 = 0 as *const i32;
unsafe {
what(&mut p);
// I thought this line would make a segfault because 'a' goes out of scope at the end of the function making the address invalid
println!("segfault? {}", *p);
// Even more unsettling: I can increment the address and still dereference it.
p = ((p as usize) + 1) as *const i32;
println!("I'm definitely missing something: {}", *p);
}
}
This program outputs:
inside: 2
segfault? {random number around 20000. probably uninitialized memory but why?}
I'm definitely missing something: {uninitialized memory}
If I uncomment the line
let addr = *p;
the second row becomes
segfault? 2
Why is there no segfault? Can the compiler extend the lifetime of a or the address p points at for safety? Am I missing some basic information about pointers in Rust?
This isn't unique to Rust. See:
Why doesn't the following code produce a segmentation fault?
Accessing an array out of bounds gives no error, why?
Can a local variable's memory be accessed outside its scope?
TL;DR: you have lied to the compiler. Your unsafe block does not uphold the requirements of safe code. This means you have created undefined behavior and the program is allowed to do whatever it wants. That could mean:
it crashes (such as via a segfault)
it runs perfectly
it erases your hard drive
it empties your bank account
it generates nasal demons
it eats your laundry
etc.
Segfaults are never a guaranteed outcome. A segmentation fault occurs when you access memory that is outside of the chunk of memory your thread/process has. The memory of the stack is well inside of that chunk, so it's unlikely to trigger the case.
See also:
Why doesn't this Rust program crash?
I have a finite difference code for wave propagation, because there is a lot of temporary mixed derivative term, I defined a temporary memory buffer and separate them into chunks to store each derivative term for memory efficiency. The code looks like
Wrk = malloc(2*(4*nxe*(2*ne+1) + 15*nxe)*sizeof(float));
computing function:
float *dudz = Wrk + NE;
float *dqdz = dudz + nxe;
for (int i=ix0_1; i<ixh_1; i++)
dudz [i] = hdzi*(u[i+nxe]-u[i-nxe]);
The problem for me, is that the code runs fine with Intel compiler 12, however it will blow up when compiling it with intel compiler 13 and 14.
All the compiling from intel compiler 12, 13 and 14 will optimize the code above by vectorizing the loops. If I turn off the compiler optimization for intel compiler 13 and 14, by defining
volatile float *dudz = Wrk + NE;
The code will also run fine although slower.
I would greatly appreciate if any of you could give me some advice,
Thank you so much,
gqchen
I'm trying to obtain how much free memory I have on the device. To do this I call the cuda function cuMemGetInfo from a fortran code, but it returns negative values for the free amount of memory, so there's clearly something wrong.
Does anyone know how I can do that?
Thanks
EDIT:
Sorry, in fact my question was not very clear. I'm using OpenACC in Fortran and I call the C++ cuda function cudaMemGetInfo. Finally I could fix the code, the problem was effectively the kind of variables that I was using. Switching to size_ fixed everything. This is the interface in fortran that I'm using:
interface
subroutine get_dev_mem(total,free) bind(C,name="get_dev_mem")
use iso_c_binding
integer(kind=c_size_t)::total,free
end subroutine get_dev_mem
end interface
and this the cuda code
#include <cuda.h>
#include <cuda_runtime.h>
extern "C" {
void get_dev_mem(size_t& total, size_t& free)
{
cuMemGetInfo(&free, &total);
}
}
There's one last question: I pushed an array on the gpu and I checked its size using cuMemGetInfo, then I computed it's size counting the number of bytes, but I don't have the same answer, why? In the first case it is 3052mb large, in the latter 3051mb. This difference of 1mb could be the size of the array descriptor? Here there's the code that I used:
integer, parameter:: long = selected_int_kind(12)
integer(kind=c_size_t) :: total, free1,free2
real(8), dimension(:),allocatable::a
integer(kind=long)::N, eight, four
allocate(a(four*N))
!some OpenACC stuff in order to init the gpu
call get_dev_mem(total,free1)
!$acc data copy(a)
call get_dev_mem(total,free2)
print *,"size a in the gpu = ",(free1-free2)/1024/1024, " mb"
print *,"size a in theory = ", (eight*four*N)/1024/1024, " mb"
!$acc end data
deallocate(a)
Right, so, like commenters have suggested, we're not sure exactly what you're running, but filling in the missing details by guessing, here's a shot:
Most CUDA API calls return a status code (or error code if you will); this is true both in C/C++ and in Fortran, as we can see in the Portland Group's CUDA Fortran Manual:
Most of the runtime API routines are integer functions that return an error code; they return a value of zero if the call was successful, and a nonzero value if there was an error. To interpret the error codes, refer to “Error Handling,” on page 48.
This is the case for cudaMemGetInfo() specifically:
integer function cudaMemGetInfo( free, total )
integer(kind=cuda_count_kind) :: free, total
The two integers for free and total are cuda_count_kind, which if I am not mistaken are effectively unsigned... anyway, I would guess that what you're getting is an error code. Have a look at the Error Handling section on page 48 of the manual.
I'm doing ZigZag encoding on 32bit integers with Dart. This is the source code that I'm using:
int _encodeZigZag(int instance) => (instance << 1) ^ (instance >> 31);
int _decodeZigZag(int instance) => (instance >> 1) ^ (-(instance & 1));
The code works as expected in the DartVM.
But in dart2js the _decodeZigZag function is returning invalid results if I input negativ numbers. For example -10. -10 is encoded to 19 and should be decoded back to -10, but it is decoded to 4294967286. If I run (instance >> 1) ^ (-(instance & 1)) in the JavaScript console of Chrome, I get the expected result of -10. That means for me, that Javascript should be able to run this operation properly with it number model.
But Dart2Js generate the following JavaScript, that looks different from the code I tested in the console:
return ($.JSNumber_methods.$shr(instance, 1) ^ -(instance & 1)) >>> 0;
Why does Dart2Js adds a usinged right shift by 0 to the function? Without the shift, the result would be as expected.
Now I'm wondering, is it a bug in the Dart2Js compiler or the expected result? Is there a way to force Dart2Js to output the right javascript code?
Or is my Dart code wrong?
PS: Also tested splitting up the XOR into other operations, but Dart2Js is still adding the right shift:
final a = -(instance & 1);
final b = (instance >> 1);
return (a & -b) | (-a & b);
Results in:
a = -(instance & 1);
b = $.JSNumber_methods.$shr(instance, 1);
return (a & -b | -a & b) >>> 0;
For efficiency reasons dart2js compiles Dart numbers to JS numbers. JS, however, only provides one number type: doubles. Furthermore bit-operations in JS are always truncated to 32 bits.
In many cases (like cryptography) it is easier to deal with unsigned 32 bits, so dart2js compiles bit-operations so that their result is an unsigned 32 bit number.
Neither choice (signed or unsigned) is perfect. Initially dart2js compiled to signed 32 bits, and was only changed when we tripped over it too frequently. As your code demonstrate, this doesn't remove the problem, just shifts it to different (hopefully less frequent) use-cases.
Non-compliant number semantics have been a long-standing bug in dart2js, but fixing it will take time and potentially slow down the resulting code. In the short-term future Dart developers (compiling to JS) need to know about this restriction and work around it.
Looks like I found equivalent code that output the right result. The unit test pass for both the dart vm and dart2js and I will use it for now.
int _decodeZigZag(int instance) => ((instance & 1) == 1 ? -(instance >> 1) - 1 : (instance >> 1));
Dart2Js is not adding a shift this time. I would still be interested into the reason for this behavior.
I'm doing a program with FORTRAN that is a bit special. I can only use integer variables, and as you know with these you've got a memory overflow when you try to calculate a factorial superior to 12 or 13. So I made this program to avoid this problem:
http://lendricheolfiles.webs.com/codigo.txt
But something very strange is happening. The program calculates the factorial well 4 or 5 times and then gives a memory overflow message. I'm using Windows 8 and I fear it might be the cause of the failure, or if it's just that I've done something wrong.
Thanks.
Try compiling with run-time subscript checking. In Fortran segmentation faults are generally caused either by subscript errors or by mismatches between actual and dummy arguments (i.e., between arguments in the call to a procedure and the arguments as declared in the procedure). I'll make a wild guess from glancing at your code that you have have a subscript error -- let the compiler find it for you by turning on run-time subscript checking. Most Fortran compilers have this as an compilation option.
P.S. You can also do calculations like this by using already written packages, e.g., the arbitrary precision arithmetic software of David Bailey, et al., available in Fortran 90 at http://crd-legacy.lbl.gov/~dhbailey/mpdist/
M.S.B.'s answer has the gist of your problem: your array indices go out of bounds at a couple of places.
In three loops, cifra - 1 == 0 is out of bounds:
do cifra=ncifras,1,-1
factor(1,cifra-1) = factor(1,cifra)/10 ! factor is (1:2, 1:ncifras)
factor(1,cifra) = mod(factor(1,cifra),10)
enddo
! :
! Same here:
do cifra=ncifras,1,-1
factor(2,cifra-1) = factor(2,cifra)/10
factor(2,cifra) = mod(factor(2,cifra),10)
enddo
!:
do cifra=ncifras,1,-1
sumaprovisional(cifra-1) = sumaprovisional(cifra-1)+(sumaprovisional(cifra)/10)
sumaprovisional(cifra) = mod(sumaprovisional(cifra),10)
enddo
In the next case, the value of cifra - (fila - 1) goes out of bounds:
do fila=1,nfilas
do cifra=1,ncifras
! Out of bounds for all cifra < fila:
sumando(fila,cifra-(fila-1)) = factor(1,cifra)*factor(2,ncifras-(fila-1))
enddo
sumaprovisional = sumaprovisional+sumando(fila,:)
enddo
You should be fine if you rewrite the first three loops as do cifra = ncifras, 2, -1 and the inner loop of the other case as do cifra = fila, ncifras. Also, in the example program you posted, you first have to allocate resultado properly before passing it to the subroutine.