I solved Project Euler #4 in Rust. There is one line of code that took me around 30 minutes to solve. When I remove the line, I get: thread 'main' panicked at 'attempt to multiply with overflow' Explanation is in the code:(see on playground)
The weird thing is, rev is 0, but when I try: rev=0; in the place that I have marked as 'problem here', it will solve the issue even though the value is same. Why is that? I have checked and this is not a duplicate question. I also didn't know what to write in the title since this is an uncommon error.
//Task: Find the largest palindrome made from the product of two 3-digit numbers.
fn main(){
let mut pal;//palindrome
let mut ram;//a second number that's equal to palindrome, to copy it's digits.
let mut rev=0;//reversed palindrome
let mut lar=0;//largest palindrome
for ln in 100..1000{//ln=left number } left number * right number = palindrome
for rn in 100..1000{//rn=right number} ^
pal=ln*rn;// |
ram=pal;
//-----------------Problem here-----------------
//rev=ram%10;//when this line is commented, it gives:
//thread 'main' panicked at 'attempt to multiply with overflow', why_overflow_pe4.rs:13:17
while ram>0{//getting the last digit of ram for the first digit of rev
//and continuing until ram=0 and rev is reversed.
rev*=10;
ram/=10;
rev+=ram%10;
}
rev/=10;
if pal==rev && pal>lar{//if rev=pal and our palindrome is larger than previous largest
//make the largest palindrome current palindrome
lar=pal;
}
}
}
println!("{}",lar);
}
-Thank you!
Inspecting the contents before the multiplication of rev may help: rev has the value 1010102010, and multiplying it by 10 would result in a number that is too large (i.e. would require too many bits) to be held by rev.
If you uncomment the rev=ram%10, rev will surely be less than 10, so multiplying it by 10 results in at most 100.
You can tell the data type size using u8, u16, u32, u64, u128 et al., but even u128 will overflow. Thus, you could either adapt your algorithm or go with a data type supporting more than 128 bits.
Related
I'm looking for an SSE Bitwise OR between components of same vector. (Editor's note: this is potentially an X-Y problem, see below for the real comparison logic.)
I am porting some SIMD logic from SPU intrinsics. It has an instruction
spu_orx(a)
Which according to the docs
spu_orx: OR word across d = spu_orx(a) The four word elements of
vector a are logically Ored. The result is returned in word element 0
of vector d. All other elements (1,2,3) of d are assigned a value of
zero.
How can I do that with SSE 2 - 4 involving minimum instruction? _mm_or_ps is what I got here.
UPDATE:
Here is the scenario from SPU based code:
qword res = spu_orx(spu_or(spu_fcgt(x, y), spu_fcgt(z, w)))
So it first ORs two 'greater' comparisons, then ORs its result.
Later couples of those results are ANDed to get final comparison value.
This is effectively doing (A||B||C||D||E||F||G||H) && (I||J||K||L||M||N||O||P) && ... where A..D are the 4x 32-bit elements of the fcgt(x,y) and so on.
Obviously vertical _mm_or_ps of _mm_cmp_ps results is a good way to reduce down to 1 vector, but then what? Shuffle + OR, or something else?
UPDATE 1
Regarding "but then what?"
I perform
qword res = spu_orx(spu_or(spu_fcgt(x, y), spu_fcgt(z, w)))
On SPU it goes like this:
qword aRes = si_and(res, res1);
qword aRes1 = si_and(aRes, res2);
qword aRes2 = si_and(aRes1 , res3);
return si_to_uint(aRes2 );
several times on different inputs,then AND those all into a single result,which is finally cast to integer 0 or 1 (false/true test)
SSE4.1 PTEST bool any_nonzero = !_mm_testz_si128(v,v);
That would be a good way to horizontal OR + booleanize a vector into a 0/1 integer. It will compile to multiple instructions, and ptest same,same is 2 uops on its own. But once you have the result as a scalar integer, scalar AND is even cheaper than any vector instruction, and you can branch on the result directly because it sets integer flags.
#include <immintrin.h>
bool any_nonzero_bit(__m128i v) {
return !_mm_testz_si128(v,v);
}
On Godbolt with gcc9.1 -O3 -march=nehalem:
any_nonzero(long long __vector(2)):
ptest xmm0, xmm0 # 2 uops
setne al # 1 uop with false dep on old value of RAX
ret
This is only 3 uops on Intel for a horizontal OR into a single bit in an integer register. AMD Ryzen ptest is only 1 uop so it's even better.
The only risk here is if gcc or clang creates false dependencies by not xor-zeroing eax before doing a setcc into AL. Usually gcc is pretty fanatical about spending extra uops to break false dependencies so I don't know why it doesn't here. (I did check with -march=skylake and -mtune=generic in case it was relying on Nehalem partial-register renaming for -march=nehalem. Even -march=znver1 didn't get it to xor-zero EAX before the ptest.)
It would be nice if we could avoid the _mm_or_ps and have PTEST do all the work. But even if we consider inverting the comparisons, the vertical-AND / horizontal-OR behaviour doesn't let us check something about all 8 elements of 2 vectors, or about any of those 8 elements.
e.g. Can PTEST be used to test if two registers are both zero or some other condition?
// NOT USEFUL
// 1 if all the vertical pairs AND to zero.
// but 0 if even one vertical AND result is non-zero
_mm_testz_si128( _mm_castps_si128(_mm_cmpngt_ps(x,y)),
_mm_castps_si128(_mm_cmpngt_ps(z,w)));
I mention this only to rule it out and save you the trouble of considering this optimization idea. (#chtz suggested it in comments. Inverting the comparison is a good idea that can be useful for other ways of doing things.)
Without SSE4.1 / delaying the horizontal OR
We might be able to delay horizontal ORing / booleanizing until after combining some results from multiple vectors. This makes combining more expensive (imul or something), but saves 2 uops in the vector -> integer stage vs. PTEST.
x86 has cheap vector mask->integer bitmap with _mm_movemask_ps. Especially if you ultimately want to branch on the result, this might be a good idea. (But x86 doesn't have a || instruction that booleanizes its inputs either so you can't just & the movemask results).
One thing you can do is integer multiply movemask results: x * y is non-zero iff both inputs are non-zero. Unlike x & y which can be false for 0b0101 &0b1010for example. (Our inputs are 4-bit movemask results andunsigned` is 32-bit so we have some room before we overflow). AMD Bulldozer family has an integer multiply that isn't fully pipelined so this could be a bottleneck on old AMD CPUs. Using just 32-bit integers is also good for some low-power CPUs with slow 64-bit multiply.
This might be good if throughput is more of a bottleneck than latency, although movmskps can only run on one port.
I'm not sure if there are any cheaper integer operations that let us recover the logical-AND result later. Adding doesn't work; the result is non-zero even if only one of the inputs was non-zero. Concatenating the bits together (shift+or) is also of course like an OR if we eventually just test for any non-zero bit. We can't just bitwise AND because 2 & 1 == 0, unlike 2 && 1.
Keeping it in the vector domain
Horizontal OR of 4 elements takes multiple steps.
The obvious way is _mm_movehl_ps + OR, then another shuffle+OR. (See Fastest way to do horizontal float vector sum on x86 but replace _mm_add_ps with _mm_or_ps)
But since we don't actually need an exact bitwise-OR when our inputs are compare results, we just care if any element is non-zero. We can and should think of the vectors as integer, and look at integer instructions like 64-bit element ==. One 64-bit element covers/aliases two 32-bit elements.
__m128i cmp = _mm_castps_si128(cmpps_result); // reinterpret: zero instructions
// SSE4.1 pcmpeqq 64-bit integer elements
__m128i cmp64 = _mm_cmpeq_epi64(cmp, _mm_setzero_si128()); // -1 if both elements were zero, otherwise 0
__m128i swap = _mm_shuffle_epi32(cmp64, _MM_SHUFFLE(1,0, 3,2)); // copy and swap, no movdqa instruction needed even without AVX
__m128i bothzero = _mm_and_si128(cmp64, swap); // both halves have the full result
After this logical inversion, ORing together multiple bothzero results will give you the AND of multiple conditions you're looking for.
Alternatively, SSE4.1 _mm_minpos_epu16(cmp64) (phminposuw) will tell us in 1 uop (but 5 cycle latency) if either qword is zero. It will place either 0 or 0xFFFF in the lowest word (16 bits) of the result in this case.
If we inverted the original compares, we could use phminposuw on that (without pcmpeqq) to check if any are zero. So basically a horizontal AND across the whole vector. (Assuming that it's elements of 0 / -1). I think that's a useful result for inverted inputs. (And saves us from using _mm_xor_si128 to flip the bits).
An alternative to pcmpeqq (_mm_cmpeq_epi64) would be SSE2 psadbw against a zeroed vector to get 0 or non-zero results in the bottom of each 64-bit element. It won't be a mask, though, it's 0xFF * 8. Still, it's always that or 0 so you can still AND it. And it doesn't invert.
HI I know there have been may question about this here but I wasn't able to find a detailed enough answer, Wikipedia has two examples of ISIN and how is their checksum calculated.
The part of calculation that I'm struggling with is
Multiply the group containing the rightmost character
The way I understand this statement is:
Iterate through each character from right to left
once you stumble upon a character rather than digit record its position
if the position is an even number double all numeric values in even position
if the position is an odd number double all numeric values in odd position
My understanding has to be wrong because there are at least two problems:
Every ISIN starts with two character country code so position of rightmost character is always the first character
If you omit the first two characters then there is no explanation as to what to do with ISINs that are made up of all numbers (except for first two characters)
Note
isin.org contains even less information on verifying ISINs, they even use the same example as Wikipedia.
I agree with you; the definition on Wikipedia is not the clearest I have seen.
There's a piece of text just before the two examples that explains when one or the other algorithm should be used:
Since the NSIN element can be any alpha numeric sequence (9 characters), an odd number of letters will result in an even number of digits and an even number of letters will result in an odd number of digits. For an odd number of digits, the approach in the first example is used. For an even number of digits, the approach in the second example is used
The NSIN is identical to the ISIN, excluding the first two letters and the last digit; so if the ISIN is US0378331005 the NSIN is 037833100.
So, if you want to verify the checksum digit of US0378331005, you'll have to use the "first algorithm" because there are 9 digits in the NSIN. Conversely, if you want to check AU0000XVGZA3 you're going to use the "second algorithm" because the NSIN contains 4 digits.
As to the "first" and "second" algorithms, they're identical, with the only exception that in the former you'll multiply by 2 the group of odd digits, whereas in the latter you'll multiply by 2 the group of even digits.
Now, the good news is, you can get away without this overcomplicated algorithm.
You can, instead:
Take the ISIN except the last digit (which you'll want to verify)
Convert all letters to numbers, so to obtain a list of digits
Reverse the list of digits
All the digits in an odd position are doubled and their digits summed again if the result is >= 10
All the digits in an even position are taken as they are
Sum all the digits, take the modulo, subtract the result from 0 and take the absolute value
The only tricky step is #4. Let's clarify it with a mini-example.
Suppose the digits in an odd position are 4, 0, 7.
You'll double them and get: 8, 0, 14.
8 is not >= 10, so we take it as it is. Ditto for 0. 14 is >= 10, so we sum its digits again: 1+4=5.
The result of step #4 in this mini-example is, therefore: 8, 0, 5.
A minimal, working implementation in Python could look like this:
import string
isin = 'US4581401001'
def digit_sum(n):
return (n // 10) + (n % 10)
alphabet = {letter: value for (value, letter) in
enumerate(''.join(str(n) for n in range(10)) + string.ascii_uppercase)}
isin_to_digits = ''.join(str(d) for d in (alphabet[v] for v in isin[:-1]))
isin_sum = 0
for (i, c) in enumerate(reversed(isin_to_digits), 1):
if i % 2 == 1:
isin_sum += digit_sum(2*int(c))
else:
isin_sum += int(c)
checksum_digit = abs(- isin_sum % 10)
assert int(isin[-1]) == checksum_digit
Or, more crammed, just for functional fun:
checksum_digit = abs( - sum(digit_sum(2*int(c)) if i % 2 == 1 else int(c)
for (i, c) in enumerate(
reversed(''.join(str(d) for d in (alphabet[v] for v in isin[:-1]))), 1)) % 10)
Okay I need to redraw the pascal's triangle and explain the Fibonacci sequence embedded in it.. And i need to observe over 12 rows of the triangle (which ends on the number 144 in the fibonacci sequence) -- I understand this part as i am just explaining how each row diagonally forms the sum of the Fibonacci numbers.
But I need to use the fact that the rth number in the nth row of the triangle is
C(n, r) = n!/r! n-r!
This last part is whats confusing me.. How can i use C(n,r) to explain the Fibonacci sequence in the triangle??
Please Help. Thanks
Consider the following problem :
In how many ways can you go up a ladder of n steps if you can take either a single step at a time or 2 steps at a time?
Solution 1 : Let's construct a recurrence relation for this problem. It's pretty clear that the recurrence would be something like this : a(n) = a(n-1) + a(n-2); where a(1)=1 and a(2)=2
Thus, the answer for n would be the (n+1)th fibonacci term.
Solution 2 : Each unique way of climbing up the ladder corresponds to a unique sequence of 1's and 2's which adds up to n. The number of such sequences thus would be our answer. Let's start counting such sequences :
Number of sequences without a 2 = $ {n \choose 0 } $.
Number of sequences with one 2 = $ {n-1 \choose 1 } $.
.
.
.
and so on.
In case of even n, the last term would be $ {n/2 \choose n/2 } $.
And for odd n, it would be $ {(n+1)/2 \choose (n-1)/2 } $.
As you can see, These are the diagonal terms in a pascal's triangle.
As these two solutions compute the same result, hence they must be equal. Thus we get the relation between Fibonacci numbers and the diagonals of a pascals triangle.
Refer the link
http://ms.appliedprobability.org/data/files/Articles%2033/33-1-5.pdf
for anymore doubts.
in fibonacci series let's assume nth fibonacci term is T. F(n)=T. but i want to write a a program that will take T as input and return n that means which term is it in the series( taken that T always will be a fibonacci number. )i want to find if there lies an efficient way to find it.
The easy way would be to simply start generating Fibonacci numbers until F(i) == T, which has a complexity of O(T) if implemented correctly (read: not recursively). This method also allows you to make sure T is a valid Fibonacci number.
If T is guaranteed to be a valid Fibonacci number, you can use approximation rules:
Formula
It looks complicated, but it's not. The point is: from a certain point on, the ratio of F(i+1)/F(i) becomes a constant value. Since we're not generating Fibonacci Numbers but are merely finding the "index", we can drop most of it and just realize the following:
breakpoint := f(T)
Any f(i) where i > T = f(i-1)*Ratio = f(T) * Ratio^(i-T)
We can get the reverse by simply taking Log(N, R), R being Ratio. By adjusting for the inaccuracy for early numbers, we don't even have to select a breakpoint (if you do: it's ~ correct for i > 17).
The Ratio is, approximately, 1.618034. Taking the log(1.618034) of 6765 (= F(20)), we get a value of 18.3277. The accuracy remains the same for any higher Fibonacci numbers, so simply rounding down and adding 2 gives us the exact Fibonacci "rank" (provided that F(1) = F(2) = 1).
The first step is to implement fib numbers in a non-recursive way such as
fib1=0;fib2=1;
for(i=startIndex;i<stopIndex;i++)
{
if(fib1<fib2)
{
fib1+=fib2;
if(fib1=T) return i;
if(fib1>T) return -1;
}
else
{
fib2+=fib1;
if(fib2=T) return i;
if(fib2>t) return -1;
}
}
Here startIndex would be set to 3 stopIndex would be set to 10000 or so. To cut down in the iteration, you can also select 2 seed number that are sequential fib numbers further down the sequence. startIndex is then set to the next index and do the computation with an appropriate adjustment to the stopIndex. I would suggest breaking the sequence up in several section depending on machine performance and the maximum expected input to minimize the run time.
I have many text files of this format
....
<snip>
'FOP' 0.19 1 24 1 25 7 8 /
'FOP' 0.18 1 24 1 25 9 11 /
/
TURX
560231
300244
70029
200250
645257
800191
900333
600334
770291
300335
220287
110262 /
SUBTRACT
'TURX' 'TURY'/
</snip>
......
where the portions I snipped off contain other various data in various formats. The file format is inconsistent (machine generated), the only thing one is assured of is the keyword TURX which may appear more than once. If it appears alone on one line, then the next few lines will contain numbers that I need to fetch into an array. The last number will have a space then a forward slash (/). I can then use this array in other operations afterwards.
How do I "search" or parse a file of unknown format in fortran, and how do I get a loop to fetch the rest of the data, please? I am really new to this and I HAVE to use fortran. Thanks.
Fortran 95 / 2003 have a lot of string and file handling features that make this easier.
For example, this code fragment to process a file of unknown length:
use iso_fortran_env
character (len=100) :: line
integer :: ReadCode
ReadLoop: do
read (75, '(A)', iostat=ReadCode ) line
if ( ReadCode /= 0 ) then
if ( ReadCode == iostat_end ) then
exit ReadLoop
else
write ( *, '( / "Error reading file: ", I0 )' ) ReadCode
stop
end if
end if
! code to process the line ....
end do ReadLoop
Then the "process the line" code can contain several sections depending on a logical variable "Have_TURX". If Have_TRUX is false you are "seeking" ... test whether the line contains "TURX". You could use a plain "==" if TURX is always at the start of the string, or for more generality you could use the intrinsic function "index" to test whether the string "line" contains TURX.
Once the program is in the mode Have_TRUX is true, then you use "internal I/O" to read the numeric value from the string. Since the integers have varying lengths and are left-justified, the easiest way is to use "list-directed I/O": combining these:
read (line, *) integer_variable
Then you could use the intrinsic function "index" again to test whether the string also contains a slash, in which case you change Have_TRUX to false and end reading mode.
If you need to put the numbers into an array, it might be necessary to read the file twice, or to backspace the file, because you will have to allocate the array, and you can't do that until you know the size of the array. Or you could pop the numbers into a linked list, then when you hit the slash allocate the array and fill it from the linked list. Or if there is a known maximum number of values you could use a temporary array, then transfer the numbers to an allocatable output array. This is assuming that you want the output argument of the subroutine be an allocatable array of the correct length, and the it returns one group of numbers per call:
integer, dimension (:), allocatable, intent (out) :: numbers
allocate (numbers (1: HowMany) )
P.S. There is a brief summary of the language features at http://en.wikipedia.org/wiki/Fortran_95_language_features and the gfortran manual has a summary of the intrinsic procedures, from which you can see what built in functions are available for string handling.
I'll give you a nudge in the right direction so that you can finish your project.
Some basics:
Do/While as you'll need some sort of loop
structure to loop through the file
and then over the numbers. There's
no for loop in Fortran, so use this
type.
Read
to read the strings.
To start you need something like this:
program readlines
implicit none
character (len=30) :: rdline
integer,dimension(1000) :: array
! This sets up a character array with 30 positions and an integer array with 1000
!
open(18,file='fileread.txt')
do
read(18,*) rdline
if (trim(rdline).eq.'TURX') exit !loop until the trimmed off portion matches TURX
end do
See this thread for way to turn your strings into integers.
Final edit: Looks like MSB has got most of what I just found out. The iostat argument of the read is the key to it. See this site for a sample program.
Here was my final way around it.
PROGRAM fetchnumbers
implicit none
character (len=50) ::line, numdata
logical ::is_numeric
integer ::I,iost,iost2,counter=0,number
integer, parameter :: long = selected_int_kind(10)
integer, dimension(1000)::numbers !Can the number of numbers be up to 1000?
open(20,file='inputfile.txt') !assuming file is in the same location as program
ReadLoop: do
read(20,*,iostat=iost) line !read data line by line
if (iost .LT. 0) exit !end of file reached before TURX was found
if (len_trim(line)==0) cycle ReadLoop !ignore empty lines
if (index(line, 'TURX').EQ.1) then !prepare to begin capturing
GetNumbers: do
read(20, *,iostat=iost2)numdata !read in the numbers one by one
if (.NOT.is_numeric(numdata)) exit !no more numbers to read
if (iost2 .LT. 0) exit !end of file reached while fetching numbers
read (numdata,*) number !read string value into a number
counter = counter + 1
Storeloop: do I =1,counter
if (I<counter) cycle StoreLoop
numbers(counter)=number !storing data into array
end do StoreLoop
end do GetNumbers
end if
end do ReadLoop
write(*,*) "Numbers are:"
do I=1,counter
write(*,'(I14)') numbers(I)
end do
END PROGRAM fetchnumbers
FUNCTION is_numeric(string)
IMPLICIT NONE
CHARACTER(len=*), INTENT(IN) :: string
LOGICAL :: is_numeric
REAL :: x
INTEGER :: e
is_numeric = .FALSE.
READ(string,*,IOSTAT=e) x
IF (e == 0) is_numeric = .TRUE.
END FUNCTION is_numeric