Multiply bytes to produce 16-bits, without shifting - sse

Still learning the art of SIMD, I have a question: I have two packed 8-bits registers that I'd like to multiply-add with _mm_maddubs_epi16 (pmaddubsw) to obtain a 16-bits packed register.
I know that these bytes will produce always a number less that 256, so I'd like to avoid wasting the remaining 8 bits. For instance, the result of _mm_maddubs_epi16(v1, v2) should write the result in r where XX is, not where it will be (denoted with __).
v1 (04, 00, 0e, 00, 04, 00, 04, 00, 0a, 00, 0f, 00, 05, 00, 01, 00)
v2 (04, 00, 0e, 00, 04, 00, 04, 00, 0a, 00, 0f, 00, 05, 00, 01, 00)
r (__, XX, __, XX, __, XX, __, XX, __, XX, __, XX, __, XX, __, XX)
Can I do this without shifting the result?
PS. I don't have a nice processor, I am limited to AVX instructions.

In your vector diagram, is the highest element at the left or the right? Are the XX locations in the most or least significant byte of the pmaddubsw result?
To get results in the low byte of a word, from inputs in the high byte of each word:
Use _mm_mulhi_epu16 so you're effectively doing (v1 << 8) * (v2 << 8) >> 16, producing the result in the opposite byte from the input words. Since you say the product is strictly less than 256, you'll get an 8-bit result in the low byte of each 16-bit word.
(If your inputs are signed, use _mm_mulhi_epi16, but then a negative result would be sign-extended to the full 16 bits.)
To get results in the high byte of a word, from inputs in the low byte
You'll need to change how you load / create one of the inputs so instead of
MSB LSB | MSB LSB
v1_lo (00, 04, 00, 0e, 00, 04, 00, 04, 00, 0a, 00, 0f, 00, 05, 00, 01)
element# 15 14 13 12 ... 0
you have this: (both using Intel's notation where the left element is the highest number, so vector shifts like _mm_slli_epi128 shift bytes to the left in the diagram).
MSB LSB | MSB LSB
v1_hi (04, 00, 0e, 00, 04, 00, 04, 00, 0a, 00, 0f, 00, 05, 00, 01, 00)
element# 15 14 13 12 ... 0
With v2 still having its non-zero bytes in the high half of each word element, simply _mm_mullo_epi16(v1_hi, v2), and you'll get (v1 * v2) << 8 for free.
If you're already unpacking bytes with zeros to obtain v1 and v2, then unpack the other way. If you were using pmovzx (_mm_cvtepu8_epi16), then switch to using _mm_unpacklo_epi8(_mm_setzero_si128(), packed_v1 ).
If you were loading these vectors from memory in this already-zero-padded form, use an unaligned load offset by 1 byte so the zeros end up in the opposite location.
If what you really want is to start with input bytes that aren't unpacked with zeros to start with, I don't think you can avoid that. Or if you're masking instead of unpacking (to save shuffle-port throughput by using _mm_and_si128 instead), you're probably going to need a shift somewhere. You can shift instead of masking one way, though, using v1_hi = _mm_slli_epi16(v, 8): a left-shift by 8 with word granularity will knock leave the low byte zeroed.

Shift v1 or v2 and then use_mm_mullo_epi16().
Possible XY Problem? My guess is that _mm_unpacklo_epi8() and _mm_packus_epi16() may be useful for you.

Related

Savitzky - Golay filter for 2D Matrices

i am doing some research about implementing a Savitzky-Golay filter for images. As far as i have read, the main application for this filter is signal processing, e.g. for smoothing audio-files.
The idea is fitting a polynomial through a defined neighbourhood around point P(i) and setting this point P to his new value P_new(i) = polynomial(i).
The problem in 2D-space is - in my opinion - that there is not only one direction to do the fitting. You can use different "directions" to find a polynomial. Like for
[51 52 11 33 34]
[41 42 12 24 01]
[01 02 PP 03 04]
[21 23 13 43 44]
[31 32 14 53 54]
It could be:
[01 02 PP 03 04], (horizontal)
[11 12 PP 23 24], (vertical)
[51 42 PP 43 54], (diagonal)
[41 42 PP 43 44], (semi-diagonal?)
but also
[41 02 PP 03 44], (semi-diagonal as well)
(see my illustration)
So my question is: Does the Savitzky-Golay filter even make sense for 2D-space, and if yes, is there and any defined generalized form for this filter for higher dimensions and larger filter masks?
Thank you !
A first option is to use SG filtering in a separable way, i.e. filtering once on the horizontal rows, then a second time on the vertical rows.
A second option is to rewrite the equations with a bivariate polynomial (bicubic f.i.) and solve for the coefficients by least-squares.

How can I convert a GUID into a byte array in Ruby?

In order to save data traffic we want to send our GUID's as array of bytes instead of as a string (with the use of Google Protocol Buffers).
How can I convert a string representation of a GUID in Ruby to an array of bytes:
Example:
Guid: 35918bc9-196d-40ea-9779-889d79b753f0
=> Result: C9 8B 91 35 6D 19 EA 40 97 79 88 9D 79 B7 53 F0
In .NET this seems to be natively implemented:
http://msdn.microsoft.com/en-us/library/system.guid.tobytearray%28v=vs.110%29.aspx
Your example GUID is in a Microsoft specific format. From Wikipedia:
Other systems, notably Microsoft's marshalling of UUIDs in their COM/OLE libraries, use a mixed-endian format, whereby the first three components of the UUID are little-endian, and the last two are big-endian.
So in order to get that result, we have to move the bits around a little. Specifically, we have to change the endianess of the first three components. Let's start by breaking the GUID string apart:
guid = '35918bc9-196d-40ea-9779-889d79b753f0'
parts = guid.split('-')
#=> ["35918bc9", "196d", "40ea", "9779", "889d79b753f0"]
We can convert these hex-strings to binary via:
mixed_endian = parts.pack('H* H* H* H* H*')
#=> "5\x91\x8B\xC9\x19m#\xEA\x97y\x88\x9Dy\xB7S\xF0"
Next let's swap the first three parts:
big_endian = mixed_endian.unpack('L< S< S< A*').pack('L> S> S> A*')
#=> "\xC9\x8B\x915m\x19\xEA#\x97y\x88\x9Dy\xB7S\xF0"
L denotes a 32-bit unsigned integer (1st component)
S denotes a 16-bit unsigned integer (2nd and 3rd component)
< and > denote little-endian and big-endian, respectively
A* treats the remaining bytes as an arbitrary binary string (we don't have to convert these)
If you prefer an array of bytes instead of a binary string, you'd just use:
big_endian.bytes
#=> [201, 139, 145, 53, 109, 25, 234, 64, 151, 121, 136, 157, 121, 183, 83, 240]
PS: if your actual GUID isn't Microsoft specific, you can skip the swapping part.

How to compute memory displacement in assembly?

I've been working on yasm assembly language and I generated a listing file that contains the following. I need help understanding how the memory displacement is computed in the first column. Thanks in advance.
1 %line 1+1 memory.asm
2 [section .data]
3 00000000 04000000 a dd 4
4 00000004 CDCC8C40 b dd 4.4
5 00000008 00000000<rept> c times 10 dd 0
6 00000030 01000200 d dw 1, 2
7 00000034 FB e db 0xfb
8 00000035 68656C6C6F20776F72- f db "hello world", 0
9 00000035 6C6400
Assembler is producing bytes (machine code), starting at some start address (here 0) and laying them next to each other. So first a dd 4 produces 4 bytes of data 04 00 00 00, thus memory at addresses 0, 1, 2 and 3 are filled up. Next free slot is at address 4. There goes b dd 4.4, again 4 bytes long. c times 10 dd 0 is 40 bytes long, so 8+40 = 48 (0x30) => next free slot.

How to create a "on/off" graphs with HighCharts?

I've read the documentation quite a few times, but I just can't seem to find a way to make a graph like this. Perhaps it's because I don't know what it's called, so I'm not even sure what to look for. Let me try to explain what I'm trying to do.
Normally if you have a series of points like this:
3 May, 5:00 PM ---> 0
3 May, 5:20 PM ---> 3
4 May, 5:00 PM ---> 0
4 May, 5:20 PM ---> 3
If you make a standard LINE GRAPH, high charts will plot the values INCREASE between the two. So I end up with this:
But the problem is, the values being shown are actually values changing at a point in time. In other words, what I want is this:
And even more importantly, it seems the spacing between time isn't correct. You'll notice that it creates a perfect zigzag, even though the times between the first and second point is 20 minutes (5PM to 5:20 PM), and the second point and 3rd point is 23 hours and 40 minutes (3 May 5:20 PM and 4 May 5PM). So what I really want is this:
Any idea what a graph like this is called?
Any idea how to make it using HighCharts?
UPDATE
The only solution I can think of right now, is to fake points between the real points. so for example if the value is 0 at 5PM and turns to 3 at 5:20 PM, then I will add 19 points in between these two. So at 5:01 I will make it 0, and 5:02 I will also make it 0, and 5:03 etc. Until 5:19. But even this method will result in a SLIGHTLY skewed line going up from 5:19 to 5:20. Which is what I'm actually trying to avoid.
Any ideas?
UPDATE 2
The "step : left" solution has definitely solved half of my problem, but for some reason I still have this:
You should now see that even though I have steps, they are not quite making the expected spacing. For 17:13 on 5 May, I expect the graph to be closer to the 6 May mark, than to the 5 May mark.
Any ideas as to why this is happening?
UPDATE 3
I created a jFiddle for my problem: https://jsfiddle.net/coderama/ubz7m0Lh/4/
UPDATE 4
Based on wergeld's input, it seems using "ordinal" on the x axis is the way to go --> http://api.highcharts.com/highstock#xAxis.ordinal
But it produces a pretty weird graph: https://jsfiddle.net/coderama/6tz8h53x/1/
I'll keep looking, but at least it feels like there's progress being made!
What you are looking for is the step option. You can set up something like:
$(function() {
$('#container').highcharts({
title: {
text: 'Step line types, with null values in the series'
},
xAxis: {
type: 'datetime',
tickInterval: 86400000
},
series: [{
data: [
[Date.UTC(2016, 04, 3, 17, 00), 0],
[Date.UTC(2016, 04, 3, 20, 00), 3],
[Date.UTC(2016, 04, 4, 17, 00), 0],
[Date.UTC(2016, 04, 5, 18, 00), 3],
[Date.UTC(2016, 04, 5, 19, 00), 0],
[Date.UTC(2016, 04, 6, 20, 00), 3],
[Date.UTC(2016, 04, 7, 17, 00), 0]
],
step: 'left'
}]
});
});
The step parameter tells highcharts how to go from your given point to the next point.

SPSS automated variable labels with leading zeros

what I am struggling with is to adopt one of the syntax/macros from http://www.spsstools.net/.
It was intended to change labels of "many-many" variables that do not have the leading zeros, but my variables do have those:
DATA LIST LIST /id.
BEGIN DATA
1
END DATA.
NUMERIC set01sub1 TO set01sub4.
* but the intended variable names are set01sub01 TO set01sub04 (with leading zeros and going over 10).
SET MPRINT=yes.
DEFINE !label (lab=!TOKENS(1) /stem=!TOKENS(1) /nb1=!TOKENS(1) /nb2=!TOKENS(1))
!DO !cnt=!nb1 !TO !nb2
!LET !var=!CONCAT(!stem,!cnt)
!LET !labe=!QUOTE(!CONCAT(!UNQUOTE(!lab),!cnt))
VARIABLE LABEL !var !labe.
!DOEND.
!ENDDEFINE.
!label lab='Set 1, subset ' stem=set01sub nb1=1 nb2=4.
I was very naive and I have tried to use !STRING(...,N2):
!LET !labe=!QUOTE(!CONCAT(!UNQUOTE(!lab),!STRING(!cnt,N2)))
but, this didn't work as expected
my variables are
subID
rvnAns_s01m01 TO rvnAns_s01m12
rvnAns_s02m01 TO rvnAns_s02m36
rvnAns_s03m01 TO rvnAns_s03m36
rvnEva_s01m01 TO rvnEva_s01m12
rvnEva_s02m01 TO rvnEva_s02m36
rvnEva_s03m01 TO rvnEva_s03m36
and the intended labels are:
"Subject ID"
"RAPM, Series 01, Matrix 01 answer"
"RAPM, Series 01, Matrix 02 answer"
...
"RAPM, Series 01, Matrix 12 answer"
"RAPM, Series 02, Matrix 01 answer"
"RAPM, Series 02, Matrix 02 answer"
...
"RAPM, Series 02, Matrix 36 answer"
"RAPM, Series 03, Matrix 01 answer"
"RAPM, Series 03, Matrix 02 answer"
...
"RAPM, Series 03, Matrix 36 answer"
and
"RAPM, Series 01, Matrix 01 answer evaluation"
"RAPM, Series 01, Matrix 02 answer evaluation"
...
"RAPM, Series 01, Matrix 12 answer evaluation"
"RAPM, Series 02, Matrix 01 answer evaluation"
"RAPM, Series 02, Matrix 02 answer evaluation"
...
"RAPM, Series 02, Matrix 36 answer evaluation"
"RAPM, Series 03, Matrix 01 answer evaluation"
"RAPM, Series 03, Matrix 02 answer evaluation"
...
"RAPMs, Series 03, Matrix 36 answer evaluation"
I would be very grateful for any help or suggestions on how to achieve such result.
If you install the Python Essentials via the SPSS Community website (www.ibm.com/developerworks/spssdevcentral), the following program will convert the variable names.
It makes two assumptions:
1) None of the names has a form with just a leading zero, e.g., x0y1. (That could be addressed with a little more complexity
2) None of the renames will result in a name collision.
3) None of the expanded names will exceed the maximum length for a name (64 bytes).
Explanation below the program.
begin program.
import spss, re
for v in range(spss.GetVariableCount()):
vname = spss.GetVariableName(v)
vnamenew = re.sub(r"(\D)([1-9])", r"\g<1>0\g<2>", vname)
if vname != vnamenew:
spss.Submit("rename variables (%s=%s)" % (vname, vnamenew))
print vname, "->", vnamenew
end program.
This program iterates through all the variable names. For each one it looks for all occurrences of nondigit-nonzerodit and replaces it with nondigit-0-digit and then generates and runs a rename variables command.
As I said in my comment on crossvalidated, your code does work for the given sample if you supply the stem token with the leading zero, e.g. stem=set01sub0.
If you have ranges that span more than 10 digits though, I presume you won't have a leading zero for values of 10+. Below I have an example in the MACRO using conditional evaluation to concatenate a leading zero for values below 10. If you have potentially more values (e.g. go into 100's and so have two leading zeroes) this would need to be amended.
DATA LIST LIST /id.
BEGIN DATA
1
END DATA.
NUMERIC set01sub01 TO set01sub15.
DEFINE !label (lab=!TOKENS(1) /stem=!TOKENS(1) /nb1=!TOKENS(1) /nb2=!TOKENS(1))
!DO !cnt=!nb1 !TO !nb2
!IF (!LENGTH(!cnt) = 1) !THEN
!LET !cnt0 = !CONCAT("0",!cnt)
!ELSE
!LET !cnt0 = !cnt
!IFEND
!LET !var=!CONCAT(!stem,!cnt0)
!LET !labe=!QUOTE(!CONCAT(!UNQUOTE(!lab),!cnt0))
VARIABLE LABEL !var !labe.
!DOEND.
!ENDDEFINE.
PRESERVE.
SET MPRINT ON.
!label lab='Set 1, subset ' stem=set01sub nb1=1 nb2=15.
RESTORE.

Resources