I'm basically stuck at excercise 3.56 in SICP. The problem goes like this:
Exercise 3.56. A famous problem, first raised by R. Hamming, is to enumerate, in ascending order with no repetitions, all positive integers with no prime factors other than 2, 3, or 5. One obvious way to do this is to simply test each integer in turn to see whether it has any factors other than 2, 3, and 5. But this is very inefficient, since, as the integers get larger, fewer and fewer of them fit the requirement. As an alternative, let us call the required stream of numbers S and notice the following facts about it.
S begins with 1.
The elements of (scale-stream S 2) are also elements of S.
The same is true for (scale-stream S 3) and (scale-stream 5 S).
These are all the elements of S.
Now all we have to do is combine elements from these sources. For this we define a procedure merge that combines two ordered streams into one ordered result stream, eliminating repetitions:
(define (merge s1 s2)
(cond ((stream-null? s1) s2)
((stream-null? s2) s1)
(else
(let ((s1car (stream-car s1))
(s2car (stream-car s2)))
(cond ((< s1car s2car)
(cons-stream s1car (merge (stream-cdr s1) s2)))
((> s1car s2car)
(cons-stream s2car (merge s1 (stream-cdr s2))))
(else
(cons-stream s1car
(merge (stream-cdr s1)
(stream-cdr s2)))))))))
Then the required stream may be constructed with merge, as follows:
(define S (cons-stream 1 (merge <??> <??>)))
Fill in the missing expressions in the places marked above.
Before this particular problem, I've been able to visualize and understand these implicit stream definitions using a signal processing block diagram with the original stream being fed back to the procedure.
But I've basically hit a wall with this particular problem, I've looked up the solution, but I'm finding it impossible to visualize how the solution would work in my head/paper.
Is there a trick for understanding and coming up with solutions for these sort of problems?
This is the solution that works:
(define S
(cons-stream 1 (merge (scale-stream S 2)
(merge (scale-stream S 3)
(scale-stream S 5)))))
Thanks in advance.
As a matter of proper naming, merge shouldn't be removing duplicates, as its name suggests its being part of mergesort which ought to preserve them. Union is a better name for such operation, which sees sets represented (here) by increasing lists of unique numbers, which constraint it ought to preserve by removing the duplicates which can only come from both of its arguments.
Back to the problem itself, let's write it symbolically as
S235 = {1} ∪ 2*S235 ∪ 3*S235 ∪ 5*S235
Premature implementation is the mother of all evil! (wait, what?) We won't even yet try to establish how exactly those ∪s do their job, not even in which order. Or even how many of the terms there are there:
S23 = {1} ∪ 2*S23 ∪ 3*S23
or even
S2 = {1} ∪ 2*S2
Now this looks simple enough. We can even fake-implement the union of A and B here simply as, first, taking all the elements of A, and then -- of B. And it will work just fine, here, because there's only one element in this ∪'s left input:
{1} ----∪-->--->--S₂--.--->S₂
/ \
\______*2_______/
---<----<---
How does this work? 1 enters the ∪ combiner, exits it first, unconditionally (NB this discovered requirement is important, for if ∪ had to examine both of its arguments right away we'd get ourselves an infinite loop, a black hole in Haskell argot), is split in two by the . splitter, then the first copy of 1 continues forth to the output point while the second copy of 1 goes back through the *2 multiplier, the resulting 2 enters back the ∪ this time on the right, unopposed by anything on the left (which is at this time already empty), and continues on in the same fashion so 2 goes to the output point, then 4, then 8, etc. etc..
To put it differently, S₂ contains all elements of {1}; plus all elements of {1} that went through the *2 multiplier once; and twice; and three times; and so on and so forth -- all the powers of 2 in increasing order:
S2 = {1} ∪ 2*{1} ∪ 2*2*{1} ;; == {1, 2, 4, 8, 16, 32, ...}
∪ 2*2*2*{1}
∪ 2*2*2*2*{1}
∪ ..........
The two S₂'s in the diagram are the same because whatever we siphon from it at the splitter point does not affect it.
Wasn't this fun?
So how do we go about adding the multiples of 3 to it? One way to do it is
S23 = S2 ∪ 3*S23
{1} ----∪-->--->--S₂--.---S₂----∪-->--->--S₂₃--.--->S₂₃
/ \ / \
\______*2_______/ \______*3________/
---<----<--- ---<----<---
Here 1 from S₂ enters the second ∪ combiner and proceeds to the output point S₂₃ as well as back through the *3 multiplier, turning into 3. Now the second ∪ has 2,4,8,... and 3,... as its inputs; 2 goes through as well as turning into 6. Next, ∪ has 4,8,16,... and 3,6,...; 3 goes through. Next, 4; etc., etc., and so on and so forth.
Thus all elements of S₂ are part of S₂₃, but so are also all elements of S₂ that went through the *3 multiplier once, and twice, etc., -- all the powers of 2 and 3 multiplied together, in increasing order:
S23 = S2 ∪ 3*S2 ∪ 3*3*S2 ;; = S2 ∪ 3*( S2 ∪ 3*S2
∪ 3*3*3*S2 ;; ∪ 3*3*S2
∪ 3*3*3*3*S2 ;; ∪ 3*3*3*S2
∪ .......... ;; ∪ ........ ) !!
Why the increasing order? How? Why, that is the responsibility of ∪! Hello, another discovered requirement. Whatever enters it on either side, it must produce the smaller element before the larger one.
And what is it to do in the event the two are equal? Do we even need to concern ourselves with this question in this here scheme? Can this ever happen, here?
It can't. And so we can implement the ∪ here as a merge, not as a union (but remember the first discovered requirement! -- is it still valid? needed? with the addition of new cases). Merge ought to be more efficient than union as it doesn't concern itself with the case of equals.
And for the multiples of 5 also? We continue, as
S235 = S23 ∪ 5*S235
{1} ----∪-->--->--S₂--.---S₂----∪-->--->--S₂₃--.---S₂₃----∪-->--->--S₂₃₅--.--->S₂₃₅
/ \ / \ / \
\______*2_______/ \______*3________/ \_______*5________/
---<----<--- ---<----<--- ---<----<---
Does this describe the code from the book? _______
Does this describe a code which is about twice faster than the one from the book? _______
Why is it about twice faster than the code from the book? _______
Does this answer your question? _______
Does this help you answer your question? _______
(fill in the blanks).
See also:
New state of the art in unlimited generation of Hamming sequence
And the signal processing block diagram for the book's code is:
1 --->---\
cons-stream ->-- S ---.---> S
/----->---->--- *2 --->---\ / |
/ union ->--/ /
.-->-- *3 -->--\ / /
| union ->--/ /
.-->-- *5 -->--/ /
\ /
\__________<__________<__________<_________<_______________/
where the duplicates-removing "union" is called merge in the book.
This is my best attempt to visualize it. But I do struggle, it feels like a snake with three heads eating its own tail.
If we say the values of the stream S are s0, s1, s2, ..., then
initially we only know the first value, s0.
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 ? ? ? ? ? ? ? ? ? ?
But we do know the three scale-streams will be producing multiples of
these values, on demand:
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 ? ? ? ? ? ? ? ? ? ?
scale-2: 2*1 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*?
scale-3: 3*1 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*?
scale-5: 5*1 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*?
________________________________________________________________
Merge will initially select the lowest of the numbers at the heads of
these three streams, forcing their calculation in the process:
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 ? ? ? ? ? ? ? ? ? ?
scale-2: [2] 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*?
scale-3: 3 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*?
scale-5: 5 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*?
________________________________________________________________
So s1 will now have the value 2:
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 [2] ? ? ? ? ? ? ? ? ?
scale-2: 2*2 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*?
scale-3: 3 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*?
scale-5: 5 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*?
________________________________________________________________
Merge will now select 3 as the minimum of 4, 3, and 5:
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 2 ? ? ? ? ? ? ? ? ?
scale-2: 4 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*?
scale-3: [3] 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*?
scale-5: 5 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*?
________________________________________________________________
and will put it into the next slot in the result stream S, s2:
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 2 [3] ? ? ? ? ? ? ? ?
scale-2: 4 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*?
scale-3: 3*2 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*?
scale-5: 5 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*?
________________________________________________________________
Scale-2's head is selected again:
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 2 3 [4] ? ? ? ? ? ? ?
scale-2: 2*3 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*?
scale-3: 6 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*?
scale-5: 5 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*?
________________________________________________________________
And then 5 is selected from scale-5 and placed in the result:
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 2 3 4 [5] ? ? ? ? ? ?
scale-2: 6 2*? 2*? 2*? 2*? 2*? 2*? 2*? 2*?
scale-3: 6 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*?
scale-5: 5*2 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*?
________________________________________________________________
Two streams have 6 at their head, both are consumed but only one 6
is placed in the result:
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 2 3 4 5 [6] ? ? ? ? ?
scale-2: 2*4 2*? 2*? 2*? 2*? 2*? 2*? 2*?
scale-3: 3*3 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*?
scale-5: 10 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*?
________________________________________________________________
And a few more iterations:
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 2 3 4 5 6 [8] ? ? ? ?
scale-2: 2*5 2*? 2*? 2*? 2*? 2*? 2*?
scale-3: 9 3*? 3*? 3*? 3*? 3*? 3*? 3*? 3*?
scale-5: 10 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*?
________________________________________________________________
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 2 3 4 5 6 8 [9] ? ? ?
scale-2: 10 2*? 2*? 2*? 2*? 2*? 2*?
scale-3: 3*4 3*? 3*? 3*? 3*? 3*? 3*? 3*?
scale-5: 10 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*?
_________________________________________________________________
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 2 3 4 5 6 8 9 [10] ? ?
scale-2: 2*6 2*? 2*? 2*? 2*? 2*?
scale-3: 12 3*? 3*? 3*? 3*? 3*? 3*? 3*?
scale-5: 5*3 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*?
________________________________________________________________
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 2 3 4 5 6 8 9 10 [12] ?
scale-2: 2*8 2*? 2*? 2*? 2*?
scale-3: 3*5 3*? 3*? 3*? 3*? 3*? 3*?
scale-5: 15 5*? 5*? 5*? 5*? 5*? 5*? 5*? 5*?
_________________________________________________________________
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
S = 1 2 3 4 5 6 8 9 10 12 [15]
scale-2: 16 2*? 2*? 2*? 2*?
scale-3: 3*6 3*? 3*? 3*? 3*? 3*?
scale-5: 5*4 5*? 5*? 5*? 5*? 5*? 5*? 5*?
________________________________________________________________
So perhaps it's more like a snake with one head taking alternate bites from its three tails.
Related
I have the following script and I have problems on >printing and
saving. Any ideas or help welcome.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [3513, 3514, 3517],
'B':['lname1', 'lname2', 'lname3'],
'C':['fname1', 'fname2', 'fname3'],
},index=np.arange(3,dtype=int))
def vamos(df):
for x in df.index:
s = (df.loc[x,'A'])
digits = list(map(int, str(s)))
Sum = (sum(digits))
df = df.assign(column_2=(Sum))
df['column_3'] = 20 - Sum
print(df)
df.to_excel("book_Sum.xlsx")
if __name__ == '__main__':
vamos(df)
This is what I get with print(df):
A B C column_2 column_3
0 3513 lname1 fname1 12 8
1 3514 lname2 fname2 12 8
2 3517 lname3 fname3 12 8
A B C column_2 column_3
0 3513 lname1 fname1 13 7
1 3514 lname2 fname2 13 7
2 3517 lname3 fname3 13 7
A B C column_2 column_3
0 3513 lname1 fname1 16 4
1 3514 lname2 fname2 16 4
2 3517 lname3 fname3 16 4
And this when I save to excel. df.to_excel("book_Sum.xlsx")
A B C column_2 column_3
0 3513 lname1 fname1 16 4
1 3514 lname2 fname2 16 4
2 3517 lname3 fname3 16 4
function OnEvent(event, arg)
--OutputLogMessage("Event: "..event.." Arg: "..arg.."\n")
-- MP value: 34390, 31219
-- CC: 43405, 15670
-- OK: 36029, 36017
--if IsMouseButtonPressed(2) then
-- x, y = GetMousePosition()
-- OutputLogMessage("Mouse is at %d, %d\n", x, y)
--end
i = 61
while IsMouseButtonPressed(5) do
MoveMouseTo(34390, 31219)
Sleep(100)
PressAndReleaseMouseButton(1)
Sleep(100)
PressAndReleaseKey("%d", i)
i = i + 1
OutputLogMessage(i)
Sleep(100)
PressAndReleaseKey(28)
Sleep(100)
MoveMouseTo(43405, 15670)
PressAndReleaseMouseButton(1)
Sleep(300)
end
I'm trying to convert an integer to a string. I've tried
PressAndReleaseKey(tostring(i))
But that didn't work.
I also tried
PressAndReleaseKey(i)
But that just turns it into a scancode which is "F3" btw.
What I'm trying to do is get it to click on something, type out "61", then increase the number by 1 each time it runs through the cycle.
Try
i = 0x61; -- hexadecimal!!!
PressAndReleaseKey(string.char(i))
That should probably do what you want.
Another approach is using function PressAndReleaseHidKey() - it has more consistent codes for keyboard keys: 4-29 are for letters a-z, 30-39 are for digits 1234567890
For example, PressAndReleaseHidKey(4) is the same as PressAndReleaseKey("a").
The complete list of arguments accepted by PressAndReleaseHidKey, PressHidKey, ReleaseHidKey:
hidcode
description
0x00
Reserved (no event indicated) (not a physical key)
0x01
Keyboard ErrorRollOver (not a physical key)
0x02
Keyboard POSTFail (not a physical key)
0x03
Keyboard ErrorUndefined (not a physical key)
0x04
Keyboard a and A
0x05
Keyboard b and B
0x06
Keyboard c and C
0x07
Keyboard d and D
0x08
Keyboard e and E
0x09
Keyboard f and F
0x0A
Keyboard g and G
0x0B
Keyboard h and H
0x0C
Keyboard i and I
0x0D
Keyboard j and J
0x0E
Keyboard k and K
0x0F
Keyboard l and L
0x10
Keyboard m and M
0x11
Keyboard n and N
0x12
Keyboard o and O
0x13
Keyboard p and P
0x14
Keyboard q and Q
0x15
Keyboard r and R
0x16
Keyboard s and S
0x17
Keyboard t and T
0x18
Keyboard u and U
0x19
Keyboard v and V
0x1A
Keyboard w and W
0x1B
Keyboard x and X
0x1C
Keyboard y and Y
0x1D
Keyboard z and Z
0x1E
Keyboard 1 and !
0x1F
Keyboard 2 and #
0x20
Keyboard 3 and #
0x21
Keyboard 4 and $
0x22
Keyboard 5 and %
0x23
Keyboard 6 and ^
0x24
Keyboard 7 and &
0x25
Keyboard 8 and *
0x26
Keyboard 9 and (
0x27
Keyboard 0 and )
0x28
Keyboard Return (ENTER)
0x29
Keyboard ESCAPE
0x2A
Keyboard Backspace
0x2B
Keyboard Tab
0x2C
Keyboard Spacebar
0x2D
Keyboard - and _
0x2E
Keyboard = and +
0x2F
Keyboard [ and {
0x30
Keyboard ] and }
0x31
Keyboard \ and
0x32
Keyboard Non-US # and ~
0x33
Keyboard ; and :
0x34
Keyboard ' and "
0x35
Keyboard Grave Accent and Tilde
0x36
Keyboard , and <
0x37
Keyboard . and >
0x38
Keyboard / and ?
0x39
Keyboard Caps Lock
0x3A
Keyboard F1
0x3B
Keyboard F2
0x3C
Keyboard F3
0x3D
Keyboard F4
0x3E
Keyboard F5
0x3F
Keyboard F6
0x40
Keyboard F7
0x41
Keyboard F8
0x42
Keyboard F9
0x43
Keyboard F10
0x44
Keyboard F11
0x45
Keyboard F12
0x46
Keyboard PrintScreen
0x47
Keyboard Scroll Lock
0x48
Keyboard Pause
0x49
Keyboard Insert
0x4A
Keyboard Home
0x4B
Keyboard PageUp
0x4C
Keyboard Delete
0x4D
Keyboard End
0x4E
Keyboard PageDown
0x4F
Keyboard RightArrow
0x50
Keyboard LeftArrow
0x51
Keyboard DownArrow
0x52
Keyboard UpArrow
0x53
Keypad Num Lock and Clear
0x54
Keypad /
0x55
Keypad *
0x56
Keypad -
0x57
Keypad +
0x58
Keypad ENTER
0x59
Keypad 1 and End
0x5A
Keypad 2 and Down Arrow
0x5B
Keypad 3 and PageDn
0x5C
Keypad 4 and Left Arrow
0x5D
Keypad 5
0x5E
Keypad 6 and Right Arrow
0x5F
Keypad 7 and Home
0x60
Keypad 8 and Up Arrow
0x61
Keypad 9 and PageUp
0x62
Keypad 0 and Insert
0x63
Keypad . and Delete
0x64
Keyboard Non-US \ and |
0x65
Keyboard Application (context menu key)
0x66
Keyboard Power (not a physical key)
0x67
Keypad =
0x68
Keyboard F13
0x69
Keyboard F14
0x6A
Keyboard F15
0x6B
Keyboard F16
0x6C
Keyboard F17
0x6D
Keyboard F18
0x6E
Keyboard F19
0x6F
Keyboard F20
0x70
Keyboard F21
0x71
Keyboard F22
0x72
Keyboard F23
0x73
Keyboard F24
0x74
Keyboard Execute
0x75
Keyboard Help
0x76
Keyboard Menu
0x77
Keyboard Select
0x78
Keyboard Stop
0x79
Keyboard Again
0x7A
Keyboard Undo
0x7B
Keyboard Cut
0x7C
Keyboard Copy
0x7D
Keyboard Paste
0x7E
Keyboard Find
0x7F
Keyboard Mute
0x80
Keyboard Volume Up
0x81
Keyboard Volume Down
0x82
Keyboard Locking Caps Lock
0x83
Keyboard Locking Num Lock
0x84
Keyboard Locking Scroll Lock
0x85
Keypad Comma (Brazilian ".")
0x86
Keypad Equal Sign (on AS/400 keyboard)
0x87
Keyboard International1 (Brazilian "/" and "?", Kanji)
0x88
Keyboard International2
0x89
Keyboard International3
0x8A
Keyboard International4
0x8B
Keyboard International5
0x8C
Keyboard International6
0x8D
Keyboard International7 (Double-byte/Single-byte)
0x8E
Keyboard International8
0x8F
Keyboard International9
0x90
Keyboard LANG1 (Hangul/English, Korean)
0x91
Keyboard LANG2 (Hanja, Korean)
0x92
Keyboard LANG3 (Katakana, Japanese)
0x93
Keyboard LANG4 (Hiragana, Japanese)
0x94
Keyboard LANG5 (Zenkaku/Hankaku, Japanese)
0x95
Keyboard LANG6
0x96
Keyboard LANG7
0x97
Keyboard LANG8
0x98
Keyboard LANG9
0x99
Keyboard Alternate Erase
0x9A
Keyboard SysReq/Attention
0x9B
Keyboard Cancel
0x9C
Keyboard Clear
0x9D
Keyboard Prior
0x9E
Keyboard Return
0x9F
Keyboard Separator
0xA0
Keyboard Out
0xA1
Keyboard Oper
0xA2
Keyboard Clear/Again
0xA3
Keyboard CrSel/Props
0xA4
Keyboard ExSel
0xA5
(Reserved)
0xA6
(Reserved)
0xA7
(Reserved)
0xA8
(Reserved)
0xA9
(Reserved)
0xAA
(Reserved)
0xAB
(Reserved)
0xAC
(Reserved)
0xAD
(Reserved)
0xAE
(Reserved)
0xAF
(Reserved)
0xB0
Keypad 00
0xB1
Keypad 000
0xB2
Thousands separator (locale-dependent symbol)
0xB3
Decimal Separator (locale-dependent symbol)
0xB4
Currency Unit (locale-dependent symbol)
0xB5
Currency Sub-unit (locale-dependent symbol)
0xB6
Keypad (
0xB7
Keypad )
0xB8
Keypad {
0xB9
Keypad }
0xBA
Keypad Tab
0xBB
Keypad Backspace
0xBC
Keypad A
0xBD
Keypad B
0xBE
Keypad C
0xBF
Keypad D
0xC0
Keypad E
0xC1
Keypad F
0xC2
Keypad XOR
0xC3
Keypad ^
0xC4
Keypad %
0xC5
Keypad <
0xC6
Keypad >
0xC7
Keypad &
0xC8
Keypad &&
0xC9
Keypad
0xCA
Keypad
0xCB
Keypad :
0xCC
Keypad #
0xCD
Keypad Space
0xCE
Keypad #
0xCF
Keypad !
0xD0
Keypad Memory Store
0xD1
Keypad Memory Recall
0xD2
Keypad Memory Clear
0xD3
Keypad Memory Add
0xD4
Keypad Memory Subtract
0xD5
Keypad Memory Multiply
0xD6
Keypad Memory Divide
0xD7
Keypad +/-
0xD8
Keypad Clear
0xD9
Keypad Clear Entry
0xDA
Keypad Binary
0xDB
Keypad Octal
0xDC
Keypad Decimal
0xDD
Keypad Dexadecimal
0xDE
(Reserved)
0xDF
(Reserved)
0xE0
Keyboard LeftControl
0xE1
Keyboard LeftShift
0xE2
Keyboard LeftAlt
0xE3
Keyboard Left GUI (Left Win)
0xE4
Keyboard RightControl
0xE5
Keyboard RightShift
0xE6
Keyboard RightAlt
0xE7
Keyboard Right GUI (Right Win)
If you want to type "61":
i = 61
for key in tostring(i):gmatch"%d" do
PressAndReleaseKey(key)
Sleep(100)
end
I was trying to train a very straightforward (I thought) NN model with PyTorch and skorch, but the bad performance really baffles me, so it would be great if you have any insight into this.
The problem is something like this: there are five objects, A, B, C,
D, E, (labeled by their fingerprint, e.g.(0, 0) is A, (0.2, 0.5) is B,
etc) each correspond to a number, and the problem is trying to find
what number does each correspond to. The training data is a list of
"collections" and the corresponding sum. for example: [A, A, A, B, B]
== [(0,0), (0,0), (0,0), (0.2,0.5), (0.2, 0.5)] --> 15, [B, C, D, E] == [(0.2,0.5), (0.5,0.8), (0.3,0.9), (1,1)] --> 30 .... Note that number of object in one collection is not constant
There is no noise or anything, so it's just a linear system that can be solved directly. So I would thought this would be very easy for a NN for find out. I'm actually using this example as a sanity check for a more complicated problem, but was surprised that NN couldn't even solve this.
Now I'm just trying to pinpoint exactly where it went wrong. The model definition seem to be right, the data input is right, is the bad performance due to bad training? or is NN just bad at these things?
here is the model definition:
class NN(nn.Module):
def __init__(
self,
input_dim,
num_nodes,
num_layers,
batchnorm=False,
activation=Tanh,
):
super(SingleNN, self).__init__()
self.get_forces = get_forces
self.activation_fn = activation
self.model = MLP(
n_input_nodes=input_dim,
n_layers=num_layers,
n_hidden_size=num_nodes,
activation=activation,
batchnorm=batchnorm,
)
def forward(self, batch):
if isinstance(batch, list):
batch = batch[0]
with torch.enable_grad():
fingerprints = batch.fingerprint.float()
fingerprints.requires_grad = True
#index of the current "collection" in the training list
idx = batch.idx
sorted_idx = torch.unique_consecutive(idx)
o = self.model(fingerprints)
total = scatter(o, idx, dim=0)[sorted_idx]
return total
#property
def num_params(self):
return sum(p.numel() for p in self.parameters())
class MLP(nn.Module):
def __init__(
self,
n_input_nodes,
n_layers,
n_hidden_size,
activation,
batchnorm,
n_output_nodes=1,
):
super(MLP, self).__init__()
if isinstance(n_hidden_size, int):
n_hidden_size = [n_hidden_size] * (n_layers)
self.n_neurons = [n_input_nodes] + n_hidden_size + [n_output_nodes]
self.activation = activation
layers = []
for _ in range(n_layers - 1):
layers.append(nn.Linear(self.n_neurons[_], self.n_neurons[_ + 1]))
layers.append(activation())
if batchnorm:
layers.append(nn.BatchNorm1d(self.n_neurons[_ + 1]))
layers.append(nn.Linear(self.n_neurons[-2], self.n_neurons[-1]))
self.model_net = nn.Sequential(*layers)
def forward(self, inputs):
return self.model_net(inputs)
and the skorch part is straightforward
model = NN(2, 100, 2)
net = NeuralNetRegressor(
module=model,
...
)
net.fit(train_dataset, None)
For a test run, the dataset looks like the following (16 collections in total):
[[0.7484336 0.5656401]
[0. 0. ]
[0. 0. ]
[0. 0. ]]
[[1. 1.]
[0. 0.]
[0. 0.]]
[[0.51311415 0.67012525]
[0.51311415 0.67012525]
[0. 0. ]
[0. 0. ]]
[[0.51311415 0.67012525]
[0.7484336 0.5656401 ]
[0. 0. ]]
[[0.51311415 0.67012525]
[1. 1. ]
[0. 0. ]
[0. 0. ]]
[[0.51311415 0.67012525]
[0.51311415 0.67012525]
[0. 0. ]
[0. 0. ]
[0. 0. ]
[0. 0. ]
[0. 0. ]
[0. 0. ]]
[[0.51311415 0.67012525]
[1. 1. ]
[0. 0. ]
[0. 0. ]
[0. 0. ]
[0. 0. ]]
....
with corresponding total:
[10, 11, 14, 14, 17, 18, ...]
It's easy to tell what are the objects/how many of them are in one collection just by eyeballing it
and the training process looks like:
epoch train_energy_mae train_loss cp dur
------- ------------------ ------------ ---- ------
1 4.9852 0.5425 + 0.1486
2 16.3659 4.2273 0.0382
3 6.6945 0.7403 0.0025
4 7.9199 1.2694 0.0024
5 12.0389 2.4982 0.0024
6 9.9942 1.8391 0.0024
7 5.6733 0.7528 0.0024
8 5.7007 0.5166 0.0024
9 7.8929 1.0641 0.0024
10 9.2560 1.4663 0.0024
11 8.5545 1.2562 0.0024
12 6.7690 0.7589 0.0024
13 5.3769 0.4806 0.0024
14 5.1117 0.6009 0.0024
15 6.2685 0.8831 0.0024
....
290 5.1899 0.4750 0.0024
291 5.1899 0.4750 0.0024
292 5.1899 0.4750 0.0024
293 5.1899 0.4750 0.0024
294 5.1899 0.4750 0.0025
295 5.1899 0.4750 0.0025
296 5.1899 0.4750 0.0025
297 5.1899 0.4750 0.0025
298 5.1899 0.4750 0.0025
299 5.1899 0.4750 0.0025
300 5.1899 0.4750 0.0025
301 5.1899 0.4750 0.0024
302 5.1899 0.4750 0.0025
303 5.1899 0.4750 0.0024
304 5.1899 0.4750 0.0024
305 5.1899 0.4750 0.0025
306 5.1899 0.4750 0.0024
307 5.1899 0.4750 0.0025
You can see that it just stopped training after a while.
I can confirm that the NN does give different result for different fingerprint, but somehow the final predicted value is just never good enough.
I have tried different NN size, learning rate, batch size, activation function (tanh, relu, etc) and non of them seem to help. Do you have any insight into this? is there anything I did wrong/could try, or is NN just bad at this kind of task?
First thing I've noticed: super(SingleNN, self).__init__() should be super(NN, self).__init__() instead. Change that and let me know if you still get any errors.
I have JPG images and with inputsvgdraw, a flash tool for image annotation (http://www.mainada.net/inputdraw).
As Gimp or Picasa, when i crop an image and i save as new image only the cropped part. I need a function that take svg path (representing new image bourders) as parameter and then basing on that, create the new image.
svg data:
<svg version="1.1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 488 325"><g fill="none" stroke-miterlimit="6" stroke-linecap="round" stroke-linejoin="round"><path d="M 307 97 l 0 -1 l -2 -1 l -10 -2 l -20 -1 l -25 5 l -22 9 l -10 9 l 0 9 l 2 12 l 16 18 l 25 11 l 25 5 l 17 -1 l 6 -4 l 3 -7 l -1 -12 l -6 -16 l -7 -13 l -11 -12 l -11 -14 l -9 -5" opacity="1" stroke="rgb(170,37,34)" stroke-width="5"/></g></svg>.
How can i extract and save in a new image inside circle part image? There is any library that handle this?
There is a way convert these datas for a function in a image library in python and cropping image?
I'd like to get some insight about how constant memory is allocated (using CUDA 4.2). I know that the total available constant memory is 64KB. But when is this memory actually allocated on the device? Is this limit apply to each kernel, cuda context or for the whole application?
Let's say there are several kernels in a .cu file, each using less than 64K constant memory. But the total constant memory usage is more than 64K. Is it possible to call these kernels sequentially? What happens if they are called concurrently using different streams?
What happens if there is a large CUDA dynamic library with lots of kernels each using different amounts of constant memory?
What happens if there are two applications each requiring more than half of the available constant memory? The first application runs fine, but when will the second app fail? At app start, at cudaMemcpyToSymbol() calls or at kernel execution?
Parallel Thread Execution ISA Version 3.1 section 5.1.3 discusses constant banks.
Constant memory is restricted in size, currently limited to 64KB which
can be used to hold statically-sized constant variables. There is an
additional 640KB of constant memory, organized as ten independent 64KB
regions. The driver may allocate and initialize constant buffers in
these regions and pass pointers to the buffers as kernel function
parameters. Since the ten regions are not contiguous, the driver
must ensure that constant buffers are allocated so that each buffer
fits entirely within a 64KB region and does not span a region
boundary.
A simple program can be used to illustrate the use of constant memory.
__constant__ int kd_p1;
__constant__ short kd_p2;
__constant__ char kd_p3;
__constant__ double kd_p4;
__constant__ float kd_floats[8];
__global__ void parameters(int p1, short p2, char p3, double p4, int* pp1, short* pp2, char* pp3, double* pp4)
{
*pp1 = p1;
*pp2 = p2;
*pp3 = p3;
*pp4 = p4;
return;
}
__global__ void constants(int* pp1, short* pp2, char* pp3, double* pp4)
{
*pp1 = kd_p1;
*pp2 = kd_p2;
*pp3 = kd_p3;
*pp4 = kd_p4;
return;
}
Compile this for compute_30, sm_30 and execute cuobjdump -sass <executable or obj> to disassemble you should see
Fatbin elf code:
================
arch = sm_30
code version = [1,6]
producer = cuda
host = windows
compile_size = 32bit
identifier = c:/dev/constant_banks/kernel.cu
code for sm_30
Function : _Z10parametersiscdPiPsPcPd
/*0008*/ /*0x10005de428004001*/ MOV R1, c [0x0] [0x44]; // stack pointer
/*0010*/ /*0x40001de428004005*/ MOV R0, c [0x0] [0x150]; // pp1
/*0018*/ /*0x50009de428004005*/ MOV R2, c [0x0] [0x154]; // pp2
/*0020*/ /*0x0001dde428004005*/ MOV R7, c [0x0] [0x140]; // p1
/*0028*/ /*0x13f0dc4614000005*/ LDC.U16 R3, c [0x0] [0x144]; // p2
/*0030*/ /*0x60011de428004005*/ MOV R4, c [0x0] [0x158]; // pp3
/*0038*/ /*0x70019de428004005*/ MOV R6, c [0x0] [0x15c]; // pp4
/*0048*/ /*0x20021de428004005*/ MOV R8, c [0x0] [0x148]; // p4
/*0050*/ /*0x30025de428004005*/ MOV R9, c [0x0] [0x14c]; // p4
/*0058*/ /*0x1bf15c0614000005*/ LDC.U8 R5, c [0x0] [0x146]; // p3
/*0060*/ /*0x0001dc8590000000*/ ST [R0], R7; // *pp1 = p1
/*0068*/ /*0x0020dc4590000000*/ ST.U16 [R2], R3; // *pp2 = p2
/*0070*/ /*0x00415c0590000000*/ ST.U8 [R4], R5; // *pp3 = p3
/*0078*/ /*0x00621ca590000000*/ ST.64 [R6], R8; // *pp4 = p4
/*0088*/ /*0x00001de780000000*/ EXIT;
/*0090*/ /*0xe0001de74003ffff*/ BRA 0x90;
/*0098*/ /*0x00001de440000000*/ NOP CC.T;
/*00a0*/ /*0x00001de440000000*/ NOP CC.T;
/*00a8*/ /*0x00001de440000000*/ NOP CC.T;
/*00b0*/ /*0x00001de440000000*/ NOP CC.T;
/*00b8*/ /*0x00001de440000000*/ NOP CC.T;
...........................................
Function : _Z9constantsPiPsPcPd
/*0008*/ /*0x10005de428004001*/ MOV R1, c [0x0] [0x44]; // stack pointer
/*0010*/ /*0x00001de428004005*/ MOV R0, c [0x0] [0x140]; // p1
/*0018*/ /*0x10009de428004005*/ MOV R2, c [0x0] [0x144]; // p2
/*0020*/ /*0x0001dde428004c00*/ MOV R7, c [0x3] [0x0]; // kd_p1
/*0028*/ /*0x13f0dc4614000c00*/ LDC.U16 R3, c [0x3] [0x4]; // kd_p2
/*0030*/ /*0x20011de428004005*/ MOV R4, c [0x0] [0x148]; // p3
/*0038*/ /*0x30019de428004005*/ MOV R6, c [0x0] [0x14c]; // p4
/*0048*/ /*0x20021de428004c00*/ MOV R8, c [0x3] [0x8]; // kd_p4
/*0050*/ /*0x30025de428004c00*/ MOV R9, c [0x3] [0xc]; // kd_p4
/*0058*/ /*0x1bf15c0614000c00*/ LDC.U8 R5, c [0x3] [0x6]; // kd_p3
/*0060*/ /*0x0001dc8590000000*/ ST [R0], R7;
/*0068*/ /*0x0020dc4590000000*/ ST.U16 [R2], R3;
/*0070*/ /*0x00415c0590000000*/ ST.U8 [R4], R5;
/*0078*/ /*0x00621ca590000000*/ ST.64 [R6], R8;
/*0088*/ /*0x00001de780000000*/ EXIT;
/*0090*/ /*0xe0001de74003ffff*/ BRA 0x90;
/*0098*/ /*0x00001de440000000*/ NOP CC.T;
/*00a0*/ /*0x00001de440000000*/ NOP CC.T;
/*00a8*/ /*0x00001de440000000*/ NOP CC.T;
/*00b0*/ /*0x00001de440000000*/ NOP CC.T;
/*00b8*/ /*0x00001de440000000*/ NOP CC.T;
.....................................
I annotated to the right of the SASS.
On sm30 you can see that parameters are passed in constant bank 0 starting at offset 0x140.
User defined __constant__ variables are defined in constant bank 3.
If you execute cuobjdump --dump-elf <executable or obj> you can find other interesting constant information.
32bit elf: abi=6, sm=30, flags = 0x1e011e
Sections:
Index Offset Size ES Align Type Flags Link Info Name
1 34 142 0 1 STRTAB 0 0 0 .shstrtab
2 176 19b 0 1 STRTAB 0 0 0 .strtab
3 314 d0 10 4 SYMTAB 0 2 a .symtab
4 3e4 50 0 4 CUDA_INFO 0 3 b .nv.info._Z9constantsPiPsPcPd
5 434 30 0 4 CUDA_INFO 0 3 0 .nv.info
6 464 90 0 4 CUDA_INFO 0 3 a .nv.info._Z10parametersiscdPiPsPcPd
7 4f4 160 0 4 PROGBITS 2 0 a .nv.constant0._Z10parametersiscdPiPsPcPd
8 654 150 0 4 PROGBITS 2 0 b .nv.constant0._Z9constantsPiPsPcPd
9 7a8 30 0 8 PROGBITS 2 0 0 .nv.constant3
a 7d8 c0 0 4 PROGBITS 6 3 a00000b .text._Z10parametersiscdPiPsPcPd
b 898 c0 0 4 PROGBITS 6 3 a00000c .text._Z9constantsPiPsPcPd
.section .strtab
.section .shstrtab
.section .symtab
index value size info other shndx name
0 0 0 0 0 0 (null)
1 0 0 3 0 a .text._Z10parametersiscdPiPsPcPd
2 0 0 3 0 7 .nv.constant0._Z10parametersiscdPiPsPcPd
3 0 0 3 0 b .text._Z9constantsPiPsPcPd
4 0 0 3 0 8 .nv.constant0._Z9constantsPiPsPcPd
5 0 0 3 0 9 .nv.constant3
6 0 4 1 0 9 kd_p1
7 4 2 1 0 9 kd_p2
8 6 1 1 0 9 kd_p3
9 8 8 1 0 9 kd_p4
10 16 32 1 0 9 kd_floats
11 0 192 12 10 a _Z10parametersiscdPiPsPcPd
12 0 192 12 10 b _Z9constantsPiPsPcPd
The kernel parameter constant bank is versioned per launch so that concurrent kernels can be executed. The compiler and user constants are per CUmodule. It is the responsibility of the developer to manage coherency of this data. For example, the developer has to ensure that a cudaMemcpyToSymbol is update in a safe manner.