Pipeline hazard handling with store

Pipeline hazard handling with store - memory

Consider the following execution of instructions in a 5-stage pipeline
(IF - ID - EX - MEM - WB)
where "SD N(R2), R1" means store data from register R1 to memory position M[N+R2], "ADD R3, R1, R2" performs the operation R1 + R2 and stores the result in R3, and NOP is a bubble.
For what I understand registers are read on the ID stage.
So, if I have the following instructions:
I1: SD 0(R2), R6
NOP
I2: ADD R3, R1, R2
then the execution goes as following (I hope it looks clear)
R2 is read
^ Store M[0+R2] <- R6
^ ^
I1: | IF | ID | EX | MEM | WB |
NOP: |////|////|////|////|////|
I2: | IF | ID | EX | MEM | WB |
v
R2 is read
Is there a hazard on the 4th cycle when I1 is on the MEM stage and I2 on the ID stage because both instructions want to access R2 at the same time?
Or is there no hazard since R2 is only read on the ID stages and therefore it is not accessed on the MEM stage?

All registers are read in the ID stage, thus there are no hazards in trying to read registers.
This does mean that some instructions will have to stall in ID if the registers they want to read aren't finished yet. That's where "bypassing" can help.

Related

Writing the production rules of this finite state machine

Consider the following state diagram which accepts the alphabet {0,1} and accepts if the input string has two consecutive 0's or 1's:
01001 --> Accept
101 --> Reject
How would I write the production rules to show this? Is it just:
D -> C0 | B1 | D0 | D1
C -> A0 | B0
B -> A1 | C1
And if so, how would the terminals (0,1) be differentiated from the states (A,B,C) ? And should the state go before or after the input? That is, should it be A1 or 1A for example?

The grammar you suggest has no A: it's not a non-terminal because it has no production rules, and it's not a terminal because it's not present in the input. You could make that work by writing, for example, C → 0 | B 0, but a more general solution is to make A into a non-terminal using an ε-rule: A → ε and then
C → A 0 | B 0.
B0 is misleading, because it looks like a single thing. But it's two grammatical symbols, a non-terminal (B) and a terminal 0.
With those modifications, your grammar is fine. It's a left linear grammar; a right linear grammar can also be constructed from the FSA by considering in-transitions rather than out-transitions. In this version, the epsilon production corresponds to final states rather than initial states.
A → 1 B | 0 C
B → 0 C | 1 D
C → 1 B | 0 D
D → 0 D | 1 D | ε
If it's not obvious why the FSM corresponds to these two grammars, it's probably worth grabbing a pad of paper and constructing a derivation with each grammar for a few sample sentences. Compare the derivations you produce with the progress through the FSM for the same input.

NASM memory not being accessed correctly?

So I am trying to print a simple hello world string using NASM in real mode. As you might be able to tell by the org 0000:7C00 define, it is a test bootloader. For some reason or another though, 'Hello World' is not being printed correctly. Tried in VirtualBox and real hardware.
When ran, it ends up printing a bunch of random shapes and figures, which has no resemblance to real letters, let alone 'Hello World'. I'm thinking that it has to do with my segment registers not being set up properly, as I noticed that moving around the definition of MESSAGE changed the values that were being printed out. I looked at this question:
Simple NASM "boot program" not accessing memory correctly?
But there were no answers there to my problem, and I do set up ds to be 0. Any ideas what's going on?
Also worth noting, i am compiling it into a flat binary. The reason why it prints 'L' at the end is so I know that everything that was supposed to print before it worked. Or, I guess in this case, didn't.
BITS 16
org 0x0000:7C00
start:
mov ax, 0
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax
mov ss, ax ;Puts 0 into the segment pointer, we are using real memory.
mov sp, 0000:7C00 ;Moves 7C00 into the stack pointer, so that all data <7C00 is stack.
call print_string ;Calls print string.
jmp Exit
;Prints the test string for now.
print_string:
mov si, MESSAGE
.nextChar:
mov ah, 0x0E
mov al, [si]
cmp al, 0x0
je .end
int 10h
add si, 1
jmp .nextChar
.end:
ret
MESSAGE db "Hello world!", 0
Exit:
mov ah, 0x0E
mov al, 'L'
int 10h
times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s
dw 0xAA55 ; The standard PC boot signature

Try this to really stop the program in stead of executing garbage after the last int 10h
Exit:
mov ah, 0x0E
mov al, 'L'
int 10h
EndlessLoop:
jmp EndlessLoop

Experimental OS in assembly - can't show a character on the screen (pmode)

I hope there's some experienced assembly/os developer here, even if my problem is not a huge one.
I am trying to play with assembly and create a small operating system. In fact, what I want is a boot-loader and a second boot-loader that activates pmode and displays a single char on the screen, using the video memory (not with interrupts, evidently).
I am using VirtualBox to emulate the code, which I paste manually inside a VHD disk (two sectors of code)
In first place, my code:
boot.asm
This is the first boot-loader
bits 16
org 0
mov al, dl
jmp 07c0h:Start
Start:
cli
push ax
mov ax, cs
mov ds, ax
mov es, ax
pop ax
sti
jmp ReadDisk
ReadDisk:
call ResetDisk
mov bx, 0x1000
mov es, bx
mov bx, 0x0000
mov dl, al
mov ah, 0x02
mov al, 0x01
mov ch, 0x00
mov cl, 0x02
mov dh, 0x00
int 0x13
jc ReadDisk
jmp 0x1000:0x0000
ResetDisk:
mov ah, 0x00
mov dl, al
int 0x13
jc ResetDisk
ret
times 510 - ($ - $$) db 0
dw 0xAA55
boot2.asm
This is the second boot-loader, pasted on the second sector (next 512 bytes)
bits 16
org 0
jmp 0x1000:Start
InstallGDT:
cli
pusha
lgdt [GDT]
sti
popa
ret
StartGDT:
dd 0
dd 0
dw 0ffffh
dw 0
db 0
db 10011010b
db 11001111b
db 0
dw 0ffffh
dw 0
db 0
db 10010010b
db 11001111b
db 0
StopGDT:
GDT:
dw StopGDT - StartGDT - 1
dd StartGDT + 10000h
OpenA20:
cli
pusha
call WaitInput
mov al, 0xad
out 0x64, al
call WaitInput
mov al, 0xd0
out 0x64, al
call WaitInput
in al, 0x60
push eax
call WaitInput
mov al, 0xd1
out 0x64, al
call WaitInput
pop eax
or al, 2
out 0x60, al
call WaitInput
mov al, 0xae
out 0x64, al
call WaitInput
popa
sti
ret
WaitInput:
in al, 0x64
test al, 2
jnz WaitInput
ret
WaitOutput:
in al, 0x64
test al, 1
jz WaitOutput
ret
Start:
cli
xor ax, ax
mov ds, ax
mov es, ax
mov ax, 0x9000
mov ss, ax
mov sp, 0xffff
sti
call InstallGDT
call OpenA20
ProtectedMode:
cli
mov eax, cr0
or eax, 1
mov cr0, eax
jmp 08h:ShowChar
bits 32
ShowChar:
mov ax, 0x10
mov ds, ax
mov ss, ax
mov es, ax
mov esp, 90000h
pusha ; save registers
mov edi, 0xB8000
mov bl, '.'
mov dl, bl ; Get character
mov dh, 63 ; the character attribute
mov word [edi], dx ; write to video display
popa
cli
hlt
So, I compile this code and paste the binary in the VHD, then run the system on Virtual Box. I can see that it goes in pmode correctly, the A20 gate is enabled and the LGTR contains a memory address (which I have no idea if is the correct). This is some part of the log file, that may be of interest:
00:00:07.852082 ****************** Guest state at power off ******************
00:00:07.852088 Guest CPUM (VCPU 0) state:
00:00:07.852096 eax=00000011 ebx=00000000 ecx=00010002 edx=00000080 esi=0000f4a0 edi=0000fff0
00:00:07.852102 eip=0000016d esp=0000ffff ebp=00000000 iopl=0 nv up di pl zr na po nc
00:00:07.852108 cs={1000 base=0000000000010000 limit=0000ffff flags=0000009b} dr0=00000000 dr1=00000000
00:00:07.852118 ds={0000 base=0000000000000000 limit=0000ffff flags=00000093} dr2=00000000 dr3=00000000
00:00:07.852124 es={0000 base=0000000000000000 limit=0000ffff flags=00000093} dr4=00000000 dr5=00000000
00:00:07.852129 fs={0000 base=0000000000000000 limit=0000ffff flags=00000093} dr6=ffff0ff0 dr7=00000400
00:00:07.852136 gs={0000 base=0000000000000000 limit=0000ffff flags=00000093} cr0=00000011 cr2=00000000
00:00:07.852141 ss={9000 base=0000000000090000 limit=0000ffff flags=00000093} cr3=00000000 cr4=00000000
00:00:07.852148 gdtr=0000000000539fc0:003d idtr=0000000000000000:ffff eflags=00000006
00:00:07.852155 ldtr={0000 base=00000000 limit=0000ffff flags=00000082}
00:00:07.852158 tr ={0000 base=00000000 limit=0000ffff flags=0000008b}
00:00:07.852162 SysEnter={cs=0000 eip=00000000 esp=00000000}
00:00:07.852166 FCW=037f FSW=0000 FTW=0000 FOP=0000 MXCSR=00001f80 MXCSR_MASK=0000ffff
00:00:07.852172 FPUIP=00000000 CS=0000 Rsrvd1=0000 FPUDP=00000000 DS=0000 Rsvrd2=0000
00:00:07.852177 ST(0)=FPR0={0000'00000000'00000000} t0 +0.0000000000000000000000 ^ 0
00:00:07.852185 ST(1)=FPR1={0000'00000000'00000000} t0 +0.0000000000000000000000 ^ 0
00:00:07.852193 ST(2)=FPR2={0000'00000000'00000000} t0 +0.0000000000000000000000 ^ 0
00:00:07.852201 ST(3)=FPR3={0000'00000000'00000000} t0 +0.0000000000000000000000 ^ 0
00:00:07.852209 ST(4)=FPR4={0000'00000000'00000000} t0 +0.0000000000000000000000 ^ 0
00:00:07.852222 ST(5)=FPR5={0000'00000000'00000000} t0 +0.0000000000000000000000 ^ 0
00:00:07.852229 ST(6)=FPR6={0000'00000000'00000000} t0 +0.0000000000000000000000 ^ 0
00:00:07.852236 ST(7)=FPR7={0000'00000000'00000000} t0 +0.0000000000000000000000 ^ 0
00:00:07.852244 XMM0 =00000000'00000000'00000000'00000000 XMM1 =00000000'00000000'00000000'00000000
00:00:07.852253 XMM2 =00000000'00000000'00000000'00000000 XMM3 =00000000'00000000'00000000'00000000
00:00:07.852262 XMM4 =00000000'00000000'00000000'00000000 XMM5 =00000000'00000000'00000000'00000000
00:00:07.852270 XMM6 =00000000'00000000'00000000'00000000 XMM7 =00000000'00000000'00000000'00000000
00:00:07.852280 XMM8 =00000000'00000000'00000000'00000000 XMM9 =00000000'00000000'00000000'00000000
00:00:07.852287 XMM10=00000000'00000000'00000000'00000000 XMM11=00000000'00000000'00000000'00000000
00:00:07.852295 XMM12=00000000'00000000'00000000'00000000 XMM13=00000000'00000000'00000000'00000000
00:00:07.852302 XMM14=00000000'00000000'00000000'00000000 XMM15=00000000'00000000'00000000'00000000
00:00:07.852310 EFER =0000000000000000
00:00:07.852312 PAT =0007040600070406
00:00:07.852316 STAR =0000000000000000
00:00:07.852318 CSTAR =0000000000000000
00:00:07.852320 LSTAR =0000000000000000
00:00:07.852322 SFMASK =0000000000000000
00:00:07.852324 KERNELGSBASE =0000000000000000
00:00:07.852327 ***
00:00:07.852334 Guest paging mode: Protected (changed 5 times), A20 enabled (changed 2 times)
So, this is the status of the processor at the end of the test.
The problem is that, I cannot see the character on the screen. This can be a problem related to memory (I must admit I'm not so good at memory addressing), like wrong content in segment register, or it can be related to the manner in which I am trying to use the video memory in order to show that character, but it may be something else. What do you think is wrong? Thanks so much!
Update
The problem is related to memory addressing. The ShowChar instructions are not executed. I verified it in the logs file. What I know is that everything is executed correctly up to this line:
jmp 08h:ShowChar
So, this might be related to wrong segment registers, wrong GDTR or something else related to memory addressing.
Update
I changed GDT, to be a linear address instead of a segment:offset one, but still not seeing the character. The problem is that I can't figure out the origin of the problem, because I can't verify if the GDT is correct. I can see the content of all the registers, but how could I know that the GDTR (which at the moment is 0000000000ff53f0:00e9) is correct? I'm just supposing that the ShowChar function is not executed because of a wrong GDT, but just a supposition.

The problem is, despite all your work for making character and attribute available in DX:
mov bl, '.'
mov dl, bl ; Get character
mov dh, CHAR_ATTRIB ; the character attribute
you end up writing word 63 into the screen buffer:
mov word [edi], 63 ; write to video display
which is a question mark with zero attributes, i.e. black question mark on black background.

I'm not very experienced with this, but...
GDT:
dw StopGDT - StartGDT - 1
dd StartGDT
Doesn't this need to be an "absolute" (not seg:offs) address? Since you've loaded it at segment 1000h, I would expect dd StartGDT + 10000h to be right here. No?

Here is a workable minimalist bootloader that switch to protected and print a "X" to VGA, using Qemu (so no need to read the disk).
[org 0x7C00]
cli
lgdt [gdt_descriptor]
; Enter PM
mov eax, cr0
or eax, 0x1
mov cr0, eax
; 1 GDT entry is 8B, code segment is 2nd entry (after null entry), so
; jump to code segment at 0x08 and load init_pm from there
jmp 0x8:init_pm
[bits 32]
init_pm :
; Data segment is 3rd entry in GDT, so pass to ds the value 3*8B = 0x10
mov ax, 0x10
mov ds, ax
mov ss, ax
mov es, ax
mov fs, ax
mov gs, ax
;Print a X of cyan color
;Note that this is printed over the previously printed Qemu screen
mov al, 'L'
mov ah, 3 ; cyan
mov edx, 0xb8004
mov [edx], ax
jmp $
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[bits 16]
GDT:
;null :
dd 0x0
dd 0x0
;code :
dw 0xffff ;Limit
dw 0x0 ;Base
db 0x0 ;Base
db 0b10011010 ;1st flag, Type flag
db 0b11001111 ;2nd flag, Limit
db 0x0 ;Base
;data :
dw 0xffff
dw 0x0
db 0x0
db 0b10010010
db 0b11001111
db 0x0
gdt_descriptor :
dw $ - GDT - 1 ;16-bit size
dd GDT ;32-bit start address
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Bootsector padding
times 510-($-$$) db 0
dw 0xaa55
Then, do:
nasm boot.asm
qemu boot

To write to video memory in standard VGA you write data to memory address 0xb8000 or byte 8000 in memory. To do a simple black and white character simply OR the character value with the value 15 << 8 so that you get a 16 bit unsigned short. You then write these 16 bits to that memory location to draw the character.

The problem is your use of the ORG directive and mixing up real mode and protected mode addressing schemes. You are right about your 32 bit code not being executed. When the CPU executes this code:
jmp 08h:ShowChar
It jumps to somewhere in the currently loaded Interrupt Vector Table, at the beginning of memory instead of your 32 bit code. Why? Because the base of your defined code segment is 0, and you told your assembler to resolve addresses relative to 0:
Org 0
Thus the CPU is actually jumping to an address that is numerically equal to (0 + the offset of the first instruction of your ShowChar code) (i.e. Code Segment Base + Offset)
To rectify this issue, change:
Org 0
Into
Org 0x10000
Then you would need to change your segment registers to match, but in this case the segment registers you originally set were incorrect for the origin directive you originally specified, but are valid when the origin directive is changed as above, so no further changes need to be made. As a side note, the fact that your origin directive was incorrect can explain why your GDT address appeared to be garbage - because it was in fact some part of the Interrupt Vector Table that was loaded by your lgdt instruction. Your pointer to the GDT parameters ('GTD' label) is actually pointing to somewhere in the beginning of the Interrupt Vector Table.
Anyway, simply changing the origin directive as shown above should fix the problem.
By the way, your code looks awfully similar to the code over at http://www.brokenthorn.com/Resources/OSDev8.html
Especially the demo code provided at the bottom of the page of
http://www.brokenthorn.com/Resources/OSDev10.html
Interesting..

How to implement a left recursion eliminator?

How can i implement an eliminator for this?
A := AB |
AC |
D |
E ;

This is an example of so called immediate left recursion, and is removed like this:
A := DA' |
EA' ;
A' := ε |
BA' |
CA' ;
The basic idea is to first note that when parsing an A you will necessarily start with a D or an E. After the D or an E you will either end (tail is ε) or continue (if we're in a AB or AC construction).
The actual algorithm works like this:
For any left-recursive production like this: A -> A a1 | ... | A ak | b1 | b2 | ... | bm replace the production with A -> b1 A' | b2 A' | ... | bm A' and add the production A' -> ε | a1 A' | ... | ak A'.
See Wikipedia: Left Recursion for more information on the elimination algorithm (including elimination of indirect left recursion).

Another form available is:
A := (D | E) (B | C)*
The mechanics of doing it are about the same but some parsers might handle that form better. Also consider what it will take to munge the action rules along with the grammar its self; the other form requires the factoring tool to generate a new type for the A' rule to return where as this form doesn't.

difference between top down and bottom up parsing techniques?

I guess the same logic is applied in both of them, i.e replacing the matched strings with the corresponding non-terminal elements as provided in the production rules.
Why do they categorize LL as top down and LR as bottom-up?

Bottom up parsing:
Bottom-up parsing (also known as
shift-reduce parsing) is a strategy
for analyzing unknown data
relationships that attempts to
identify the most fundamental units
first, and then to infer higher-order
structures from them. It attempts to
build trees upward toward the start
symbol.
Top-down parsing:
Top-down parsing is a strategy of
analyzing unknown data relationships
by hypothesizing general parse tree
structures and then considering
whether the known fundamental
structures are compatible with the
hypothesis.

Top down parsing
involves to generating the string from first non-terminal.
Example: recursive descent parsing,non-recursive descent parsing, LL parsing, etc.
The grammars with left recursive and left factoring do not work.
Might occur backtracking.
Use of left most derivation

Things Of Interest Blog
The difference between top-down parsing and bottom-up parsing
Given a formal grammar and a string produced by that grammar, parsing is figuring out the production process for that string.
In the case of the context-free grammars, the production process takes the form of a parse tree. Before we begin, we always know two things about the parse tree: the root node, which is the initial symbol from which the string was originally derived, and the leaf nodes, which are all the characters of the string in order. What we don't know is the layout of nodes and branches between them.
For example, if the string is acddf, we know this much already:
S
/|\
???
| | | | |
a c d d f
Example grammar for use in this article
S → xyz | aBC
B → c | cd
C → eg | df
Bottom-up parsing
This approach is not unlike solving a jigsaw puzzle. We start at the bottom of the parse tree with individual characters. We then use the rules to connect the characters together into larger tokens as we go. At the end of the string, everything should have been combined into a single big S, and S should be the only thing we have left. If not, it's necessary to backtrack and try combining tokens in different ways.
With bottom-up parsing, we typically maintain a stack, which is the list of characters and tokens we've seen so far. At each step, we shift a new character onto the stack, and then reduce as far as possible by combining characters into larger tokens.
Example
String is acddf.
Steps
ε can't be reduced
a can't be reduced
ac can be reduced, as follows:
reduce ac to aB
aB can't be reduced
aBd can't be reduced
aBdd can't be reduced
aBddf can be reduced, as follows:
reduce aBddf to aBdC
aBdC can't be reduced
End of string. Stack is aBdC, not S. Failure! Must backtrack.
aBddf can't be reduced
ac can't be reduced
acd can be reduced, as follows:
reduce acd to aB
aB can't be reduced
aBd can't be reduced
aBdf can be reduced, as follows:
reduce aBdf to aBC
aBC can be reduced, as follows:
reduce aBC to S
End of string. Stack is S. Success!
Parse trees
|
a
| |
a c
B
| |
a c
B
| | |
a c d
B
| | | |
a c d d
B
| | | | |
a c d d f
B C
| | | |\
a c d d f
| |
a c
| | |
a c d
B
| /|
a c d
B
| /| |
a c d d
B
| /| | |
a c d d f
B C
| /| |\
a c d d f
S
/|\
/ | |
/ B C
| /| |\
a c d d f
Example 2
If all combinations fail, then the string cannot be parsed.
String is acdg.
Steps
ε can't be reduced
a can't be reduced
ac can be reduced, as follows:
reduce ac to aB
aB can't be reduced
aBd can't be reduced
aBdg can't be reduced
End of string. Stack is aBdg, not S. Failure! Must backtrack.
ac can't be reduced
acd can be reduced, as follows:
reduce acd to aB
aB can't be reduced
aBg can't be reduced
End of string. stack is aBg, not S. Failure! Must backtrack.
acd can't be reduced
acdg can't be reduced
End of string. Stack is is acdg, not S. No backtracking is possible. Failure!
Parse trees
|
a
| |
a c
B
| |
a c
B
| | |
a c d
B
| | | |
a c d g
| |
a c
| | |
a c d
B
| /|
a c d
B
| /| |
a c d g
| | |
a c d
| | | |
a c d g
Top-down parsing
For this approach we assume that the string matches S and look at the internal logical implications of this assumption. For example, the fact that the string matches S logically implies that either (1) the string matches xyz or (2) the string matches aBC. If we know that (1) is not true, then (2) must be true. But (2) has its own further logical implications. These must be examined as far as necessary to prove the base assertion.
Example
String is acddf.
Steps
Assertion 1: acddf matches S
Assertion 2: acddf matches xyz:
Assertion is false. Try another.
Assertion 2: acddf matches aBC i.e. cddf matches BC:
Assertion 3: cddf matches cC i.e. ddf matches C:
Assertion 4: ddf matches eg:
False.
Assertion 4: ddf matches df:
False.
Assertion 3 is false. Try another.
Assertion 3: cddf matches cdC i.e. df matches C:
Assertion 4: df matches eg:
False.
Assertion 4: df matches df:
Assertion 4 is true.
Assertion 3 is true.
Assertion 2 is true.
Assertion 1 is true. Success!
Parse trees
S
|
S
/|\
a B C
| |
S
/|\
a B C
| |
c
S
/|\
a B C
/| |
c d
S
/|\
a B C
/| |\
c d d f
Example 2
If, after following every logical lead, we can't prove the basic hypothesis ("The string matches S") then the string cannot be parsed.
String is acdg.
Steps
Assertion 1: acdg matches S:
Assertion 2: acdg matches xyz:
False.
Assertion 2: acdg matches aBC i.e. cdg matches BC:
Assertion 3: cdg matches cC i.e. dg matches C:
Assertion 4: dg matches eg:
False.
Assertion 4: dg matches df:
False.
False.
Assertion 3: cdg matches cdC i.e. g matches C:
Assertion 4: g matches eg:
False.
Assertion 4: g matches df:
False.
False.
False.
Assertion 1 is false. Failure!
Parse trees
S
|
S
/|\
a B C
| |
S
/|\
a B C
| |
c
S
/|\
a B C
/| |
c d
Why left-recursion is a problem for top-down parsers
If our rules were left-recursive, for example something like this:
S → Sb
Then notice how our algorithm behaves:
Steps
Assertion 1: acddf matches S:
Assertion 2: acddf matches Sb:
Assertion 3: acddf matches Sbb:
Assertion 4: acddf matches Sbbb:
...and so on forever
Parse trees
S
|
S
|\
S b
|
S
|\
S b
|\
S b
|
S
|\
S b
|\
S b
|\
S b
|
...

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart