1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
|
The "Big Golf" Microarchitecture
Allowed memory combinations:
* Any two loads
* Any two stores with different addresses (n.b. LLC is limited to 1 eviction per cycle)
* Any load with any younger store
Instruction opcodes:
0 AND logical AND from memory to accumulator
1 TAD Two's-complement ADd from memory to accumulator
2 ISZ Increment and Skip if Zero
3 DCA Deposit and Clear Accumulator
4 JMS JuMp Subroutine
5 JMP JuMP
6 IOT In-Out Transfer (device accesses)
7 OPR microsequenced OPeRations (miscellaneous, like clear/rotate/etc)
Memory transactions: Opcodes that do it: (second set is the indirect versions)
* Fetch instruction 01234567 01234567
* Indirect address load 0123
* Autoincrement store 0123
* Execution load 012 012 45
* Execution store 234 234
┌─────┐ ┌──────┐ ┌────┐
│Fetch├──────►│Decode│ ┌►│Exec│
└─────┘ └──────┘ │ └────┘
│
next_pc ┌──init_indirect_load │ init_execution_store
│ init_execution_load──┤ retire
│ init_execution_store │
│ retire │
│ rubberband_stall(1/2)│
│ │
│ ┌───────┐ │
└►│Autoinc│ │
└───────┘ │
│
┌──init_autoinc_store │
│ init_execution_load──┤
│ init_execution_store │
│ retire │
│ │
│ ┌─────┐ │
└►│Indir│ │
└─────┘ │
│
init_execution_load──┘
init_execution_store
retire
Possible arbitration techniques:
* Rubberband stalling in Decode + positional arbitration
* Age/address/operation comparison without rubberbanding
* Longer clock cycles, or
* Extra cycle
What to do with cache misses?
* Stall entire pipeline to maintain simpler ordering constraints
* If only loads are missing, allow everything else to proceed?
* Always allow Fetch to proceed?
Need separate logic to detect SMC clobbers *anyway*
OPR opcodes:
"group 1"
_0___1___2_ _3_ _4_ _5_ _6_ _7_ _8_ _9_ _10 _11
| | | | | | |RAR|RAL| 0 | |
| 1 1 1 | 0 |CLA|CLL|CMA|CML|RTR|RTL| 1 |IAC|
|___|___|___|___|___|___|___|___|___|___|___|___|
CLA CLear Accumulator
CLL CLear Link
CMA CoMplement Accumulator
CML CoMplement Link
RAR Rotate Accumulator Right (if bit 10 is 0)
RAL Rotate Accumulator Left (if bit 10 is 0)
RTR Rotate (Twice) accumulator and link Right (if bit 10 is 1)
RTL Rotate (Twice) accumulator and link Left (if bit 10 is 1)
IAC Increment ACcumulator
BSW Byte Swap word in accumulator (if bits 8 and 9 are 0, and bit 10 is 1)
Logical order of operations:
CLA, CLL
CMA, CML
IAC
RAR, RAL, RTR, RTL, BSW
"group 2"
_0___1___2_ _3_ _4_ _5_ _6_ _7_ _8_ _9_ _10 _11
| | | |SMA|SZA|SNL| 0 | | | |
| 1 1 1 | 1 |CLA|SPA|SNA|SZL| 1 |OSR|HLT| 0 |
|___|___|___|___|___|___|___|___|___|___|___|___|
SMA Skip on Minus Accumulator (skip if high bit of accumulator is set) (if bit 8 is 0)
SPA Skip on Plus Accumulator (skip if high bit of accumulator is clear) (if bit 8 is 1)
SZA Skip on Zero Accumulator (if bit 8 is 0)
SNA Skip on Nonzero Accumulator (if bit 8 is 1)
SNL Skip on Nonzero Link (if bit 8 is 0)
SZL Skip on Zero Link (if bit 8 is 1)
OSR bitwise Or Switch Register into accumulator
HLT HaLT processor
CLA CLear Accumulator
Logical order of operations:
SMA, SZA, SNL
SPA, SNA, SZL
CLA
OSR, HLT
"mq"
_0___1___2_ _3_ _4_ _5_ _6_ _7_ _8_ _9_ _10 _11
| | | | | | | | | | |
| 1 1 1 | 1 |CLA|MQA| |MQL| | | | 1 |
|___|___|___|___|___|___|___|___|___|___|___|___|
CLA CLear Accumulator
MQL MQ Loads from Accumulator
MQA bitwise or MQ into Accumulator
bits 6,8,9,10 are used for extended arithmetic instructions
see https://homepage.divms.uiowa.edu/~jones/pdp8/refcard/74.html
Logical order of operations:
CLA
MQA, MQL (simultaneous parallel assignment)
|