blob: 60adf5f0c986ad23c9229b0fb1a76ea4da3a8a7f (
plain) (
blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
|
0. writeback cache
[✔️] Cache dirty bits
[✔️] Cache evicting dirty data on fills that would replace
[✔️] Cache not immediately forwarding writes
[✔️] Fix mem_cache to actually instantiate memory correctly
[✔️] Run Quartus in Windows to generate a Verilog template for manual instantiation of RAM blocks
[X] Try to use asynchronous clears for reset instead of occupying a port for SETS cycles NOPE
[✔️] Need at least one port capable of read-before-write
[✔️] Maybe don't need a second port if the first port can make write optional
[✔️] We might need to split our accesses across two cycles
[X] If so, can we infer the correct logic without explicit instantiation of the megafunction? NOPE
[X] Can we do asynchronous clear without explicit instantiation of the megafunction? NOPE
[✔️] Copy from said template into mem_cache.sv instead of trying to use inference
---
[ ] Arbiter sending snoops to caches in response to CLI writes
[ ] Cache updating itself to clean state for write snoops
---
[ ] Arbiter sending snoops to caches in response to CLI reads
[ ] Arbiter waiting for snoop responses from caches for CLI reads
[ ] Arbiter sending correct data for CLI reads (snoop responses in preference over RAM response)
[ ] Cache sending snoop responses for read snoops
---
[ ] Cache forwarding snoops upstream
[ ] Core updating itself for write snoops (no-op)
[ ] Core sending snoop responses for read snoops (always no data)
1. pipelining that works with SMC / start working on minhdl version of the core
2. write an SPI or I2C master on the FPGA to sample analog inputs
3. support wider-than-single-word cache lines
4. add global shared cache
5. ethernet
∞. fix hello.pal (in noncpu)
Command block:
+--------------+ +----------------+
+-----| Echo Arbiter |<--| Result Printer |<-------------------------------+
| +--------------+ +----------------+ |
| ^ |
v | |
+--------+ +----------------+ |
| UART 0 |-->| Command Parser |------------+ RAM block: |
+--------+ +----------------+ | |
| +----------------+ |
| | Off-chip DRAM | |
| +----------------+ |
PDP-8 block: | ^ | |
| | v |
+--------+ +-------+ | +----------------+ |
| UART 1 |<->| PDP-8 | v | RAM Controller | |
+--------+ +-------+ +-------------+ +----------------+ +---------------+
| Cache |------------------>| Mem Arbiter |-->| Another Cache |-->| Mem Broadcast |
+-------+ +-------------+ +----------------+ +---------------+
^ ^ | |
| | | |
+--------------------------------------------------------------+ |
| |
* | |
* (PDP-8s are replicated) | |
* | |
| |
+--------+ +-------+ | |
| UART N |<->| PDP-8 | | |
+--------+ +-------+ | |
| Cache |---------------------+ |
+-------+ |
^ |
| |
+----------------------------------------------------------------+
Echo Arbiter: Trivial priority arbiter (input echo has priority over the result printer)
Nen Arbiter: Adds clog2(1 + number of PDP-8s) tag bits indicating which channel was selected
For inputs coming from a PDP-8 as opposed to the command parser, add the appropriate prefix to the memory address
Mem Broadcast: Removes clog2(1 + number of PDP-8s) tag bits to determine which channel to send to
For outputs going to a PDP-8 as opposed to the result printer, strip excess address bits
Note that the mem arbiter and broadcast have to be as wide as (number of
PDP-8s)+1, so they will wind up being bottlenecks. The memory protocol allows
arbitrary stalls, so multi-cycle arbitration is possible.
Not shown is the front panel interface.
|