Overview
- Digital building blocks
- Gates, multiplexers, decoders, registers, arithmetic circuits, counters, memory arrays, logic arrays
- Demonstrate:
- Hierarchy
- Modularity
- Regularity
Adder
1-Bit Adder
Multi-Bit Adder/Carry Propagate Adder (CPA)
Ripple-Carry Adder
- Slow
- Delay: $$t{ripple} = Nt{FA}$$
Carry-Lookahead Adder
- Fast
- PGK signals
P
:A + B
, $$C_{out} = 1$$G
:A · B
, $$C_{out} = C$$K
:~A · ~B
, $$C_{out} = 0$$
- Carry out $$Ci$$ dependent on $$C{i-1}$$
- $$Ci = A_i B_i + (A_i + B_i) C{i-1} = Gi + P_i C{i-1}$$
- Compute $$G_i$$ and $$P_i$$ for all columns
- Computer $$G$$ and $$P$$ for all k-bit blocks
- $$G{i:j} = G{i:k} + P{i:k}G{k-1:j}$$
- $$P{i:j} = P{i:k}P_{k-1:j}$$
- Propagate $$C_{in}$$ through all k-bit blocks
- $$C{out} = C_k = G{k:0} + P{k:0} C{in}$$
- $$Sk = P_k \oplus G{k-1:-1}$$
- Delay
- $$t{CLA} = t{pg} + t{pg_block} + (\frac{N}{k} - 1)t{AND_OR} + k t_FA$$
- $$t_{pg}$$: delay to generate $$G_i$$ and $$P_i$$
- $$t{pg_block}$$: delay to generate all group $$G{i:j}$$, $$P_{i:j}$$
- $$t{AND_OR}$$: delay from $$C{in}$$ to $$C_{out}$$ in a k-bit block
- $$t{CLA} = t{pg} + t{pg_block} + (\frac{N}{k} - 1)t{AND_OR} + k t_FA$$
Prefix
- Faster
Subtractor
- Flip then add
Comparator
Equality
Less Than
- Subtract
- Look at sign
ALU
Shifter & Rotator
- Logical shifter
11001 >> 2 = 00110
- Arithmetic shifter
11001 >>> 2 = 11110
- Rotator
11001 ROR 2 = 01110
Multiplier & Divider
- Partial products derived
- Multiplying single digit of multiplier with multiplicand
- Shift
- Sum
4 x 4 Multiplier
4 x 4 Divider
# A = B * Q + R
# A / B = Q + R / B
R' = 0
for i = N-1 to 0:
R = R' << 1 + A[i]
D = R - B
if D < 0:
Q[i] = 0
R' = R
else:
Q[i] = 1
R' = D
R' = R
Counter & Shift Register
Counter
- Digital clock displays
- Program counter
Shift Register
- Serial-to-parallel converter
Shift Register with Parallel Load
Load = 1
: normal N-bit registerLoad = 0
: shift register
Memory Array
Types
- Flip-flop
- Random access memory (RAM)
- Dynamic random access memory (DRAM)
- Static random access memory (SRAM)
- Read only memory (ROM)
N address bits, M data bits
- Depth: # of rows (# of words)
- Width: # of columns (size of words)
- Array size: depth * width
- Wordline
- Enable read/write word data at the specific address
- Only one wordline high at once
- Bitline
- Data to be transferred
RAM
- Volatile: loses data when power off
- Read/write fast
- e.g. main memory in computer
- Contrast to sequential access memories e.g. tape recorder
DRAM
- Data stored on capacitor
- nMOS transistor as switch
- Needs to be refreshed periodically
- Charge leakage
- Read destroys the stored value
- Read
- Bitline precharged to $$\frac{V_{DD}}{2}$$
- Raise wordline, capacitor shares charge with bitline
- Sense $$\Delta V$$
- Write
- Bitline driven hgih/low
- Raise wordline, voltage forced into capacitor
SRAM
- Refresh by cross-coupled inverters
- Noise tolerant
6T SRAM Cell
- Read
- Precharge both bitlines
- Raise wordline
- One of the two bitlines will be pulled down by the cell
- Read stability: A must not flip
- N1 >> N2
- N3 >> N4
- Write
- Drive one bitline hight, the other low
- Raise wordline
- Bitlines overpower cell with new value
- Writability: must overpower feedback inverter
- N2 >> P1
- N4 >> P2
Comparison
- Memory latency and throughput also depend on memory size
- Larger memories tend to be slower than smaller ones if all else is the same
- Best memory type
- Speed, cost, power tradeoff
Memory Type | Transistors per Bit Cell | Latency | Description |
---|---|---|---|
Flip-flop | ~20 | Fast | Data immediately available; more area, more power, higher cost |
SRAM | 6 | Medium | |
DRAM | 1 (+ 1 capacitor) | Slow | Bitline not actively driven by a transistor (need to wait for charge to come from capacitors); lower throughput since must refresh data periodically |
ROM
- Nonvolatile: retains data when power off
- Read fast, writing slow/impossible
- e.g. flash memory, thumb drives, digital cameras
- $$Data_2 = A_1 \oplus A_0$$
- $$Data1 = \overline{A_1} + A_0$$
- $$Data_0 = \overline{A_1}\overline{A_0}$$
Multi-Ported Memory
- 3-ported memory
- 2 read ports
- 1 write port
Memory HDL
FPGA
Performance | Cost of Development | Cost of Manufacture | Time to Market | |
---|---|---|---|---|
ASIC | Best | High | Low | Slow |
FPGA | Better | Medium | High | Fast |
DSP | Good | Low | Medium | Very fast |
- FPGA: Field Programmable Gate Array
- 2-D arrays of programmable logic cells
- ASIC: Application-Specific Integrated Circuit
- Design optimized
- Cannot change functionality once manufactured
- DSP: Digital Signal Processor
- Specialized microprocessor with an architecture optimized for the operational needs of digital signal processing
- Can use C/C++ to develop
FPGA
- LE: logic element
- IOE: input/output element
- Programmable interconnection
Altera Spartan FPGA
Configurable Logic Block
- Components
- 3 lookup tables (LUTs)
- F-, G-, H-LUT
- Configurable multiplexers
- Registers
- 3 lookup tables (LUTs)
- Configuration
- LUT
- Multiplexer
- Performing functions
- 2 combinational and/or 2 registered functions
- 4~9 variables
Design Flow
- Create schematic or HDL description
- Design synthesized onto FPGA
- Synthesis tool determines how the LUTs, multiplexers, and routing channels are configured
- Configuration downloaded to FPGA