ELEC 4601 Notes
ELEC 4601 Notes
von Neumann
• Single memory for programs AND data, simpler design
• Access to instructions an data cannot be performed at the same time
Harvard architecture
• Two separate memories for storing data and program, more complex. Ex x86, arml
• Allows access to instructions and data at the same time
x86
• EIP (Extended Instruction Pointer) is the program counter, it advances from
one instruction to the next, points where to go next to execute the next
command
• Register Categories
o General Purpose -8 registers 16 (or 32 if E in front of them) bit each
▪ AX - accumulator
• used for all I/O operations that require data to be
inputter or outputted
▪ ECX - counter
• Stores loop counter
▪ EDX - data
• Used to hold address of I/O
▪ EBX -base
• translate
▪ ESP
• Reserved for stack pointer SS, points to top of stack
• PUSH, POP, CALL, and RET use it
▪ EBP
• Used to access elements on the stack relative to a
fixed point on the stack rather than top of stack
▪ ESI
▪ EDI
▪ Available for addressing calculations and for the results of
most arithmetic and logical calculations
o Segment - 6
▪ CS- code segment
• CS contains the currently executing sequence of
instructions, CS chooses the code segment, the IP
shows the offset within the segment
▪ DS/ES/FS/GS – data segment
▪ SS stack segment
▪ Each of the registers specifies a kind of segment: code, data, or stack
o Stack
▪ Portion of memory where all program functions variables are defined including
arguments, local variables, and returns
▪ POP decreases size of stack
• Increments ESP
▪ push increases size of stack, heap behaves opposite
• ESP decrements
• Status flags
o Used to make decisions about branches and keep track of carries and arithmetic overflow
o Used EFLAGS to store status flags in x86
o 3 flag groups, status, control, system
o CF (carry flag)
IO port -mapped
• The control bus signals that activate the IO are separate from those that activate Memory devices,
protects the I/O devices from common software bugs
• Defines the interface between program/software and the hardware for specific architectures
• Instructions are executed by a series of well known sequences of transactions called
microinstructions. They are stored in the control unit
Daisy Chain
• Implemented in hardware
• If one or more device interrupts. The cpu saves, generated interrupt acknowledgment to highest
priority, if accepted it will provide means for processors to find interrupt address vector, otherwise
asks next device
Polling
• Handled by software
• Cpu respond to interrupt by executing one general ISR, then CPu asks each ADC in order if it
generated interrupt
• Priorities are determined by order of which device gets polled
• Very slow, time required to poll can exceed time to service
Glitches
o When it reaches 0, DMA controller strops transferring data and asserts interrupt line to
CPU
• Cycle stealing- when DMA controller takes bus cycles away from the CPU
• Burst Mode – when DMA transfers a block of data before returning bus control to CPU
ISA Bus – super slow, used fro slow I/O devices like mice
igher width
Bus arbiters are used to organize bus sharing since multiple devices can be attempting to use a bus at the
same time
• Armle
• Aims at getting one instruction executes every cycle
• Simpler instructions
• More suitable for real time, low power
Embedded Processors
Cortex M4
• Supports sleep mode, that’s a low power CPU on until an event wakes it up
o Wakeup interrupt controller (WIC) keeps an eye out for Wait For Interrupt/EVENT (WFI,
WFE) and when a request is detected the WIC informs the power management unit to
power up
• Enhanced math instructions, enhanced divide instruction, it can perform two 16bit MAC at the
same time
• Memory protection unit
• Advanced busses
• Debug support
• Registers
Cortex-M4
• Cortex m processors provide a nested vectored interrupt controller (NVIC) for interrupt handling
similar to PIC in x86
o NVIC control the IVT and the how to enable/disable interrupts and set/reset the the
registers
• At reset, all intr are disabled and given values of 0, so before using you must
o Set up priotiy levels
o Enable the interrupt on the desired pins
The set up of the interrupts will likely be done by the startup code which is provided by the manfucaturer
PRIMASK – register used when all interrupts need to be disabled to perform a time critical task
BASEPRI – disables interrupts with priority levels lower than a certain level
Interrupt latency – the time from when the processors receives the request to when the interrupt handler
starts to run
• If CPU and IO device speeds are close, programmes IO may be better than interrupt based IO b/c
context switch overhead(stacking unstacking) will be saved
o Info travels only one direction on the bus at any given time
• Multi-master and multi slave
VDD RP
3.3v 3.3k
5v 4.7k
• As Rp goes down, power consumption goes up
o Cant just make Rp huge because then capacitance is too high and RC delay is too large and
limits speed
• Data is sent most significant bit first
o No 2 devices can have the same address on bus
o After every 8 bits, either master of slave must send an acknowledge bit (ACK) (whichever
isn’t sending data at the moment)
BUS ARBITRATION
• What happens when multiple masters all try to send data at the same time? Theres a system in
place dependant on which master pulls SDA low first
o If a 1 is put on the bus there’s nothing happening, so when someone pushes a 0 they let
them do it
o Master 1: 1101010 gives up at bit 4 because it doesn’t wanna fight master 2 when M2
wants low
o Master 2: 1100101 winner since it keeps the bus until a stop bit is sent
o Master 3: 1111010 this one gives up on bit 3 b/c he doesn’t wanna fight the two lows.
CLOCK STRETCHING
Waveform
• When slaves see SDA pulled low while SCL is high, that’s their signal to start listening,
o If no one is pulling the bus lines low, they default to high due to the pull up resistors
o I2c specs enforce that SDA line should never transition while SCL is high except:
▪ When master signal goes from low to high when clock is high it signals the STOP
signal, which returns SDA to idle
▪ When SDA is used to serve as a marker for the start of a data frame (start bit)
• First 4 bits are specified in data sheet, last 3 are specified by user
• After recognizing it is being addressed, the slave will want to know if it’s a write (0) (master to
salve data transfer) or R(1) (slave to master data transfer) request
• To respond` with ACK signal, slave pulls SDA low, if no slave pulls low it’s a no-acknowledge/NACK
• Only way to change from read to write is to send stop signal and restart process
NOTE: whether slave sends data to master, or master sends data to slave, the ACK signal is sent opposite
way
Memory Organization
• CPU Memory
o Is the collection of registers inside CPU
o Very fast since they are manufactured within the same silicon die /chip
o Very expensive
• Main Memory
o System RAM and ROM, larger and less expensive but slower
o Static-RAM
▪ Data is held until power is removed
▪ 1 memory cell of SRAM consist of 6> transitors, cells are organized into arrays
▪ Advantages: As fast as typical CPUs so more used as a cache memory
▪ Disadvantages: more expensive and less dense than DRAMS, smaller in size than
main memory
▪ Steps of read cycle
• Place address to be read on address bus
• Ensure chip is active by making CS low
• Activate OE pin, ensuring data is read
• Required data appears on the data bus
• TAA – read access time, the time from the instant the address is placed on
the address bus to the point when the required data is available on the
data bus
• Trc – read cycle time which is minimum time between two read cycles
•
▪ Steps of write cycle
• Place addr to be written to on the adrr bus
• Ensure chip is activated by making CS low
• Place data to be written on the data bus
• Activate WR line , and only then is data valid
•
o Dynamic Ram
▪ Consists of a single transistor and capacitor, which decides whether the cell stores
1 or 0
▪ Read cycle
• The row addr is placed on rows and is latched
• The row address strobe (RAS) is activated
• Row addr decoder selects proper row
• The column address is latched
• Column address strobe (CAS) is activated
• CAS is also enable, so once stabilized row and column data placed on data
bus
• Data selected in available at output buffers
• CAS and RAS must return to previous state
• This entire thing is asynchronous
▪ Trac – from the time the RAS signal to be activated to when the data is available on
bus
▪
o Typically 64 ms refreshing rate, done by activating each row
▪ More columns and less rows means less refresh cycles
o Sensing the small capacitor charge is challenging due to noise from coupling capacitance
DECODING
• Linear decoding
o No hardware decoder, CPU address pins directly connected to mem
o 2 major problems
▪ Multiple addresses can point to the same location
▪ Gaps in address space
o
• full decoding
o
Cache Memory
o
o Main mem in divided into Tag and Index (Tags will allow for x amount of rollovers, tag in
front followed by data)
o cache only stores tag and data not index
o 1- when CPU wants access to main mem, index part of addr is used to access cache first
o 2- the tag part of address is used to check if the desired mem location is in the cache
o 3- if tag matches, data is grabbed – CACHE HIT
o 4- if not cpu will read from main mem and save info to cache for future use -CACHE MISS
o Simple and less expensive
o Dis: slow due to tag comparison, multiple addresses with same index and different tags
cannot be saved at the same time
• Fully associative mapping
o
o Tag is physical mem address
o Uses associative memory that can be searched simultaneously
o Stores the full mem addr and contents in cache
o Dis: very expensive
o Where to place tag and data from main mem?
▪ Random replacement
▪ Least recently used
• Set associative mapping
o
o Cache is divided into ssets for each index
o Removes some hardware
o Each set can be searched by associative
▪ Same index can hold multiple data records with different tags
▪ Index is searched using the fast associative mem technique
o WIDTH OF ASSOCIATIVE MEANS ONLY TAG
o Index width + tag width = address lines
•
• MOSI – Master out slave in
• MISO – master in slave out
• SS- slave select
o Multiple GPIO pins can be used as SS pins to select from multiple available slaves, but only
ONE slave can be active at any time
o Slaves not being used will have MISO as tristate, avoiding bus contention
o When SS goes high to low, it initiates the transfer, must transition for EVERY transfer
o Ssin master enables the SPI slave for transfer
o SS in slave if high makes device ignore clock and keep MISO high
• Sclk -serial clock
• Max data rate is ½ of the clock rate
•
Mosi-
E3xam highlights
o Exc_return
o Code
▪ Description of each opcode
▪ But actually memorize pop and push
o Stack operation
▪ How stack operates with commands
▪ Increases or decreases
o Interrupt
Clock stretching- when slave holds clock low until its ready
10% - memory
10% - cache
99% for sure one exam – decoding question like one on quiz
How many index bits do you need for a cache with 1024 locations to store things
10 – dorect mapping
Using associative mem, whats the cor between size fo cache and size of add
No relationship
Set associative
Microprocessor vs microcontroller
Nosie margin?
Interrupt vs subroutine