0% found this document useful (0 votes)
22 views23 pages

ELEC 4601 Notes

Uploaded by

kolagdeshpande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views23 pages

ELEC 4601 Notes

Uploaded by

kolagdeshpande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

lOMoARcPSD|44313966

Notes - exam only

assembly language (Carleton University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)
lOMoARcPSD|44313966

von Neumann
• Single memory for programs AND data, simpler design
• Access to instructions an data cannot be performed at the same time

Harvard architecture
• Two separate memories for storing data and program, more complex. Ex x86, arml
• Allows access to instructions and data at the same time

Timing and control


Instruction cycle – fetching, decoding, and executing, takes one to five read/write operation

Machine cycle – time period of each mem of I/O operation

• Each machine cycle consists of a number of clock periods called T states

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

State change -> valid state -> tri-state

x86
• EIP (Extended Instruction Pointer) is the program counter, it advances from
one instruction to the next, points where to go next to execute the next
command
• Register Categories
o General Purpose -8 registers 16 (or 32 if E in front of them) bit each
▪ AX - accumulator
• used for all I/O operations that require data to be
inputter or outputted
▪ ECX - counter
• Stores loop counter
▪ EDX - data
• Used to hold address of I/O
▪ EBX -base
• translate
▪ ESP
• Reserved for stack pointer SS, points to top of stack
• PUSH, POP, CALL, and RET use it
▪ EBP
• Used to access elements on the stack relative to a
fixed point on the stack rather than top of stack
▪ ESI
▪ EDI
▪ Available for addressing calculations and for the results of
most arithmetic and logical calculations
o Segment - 6
▪ CS- code segment
• CS contains the currently executing sequence of
instructions, CS chooses the code segment, the IP
shows the offset within the segment
▪ DS/ES/FS/GS – data segment
▪ SS stack segment
▪ Each of the registers specifies a kind of segment: code, data, or stack
o Stack
▪ Portion of memory where all program functions variables are defined including
arguments, local variables, and returns
▪ POP decreases size of stack
• Increments ESP
▪ push increases size of stack, heap behaves opposite
• ESP decrements
• Status flags
o Used to make decisions about branches and keep track of carries and arithmetic overflow
o Used EFLAGS to store status flags in x86
o 3 flag groups, status, control, system
o CF (carry flag)

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

▪ Carry out generated by last arithmetic operations. Indicates overflow in unsigned


math, also used for propagating the carry between words
o ZF (zero flag)
▪ Result of last operation was zero
o SF (sign flag)
▪ Result of last operation was negative
o OF (overflow flag)
▪ Overflow of two’s complement arithmetic

Instruction Execution Timing


Instruction Fetch Timing

1. With first rising edge of the clock


a. PC register content is placed on Address Bus
b. And READ signal is activated
2. With the first falling edge, instruction is fetched into IR
3. By second rising edge of the clock, control unit would have decoded and recognized the type of
instruction
4. With the second rising edge of the clock, control unit will begin executing the instruction

READ Instruction Timing

1. With first rising edge of the clock


a. PC register content is placed on Address Bus
b. And READ signal is activated
2. With the first falling edge, instruction is fetched into IR
3. By second rising edge of the clock, control unit would have decoded and recognized the type of
instruction
4. With the second rising edge of the clock, control unit will
a. Put memory address in MAR
b. Activate READ signal
5. With second falling edge, data becomes available on the data bus lines ready to be used by CPU

WRITE Instruction Timing

1. With first rising edge of the clock


a. PC register content is placed on Address Bus
b. And READ signal is activated
2. With the first falling edge, instruction is fetched into IR
3. By second rising edge of the clock, control unit would have decoded and recognized the type of
instruction
4. With the second rising edge of the clock, control unit will
a. Put memory address in MAR
b. Activate WRITE signal
5. With second falling edge, data becomes available on the data bus lines ready to be written to
MEMORY whenever it is ready

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

IO port -mapped

• The control bus signals that activate the IO are separate from those that activate Memory devices,
protects the I/O devices from common software bugs

Instruction set – list of commands that microprocessor is designed to execute

ISA (Industry Standard Architecture)


• Includes the bus structure, that operates at speeds of 8MHz to 8.33 MHz, both 8 and 16 bit
• Spec functions common to all ISA systems are Direct Mem Access, Interrupts, Keyboard, Real time
clock (RTC), and configuration RAM, and ISA timers
ISA (Instruction Set Architecture)

• Defines the interface between program/software and the hardware for specific architectures
• Instructions are executed by a series of well known sequences of transactions called
microinstructions. They are stored in the control unit

Daisy Chain

• Implemented in hardware
• If one or more device interrupts. The cpu saves, generated interrupt acknowledgment to highest
priority, if accepted it will provide means for processors to find interrupt address vector, otherwise
asks next device

Polling

• Handled by software
• Cpu respond to interrupt by executing one general ISR, then CPu asks each ADC in order if it
generated interrupt
• Priorities are determined by order of which device gets polled
• Very slow, time required to poll can exceed time to service

Glitches

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

DMA (Direct Memory Access)


• Interrupts are expensive time wise, so we go back to programmed I/O but instead DMA controller
is added with direct access to bus.
• Chip has at least 4 registers,
o 1st contains me addr to be read or written
o 2nd contains count of how many bytes or words to be transferred
o 3d specifies the device number or I/O space addr to use
o 4th specifies whether we read from or write to
• Writing steps
o EX: write a block of 32 bytes from mem address 100 to a terminal 4 (device 4)
o 1st reg is 100, 2nd is 32, 3d is 4, and 4th is 1 (code for write)
o DMA makes bus request to read byte 100 from mem, after receival, request IO request to
dev 4
o After both, DMA increments reg by 1 and decrements count by 1 and repeats until count
=0

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

o When it reaches 0, DMA controller strops transferring data and asserts interrupt line to
CPU
• Cycle stealing- when DMA controller takes bus cycles away from the CPU
• Burst Mode – when DMA transfers a block of data before returning bus control to CPU

Clock skew – the scattering over an entire circuit of timing

Clock jitter- delays varying from one cycle to another

Memory bus – connects cpu and mem/cache, very fast

ISA Bus – super slow, used fro slow I/O devices like mice

PCI (Peripheral Component Interconnect – faster rates and h

igher width

Bus arbiters are used to organize bus sharing since multiple devices can be attempting to use a bus at the
same time

• Centralized is daisy chain using an arbiter


• Decentralized has no arbiter

RISC (Reduced Instruction Set Computers)

• Armle
• Aims at getting one instruction executes every cycle
• Simpler instructions
• More suitable for real time, low power

CISC ( Complex Instruction Set Computer)


• CISC performs longer commands, so as to save mem when RAM was expensive.
• More suitable for PCs, Servers, cloud, and network infrastructure basically high power systems

Embedded Processors

• Microcontrollers are used rather than microprocessors, written in c

Cortex M4

• Supports sleep mode, that’s a low power CPU on until an event wakes it up
o Wakeup interrupt controller (WIC) keeps an eye out for Wait For Interrupt/EVENT (WFI,
WFE) and when a request is detected the WIC informs the power management unit to
power up
• Enhanced math instructions, enhanced divide instruction, it can perform two 16bit MAC at the
same time
• Memory protection unit
• Advanced busses
• Debug support
• Registers

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

o 16, 32-bit registers


o R0-R12- general purpose registers (low 0-7, high 8-12)
o R13 stack pointer
o PC- R15
▪ Records address of the current intruction code
o Link register -R14
▪ Used to store the return address of a subroutine or a function call. The program
counter will load the value from LR after function is finished
• A good microprocessor uses a multiplexer to switch between a 32bit instruction set, or a 16 bit
thumb instruction set
o Thumb 1 -16 bit
o Mix of arm and thumb 1
o Thumb 2 -

Cortex-M4

• Configured to respond to:


o External Events
▪ From I/O peripheral
▪ Asynchronous, not related to what mode is being executed
o Internal events
▪ For example SySTICK which is a timer interrupt
▪ Synchronous, result of specific instruction executing
o RESET is highest priority (-3), followed by Non-Maskable Interrupt (-2), then IRQ handler
▪ The vector table me starts at 0 and increases by exception number * 4
▪ Each interrupt vector’s least sig bit is set to 1 to indicate thumb state

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

• Cortex m processors provide a nested vectored interrupt controller (NVIC) for interrupt handling
similar to PIC in x86
o NVIC control the IVT and the how to enable/disable interrupts and set/reset the the
registers

• At reset, all intr are disabled and given values of 0, so before using you must
o Set up priotiy levels
o Enable the interrupt on the desired pins

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

o Enable interrupt in NVIC


o Set addr of ISR in NVIC

The set up of the interrupts will likely be done by the startup code which is provided by the manfucaturer

• Name of ISE in code must match C library

Interrupt handling scenario SLIDE 138

1. Acceptance of interrupt if and only if


• Processor is running
• Interrupt is enabled
• Interrupt has higher priority than current ISR (if any)
• If interrupt isn’t blocked by exception masking register
• Thread mode – normal user application
• Handler mode- Main Stack Pointer is used when interrupt is accepted by system
2. Interrupt entrance sequence
• Stacking – a number of register, including return addr
• Fetching – the interrupt/exception vector (can be parallel timing wise to stacking)
• Fetching – instructions of interrupt handler
• Update NVIC registers and CPU register, and other important registers
3. Interrupt handler execution
• Carry out services withing handler
• If higher priority exception occurs, new interrupt will be accepted and current is suspended
(nested interrupt)
i. If lower priority, current one finished, and the next one serviced after
• At end of ISR, EXC_RETURN

PRIMASK – register used when all interrupts need to be disabled to perform a time critical task

BASEPRI – disables interrupts with priority levels lower than a certain level

Interrupt latency – the time from when the processors receives the request to when the interrupt handler
starts to run

• If CPU and IO device speeds are close, programmes IO may be better than interrupt based IO b/c
context switch overhead(stacking unstacking) will be saved

I2Cs – Inter Integrated circuit communication

• Inter – integrated circuit communication


• Invented in 1989
• Serial communications protocol – frame based
o Bits are sent one by one not in parallel
o Serial b/c less PCB routing and therefore cheaper, and lower pin count so cheaper
• Half duplex

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

o Info travels only one direction on the bus at any given time
• Multi-master and multi slave

BASIC CONCEPT OF MASTER SLAVE

• Two devices connected by 2 wires


o 1 is for clock signal and called serial clock line (SCL)
o 2 is for data carrying, serial data line (SDA)
o Master always supplies clock
o Slave basically waits for its address to appear on the bus
o All we also need is pull up resistor

Or if you have multiple masters and slaves

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

Reminder on how pull up resistors even work…

• When A is low, then µ1=0 so SDA is pulled high


• If A is high, the µ1 is on and SDA is shorted to GND and SDA is low

Typical values for RP and VDD

VDD RP
3.3v 3.3k
5v 4.7k
• As Rp goes down, power consumption goes up
o Cant just make Rp huge because then capacitance is too high and RC delay is too large and
limits speed
• Data is sent most significant bit first
o No 2 devices can have the same address on bus
o After every 8 bits, either master of slave must send an acknowledge bit (ACK) (whichever
isn’t sending data at the moment)

BUS ARBITRATION

• What happens when multiple masters all try to send data at the same time? Theres a system in
place dependant on which master pulls SDA low first
o If a 1 is put on the bus there’s nothing happening, so when someone pushes a 0 they let
them do it
o Master 1: 1101010  gives up at bit 4 because it doesn’t wanna fight master 2 when M2
wants low
o Master 2: 1100101  winner since it keeps the bus until a stop bit is sent
o Master 3: 1111010  this one gives up on bit 3 b/c he doesn’t wanna fight the two lows.

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

CLOCK STRETCHING

• Process to slow down communication, related to SCL line


• Temporarily, a slave device overrides the clock line and pulls it low, done when the slave needs
more time to supply data, master sees and waits till its not pulled low anymore
• Advantages:
o Low pin count
o Ack allows for error handling
o Multi amster and multi salve
o Slow and fast mode
• Disadvantages:
o Frame overhead resduces data speed
o Hald duplex
o Hardware gets more complex as the # of masters and slaves increases

Waveform

• When slaves see SDA pulled low while SCL is high, that’s their signal to start listening,
o If no one is pulling the bus lines low, they default to high due to the pull up resistors
o I2c specs enforce that SDA line should never transition while SCL is high except:
▪ When master signal goes from low to high when clock is high it signals the STOP
signal, which returns SDA to idle
▪ When SDA is used to serve as a marker for the start of a data frame (start bit)
• First 4 bits are specified in data sheet, last 3 are specified by user

• After recognizing it is being addressed, the slave will want to know if it’s a write (0) (master to
salve data transfer) or R(1) (slave to master data transfer) request
• To respond` with ACK signal, slave pulls SDA low, if no slave pulls low it’s a no-acknowledge/NACK
• Only way to change from read to write is to send stop signal and restart process

NOTE: whether slave sends data to master, or master sends data to slave, the ACK signal is sent opposite
way

Dual state is both high b/c is save power

Memory Organization

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

• CPU Memory
o Is the collection of registers inside CPU
o Very fast since they are manufactured within the same silicon die /chip
o Very expensive
• Main Memory
o System RAM and ROM, larger and less expensive but slower
o Static-RAM
▪ Data is held until power is removed
▪ 1 memory cell of SRAM consist of 6> transitors, cells are organized into arrays
▪ Advantages: As fast as typical CPUs so more used as a cache memory
▪ Disadvantages: more expensive and less dense than DRAMS, smaller in size than
main memory
▪ Steps of read cycle
• Place address to be read on address bus
• Ensure chip is active by making CS low
• Activate OE pin, ensuring data is read
• Required data appears on the data bus
• TAA – read access time, the time from the instant the address is placed on
the address bus to the point when the required data is available on the
data bus
• Trc – read cycle time which is minimum time between two read cycles


▪ Steps of write cycle
• Place addr to be written to on the adrr bus
• Ensure chip is activated by making CS low
• Place data to be written on the data bus
• Activate WR line , and only then is data valid

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966


o Dynamic Ram
▪ Consists of a single transistor and capacitor, which decides whether the cell stores
1 or 0
▪ Read cycle
• The row addr is placed on rows and is latched
• The row address strobe (RAS) is activated
• Row addr decoder selects proper row
• The column address is latched
• Column address strobe (CAS) is activated
• CAS is also enable, so once stabilized row and column data placed on data
bus
• Data selected in available at output buffers
• CAS and RAS must return to previous state
• This entire thing is asynchronous
▪ Trac – from the time the RAS signal to be activated to when the data is available on
bus


o Typically 64 ms refreshing rate, done by activating each row
▪ More columns and less rows means less refresh cycles
o Sensing the small capacitor charge is challenging due to noise from coupling capacitance

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

Feature DRAM SRAM


Storage circ Capacitor Flipflop
Transfer speed Slower than CPU Almost same as CPU
Latency Higher lower
Density Higher Lower
Power consumption lower Higher
cost Cheap expensive
o Speed: CPU<cache<main mem<secondary mem
o DDRAM
▪ Transfers data at both rising and falling edge
o Read Only Memory (ROM)
▪ Doesn’t loose contents when power is turned off
▪ EPROM – erasable and programmable roms which can have contents erased bu UV
radiation
▪ EEPROM – electrically erasable (SSD, USB, phone)
• Flash Rom – special type of EEPROM that can be erased and
reprogrammed in blocks instead of one byte at a time
• Secondary Memory
o Hard disks, larger cheaper, but much slower

DECODING

• Linear decoding
o No hardware decoder, CPU address pins directly connected to mem
o 2 major problems
▪ Multiple addresses can point to the same location
▪ Gaps in address space

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

o
• full decoding

o
Cache Memory

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

• Locality of reference – addresses generated in microprocessor


have a tendency to cluster around a small region in the main mem
• CPU can execute same instructions of a loop from an on chip fast
mem instead of reading from the main mem
• Separate cache for instructions and data because they have very
different address grabbing mechanisms
• Larger and larger storage, but slower access
• Small, fast mem with special control hardware that permits it to
handle a sig proportion of accesses req’d by the CPU

• Spatial locality (location of mem addr from isntr)


o Observation that mem locations with addresses numerically similar to recently accessed
mem location are likely to be accessed again. Caches will bring in more than the needed
requested data since it will likely be needed.
o Line size is the number of words in the block fetched
• Temporal locality (location of instructions and operands)
o Save more instructions than what is needed, and discard instructions that haven’t been
accessed recently (Least recent used replacement)
• There’s always something in cache, when mem is referenced the controller circuit checks if its in
cache to save a trip to main memory
• Cache hit
o
• Direct Mapping – easiest

o
o Main mem in divided into Tag and Index (Tags will allow for x amount of rollovers, tag in
front followed by data)
o cache only stores tag and data not index
o 1- when CPU wants access to main mem, index part of addr is used to access cache first
o 2- the tag part of address is used to check if the desired mem location is in the cache
o 3- if tag matches, data is grabbed – CACHE HIT
o 4- if not cpu will read from main mem and save info to cache for future use -CACHE MISS
o Simple and less expensive
o Dis: slow due to tag comparison, multiple addresses with same index and different tags
cannot be saved at the same time
• Fully associative mapping

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

o
o Tag is physical mem address
o Uses associative memory that can be searched simultaneously
o Stores the full mem addr and contents in cache
o Dis: very expensive
o Where to place tag and data from main mem?
▪ Random replacement
▪ Least recently used
• Set associative mapping

o
o Cache is divided into ssets for each index
o Removes some hardware
o Each set can be searched by associative
▪ Same index can hold multiple data records with different tags
▪ Index is searched using the fast associative mem technique
o WIDTH OF ASSOCIATIVE MEANS ONLY TAG
o Index width + tag width = address lines

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

SPI – Serial Peripheral Interface


• SPI is a bi-directional interface (full duplex)
• 4 wires , 3 for data 1 for chip select

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

• Speedier than I2C, Synchronous, Multi master, and NO pull up resistor


• MOSI – Master out slave in
• MISO – master in slave out
• SS- slave select
o Multiple GPIO pins can be used as SS pins to select from multiple available slaves, but only
ONE slave can be active at any time
o Slaves not being used will have MISO as tristate, avoiding bus contention
o When SS goes high to low, it initiates the transfer, must transition for EVERY transfer
o Ssin master enables the SPI slave for transfer
o SS in slave if high makes device ignore clock and keep MISO high
• Sclk -serial clock
• Max data rate is ½ of the clock rate

Example problem figure 8.1 is exam

Which Is mosi and miso and why

Mosi-

Miso- high impedance bottom is miso, flat then on

E3xam highlights

10% fill in the blank – covering all material

• Spi stands for…. Etc…


• Memorizing shit

10% Multiple choice – covering all material

20% explain something

• Partial point time


• Ie if you want a microprocessor system that reacts when the power is out
o Non maskable interrupt
o Lab questions/answers

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

o Exc_return
o Code
▪ Description of each opcode
▪ But actually memorize pop and push
o Stack operation
▪ How stack operates with commands
▪ Increases or decreases
o Interrupt

10% - I2C – interpret a waveform,, answer some short questions

Clock stretching- when slave holds clock low until its ready

10% - SPI – reminder

10% - memory

10% - cache

Compare SPI to I2C look at first 2 slides of each

If you need to transmit super fast what do you choose?

When to use EEPROM , ROM, etc..

99% for sure one exam – decoding question like one on quiz

How to add more mem

How many index bits do you need for a cache with 1024 locations to store things

10 – dorect mapping

Using associative mem, whats the cor between size fo cache and size of add

No relationship

Set associative

Due to locality of ____ its advantageous of grabbing chunks of addresses

20% random question section all arm

Cortex memory mapping (page 232)

Something on glitches and hazards

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)


lOMoARcPSD|44313966

Microprocessor vs microcontroller

WFI on lab 3 lots of arrows

Nosie margin?

Cortex interrupt like on midterm

Short sequences like what happens when arm calls a subroutine

Interrupt vs subroutine

Downloaded by Alok Deshpande (kolagdeshpande@gmail.com)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy