Unit-5 Cortex and OMAP Processor (MICROCONTROLLER)s)
Unit-5 Cortex and OMAP Processor (MICROCONTROLLER)s)
1
The ARM Cortex™ Family
x1-4
ARM Cortex A Series - Applications CPUs
Cortex-A9
focused on the execution of complex OS and user
Cortex-A8 applications
Performance
Cortex-M3
SC300 ARM Cortex M Series - Microcontroller cores
focused on very cost sensitive, deterministic,
Cortex-M1 interrupt driven environments
Cortex-M0
2
Introduction to the ARM® Cortex™-M Architecture
CORTEX-M3
3
Introduction to Cortex-M3 Processor
Cortex-M3 Architecture
Harvard bus architecture
3-stage pipeline with branch speculation
Integrated bus matrix
Configurable nested vectored interrupt
controller (NVIC)
Advanced configurable debug and trace
components
* Cortex-M3 Release 2
4
Cortex-M3 Processor Overview
Integrated Nested Vectored Central Core:
Wake-Up Interrupt Controller:
Interrupt Controller and 1.25 DMIPS/MHz
for Low Power Stand by
SYSTICK Timer 1 Cycle Multiply
Operation
Hardware Divide
Instrumentation
Debug Access Trace Macrocell
Port: JTAG or (ITM ) for Data
Serial Wire Trace via Single
Wire Output
1x AHB-Lite Buses
2x AHB-Lite Buses
SYSTEM (SRAM & Fast Peripherals)
I_CODE (Instruction Code Bus)
1x APB Bus
D_CODE (Data / Coefficients Code Bus)
ARM Peripheral Bus (Internal & Slow Peripherals)
5
ARM Cortex-M3 - Designed for Performance
Relative DMIPS/MHz
High Performance 1,4
1,2
High efficiency processor core – 1.25 DMIPS/MHz
1
Advanced instructions for data manipulation 0,8
Single Cycle Multiply 0,6
Hardware Division 0,4
Bit Field Manipulation 0,2
6
ARM Cortex-M3 - Designed for Low Power
Cortex-M3 features architected support for sleep states
Enables ultra low-power standby operation
Critical for extended life battery based applications
Includes very low gate count Wake-Up Interrupt Controller (WIC)
7
ARM Cortex-M3 - Designed for Robustness
Cortex-M3 supports design of robust applications
Processor Modes
Separation of main and interrupt code
Privilege Levels
Separation of RTOS and user application
NMI
Inform processor of critical events
SYSTICK
fRFMEA
Protected system timer for pre-empting RTOS analysis
f
Fixed Memory Map R
D
MPU* ARM
Cortex-
I
fRCPU_a
Separation of user application tasks M3 rmcm3
Fault Robust Observation Interface*
IEC61508 standard SIL3 certification
CM3
*optional busses
8
Coresight™ Debug & Trace
ARM CoreSight is a complete on-chip debug and real-time trace solution for
the entire system-on-chip (SoC)
Configurable to adapt for market requirements
Debug components can be configured or even
removed (check device manufacturer data sheet)
9
Introduction to the ARM® Cortex™-M Architecture
CORTEX-M1
10
ARM Cortex-M1
Soft processor for FPGA
Upwards compatible with Cortex family on
ASIC/ASSP/MCU
3 stage pipeline
Delivers 0.8 DMIPS/MHz
Capable of up to 200MHz
11
ARM Cortex-M1 Processor Features
A 3-stage, 32-bit RISC processor
Highly configurable to enable design trade-offs
Retains the same programmers model for software simplicity
Configurable debug
JTAG or reduced pin-count SWD interface
Full – 2 watchpoints, 4 breakpoints
Small – 1 watchpoint, 2 breakpoints
None – removable for cost reduction and security
12
ARM Cortex-M1 Processor Features (2)
Integrated Interrupt Controller
Fast interrupt response
Configurable 1, 8, 16, 32
Software programmed priority levels (1-4)
Non-Maskable Interrupt
Multiplier
Fast option uses FPGA DSP blocks
Small option uses adder to save area, can use DSP blocks
Program function is the same with either - no need for software modifications
13
ARM Cortex-M1 Speed and Area
Results below are to give a guideline for MHz and area
The nature of FPGA implementation means results may change per system
The results will also change as tools and FPGA evolve
14
ARM Cortex-M1 Instruction Set
Cortex-M1 implements an ISA based primarily on Thumb
The high density 16-bit ISA introduced in ARM7TDMI
Cortex-M1 includes a few Thumb-2 system instructions
To allow operation in Thumb state only
Enables binary upwards compatible with Cortex-M3
Thumb Thumb-2
User code, compiler generated OS & system
ADC ADD ADR AND ASR B NOP
BIC BL BX CMN CMP SEV WFE
EOR LDM LDR LDRB LDRH LDRSB WFI YIELD
LDRSH LSL LSR MOV MUL MVN DMB
NEG ORR POP PUSH ROR RSB DSB
SBC STM STR STRB STRH SUB ISB
SVC TST BKPT BLX CPS CPY MRS
REV REV16 REVSH SXTB SXTH UXTB MSR
UXTH
Cortex-M1 ISA
15
Introduction to the ARM® Cortex™-M Architecture
CORTEX-M0
16
ARM Cortex-M0 Processor
The smallest, lowest power ARM processor ever
A third of the area of ARM7TDMI-S
85 μW/MHz, 12K gates *
17
ARM Cortex-M0 Processor Features
Cortex-M0 RTL is configurable
Tune for your application
Check device manufacturer data sheet
Consistent programmer’s model
Software compatibility
All tools remain compatible
Integrated Interrupt Controller (NVIC)
1, 8, 16, 24 or 32 interrupts
Multiplier options
Fast or small (1 or 32 cycle)
Optional OS extensions
SYSTICK Timer
PendSV (Pending System Call)
Configurable debug
4 or 2 breakpoints, 2 or 1 watchpoints
JTAG or SWD interface
18
Energy Efficiency
Cortex-M0 designed for excellent power efficiency
Significantly less activity required to match 8/16-bit device performance
Fast interrupt response minimizes time in active state
EFFICIENT
ENERGY
ENERGY
Power
Power
COST
Time Time
Lower energy for an identical task
19
Architected Sleep States
Cortex-M0 processor supports ultra low-power standby implementation
Critical for extended life battery-based applications
Includes very low gate count Wake-Up Interrupt Controller (WIC)
20
ARM Cortex-M0 Instruction Set
Cortex-M0 implements an ISA based primarily on Thumb
The high density 16-bit ISA introduced in ARM7TDMI
Cortex-M0 includes a few Thumb-2 system instructions
To allow operation in Thumb state only
Enables binary upwards compatible with Cortex-M3
Thumb Thumb-2
User code, compiler generated OS & system
ADC ADD ADR AND ASR B NOP
BIC BL BX CMN CMP SEV WFE
EOR LDM LDR LDRB LDRH LDRSB WFI YIELD
LDRSH LSL LSR MOV MUL MVN DMB
NEG ORR POP PUSH ROR RSB DSB
SBC STM STR STRB STRH SUB ISB
SVC TST BKPT BLX CPS CPY MRS
REV REV16 REVSH SXTB SXTH UXTB MSR
UXTH
Cortex-M0 ISA
21
Introduction to the ARM® Cortex™-M Architecture
CORTEX-M - ARCHITECTURE
22
ARM Cortex-M - Designed for Ease of Use
C
Virtually everything can be written in C/C++
23
ARM Cortex-M Processors
Cortex-M family optimised for deeply embedded
Microcontroller and low-power applications
24
Processor Mode
Handler Mode
Used to handle exceptions. The processor returns to Thread mode when it has
finished exception processing.
Thread Mode
Used to execute application software. The processor enters Thread mode when it
comes out of reset.
25
Privilege Levels
Unprivileged
The software:
has limited access to the MSR and MRS instructions, and cannot use the CPS
instruction
cannot access the system timer, NVIC, or system control block
might have restricted access to memory or peripherals.
Unprivileged software executes at the unprivileged level
Privileged
The software can use all the instructions and has access to all resources.
26
Register File
Thread/Handler Thread All registers are 32-bit wide
R0
R1
13 general purpose registers
R2 R0 - R7 are accessible by any instruction
R3 R8 - R12 are accessible to a few 16-bit instructions
R4 and to all 32-bit instructions
R5
R6 3 registers with special meaning/usage
R7 R13 - Stack Pointer (SP)
R8
R9 2 banked copies - Main and Process
R10 R14 - Link Register (LR)
R11
R12
R15 - Program Counter (PC)
R13 (MSP) R13 (PSP) Special-purpose registers
R14 (LR)
PC
PSR
PSR PRIMASK
FAULTMASK
PRIMASK
FAULTMASK*
BASEPRI
BASEPRI* CONTROL
CONTROL
*Not available in Cortex-M0 / Cortex-M1
27
Program Status Registers
PSR - Program Status Register combines
APSR - Application Program Status Register
Negative, Zero, Carry, OVerflow, Q-Sticky Saturation Flag
IPSR - Interrupt Program Status Register
ICI/IT – Interrupt Continuable Instruction, IF-THEN instruction status
Thumb – Always 1
EPSR - Execution Program Status Register
Exception Number – Indicates which exception processor is handling
CPU Instructions MSR and MRS allow access (together or separate)
For example: MSR PSR, r0
MRS r1, IPSR
APSR N Z C V Q
IPSR EXCEPTION NUMBER
EPSR ICI/IT T ICI/IT
28
Memory Access
The Cortex-M3 has an internal Harvard architecture
Separate instruction and data interfaces
In a typical system:
Instructions stored in Code space (in Flash)
SRAM accessed across System bus
29
Memory Map
Linear 4GB memory map
Fixed map required to host system components and simplify implementation
Bus Matrix partitions memory access via the AHB and PPB buses
FFFFFFFF
Vendor Specific
E0100000
APB External PPB
E0040000
M3 Instruction SCS + NVIC
E0000000
Core Data
BB
Peripheral ½GB
40000000
EX+BB
RAM ½GB
20000000
EX = Code execution support
HX = High performance code execution support HX
Code ROM/RAM ½GB
00000000
BB = Bit banding support
30
Thumb-2 Technology
Thumb-2 ISA was introduced in ARMv7 architecture
ARM
Thumb
Thumb-2
31
Software Compatibility
Cortex-M0/M1 implements an ISA based primarily on Thumb
The high-density 16-bit instruction set introduced in ARM7TDMI
Thumb-2
32
Instruction Set Comparison
ADC ADD ADR AND ASR B CLZ
33
Nested Vector Interrupt Controller
NVIC is a core peripheral
Consistent between Cortex-M cores
Tailored towards fast and efficient interrupt handling
Number of interrupts can be configured by device manufacturer
1 ... 240 interrupt channels for M3
1 ... 32 interrupt channels for M0
Each peripheral has its own interrupt vector(s)
8 – 256 interrupt priorities (1 - 4 Cortex M1/M0)
Configured via memory-mapped control registers
Exceptions/interrupts processed in Handler mode
Supervisor privilege
Interruptible LDM/STM (and PUSH/POP) for low interrupt latency
Continued on return from interrupt
Non Maskable Interrupt (NMI)
34
Nested Vector Interrupt Controller
When an interrupt occurs: APB
Vendor Specific
External PPB
FFFFFFFF
E0100000
E0040000
M3 Instruction SCS + NVIC
A0000000
with SYSTEM
SYSTEM AHB
AHB
In parallel, the processor state is saved over the SYSTEM bus Debug
Debug
Bit-Bander
Aligner
and Patch
ICODE AHB
DCODE AHB EX
External RAM 1GB
60000000
Peripheral ½GB
BB
40000000
Code ROM/RAM
½GB
20000000
½GB
HX = High performance code execution support HX
00000000
BB = Bit banding support
35
Vector Table
ARMv7M architecture implements a re-locatable vector table
Contains initial stack pointer value
Contains the address of RESET handler
Contains the address of the function to execute for a particular handler
The first sixteen entries are special with the others mapping to specific interrupts
36
Exception & Pre-emption Ordering
Exception handling order is defined by programmable priority
Reset, Non Maskable Interrupt (NMI) and Hard Fault have predefined pre-emption.
NVIC catches exceptions and pre-empts current task based on priority
… … … …
… … … …
… … … …
255 Interrupt #239 Programmable External Interrupt #239
37
Exception Model
Exceptions cause current machine state to be stacked
Stacked registers conform to Embedded Application Binary Interface (EABI)
Exception handlers are trivial as register manipulation carried out in
hardware
No assembler code required
Simple ‘C’ interrupt service routines
void MY_IRQHandler(void) { /* my handler */ }
38
Interrupt Response – Tail Chaining
Highest
IRQ1
IRQ2
ARM7TDMI
Push ISR 1 Pop Push ISR 2 Pop
Interrupt Handling
26 Cycles 16 Cycles 26 Cycles 16 Cycles
ARM7TDMI Cortex-M3
26 cycles from IRQ1 to ISR1 12 cycles from IRQ1 to ISR1
(up to 42 cycles if in LSM) (Interruptible/Continual LSM)
42 cycles from ISR1 exit to ISR2 entry 6 cycles from ISR1 exit to ISR2 entry
16 cycles to return from ISR2 12 cycles to return from ISR2
39
Interrupt Response – Late Arriving
Highest
IRQ1
IRQ2
ARM7TDMI
Push Push ISR 1 Pop ISR 2 Pop
Interrupt Handling
26 Cycles 26 Cycles 16 Cycles 16 Cycles
Cortex-M3
Push ISR 1 ISR 2 Pop
Interrupt Handling
12 Cycles 6 Cycles 12 Cycles
Tail-Chaining
ARM7TDMI Cortex-M3
26 cycles to ISR2 entered 12 cycles to ISR entry
Immediately pre-empted by IRQ1 Parallel stacking & instruction fetch
Additional 26 cycles to enter ISR1. Target ISR may be changed until last
ISR 1 completes cycle (PC is set)
Additional 16 cycles return to ISR2. When IRQ1 occurs new target ISR set
40
Interrupt Response – Pop Pre-emption
Highest
IRQ1
IRQ2
ARM7TDMI
ISR 1 Pop Push ISR 2 Pop
Interrupt Handling
16 Cycles 26 Cycles 16 Cycles
Cortex-M3
ISR 1 ISR 2 Pop
Interrupt Handling
6 Cycles 12 Cycles
Abandon Pop (1-12 Cycles) Tail-Chaining
ARM7TDMI Cortex-M3
Load multiple not interruptible Hardware un-stacking interruptible
Core must complete the recovery of If interrupted only 6 cycles required
the stack then re-stack to enter the ISR to enter ISR2
41
SYSTICK Timer
Simple 24 Bit down counter
4 Registers
CTRL - Control and Status
LOAD – Reload Value
VAL – Current Value
CALIB - Calibration Value
42
Atomic Bit Manipulation
Internal and peripheral data represented as individual bits Cortex-M3 Memory Map
without the processing overhead normally associated with
this type of action.
Alias Memory
32-Bit Word Alias 0
1
Real Memory
01
43
Bit Manipulation – BFI / BFC
Insert or clear any number of adjacent bits anywhere in a register
Ideal for modifying or stripping packet headers
BFI - Bit Field Insert
BFC - Bit Filed Clear
Packet stored in R1
1. Packet comes in 1 1 4 4 8 8 C C
32 0
Contents of R0
2. Want to insert data in
front of packet 3 3 5 5 A A C C
32 0
Example:
R0[15:0] into R1[31:16]
[ BFI R1, R0, #16, #16 ]
3. Insertion is executed by
hardware immediately
A A C C 8 8 C C
32 16 0
Resulting Packet in R1
Note: BFC will clear ‘n’ adjacent bits in a register, starting at bit ‘m’.
e.g.: BFC R0, #4, #8 will clear a byte starting from bit 4.
44
Memory Protection Unit (MPU)
MPU provides access control for various
memory regions
45
OMAP Processor
OMAP (Open Multimedia Applications Platform) is a family of image/video
processors that was developed by Texas Instruments. They are proprietary system
on chips (SoCs) for portable and mobile multimedia applications. OMAP devices
generally include a general-purpose ARM architecture processor core plus one or
more specialized co-processors. Earlier OMAP variants commonly featured a
variant of the Texas Instruments TMS320 series digital signal processor.
OMAP family
The OMAP family consists of three product groups classified by performance and
intended application:
OMAP Architecture
OMAP Processor
Used in smart phones which are powerful enough to run significant operating
systems such as Linux, Android, Symbian ,support connectivity to personal
computers and support various audio and video applications.
Basic multimedia application processors
marketed only used by handset manufacturers. They are highly integrated for use
in very low-cost cell phones.
Advantage
High performance
Low power consumption
PC like web browsing
Faster user interfaces
More flexibility
Full HD 1080p30 multi standard video encode or decode
High security
Less time to market
Slim and light weight designs
Applications
Industrial automation
Medical appliances
Auto motives
Mobile phones
Multimedia/gaming applications
Consumer