top of page

Introduction to ARM Architecture and the Working Principle of Its Modules

Updated: Apr 7

ARM stands for Advanced RISC Machine and is one of the most widely used and licensed processor architectures in the world. The first ARM processor concept was developed in 1978 at Cambridge University, and the first practical ARM RISC processor was produced by Acorn Computers in 1985. ARM processors are widely used in portable and embedded devices such as digital cameras, mobile phones, home networking systems, wireless communication devices, and many other embedded applications because they offer low power consumption, efficient performance, and compact design.



ARM Architecture

ARM is based on RISC (Reduced Instruction Set Computing) architecture and is commonly implemented as a 32-bit microcontroller architecture. It was introduced by Acorn Computers in 1987 and later adopted by many semiconductor manufacturers such as STMicroelectronics, Motorola, and others. Over time, ARM architecture evolved through multiple versions such as ARMv1, ARMv2, and later families, each with its own strengths and limitations.



The ARM Cortex family is one of the most important ARM processor families and is based on the ARMv7 architecture. It is divided into three major subfamilies:

  • ARM Cortex-A series

  • ARM Cortex-R series

  • ARM Cortex-M series

Main Components of ARM Architecture

The ARM architecture mainly includes the following components:

  • Arithmetic Logic Unit (ALU)

  • Booth Multiplier

  • Barrel Shifter

  • Control Unit

  • Register File

In addition to these, ARM processors also include a Program Status Register, which stores processor flags such as Z, S, V, and C, along with mode bits and interrupt control bits. Other special registers include the instruction register, memory data registers for read and write, and the memory address register. A priority encoder is also used during multiple load and store operations to identify which register in the register file should be loaded or stored. Several multiplexers are used to control processor bus operations. Each architectural block can be modeled behaviorally, allowing easier design, optimization, and maintenance.


Arithmetic Logic Unit (ALU)

The ALU has two 32-bit inputs. One input comes from the register file, and the other comes from the shifter. The ALU updates the status flags based on its output:

  • V flag is updated from the overflow output

  • C flag is updated from the carry output

  • The most significant bit represents the S flag

  • The ALU output is NORed to generate the Z flag

The ALU uses a 4-bit function bus, which allows up to 16 different operations to be implemented.


Booth Multiplier

The Booth algorithm is an important multiplication technique used for 2’s complement numbers. It handles both positive and negative numbers uniformly. It also improves efficiency by skipping continuous runs of 0s or 1s in the multiplier, reducing unnecessary addition or subtraction steps. This can significantly speed up multiplication. According to the described implementation, the multiplication operation completes in 16 clock cycles.


Barrel Shifter

The barrel shifter takes a 32-bit input, which may come from the register file or from immediate data. The operation of the shifter is controlled by fields from the instruction register. The shift field determines the type of shift to perform, such as:

  • Logical left shift

  • Logical right shift

  • Arithmetic right shift

  • Rotate right

The amount of shift may come either from an immediate field in the instruction or from the lower 6 bits of a register in the register file. The shift_val input bus is 6 bits wide, allowing shifts up to 32 bits. The shifttype input uses:

  • 00 for shift left

  • 01 for shift right

  • 10 for arithmetic shift right

  • 11 for rotate right

The barrel shifter is mainly built using multiplexers.


Control Unit

The control unit is the heart of the processor and is responsible for supervising the operation of the entire system. Its design is one of the most important aspects of the processor architecture. It is often implemented as a combinational circuit, but in this case it is described as a simple state machine. The processor timing is also handled by the control unit. Signals generated by the control unit are connected to all processor components to coordinate and control their operations.


ARM7 Functional Diagram

The final aspect to understand is how the ARM7 processor is utilized and how the chip is structured. The processor interfaces with a variety of signals, including input, output, and control (supervisory) signals, which collectively manage and regulate its overall operation.



ARM Microcontroller Register Modes

ARM follows a load-store architecture, meaning the core cannot operate directly on memory. Data must first be loaded into registers, processed there, and then written back to memory. The ARM Cortex-M3 includes 37 registers, of which 31 are general-purpose registers and 6 are status registers. ARM processors use several processing modes:

  • User Mode

  • FIQ Mode

  • IRQ Mode

  • SVC Mode

  • Undefined Mode

  • Abort Mode

  • Monitor Mode


Description of the Modes

User Mode:

This is the normal operating mode. It has the fewest available registers, no SPSR, and limited access to CPSR.

FIQ and IRQ Modes:

These are interrupt modes. FIQ is used for fast interrupts, while IRQ is used for standard interrupts. FIQ mode includes five additional banked registers, allowing faster response and improved performance during critical interrupt handling.

SVC Mode:

Supervisor mode is used for software interrupts, startup, and reset operations.

Undefined Mode:

This mode is entered when the processor tries to execute an illegal instruction.


THUMB and THUMB-2 Modes

In THUMB mode, 32-bit data is handled in 16-bit instruction format, which improves code density and can increase execution efficiency. In THUMB-2 mode, instructions can be either 16-bit or 32-bit, providing a good balance between compact code and high performance. The ARM Cortex-M3 uses only THUMB-2 instructions.

Some registers are reserved for specific purposes in each mode. These include:

  • Stack Pointer (SP)

  • Link Register (LR)

  • Program Counter (PC)

  • Current Program Status Register (CPSR)

  • Saved Program Status Register (SPSR)

The CPSR and SPSR store control and status bits such as operating mode, interrupt enable or disable flags, and ALU status flags. The ARM core operates in either 32-bit ARM state or THUMB state.


ARM Cortex Microcontroller Programming

Today, many microcontroller manufacturers offer 32-bit microcontrollers based on ARM Cortex-M3 architecture. Embedded system developers increasingly prefer these controllers for modern applications. ARM microcontrollers support both low-level and high-level programming languages. Older traditional microcontroller architectures often had limited memory and lower performance, which made high-level programming more difficult. ARM microcontrollers, however, can run at 100 MHz or higher, making them suitable for high-level language support and more advanced software development.



ARM microcontrollers are commonly programmed using IDEs such as:

  • Keil uVision3

  • Keil uVision4

  • Coocox

While 8-bit microcontrollers use 8-bit instruction structures, ARM Cortex-M devices use 32-bit instructions for more advanced processing capabilities.


Additional Uses and Features of Cortex Processors

The Cortex processor offers many important features:

  • Reduced Instruction Set Computing (RISC) design

  • 32-bit high-performance CPU

  • Compact 3-stage pipeline

  • THUMB-2 technology

  • Efficient combination of 16-bit and 32-bit instructions

  • High performance with low power usage

  • Support for development tools and RTOS

  • CoreSight debug and trace support

  • JTAG or 2-pin Serial Wire Debug connections

  • Support for multi-processor systems

  • Low-power sleep modes

  • Software-controlled power management

  • Multiple power domains

  • Nested Vectored Interrupt Controller (NVIC)

  • Low-latency and low-noise interrupt response

  • No need for assembly language programming in many cases

Comments


bottom of page