Chapter 4. HAL Interfaces

These are definitions that characterize the properties of the base architecture that are used to compile the portable parts of the kernel. They are concerned with such things a portable type definitions, endianness, and labeling.

These definitions are supplied by the cyg/hal/basetype.h header file which is supplied by the architecture HAL. It is included automatically by cyg/infra/cyg_type.h.

4.1.1. Byte order

CYG_BYTEORDER: This defines the byte order of the target and must be set to either CYG_LSBFIRST or CYG_MSBFIRST.

4.1.2. Label Translation

CYG_LABEL_NAME(name): This is a wrapper used in some C and C++ files which use labels defined in assembly code or the linker script. It need only be defined if the default implementation in cyg/infra/cyg_type.h, which passes the name argument unaltered, is inadequate. It should be paired with CYG_LABEL_DEFN().
CYG_LABEL_DEFN(name): This is a wrapper used in assembler sources and linker scripts which define labels. It need only be defined if the default implementation in cyg/infra/cyg_type.h, which passes the name argument unaltered, is inadequate. The most usual alternative definition of this macro prepends an underscore to the label name.

4.1.3. Base types

    cyg_halint8
    cyg_halint16
    cyg_halint32
    cyg_halint64
    cyg_halcount8
    cyg_halcount16
    cyg_halcount32
    cyg_halcount64
    cyg_halbool

These macros define the C base types that should be used to define variables of the given size. They only need to be defined if the default types specified in cyg/infra/cyg_type.h cannot be used. Note that these are only the base types, they will be composed with signed and unsigned to form full type specifications.

4.1.4. Atomic types

    cyg_halatomic CYG_ATOMIC

These types are guaranteed to be read or written in a single uninterruptible operation. It is architecture defined what size this type is, but it will be at least a byte.

4.2. Architecture Characterization

4.2.1. Register Save Format
4.2.2. Thread Context Initialization
4.2.3. Thread Context Switching
4.2.4. Bit indexing
4.2.5. Idle thread activity
4.2.6. Reorder barrier
4.2.7. Breakpoint support
4.2.8. GDB support
4.2.9. Setjmp and longjmp support
4.2.10. Stack Sizes
4.2.11. Address Translation
4.2.12. Global Pointer

These are definition that are related to the basic architecture of the CPU. These include the CPU context save format, context switching, bit twiddling, breakpoints, stack sizes and address translation.

Most of these definition are found in cyg/hal/hal_arch.h. This file is supplied by the architecture HAL. If there are variant or platform specific definitions then these will be found in cyg/hal/var_arch.h or cyg/hal/plf_arch.h. These files are include automatically by this header, so need not be included explicitly.

4.2.1. Register Save Format

typedef struct HAL_SavedRegisters
{
    /* architecture-dependent list of registers to be saved */
} HAL_SavedRegisters;

This structure describes the layout of a saved machine state on the stack. Such states are saved during thread context switches, interrupts and exceptions. Different quantities of state may be saved during each of these, but usually a thread context state is a subset of the interrupt state which is itself a subset of an exception state. For debugging purposes, the same structure is used for all three purposes, but where these states are significantly different, this structure may contain a union of the three states.

4.2.2. Thread Context Initialization

HAL_THREAD_INIT_CONTEXT( sp, arg, entry, id )

This macro initializes a thread's context so that it may be switched to by HAL_THREAD_SWITCH_CONTEXT(). The arguments are:

sp: A location containing the current value of the thread's stack pointer. This should be a variable or a structure field. The SP value will be read out of here and an adjusted value written back.
arg: A value that is passed as the first argument to the entry point function.
entry: The address of an entry point function. This will be called according the C calling conventions, and the value of arg will be passed as the first argument. This function should have the following type signature void entry(CYG_ADDRWORD arg).
id: A thread id value. This is only used for debugging purposes, it is ORed into the initialization pattern for unused registers and may be used to help identify the thread from its register dump. The least significant 16 bits of this value should be zero to allow space for a register identifier.

4.2.3. Thread Context Switching

HAL_THREAD_LOAD_CONTEXT( to )
HAL_THREAD_SWITCH_CONTEXT( from, to )

These macros implement the thread switch code. The arguments are:

from: A pointer to a location where the stack pointer of the current thread will be stored.
to: A pointer to a location from where the stack pointer of the next thread will be read.

For HAL_THREAD_LOAD_CONTEXT() the current CPU state is discarded and the state of the destination thread is loaded. This is only used once, to load the first thread when the scheduler is started.

For HAL_THREAD_SWITCH_CONTEXT() the state of the current thread is saved onto its stack, using the current value of the stack pointer, and the address of the saved state placed in *from. The value in *to is then read and the state of the new thread is loaded from it.

While these two operations may be implemented with inline assembler, they are normally implemented as calls to assembly code functions in the HAL. There are two advantages to doing it this way. First, the return link of the call provides a convenient PC value to be used in the saved context. Second, the calling conventions mean that the compiler will have already saved the caller-saved registers before the call, so the HAL need only save the callee-saved registers.

The implementation of HAL_THREAD_SWITCH_CONTEXT() saves the current CPU state on the stack, including the current interrupt state (or at least the register that contains it). For debugging purposes it is useful to save the entire register set, but for performance only the ABI-defined callee-saved registers need be saved. If it is implemented, the option CYGDBG_HAL_COMMON_CONTEXT_SAVE_MINIMUM controls how many registers are saved.

The implementation of HAL_THREAD_LOAD_CONTEXT() loads a thread context, destroying the current context. With a little care this can be implemented by sharing code with HAL_THREAD_SWITCH_CONTEXT(). To load a thread context simply requires the saved registers to be restored from the stack and a jump or return made back to the saved PC.

Note that interrupts are not disabled during this process, any interrupts that occur will be delivered onto the stack to which the current CPU stack pointer points. Hence the stack pointer should never be invalid, or loaded with a value that might cause the saved state to become corrupted by an interrupt. However, the current interrupt state is saved and restored as part of the thread context. If a thread disables interrupts and does something to cause a context switch, interrupts may be re-enabled on switching to another thread. Interrupts will be disabled again when the original thread regains control.

4.2.4. Bit indexing

HAL_LSBIT_INDEX( index, mask )
HAL_MSBIT_INDEX( index, mask )

These macros place in index the bit index of the least significant bit in mask. Some architectures have instruction level support for one or other of these operations. If no architectural support is available, then these macros may call C functions to do the job.

4.2.5. Idle thread activity

HAL_IDLE_THREAD_ACTION( count )

It may be necessary under some circumstances for the HAL to execute code in the kernel idle thread's loop. An example might be to execute a processor halt instruction. This macro provides a portable way of doing this. The argument is a copy of the idle thread's loop counter, and may be used to trigger actions at longer intervals than every loop.

4.2.6. Reorder barrier

HAL_REORDER_BARRIER()

When optimizing the compiler can reorder code. In some parts of multi-threaded systems, where the order of actions is vital, this can sometimes cause problems. This macro may be inserted into places where reordering should not happen and prevents code being migrated across it by the compiler optimizer. It should be placed between statements that must be executed in the order written in the code.

4.2.7. Breakpoint support

HAL_BREAKPOINT( label )
HAL_BREAKINST
HAL_BREAKINST_SIZE

These macros provide support for breakpoints.

HAL_BREAKPOINT() executes a breakpoint instruction. The label is defined at the breakpoint instruction so that exception code can detect which breakpoint was executed.

HAL_BREAKINST contains the breakpoint instruction code as an integer value. HAL_BREAKINST_SIZE is the size of that breakpoint instruction in bytes. Together these may be used to place a breakpoint in any code.

4.2.8. GDB support

HAL_THREAD_GET_SAVED_REGISTERS( sp, regs )
HAL_GET_GDB_REGISTERS( regval, regs )
HAL_SET_GDB_REGISTERS( regs, regval )

These macros provide support for interfacing GDB to the HAL.

HAL_THREAD_GET_SAVED_REGISTERS() extracts a pointer to a HAL_SavedRegisters structure from a stack pointer value. The stack pointer passed in should be the value saved by the thread context macros. The macro will assign a pointer to the HAL_SavedRegisters structure to the variable passed as the second argument.

HAL_GET_GDB_REGISTERS() translates a register state as saved by the HAL and into a register dump in the format expected by GDB. It takes a pointer to a HAL_SavedRegisters structure in the regs argument and a pointer to the memory to contain the GDB register dump in the regval argument.

HAL_SET_GDB_REGISTERS() translates a GDB format register dump into a the format expected by the HAL. It takes a pointer to the memory containing the GDB register dump in the regval argument and a pointer to a HAL_SavedRegisters structure in the regs argument.

4.2.9. Setjmp and longjmp support

CYGARC_JMP_BUF_SIZE
hal_jmp_buf[CYGARC_JMP_BUF_SIZE]
hal_setjmp( hal_jmp_buf env )
hal_longjmp( hal_jmp_buf env, int val )

These functions provide support for the C setjmp() and longjmp() functions. Refer to the C library for further information.

4.2.10. Stack Sizes

CYGNUM_HAL_STACK_SIZE_MINIMUM
CYGNUM_HAL_STACK_SIZE_TYPICAL

The values of these macros define the minimum and typical sizes of thread stacks.

CYGNUM_HAL_STACK_SIZE_MINIMUM defines the minimum size of a thread stack. This is enough for the thread to function correctly within eCos and allows it to take interrupts and context switches. There should also be enough space for a simple thread entry function to execute and call basic kernel operations on objects like mutexes and semaphores. However there will not be enough room for much more than this. When creating stacks for their own threads, applications should determine the stack usage needed for application purposes and then add CYGNUM_HAL_STACK_SIZE_MINIMUM.

CYGNUM_HAL_STACK_SIZE_TYPICAL is a reasonable increment over CYGNUM_HAL_STACK_SIZE_MINIMUM, usually about 1kB. This should be adequate for most modest thread needs. Only threads that need to define significant amounts of local data, or have very deep call trees should need to use a larger stack size.

4.2.11. Address Translation

CYGARC_CACHED_ADDRESS(addr)
CYGARC_UNCACHED_ADDRESS(addr)
CYGARC_PHYSICAL_ADDRESS(addr)

These macros provide address translation between different views of memory. In many architectures a given memory location may be visible at different addresses in both cached and uncached forms. It is also possible that the MMU or some other address translation unit in the CPU presents memory to the program at a different virtual address to its physical address on the bus.

CYGARC_CACHED_ADDRESS() translates the given address to its location in cached memory. This is typically where the application will access the memory.

CYGARC_UNCACHED_ADDRESS() translates the given address to its location in uncached memory. This is typically where device drivers will access the memory to avoid cache problems. It may additionally be necessary for the cache to be flushed before the contents of this location is fully valid.

CYGARC_PHYSICAL_ADDRESS() translates the given address to its location in the physical address space. This is typically the address that needs to be passed to device hardware such as a DMA engine, Ethernet device or PCI bus bridge. The physical address may not be directly accessible to the program, it may be re-mapped by address translation.

4.2.12. Global Pointer

CYGARC_HAL_SAVE_GP()
CYGARC_HAL_RESTORE_GP()

These macros insert code to save and restore any global data pointer that the ABI uses. These are necessary when switching context between two eCos instances - for example between an eCos application and RedBoot.

4.3. Interrupt Handling

4.3.1. Vector numbers
4.3.2. Interrupt state control
4.3.3. ISR and VSR management
4.3.4. Interrupt controller management

These interfaces contain definitions related to interrupt handling. They include definitions of exception and interrupt numbers, interrupt enabling and masking.

These definitions are normally found in cyg/hal/hal_intr.h. This file is supplied by the architecture HAL. Any variant or platform specific definitions will be found in cyg/hal/var_intr.h, cyg/hal/plf_intr.h or cyg/hal/hal_platform_ints.h in the variant or platform HAL, depending on the exact target. These files are include automatically by this header, so need not be included explicitly.

4.3.1. Vector numbers

CYGNUM_HAL_VECTOR_XXXX
CYGNUM_HAL_VSR_MIN
CYGNUM_HAL_VSR_MAX
CYGNUM_HAL_VSR_COUNT

CYGNUM_HAL_INTERRUPT_XXXX
CYGNUM_HAL_ISR_MIN
CYGNUM_HAL_ISR_MAX
CYGNUM_HAL_ISR_COUNT

CYGNUM_HAL_EXCEPTION_XXXX
CYGNUM_HAL_EXCEPTION_MIN
CYGNUM_HAL_EXCEPTION_MAX
CYGNUM_HAL_EXCEPTION_COUNT

All possible VSR, interrupt and exception vectors are specified here, together with maximum and minimum values for range checking. While the VSR and exception numbers will be defined in this file, the interrupt numbers will normally be defined in the variant or platform HAL file that is included by this header.

There are two ranges of numbers, those for the vector service routines and those for the interrupt service routines. The relationship between these two ranges is undefined, and no equivalence should be assumed if vectors from the two ranges coincide.

The VSR vectors correspond to the set of exception vectors that can be delivered by the CPU architecture, many of these will be internal exception traps. The ISR vectors correspond to the set of external interrupts that can be delivered and are usually determined by extra decoding of the interrupt controller by the interrupt VSR.

Where a CPU supports synchronous exceptions, the range of such exceptions allowed are defined by CYGNUM_HAL_EXCEPTION_MIN and CYGNUM_HAL_EXCEPTION_MAX. The CYGNUM_HAL_EXCEPTION_XXXX definitions are standard names used by target independent code to test for the presence of particular exceptions in the architecture. The actual exception numbers will normally correspond to the VSR exception range. In future other exceptions generated by the system software (such as stack overflow) may be added.

CYGNUM_HAL_ISR_COUNT, CYGNUM_HAL_VSR_COUNT and CYGNUM_HAL_EXCEPTION_COUNT define the number of ISRs, VSRs and EXCEPTIONs respectively for the purposes of defining arrays etc. There might be a translation from the supplied vector numbers into array offsets. Hence CYGNUM_HAL_XXX_COUNT may not simply be CYGNUM_HAL_XXX_MAX - CYGNUM_HAL_XXX_MIN or CYGNUM_HAL_XXX_MAX+1.

4.3.2. Interrupt state control

CYG_INTERRUPT_STATE
HAL_DISABLE_INTERRUPTS( old )
HAL_RESTORE_INTERRUPTS( old )
HAL_ENABLE_INTERRUPTS()
HAL_QUERY_INTERRUPTS( state )

These macros provide control over the state of the CPUs interrupt mask mechanism. They should normally manipulate a CPU status register to enable and disable interrupt delivery. They should not access an interrupt controller.

CYG_INTERRUPT_STATE is a data type that should be used to store the interrupt state returned by HAL_DISABLE_INTERRUPTS() and HAL_QUERY_INTERRUPTS() and passed to HAL_RESTORE_INTERRUPTS().

HAL_DISABLE_INTERRUPTS() disables the delivery of interrupts and stores the original state of the interrupt mask in the variable passed in the old argument.

HAL_RESTORE_INTERRUPTS() restores the state of the interrupt mask to that recorded in old.

HAL_ENABLE_INTERRUPTS() simply enables interrupts regardless of the current state of the mask.

HAL_QUERY_INTERRUPTS() stores the state of the interrupt mask in the variable passed in the state argument. The state stored here should also be capable of being passed to HAL_RESTORE_INTERRUPTS() at a later point.

It is at the HAL implementer‚s discretion exactly which interrupts are masked by this mechanism. Where a CPU has more than one interrupt type that may be masked separately (e.g. the ARM's IRQ and FIQ) only those that can raise DSRs need to be masked here. A separate architecture specific mechanism may then be used to control the other interrupt types.

4.3.3. ISR and VSR management

HAL_INTERRUPT_IN_USE( vector, state )
HAL_INTERRUPT_ATTACH( vector, isr, data, object )
HAL_INTERRUPT_DETACH( vector, isr )
HAL_VSR_SET( vector, vsr, poldvsr )
HAL_VSR_GET( vector, pvsr )
HAL_VSR_SET_TO_ECOS_HANDLER( vector, poldvsr )

These macros manage the attachment of interrupt and vector service routines to interrupt and exception vectors respectively.

HAL_INTERRUPT_IN_USE() tests the state of the supplied interrupt vector and sets the value of the state parameter to either 1 or 0 depending on whether there is already an ISR attached to the vector. The HAL will only allow one ISR to be attached to each vector, so it is a good idea to use this function before using HAL_INTERRUPT_ATTACH().

HAL_INTERRUPT_ATTACH() attaches the ISR, data pointer and object pointer to the given vector. When an interrupt occurs on this vector the ISR is called using the C calling convention and the vector number and data pointer are passed to it as the first and second arguments respectively.

HAL_INTERRUPT_DETACH() detaches the ISR from the vector.

HAL_VSR_SET() replaces the VSR attached to the vector with the replacement supplied in vsr. The old VSR is returned in the location pointed to by pvsr. On some platforms, possibly only in certain configurations, the table of VSRs will be in read-only memory. If so then this macro should be left undefined.

HAL_VSR_GET() assigns a copy of the VSR to the location pointed to by pvsr.

HAL_VSR_SET_TO_ECOS_HANDLER() ensures that the VSR for a specific exception is pointing at the eCos exception VSR and not one for RedBoot or some other ROM monitor. The default when running under RedBoot is for exceptions to be handled by RedBoot and passed to GDB. This macro diverts the exception to eCos so that it may be handled by application code. The arguments are the VSR vector to be replaces, and a location in which to store the old VSR pointer, so that it may be replaced at a later point. On some platforms, possibly only in certain configurations, the table of VSRs will be in read-only memory. If so then this macro should be left undefined.

4.3.4. Interrupt controller management

HAL_INTERRUPT_MASK( vector )
HAL_INTERRUPT_UNMASK( vector )
HAL_INTERRUPT_ACKNOWLEDGE( vector )
HAL_INTERRUPT_CONFIGURE( vector, level, up )
HAL_INTERRUPT_SET_LEVEL( vector, level )

These macros exert control over any prioritized interrupt controller that is present. If no priority controller exists, then these macros should be empty.

Note

These macros may not be reentrant, so care should be taken to prevent them being called while interrupts are enabled. This means that they can be safely used in initialization code before interrupts are enabled, and in ISRs. In DSRs, ASRs and thread code, however, interrupts must be disabled before these macros are called. Here is an example for use in a DSR where the interrupt source is unmasked after data processing:

…
HAL_DISABLE_INTERRUPTS(old);
HAL_INTERRUPT_UNMASK(CYGNUM_HAL_INTERRUPT_ETH);
HAL_RESTORE_INTERRUPTS(old);
…

HAL_INTERRUPT_MASK() causes the interrupt associated with the given vector to be blocked.

HAL_INTERRUPT_UNMASK() causes the interrupt associated with the given vector to be unblocked.

HAL_INTERRUPT_ACKNOWLEDGE() acknowledges the current interrupt from the given vector. This is usually executed from the ISR for this vector when it is prepared to allow further interrupts. Most interrupt controllers need some form of acknowledge action before the next interrupt is allowed through. Executing this macro may cause another interrupt to be delivered. Whether this interrupts the current code depends on the state of the CPU interrupt mask.

HAL_INTERRUPT_CONFIGURE() provides control over how an interrupt signal is detected. The arguments are:

vector: The interrupt vector to be configured.
level: Set to true if the interrupt is detected by level, and false if it is edge triggered.
up: If the interrupt is set to level detect, then if this is true it is detected by a high signal level, and if false by a low signal level. If the interrupt is set to edge triggered, then if this is true it is triggered by a rising edge and if false by a falling edge.

HAL_INTERRUPT_SET_LEVEL() provides control over the hardware priority of the interrupt. The arguments are:

vector: The interrupt whose level is to be set.
level: The priority level to which the interrupt is to set. In some architectures the masking of an interrupt is achieved by changing its priority level. Hence this function, HAL_INTERRUPT_MASK() and HAL_INTERRUPT_UNMASK() may interfere with each other.

4.4. Clocks and Timers

4.4.1. Clock Control
4.4.2. Microsecond Delay
4.4.3. Clock Frequency Definition

These interfaces contain definitions related to clock and timer handling. They include interfaces to initialize and read a clock for generating regular interrupts, definitions for setting the frequency of the clock, and support for short timed delays.

4.4.1. Clock Control

HAL_CLOCK_INITIALIZE( period )
HAL_CLOCK_RESET( vector, period )
HAL_CLOCK_READ( pvalue )

These macros provide control over a clock or timer device that may be used by the kernel to provide time-out, delay and scheduling services. The clock is assumed to be implemented by some form of counter that is incremented or decremented by some external source and which raises an interrupt when it reaches a predetermined value.

HAL_CLOCK_INITIALIZE() initializes the timer device to interrupt at the given period. The period is essentially the value used to initialize the timer counter and must be calculated from the timer frequency and the desired interrupt rate. The timer device should generate an interrupt every period cycles.

HAL_CLOCK_RESET() re-initializes the timer to provoke the next interrupt. This macro is only really necessary when the timer device needs to be reset in some way after each interrupt.

HAL_CLOCK_READ() reads the current value of the timer counter and puts the value in the location pointed to by pvalue. The value stored will always be the number of timer cycles since the last interrupt, and hence ranges between zero and the initial period value. If this is a count-down cyclic timer, some arithmetic may be necessary to generate this value.

4.4.2. Microsecond Delay

HAL_DELAY_US(us)

This macro provides a busy loop delay for the given number of microseconds. It is intended mainly for controlling hardware that needs short delays between operations. Code which needs longer delays, of the order of milliseconds, should instead use higher-level functions such as cyg_thread_delay. The macro implementation should be thread-safe. It can also be used in ISRs or DSRs, although such usage is undesirable because of the impact on interrupt and dispatch latency.

The macro should never delay for less than the specified amount of time. It may delay for somewhat longer, although since the macro uses a busy loop this is a waste of CPU cycles. Of course the code invoking HAL_DELAY_US may get interrupted or timesliced, in which case the delay may be much longer than intended. If this is unacceptable then the calling code must take preventative action such as disabling interrupts or locking the scheduler.

There are three main ways of implementing the macro:

a counting loop, typically written in inline assembler, using an outer loop for the microseconds and an inner loop that consumes approximately 1us. This implementation is automatically thread-safe and does not impose any dependencies on the rest of the system, for example it does not depend on the system clock having been started. However it assumes that the CPU clock speed is known at compile-time or can be easily determined at run-time.
monitor one of the hardware clocks, usually the system clock. Usually this clock ticks at a rate independent of the CPU so calibration is easier. However the implementation relies on the system clock having been started, and assumes that no other code is manipulating the clock hardware. There can also be complications when the system clock wraps around.
a combination of the previous two. The system clock is used during system initialization to determine the CPU clock speed, and the result is then used to calibrate a counting loop. This has the disadvantage of significantly increasing the system startup time, which may be unacceptable to some applications. There are also complications if the system startup code normally runs with the cache disabled because the instruction cache will greatly affect any calibration loop.

4.4.3. Clock Frequency Definition

CYGNUM_HAL_RTC_NUMERATOR
CYGNUM_HAL_RTC_DENOMINATOR
CYGNUM_HAL_RTC_PERIOD

These macros are defined in the CDL for each platform and supply the necessary parameters to specify the frequency at which the clock interrupts. These parameters are usually found in the CDL definitions for the target platform, or in some cases the CPU variant.

CYGNUM_HAL_RTC_NUMERATOR and CYGNUM_HAL_RTC_DENOMINATOR specify the resolution of the clock interrupt. This resolution involves two separate values, the numerator and the denominator. The result of dividing the numerator by the denominator should correspond to the number of nanoseconds between clock interrupts. For example a numerator of 1000000000 and a denominator of 100 means that there are 10000000 nanoseconds (or 10 milliseconds) between clock interrupts. Expressing the resolution as a fraction minimizes clock drift even for frequencies that cannot be expressed as a simple integer. For example a frequency of 60Hz corresponds to a clock resolution of 16666666.66… nanoseconds. This can be expressed accurately as 1000000000 over 60.

CYGNUM_HAL_RTC_PERIOD specifies the exact value used to initialize the clock hardware, it is the value passed as a parameter to HAL_CLOCK_INITIALIZE() and HAL_CLOCK_RESET(). The exact meaning of the value and the range of legal values therefore depends on the target hardware, and the hardware documentation should be consulted for further details.

The default values for these macros in all HALs are calculated to give a clock interrupt frequency of 100Hz, or 10ms between interrupts. To change the clock frequency, the period needs to be changed, and the resolution needs to be adjusted accordingly. As an example consider the i386 PC target. The default values for these macros are:

CYGNUM_HAL_RTC_NUMERATOR     1000000000
CYGNUM_HAL_RTC_DENOMINATOR   100
CYGNUM_HAL_RTC_PERIOD        11932

To change to, say, a 200Hz clock the period needs to be halved to 5966, and to compensate the denominator needs to be doubled to 200. To change to a 1KHz interrupt rate change the period to 1193 and the denominator to 1000.

Some HALs make this process a little easier by deriving the period arithmetically from the denominator. This calculation may also involve the CPU clock frequency and possibly other factors. For example in the ARM AT91 variant HAL the period is defined by the following expression:

((CYGNUM_HAL_ARM_AT91_CLOCK_SPEED/32) / CYGNUM_HAL_RTC_DENOMINATOR)

In this case it is not necessary to change the period at all, just change the denominator to select the desired clock frequency. However, note that for certain choices of frequency, rounding errors in this calculation may result in a small clock drift over time. This is usually negligible, but if perfect accuracy is required, it may be necessary to adjust the frequency or period by hand.

4.5. HAL I/O

4.5.1. Register address
4.5.2. Register read
4.5.3. Register write

This section contains definitions for supporting access to device control registers in an architecture neutral fashion.

These definitions are normally found in the header file cyg/hal/hal_io.h. This file itself contains macros that are generic to the architecture. If there are variant or platform specific IO access macros then these will be found in cyg/hal/var_io.h and cyg/hal/plf_io.h in the variant or platform HALs respectively. These files are included automatically by this header, so need not be included explicitly.

This header (or more likely cyg/hal/plf_io.h) also defines the PCI access macros. For more information on these see the eCos PCI library reference documentation.

4.5.1. Register address

HAL_IO_REGISTER

This type is used to store the address of an I/O register. It will normally be a memory address, an integer port address or an offset into an I/O space. More complex architectures may need to code an address space plus offset pair into a single word, or may represent it as a structure.

Values of variables and constants of this type will usually be supplied by configuration mechanisms or in target specific headers.

4.5.2. Register read

HAL_READ_XXX( register, value )
HAL_READ_XXX_VECTOR( register, buffer, count, stride )

These macros support the reading of I/O registers in various sizes. The XXX component of the name may be UINT8, UINT16, UINT32.

HAL_READ_XXX() reads the appropriately sized value from the register and stores it in the variable passed as the second argument.

HAL_READ_XXX_VECTOR() reads count values of the appropriate size into buffer. The stride controls how the pointer advances through the register space. A stride of zero will read the same register repeatedly, and a stride of one will read adjacent registers of the given size. Greater strides will step by larger amounts, to allow for sparsely mapped registers for example.

4.5.3. Register write

HAL_WRITE_XXX( register, value )
HAL_WRITE_XXX_VECTOR( register, buffer,count, stride )

These macros support the writing of I/O registers in various sizes. The XXX component of the name may be UINT8, UINT16, UINT32.

HAL_WRITE_XXX() writes the appropriately sized value from the variable passed as the second argument stored it in the register.

HAL_WRITE_XXX_VECTOR() writes count values of the appropriate size from buffer. The stride controls how the pointer advances through the register space. A stride of zero will write the same register repeatedly, and a stride of one will write adjacent registers of the given size. Greater strides will step by larger amounts, to allow for sparsely mapped registers for example.

4.6. HAL Unique-ID

4.6.1. HAL_UNIQUE_ID_LEN
4.6.2. HAL_UNIQUE_ID

This section contains definitions for supporting the optional Unique-ID access in an architecture neutral fashion. Not all variants, or platforms, will provide a mechanism for accessing device-specific Unique-ID data, in which case the macros as documented in this section will not be defined.

The required definitions are normally referenced via the header file cyg/hal/hal_io.h. This file itself contains macros that are generic to the configured architecture. If there are variant or platform specific Unique-ID access macros then these will be found in cyg/hal/var_io.h and cyg/hal/plf_io.h in the variant or platform HALs respectively. These files are included automatically by the architecture header, so need not be included explicitly.

4.6.1. HAL_UNIQUE_ID_LEN

HAL_UNIQUE_ID_LEN( CYG_WORD32 maxlen )

This macro, when defined, provides a mechanism for ascertaining the maximum number of bytes of Unique-ID data available.

For most implementations this macro will return a build-time constant value in the passed maxlen parameter, but some systems may have a run-time calculated limit.

4.6.2. HAL_UNIQUE_ID

HAL_UNIQUE_ID( CYG_BYTE *buffer, CYG_WORD32 bufflen )

T his macro, when defined, provides a mechanism for filling the passed buffer parameter with upto bufflen bytes of Unique-ID data. If the implementation provides fewer that bufflen bytes of unique information then only the available data will be copied to the destination buffer.

The use of returned Unique-ID data is application specific, but examples may include use for USB device serial# identification, Ethernet MAC addresses, cryptography seeds, etc.

When a valid buffer parameter is passed then the bufflen parameter indicates how many bytes are available. This bufflen size may be less than the total amount of Unique-ID information available, with the variant/platform implementation only copying the requested amount.

	Note
	This “always-copy” model is used since it allows the same API to be used for systems where the ID information is not held in CPU addressable memory, or where multiple sources are concatenated by an implementation to provide a larger Unique-ID value.

4.7. Cache Control

4.7.1. Cache Dimensions
4.7.2. Global Cache Control
4.7.3. Cache Line Control

This section contains definitions for supporting control of the caches on the CPU.

These definitions are usually found in the header file cyg/hal/hal_cache.h. This file may be defined in the architecture, variant or platform HAL, depending on where the caches are implemented for the target. Often there will be a generic implementation of the cache control macros in the architecture HAL with the ability to override or undefine them in the variant or platform HAL. Even when the implementation of the cache macros is in the architecture HAL, the cache dimensions will be defined in the variant or platform HAL. As with other files, the variant or platform specific definitions are usually found in cyg/hal/var_cache.h and cyg/hal/plf_cache.h respectively. These files are include automatically by this header, so need not be included explicitly.

There are versions of the macros defined here for both the Data and Instruction caches. these are distinguished by the use of either DCACHE or ICACHE in the macro names. Some architectures have a unified cache, where both data and instruction share the same cache. In these cases the control macros use UCACHE and the DCACHE and ICACHE macros will just be calls to the UCACHE version. In the following descriptions, XCACHE is used to stand for any of these. Where there are issues specific to a particular cache, this will be explained in the text.

There might be target specific restrictions on the use of some of the macros which it is the user's responsibility to comply with. Such restrictions are documented in the header file with the macro definition.

Note that destructive cache macros should be used with caution. Preceding a cache invalidation with a cache synchronization is not safe in itself since an interrupt may happen after the synchronization but before the invalidation. This might cause the state of dirty data lines created during the interrupt to be lost.

Depending on the architecture's capabilities, it may be possible to temporarily disable the cache while doing the synchronization and invalidation which solves the problem (no new data would be cached during an interrupt). Otherwise it is necessary to disable interrupts while manipulating the cache which may take a long time.

Some platform HALs now support a pair of cache state query macros: HAL_ICACHE_IS_ENABLED( x ) and HAL_DCACHE_IS_ENABLED( x ) which set the argument to true if the instruction or data cache is enabled, respectively. Like most cache control macros, these are optional, because the capabilities of different targets and boards can vary considerably. Code which uses them, if it is to be considered portable, should test for their existence first by means of #ifdef. Be sure to include <cyg/hal/hal_cache.h> in order to do this test and (maybe) use the macros.

4.7.1. Cache Dimensions

HAL_XCACHE_SIZE
HAL_XCACHE_LINE_SIZE
HAL_XCACHE_WAYS
HAL_XCACHE_SETS

These macros define the size and dimensions of the Instruction and Data caches.

HAL_XCACHE_SIZE: Defines the total size of the cache in bytes.
HAL_XCACHE_LINE_SIZE: Defines the cache line size in bytes.
HAL_XCACHE_WAYS: Defines the number of ways in each set and defines its level of associativity. This would be 1 for a direct mapped cache, 2 for a 2-way cache, 4 for 4-way and so on.
HAL_XCACHE_SETS: Defines the number of sets in the cache, and is calculated from the previous values.

4.7.2. Global Cache Control

HAL_XCACHE_ENABLE()
HAL_XCACHE_DISABLE()
HAL_XCACHE_INVALIDATE_ALL()
HAL_XCACHE_SYNC()
HAL_XCACHE_BURST_SIZE( size )
HAL_DCACHE_WRITE_MODE( mode )
HAL_XCACHE_LOCK( base, size )
HAL_XCACHE_UNLOCK( base, size )
HAL_XCACHE_UNLOCK_ALL()

These macros affect the state of the entire cache, or a large part of it.

HAL_XCACHE_ENABLE() and HAL_XCACHE_DISABLE()

Enable and disable the cache.

HAL_XCACHE_INVALIDATE_ALL()

Causes the entire contents of the cache to be invalidated. Depending on the hardware, this may require the cache to be disabled during the invalidation process. If so, the implementation must use HAL_XCACHE_IS_ENABLED() to save and restore the previous state.

Note

If this macro is called after HAL_XCACHE_SYNC() with the intention of clearing the cache (invalidating the cache after writing dirty data back to memory), you must prevent interrupts from happening between the two calls:

…
HAL_DISABLE_INTERRUPTS(old);
HAL_XCACHE_SYNC();
HAL_XCACHE_INVALIDATE_ALL();
HAL_RESTORE_INTERRUPTS(old);
…

Since the operation may take a very long time, real-time responsiveness could be affected, so only do this when it is absolutely required and you know the delay will not interfere with the operation of drivers or the application.

HAL_XCACHE_SYNC()

Causes the contents of the cache to be brought into synchronization with the contents of memory. In some implementations this may be equivalent to HAL_XCACHE_INVALIDATE_ALL().

HAL_XCACHE_BURST_SIZE()

Allows the size of cache to/from memory bursts to be controlled. This macro will only be defined if this functionality is available.

HAL_DCACHE_WRITE_MODE()

Controls the way in which data cache lines are written back to memory. There will be definitions for the possible modes. Typical definitions are HAL_DCACHE_WRITEBACK_MODE and HAL_DCACHE_WRITETHRU_MODE. This macro will only be defined if this functionality is available.

HAL_XCACHE_LOCK()

Causes data to be locked into the cache. The base and size arguments define the memory region that will be locked into the cache. It is architecture dependent whether more than one locked region is allowed at any one time, and whether this operation causes the cache to cease acting as a cache for addresses outside the region during the duration of the lock. This macro will only be defined if this functionality is available.

HAL_XCACHE_UNLOCK()

Cancels the locking of the memory region given. This should normally correspond to a region supplied in a matching lock call. This macro will only be defined if this functionality is available.

HAL_XCACHE_UNLOCK_ALL()

Cancels all existing locked memory regions. This may be required as part of the cache initialization on some architectures. This macro will only be defined if this functionality is available.

4.7.3. Cache Line Control

HAL_DCACHE_ALLOCATE( base , size )
HAL_DCACHE_FLUSH( base , size )
HAL_XCACHE_INVALIDATE( base , size )
HAL_DCACHE_STORE( base , size )
HAL_DCACHE_READ_HINT( base , size )
HAL_DCACHE_WRITE_HINT( base , size )
HAL_DCACHE_ZERO( base , size )

All of these macros apply a cache operation to all cache lines that match the memory address region defined by the base and size arguments. These macros will only be defined if the described functionality is available. Also, it is not guaranteed that the cache function will only be applied to just the described regions, in some architectures it may be applied to the whole cache.

HAL_DCACHE_ALLOCATE(): Allocates lines in the cache for the given region without reading their contents from memory, hence the contents of the lines is undefined. This is useful for preallocating lines which are to be completely overwritten, for example in a block copy operation.
HAL_DCACHE_FLUSH(): Invalidates all cache lines in the region after writing any dirty lines to memory.
HAL_XCACHE_INVALIDATE(): Invalidates all cache lines in the region. Any dirty lines are invalidated without being written to memory.
HAL_DCACHE_STORE(): Writes all dirty lines in the region to memory, but does not invalidate any lines.
HAL_DCACHE_READ_HINT(): Hints to the cache that the region is going to be read from in the near future. This may cause the region to be speculatively read into the cache.
HAL_DCACHE_WRITE_HINT(): Hints to the cache that the region is going to be written to in the near future. This may have the identical behavior to HAL_DCACHE_READ_HINT().
HAL_DCACHE_ZERO(): Allocates and zeroes lines in the cache for the given region without reading memory. This is useful if a large area of memory is to be cleared.

4.8. Linker Scripts

When an eCos application is linked it must be done under the control of a linker script. This script defines the memory areas, addresses and sized, into which the code and data are to be put, and allocates the various sections generated by the compiler to these.

The linker script actually used is in lib/target.ld in the install directory. This is actually manufactured out of two other files: a base linker script and an .ldi file that was generated by the memory layout tool.

The base linker script is usually supplied either by the architecture HAL or the variant HAL. It consists of a set of linker script fragments, in the form of C preprocessor macros, that define the major output sections to be generated by the link operation. The .ldi file, which is #include'ed by the base linker script, uses these macro definitions to assign the output sections to the required memory areas and link addresses.

The .ldi file is supplied by the platform HAL, and contains knowledge of the memory layout of the target platform. These files generally conform to a standard naming convention, each file being of the form:

pkgconf/mlt_<architecture>_<variant>_<platform>_<startup>.ldi

where <architecture>, <variant> and <platform> are the respective HAL package names and <startup> is the startup type which is usually one of ROM, RAM or ROMRAM.

In addition to the .ldi file, there is also a congruously name .h file. This may be used by the application to access information defined in the .ldi file. Specifically it contains the memory layout defined there, together with any additional section names defined by the user. Examples of the latter are heap areas or PCI bus memory access windows.

The .ldi is manufactured by the Memory Layout Tool (MLT). The MLT saves the memory configuration into a file named

include/pkgconf/mlt_<architecture>_<variant>_<platform>_<startup>.mlt

in the platform HAL. This file is used by the MLT to manufacture both the .ldi and .h files. Users should beware that direct edits the either of these files may be overwritten if the MLT is run and regenerates them from the .mlt file.

The names of the .ldi and .h files are defined by macro definitions in pkgconf/system.h. These are CYGHWR_MEMORY_LAYOUT_LDI and CYGHWR_MEMORY_LAYOUT_H respectively. While there will be little need for the application to refer to the .ldi file directly, it may include the .h file as follows:

#include CYGHWR_MEMORY_LAYOUT_H

4.9. Diagnostic Support

The HAL provides support for low level diagnostic IO. This is particularly useful during early development as an aid to bringing up a new platform. Usually this diagnostic channel is a UART or some other serial IO device, but it may equally be a a memory buffer, a simulator supported output channel, a ROM emulator virtual UART, and LCD panel, a memory mapped video buffer or any other output device.

HAL_DIAG_INIT() performs any initialization required on the device being used to generate diagnostic output. This may include, for a UART, setting baud rate, and stop, parity and character bits. For other devices it may include initializing a controller or establishing contact with a remote device.

HAL_DIAG_WRITE_CHAR(c) writes the character supplied to the diagnostic output device.

HAL_DIAG_READ_CHAR(c) reads a character from the diagnostic device into the supplied variable. This is not supported for all diagnostic devices.

These macros are defined in the header file cyg/hal/hal_diag.h. This file is usually supplied by the variant or platform HAL, depending on where the IO device being used is located. For example for on-chip UARTs it would be in the variant HAL, but for a board-level LCD panel it would be in the platform HAL.

4.10. SMP Support

4.10.1. Target Hardware Limitations
4.10.2. HAL Support

eCos contains support for limited Symmetric Multi-Processing (SMP). This is only available on selected architectures and platforms.

4.10.1. Target Hardware Limitations

To allow a reasonable implementation of SMP, and to reduce the disruption to the existing source base, a number of assumptions have been made about the features of the target hardware.

Modest multiprocessing. The typical number of CPUs supported is two to four, with an upper limit around eight. While there are no inherent limits in the code, hardware and algorithmic limitations will probably become significant beyond this point.
SMP synchronization support. The hardware must supply a mechanism to allow software on two CPUs to synchronize. This is normally provided as part of the instruction set in the form of test-and-set, compare-and-swap or load-link/store-conditional instructions. An alternative approach is the provision of hardware semaphore registers which can be used to serialize implementations of these operations. Whatever hardware facilities are available, they are used in eCos to implement spinlocks.
Coherent caches. It is assumed that no extra effort will be required to access shared memory from any processor. This means that either there are no caches, they are shared by all processors, or are maintained in a coherent state by the hardware. It would be too disruptive to the eCos sources if every memory access had to be bracketed by cache load/flush operations. Any hardware that requires this is not supported.
Uniform addressing. It is assumed that all memory that is shared between CPUs is addressed at the same location from all CPUs. Like non-coherent caches, dealing with CPU-specific address translation is considered too disruptive to the eCos source base. This does not, however, preclude systems with non-uniform access costs for different CPUs.
Uniform device addressing. As with access to memory, it is assumed that all devices are equally accessible to all CPUs. Since device access is often made from thread contexts, it is not possible to restrict access to device control registers to certain CPUs.
Interrupt routing. The target hardware must have an interrupt controller that can route interrupts to specific CPUs. It is acceptable for all interrupts to be delivered to just one CPU, or for some interrupts to be bound to specific CPUs, or for some interrupts to be local to each CPU. At present dynamic routing, where a different CPU may be chosen each time an interrupt is delivered, is not supported. ECos cannot support hardware where all interrupts are delivered to all CPUs simultaneously with the expectation that software will resolve any conflicts.
Inter-CPU interrupts. A mechanism to allow one CPU to interrupt another is needed. This is necessary so that events on one CPU can cause rescheduling on other CPUs.
CPU Identifiers. Code running on a CPU must be able to determine which CPU it is running on. The CPU Id is usually provided either in a CPU status register, or in a register associated with the inter-CPU interrupt delivery subsystem. ECos expects CPU Ids to be small positive integers, although alternative representations, such as bitmaps, can be converted relatively easily. Complex mechanisms for getting the CPU Id cannot be supported. Getting the CPU Id must be a cheap operation, since it is done often, and in performance critical places such as interrupt handlers and the scheduler.

4.10.2. HAL Support

SMP support in any platform depends on the HAL supplying the appropriate operations. All HAL SMP support is defined in the cyg/hal/hal_smp.h header. Variant and platform specific definitions will be in cyg/hal/var_smp.h and cyg/hal/plf_smp.h respectively. These files are include automatically by this header, so need not be included explicitly.

SMP support falls into a number of functional groups.

4.10.2.1. CPU Control

This group consists of descriptive and control macros for managing the CPUs in an SMP system.

HAL_SMP_CPU_TYPE: A type that can contain a CPU id. A CPU id is usually a small integer that is used to index arrays of variables that are managed on an per-CPU basis.
HAL_SMP_CPU_MASK: A type that can contain a bitmask of all CPUs in the system. In this mask, bit n corresponds to CPU n.
HAL_SMP_CPU_COUNT: The maximum number of CPUs that can be supported. This is used to provide the size of any arrays that have an element per CPU.
HAL_SMP_CPU_MAX: The maximum possible CPU ID. This should normally be one less that HAL_SMP_CPU_COUNT when CPU IDs are indexed from 0.
HAL_SMP_CPU_THIS(): Returns the CPU id of the current CPU.
HAL_SMP_CPU_NONE: A value that does not match any real CPU id. This is uses where a CPU type variable must be set to a null value.
HAL_SMP_CPU_MASK_ALL: A value for the HAL_SMP_CPU_MASK type that has a bit set for each CPU supported. This value can be derived from HAL_SMP_CPU_COUNT. For platforms where only a subset of the available cores will be used for SMP this bitmask can be defined to reflect the active set of SMP CPUs.
HAL_SMP_CPU_BOOT: This manifest allows the boot CPU for SMP configurations to be configured.
HAL_SMP_CPU_START( cpu ): Starts the given CPU executing at a defined HAL entry point. After performing any HAL level initialization, the CPU calls up into the kernel at cyg_kernel_cpu_startup().
HAL_SMP_CPU_RESCHEDULE_INTERRUPT( cpu, wait ): Sends the CPU a reschedule interrupt, and if wait is non-zero, waits for an acknowledgment. The interrupted CPU should call cyg_scheduler_set_need_reschedule() in its DSR to cause the reschedule to occur.
HAL_SMP_CPU_TIMESLICE_INTERRUPT( cpu, wait ): Sends the CPU a timeslice interrupt, and if wait is non-zero, waits for an acknowledgment. The interrupted CPU should call cyg_scheduler_timeslice_cpu() to cause the timeslice event to be processed.

4.10.2.2. Test-and-set Support

Test-and-set is the foundation of the SMP synchronization mechanisms.

HAL_TAS_TYPE: The type for all test-and-set variables. The test-and-set macros only support operations on a single bit (usually the least significant bit) of this location. This allows for maximum flexibility in the implementation.
HAL_TAS_SET( tas, oldb ): Performs a test and set operation on the location tas. oldb will contain true if the location was already set, and false if it was clear.
HAL_TAS_CLEAR( tas, oldb ): Performs a test and clear operation on the location tas. oldb will contain true if the location was already set, and false if it was clear.

4.10.2.3. Spinlocks

Spinlocks provide inter-CPU locking. Normally they will be implemented on top of the test-and-set mechanism above, but may also be implemented by other means if, for example, the hardware has more direct support for spinlocks.

HAL_SPINLOCK_TYPE: The type for all spinlock variables.
HAL_SPINLOCK_INIT_CLEAR: A value that may be assigned to a spinlock variable to initialize it to clear.
HAL_SPINLOCK_INIT_SET: A value that may be assigned to a spinlock variable to initialize it to set.
HAL_SPINLOCK_INIT( lock, val ): A macro to initialize a spinlock at runtime. The current state of the spinlock is set according to val: zero for clear, non-zero for set.
HAL_SPINLOCK_SPIN( lock ): The caller spins in a busy loop waiting for the lock to become clear. It then sets it and continues. This is all handled atomically, so that there are no race conditions between CPUs.
HAL_SPINLOCK_CLEAR( lock ): The caller clears the lock. One of any waiting spinners will then be able to proceed.
HAL_SPINLOCK_TRY( lock, val ): Attempts to set the lock. The value put in val will be true if the lock was claimed successfully, and false if it was not.
HAL_SPINLOCK_TEST( lock, val ): Tests the current value of the lock. The value put in val will be true if the lock is claimed and false of it is clear.

4.10.2.4. Scheduler Lock

The scheduler lock is the main protection for all kernel data structures. By default the kernel implements the scheduler lock itself using a spinlock. However, if spinlocks cannot be supported by the hardware, or there is a more efficient implementation available, the HAL may provide macros to implement the scheduler lock.

HAL_SMP_SCHEDLOCK_DATA_TYPE: A data type, possibly a structure, that contains any data items needed by the scheduler lock implementation. A variable of this type will be instantiated as a static member of the Cyg_Scheduler_SchedLock class and passed to all the following macros.
HAL_SMP_SCHEDLOCK_INIT( lock, data ): Initialize the scheduler lock. The lock argument is the scheduler lock counter and the data argument is a variable of HAL_SMP_SCHEDLOCK_DATA_TYPE type.
HAL_SMP_SCHEDLOCK_INC( lock, data ): Increment the scheduler lock. The first increment of the lock from zero to one for any CPU may cause it to wait until the lock is zeroed by another CPU. Subsequent increments should be less expensive since this CPU already holds the lock.
HAL_SMP_SCHEDLOCK_ZERO( lock, data ): Zero the scheduler lock. This operation will also clear the lock so that other CPUs may claim it.
HAL_SMP_SCHEDLOCK_SET( lock, data, new ): Set the lock to a different value, in new. This is only called when the lock is already known to be owned by the current CPU. It is never called to zero the lock, or to increment it from zero.

4.10.2.5. Interrupt Routing

The routing of interrupts to different CPUs is supported by two new interfaces in hal_intr.h.

Once an interrupt has been routed to a new CPU, the existing vector masking and configuration operations should take account of the CPU routing. For example, if the operation is not invoked on the destination CPU itself, then the HAL may need to arrange to transfer the operation to the destination CPU for correct application.

HAL_INTERRUPT_SET_CPU( vector, mask ): Route the interrupt for the given vector to any of the CPUs whose bit is set in mask.
HAL_INTERRUPT_GET_CPU( vector, mask ): Set mask to the set of CPUs to which this vector is routed.


Chapter 3. General principles		Chapter 5. Exception Handling

2025-01-10

Open Publication License