The Linux Kernel has for a long time (at least since v2.1.23)
contained a clever and well optimised mechanism for calling
initialisation code in drivers. It’s clever because its functionality is
largely abstracted from the driver developer and well optimised because
after initialisation, memory containing the initialisation code is
released. This post explores how the mechanism works.
We’ll start by seeing how driver developers make use of this
functionality; the following code has come from
v2.6.27.6/drivers/net/smc911x.c and is the driver for a common Ethernet
chipset.
1 | 2206: static int __init smc911xinit( void ) |
3 | 2208: return platform_driver_register(&smc911x_driver); |
6 | 2216: module_init(smc911x_init); |
The smc911xinit function can be considered as the entry point into
the driver – of particular interest is the __init macro and the static
declaration. The __init macro is used to describe the function as only
being required during initialisation time. Once initialisation is
performed the kernel will remove this function and release its memory.
The module_init macro is used to tell the kernel where the
initialisation entry point to the module lives, i.e. what function to
call at ‘start of day’. In a typical driver you will often see many
initialisation functions marked with the __init macro which are used for
initialisation, and a single module_init declaration.
Even though we are expecting the kernel to call smc911x_init at
‘start of day’ we have marked it as static and that is OK (we will see
later how the function is called). This is a particular strength of the
init call mechanism as it reduces the amount of public symbols and
reduces the coupling between driver modules and other parts of the
kernel.
The optimisation provided by the init call mechanism also provides a
means for recovering memory used by initisalation data. Such data can
be ‘tagged’ with the __initdata macro.
With the above code in place, at an appropriate time during start-up,
the kernel will call the smc911xinit function and once it has been
executed it’s memory will be released. You can see this during the
output from kernel boot (e.g. dmesg), for example an x86 machine may
print the following:
1 | Freeing unused kernel memory: 386k freed |
Which means that 386k of memory that previously contained initialisation code and data has now been freed.
OK – So we’ve seen how the mechanism is used, let’s now take a closer
look and see how it works under the hood. A quick ‘grep’ reveals that
the __init macro is defined in include/linux/init.h:
1 | 43: #define __init __section(.init.text) __cold |
And the __section and __cold macros are defined in the include/linux/compiler*.h files:
1 | compiler.h: 182: #define __section(S) __attribue__ ((__section__(#S))) |
2 | compiler-gcc4.h: #define __cold __attribue__ ((cold)) |
And when we expand it out we get:
1 | #define __init __attribute__((__section__(".init.text"))) __attribute__ ((cold)) |
Thus, when the __init macro is used a number of GCC attributes are
added to the function declaration – in the case of a different compiler,
the compiler.h file will ensure the macros expand out to whatever is
necessary for the relevant compiler. The
cold attribute
is a relatively new GCC attribute and has existed since GCC4.3 – its
purpose is to mark the function as one that is rarely used, this results
in the compiler optimising the function for size instead of speed. What
we are really interested here is the ‘section’ attribute. This __init
macro uses this attribute to inform the compiler to put the text for
this function is a special section named “.init.text”. The purpose here
is to put all initialisation functions in a single ELF section such that
a block of them can be removed after initialisation has been performed.
So what does module_init do? Its exact functionality depends if the
module in question is built-in or compiled as a loadable module. For the
purpose of this post, we’ll just be looking at the built-in modules.
Back to include/linux/init.h:
1 | 259: #define module_init(x) __initcall(x); |
2 | 204: #define __initcall(fn) device_initcall(fn) |
3 | 199: #define device_initcall __define_initcall( "6" , fn, 6) |
4 | 169: #define __define_initcall(level, fn, id) \ |
5 | 170: static initcall_t __initcall_##fn##id __used \ |
6 | 171: __attribute__ ((__section__( ".initcall" level ".init" ))) = fn |
So another load of macros that result in yet another GCC attribute!
1 | #define module_init(x) static initcall_t __initcall_x6 __used \ |
2 | __attribute__ ((__section( ".initcall6.init" ))) = x; |
And for clarity, let’s exapnd our the module_init macro as seen in our ethernet driver:
1 | static initcall_t __initcall_smc911x_init6 __used \ |
2 | __attribute__ ((__section( ".initcall6.init" ))) = smc911x_init; |
So module_init in the context of a built-in driver results in
declaring a function pointer with a unique name to our point of entry.
In addition the macro ensures the function pointer is located in a
special section of the ELF – we’ll see why shortly.
So at present we have ensured all our initialisation code and data
are stored in the .init.text section, and that each module has a
function pointer for it’s point of entry – which has a unique name and
is also stored in a special section of the resulting ELF. In addition
during link time the include/asm-generic/vmlinux.lds.h and
arch/*/kernel/vmlinux.lds.S scripts ensure that some labels/symbols
surround the start and end of these sections. I.e. __early_initcall_end
and __initcall_end mark the start and end of the function pointers and
__init_begin and __init_end mark the start and end of the .init.text
section.
Finally we are in place to see how these functions get called and how
they are eventually freed. During kernel start up a function called
do_initcalls in init/main.c is called, this is shown below.
1 | 749: static void __init do_initcalls( void ) |
5 | 753: for (call = __early_initcall_end; call < __initcall_end; call++) |
6 | 754: do_one_initcall(*call); |
The purpose of this loop is to execute each of the init functions as
set up by the module_init macros. This is achieved with a simple for
loop and a function pointer. Initially the function pointer is pointed
to the label at the start of our function pointers ELF section, and is
incremented (by the size of a function pointer (sizeof(initcall_t *))
until the end of the ELF section is reached. For each step the pointer
is invoked and the init function is thus executed.
Once initialisation is complete, a function found in the architecture
specific code named free_initmem is used to release the memory pages
taken up by the initialisation functions and data. The exact nature of
the function depends on the architecture.
So in a nutshell the kernel makes clever use of
GCC attributes
to ensure that initialisation functions and pointers to them are stored
in unique sections of the ELF. Initialisation code at kernel start up
then iterates through these function pointers and executes them in turn.
Finally once all init code has been executed the entire ELF section
(.init.text) is freed for use!
No comments:
Post a Comment