Wednesday, January 15, 2014

Init Call Mechanism in the Linux Kernel



The Linux Kernel has for a long time (at least since v2.1.23) contained a clever and well optimised mechanism for calling initialisation code in drivers. It’s clever because its functionality is largely abstracted from the driver developer and well optimised because after initialisation, memory containing the initialisation code is released. This post explores how the mechanism works.
We’ll start by seeing how driver developers make use of this functionality; the following code has come from v2.6.27.6/drivers/net/smc911x.c and is the driver for a common Ethernet chipset.
12206: static int __init smc911xinit(void)
22207: {
32208: return platform_driver_register(&smc911x_driver);
42209: }
5...
62216: module_init(smc911x_init);

The smc911xinit function can be considered as the entry point into the driver – of particular interest is the __init macro and the static declaration. The __init macro is used to describe the function as only being required during initialisation time. Once initialisation is performed the kernel will remove this function and release its memory. The module_init macro is used to tell the kernel where the initialisation entry point to the module lives, i.e. what function to call at ‘start of day’. In a typical driver you will often see many initialisation functions marked with the __init macro which are used for initialisation, and a single module_init declaration.
Even though we are expecting the kernel to call smc911x_init at ‘start of day’ we have marked it as static and that is OK (we will see later how the function is called). This is a particular strength of the init call mechanism as it reduces the amount of public symbols and reduces the coupling between driver modules and other parts of the kernel.
The optimisation provided by the init call mechanism also provides a means for recovering memory used by initisalation data. Such data can be ‘tagged’ with the __initdata macro.
With the above code in place, at an appropriate time during start-up, the kernel will call the smc911xinit function and once it has been executed it’s memory will be released. You can see this during the output from kernel boot (e.g. dmesg), for example an x86 machine may print the following:
1Freeing unused kernel memory: 386k freed
Which means that 386k of memory that previously contained initialisation code and data has now been freed.
OK – So we’ve seen how the mechanism is used, let’s now take a closer look and see how it works under the hood. A quick ‘grep’ reveals that the __init macro is defined in include/linux/init.h:
143: #define __init      __section(.init.text) __cold
And the __section and __cold macros are defined in the include/linux/compiler*.h files:
1compiler.h: 182: #define __section(S)  __attribue__ ((__section__(#S)))
2compiler-gcc4.h: #define __cold        __attribue__ ((cold))
And when we expand it out we get:
1#define __init __attribute__((__section__(".init.text"))) __attribute__ ((cold))
Thus, when the __init macro is used a number of GCC attributes are added to the function declaration – in the case of a different compiler, the compiler.h file will ensure the macros expand out to whatever is necessary for the relevant compiler. The cold attribute is a relatively new GCC attribute and has existed since GCC4.3 – its purpose is to mark the function as one that is rarely used, this results in the compiler optimising the function for size instead of speed. What we are really interested here is the ‘section’ attribute. This __init macro uses this attribute to inform the compiler to put the text for this function is a special section named “.init.text”. The purpose here is to put all initialisation functions in a single ELF section such that a block of them can be removed after initialisation has been performed.
So what does module_init do? Its exact functionality depends if the module in question is built-in or compiled as a loadable module. For the purpose of this post, we’ll just be looking at the built-in modules. Back to include/linux/init.h:
1259: #define module_init(x) __initcall(x);
2204: #define __initcall(fn) device_initcall(fn)
3199: #define device_initcall __define_initcall("6", fn, 6)
4169: #define __define_initcall(level, fn, id) \
5170:            static initcall_t __initcall_##fn##id __used \
6171:            __attribute__ ((__section__(".initcall" level ".init"))) = fn
So another load of macros that result in yet another GCC attribute!
1#define module_init(x) static initcall_t __initcall_x6 __used \
2                       __attribute__ ((__section(".initcall6.init"))) = x;
And for clarity, let’s exapnd our the module_init macro as seen in our ethernet driver:
1static initcall_t __initcall_smc911x_init6 __used \
2                  __attribute__ ((__section(".initcall6.init"))) = smc911x_init;
So module_init in the context of a built-in driver results in declaring a function pointer with a unique name to our point of entry. In addition the macro ensures the function pointer is located in a special section of the ELF – we’ll see why shortly.
So at present we have ensured all our initialisation code and data are stored in the .init.text section, and that each module has a function pointer for it’s point of entry – which has a unique name and is also stored in a special section of the resulting ELF. In addition during link time the include/asm-generic/vmlinux.lds.h and arch/*/kernel/vmlinux.lds.S scripts ensure that some labels/symbols surround the start and end of these sections. I.e. __early_initcall_end and __initcall_end mark the start and end of the function pointers and __init_begin and __init_end mark the start and end of the .init.text section.
Finally we are in place to see how these functions get called and how they are eventually freed. During kernel start up a function called do_initcalls in init/main.c is called, this is shown below.
1749: static void __init do_initcalls(void)
2750: {
3751:      initcall_t *call;
4752:
5753:      for (call = __early_initcall_end; call < __initcall_end; call++)
6754:           do_one_initcall(*call);
7755:
The purpose of this loop is to execute each of the init functions as set up by the module_init macros. This is achieved with a simple for loop and a function pointer. Initially the function pointer is pointed to the label at the start of our function pointers ELF section, and is incremented (by the size of a function pointer (sizeof(initcall_t *)) until the end of the ELF section is reached. For each step the pointer is invoked and the init function is thus executed.
Once initialisation is complete, a function found in the architecture specific code named free_initmem is used to release the memory pages taken up by the initialisation functions and data. The exact nature of the function depends on the architecture.
So in a nutshell the kernel makes clever use of GCC attributes to ensure that initialisation functions and pointers to them are stored in unique sections of the ELF. Initialisation code at kernel start up then iterates through these function pointers and executes them in turn. Finally once all init code has been executed the entire ELF section (.init.text) is freed for use!

No comments:

Post a Comment