Libtune API design

Version 0.7
2005/09/22

Nadia Derbey
Nadia.Derbey@bull.net



1. Overview
2. Installing an application
    2.1. Checking the requirements
        2.1.1. Installation requirements
            2.1.1.1. Hardware requirements
            2.1.1.2. Distribution requirements
            2.1.1.3. Software requirements
            2.1.1.4. Communication requirements
        2.1.2. Hyper-threading support
        2.1.3. Memory requirements
        2.1.4. Swap requirements
        2.1.5. Disk requirements
        2.1.6. File system requirements
    2.2. Tuning the system-wide parameters
        2.2.1. Shared memory parameters
        2.2.2. Message queues parameters
        2.2.3. Semaphores parameters
        2.2.4. File system parameters
        2.2.5. Network parameters
        2.2.6. Memory parameters
    2.3. Setting the per-user limits
    2.4. Setting the per-process parameters
    2.5. Getting statistics
        2.5.1. System-wide statistics
        2.5.2. Per-process statistics
3. Summary
4. The tunables database
    4.1. The tunables database generation
    4.2. Accessing the database
5. Portability issues
    5.1. Determining the current distribution
    5.2. Source files hierarchy
        5.2.1. Include directory
            5.2.1.1. Directories contents
            5.2.1.2. Files contents
        5.2.2. distro_lib directory
            5.2.2.1. Direcotries contents
            5.2.2.2. Files contents
        5.2.3. Building process
        5.2.4. Installation process

6. The libtune API
    6.1. Alternative
    6.2. Access rights and required privileges
    6.3. Getting information (tun_get)
        6.3.1. Parameters
        6.3.2. Returned values
    6.4. Setting information (tun_set)
        6.4.1. Parameters
        6.4.2. Returned values
    6.5. Locating information (tun_locate)
        6.5.1. Parameters
        6.5.2. Returned values
    6.6. Getting the key word for a location (tun_get_kwd)
        6.6.1. Parameters
        6.6.2. Returned values
    6.7. Getting all the valid keywords (tun_lst_all)
       6.7.1. Parameters
       6.7.2. Returned values
    6.8. Getting help information (tun_help)
        6.8.1. Parameters
        6.8.2. Returned values
    6.9. Updating TUNDB_D (tun_update)
        6.9.1. Parameters
        6.9.2. Returned values
7. Deliverables





1. Overview

Accessing kernel tunables, system information and resource consumptions is needed during the whole life cycle of an application, starting from its installation. This access is usually implemented through installation and supervision scripts. Unfortunately, the following issues have been identified:
  1. These scripts are rarely portable. since they require to get, set and change values that are represented by objects that may change from distribution to distribution, or even from one release to the other inside the same distribution.
  2. There are quite multiple ways of accessing the kernel configuration and tunables: procfs, sysfs, existing syscalls or library routines, etc...
This raises the need for a standard, well defined API to manipulate the kernel configuration and tunables for software products to relay on.
The goal of this design is to define a standard API to unify the various ways Linux developers have to access kernel tunables, system information, resource consumptions. The libtune API should be built on top of the existing mechanisms, instead of replacing them, in order to maintain backward compatibility. As seen above, this API will be useful during the whole life of an application.
In the following chapters, we first present what is done today when an application is installed and while it is running. Then, the libtune API is presented.

2. Installing an application

Generally, installing an application can be divided into the following four tasks:
  1. Check the requirements
  2. Tune the system-wide parameters (kernel, vm, file system, networking, etc)
  3. Set the per-user limits
  4. Set the per-process parameters
The checking (1) part can be considered as a read-only task, while the tune (2) and set (3, 4) parts are fully write tasks (though a read part can be added in order to check that the parameters have been correctly set).
The following chapters detail the various information that should be checked or set during an application installation. They also describe the place where that information can be found on the machine. All the information have been collected in the following documents:

2.1. Checking the requirements

2.1.1. Installation requirements

2.1.1.1. Hardware requirements
This part describes the hardware the product is supported on, or has been tested on. We need to distinguish between both, because the first part can be characterized by a mandatory aspect, while the 2nd one can be seen as advisory only.

Requirement
Where to get the information from
CPU type the product is supported on
/proc/cpuinfo
CPU architecture (32 / 64 bits)

machine model

disk model
/sys/bus/scsi/devices/0:0:0:0 for example
info extraction can be built upon libsysfs
2.1.1.2. Distribution requirements
Requirement
Where to get the information from
kernel version
/proc/sys/kernel/osrelease take the 1st 3 digits ($(VERSION).$(PATCHLEVEL).$(SUBLEVEL))
e.g. in "2.6.9-1.667smp" only take "2.6.9"
distribution
/proc/version
This file contains the string linux_banner, that is built as follows:
"Linux version " UTS_RELEASE " (" LINUX_COMPILE_BY "@" LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION "\n";
The LINUX_COMPILER variable (obtained by $(CC) -v should (don't know if it is always true?) contain the distro level:
gcc -v | tail -1
gcc version 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)
compiler version
/proc/version (see above)
2.1.1.3. Software requirements
2.1.1.4. Communication requirements

2.1.2. Hyper-threading support

Requirement
Where to get the information from
Number of sibling CPUs on the same physical CPU for architectures which use hyper-threading /proc/cpuinfo (siblings)
Package id of a logical CPU (if hyper-threading)
/proc/cpuinfo (physical id)

2.1.3. Memory requirements

Requirement
Expressed in
Where to get the information from
Memory space
KB or MB
/proc/meminfo (MemTotal)
Size of a huge page
kB
/proc/meminfo (Hugepagesize)
Total number of huge pages
# of pages
/proc/meminfo (HugePages_Total)
Total number of available huge pages
# of pages
/proc/meminfo (HugePages_Free)

2.1.4. Swap requirements

Requirement
Expressed in
Where to get the information from
Swap space
KB or MB or percentage of the memory space
/proc/meminfo (SwapTotal)

2.1.5. Disk requirements

Requirement
Where to get the information from
Free disk space
statfs()
Disk size limitation (max size supported for GPFS)
/proc/partitions

2.1.6. File system requirements

Max file system size

2.2. Tuning the system-wide parameters

2.2.1. Shared memory parameters

Parameter
Where to set / get the value
Maximum number of shm segment ids
/proc/sys/kernel/shmmni
Maximum shm segment size (in bytes)
/proc/sys/kernel/shmmax
Maximum number of shm segment pages system-wide
/proc/sys/kernel/shmall
Minimum shm segment size (in bytes)
ipcs -lm / shmctl(SHMINFO)->shmmin

2.2.2. Message queues parameters

Parameter
Where to set the value
Maximum size of a message queue
/proc/sys/kernel/msgmnb
Maximum number of message queue ids
/proc/sys/kernel/msgmni

2.2.3. Semaphores parameters

Parameter
Where to set the value
Max number of semaphore identifiers (semmni)
/proc/sys/kernel/sem (4th field)
Max number of semaphores per id (semmsl)
/proc/sys/kernel/sem (1st field)
Max number of semaphores in system (semmns = semmni * semmsl)
/proc/sys/kernel/sem (2nd field)
Max number of operations per semop call (semopm)
/proc/sys/kernel/sem (3rd field)
All previous parameters together
/proc/sys/kernel/sem

2.2.4. File system parameters

Parameter
Where to set the value
Maximum number of file handles
/proc/sys/fs/file-max

2.2.5. Network parameters

Parameter
Where to set the value
Maximum receive buffer size
/proc/sys/net/core/rmem_max
Maximum send buffer size
/proc/sys/net/core/wmem_max
Maximum number of received packets that will be processed before resulting in congestion
/proc/sys/net/core/netdev_max_backlog
Minimum / default / maximum memory size of the TCP receive buffers
/proc/sys/net/ipv4/tcp_rmem
Minimum / default / maximum memory size of the TCP send buffers
/proc/sys/net/ipv4/tcp_wmem
Enable TCP to negotiate the use of window scaling (> 64K buffers) with the other end during connection setup
/proc/sys/net/ipv4/tcp_window_scaling
Timeout for a FIN packet before the socket is forcibly closed
/proc/sys/net/ipv4/tcp_fin_timeout
Maximum number of queued connection requests which= have still not received an ACK from the connecting client
/proc/sys/net/ipv4/tcp_max_syn_backlog
Allow to reuse TIME-WAIT sockets for new connections
/proc/sys/net/ipv4/tcp_tw_reuse
Local port range that is used by TCP and UDP to choose the local port
/proc/sys/net/ipv4/ip_local_port_range
MTU size
ifconfig
or
/proc/sys/net/ipv6/conf/<interface>/mtu

2.2.6. Memory parameters

Parameter
Expressed in
Where to set the value
Kernel policy for memory allocation
0, 1, 2
/proc/sys/vm/overcommit_memory
Percentage of the memory that should be added to the swap to determine the maximum address space that is allowed to be committed
Percentage of the physical memory
/proc/sys/vm/overcommit_ratio
Swappiness
[0, 100]
/proc/sys/vm/swappiness
Number of configured huge pages
Number of pages
/proc/sys/vm/nr_hugepages

2.3. Setting the per-user limits

Parameter
Default value
How to set the value (command level)
How to set value (syscall level)
Number of open file descriptors (hard limit)
_SC_OPEN_MAX
ulimit -Hn
Note: should be set to less than /proc/sys/fs/file-max
setrlimit(RLIMIT_NOFILE)
Number of open file descriptors (soft limt)
_SC_OPEN_MAX
ulimit -Sn
setrlimit(RLIMIT(NOFILE)
Maximum number of processes (hard limit)
_SC_CHILD_MAX
ulimit -Hu
setrlimit(RLIMIT_NPROC)
Maximum number of processes (soft limit)
_SC_CHILD_MAX
ulimit -Su
setrlimit(RLIMIT_NPROC)
Maximum stack size (soft limit)
_STK_LIM
ulimit -Ss
setrlimit(RLIMIT_STACK)
Maximum size of the data segment (soft limit)
unlimited (RLIM_INFINITY)
ulimit -Sd
setrlimit(RLIMIT_DATA)
Maximum size of virtual memory (soft limit)
(datalim / 1024L) + (stacklim / 1024L)
ulimit -Sv
setrlimit(RLIMIT_RSS)

2.4. Setting the per-process parameters

Parameter
Where to set the value
Base address for shared libraries (specific to some ditros, like RH)
/proc/<pid>/mapped_base

2.5. Getting statistics

After the application installation has completed and while it is running, there is a need to get some system statistics about resource consumptions and utilization, in order to appropriately tune the kernel parameters.

2.5.1. System-wide statistics

After installation has succeeded and while the installed application is running, a need occurs to get some system statistics. These statistics are usually taken from /proc. Only the /proc entries that has not yet been presented above are described hereafter:

Parameter
Where to set the value
Time spent in user mode
/proc/stat (column 1 + column 2)
Time spent in kernel mode
/proc/stat (column 3)
Time spent idle
/proc/stat (column 4)
Number of context switches
/proc/stat (ctxt)
Free memory
/proc/meminfo (MemFree)
Total amount of swap space
/proc/meminfo (SwapTotal)
Free swap area
/proc/meminfo (SwapFree)
Network device statistics
/proc/net/dev

2.5.2. Per-process statistics

Parameter
Where to set the value
Status information about a given process
/proc/<pid>/stat

3. Summary

From what has been presented in the 1st chapters of this paper, libtune should be able to set and get information in the following ways:

4. The tunables database

We saw in the previous chapter that invoking the libtune API to set or get information is equivalent to one of:
Performing these operations is done with the support of what we will call the "tunables database" (TUNDB). This is actually an array, indexed by the data that will be accessed. Each entry of TUNDB is described as follows:

The following array shows an example of how to fill the TUNDB for each case presented in the previous chapter:

Characteristics of the underlying object

Example

attributes

location

field number or line delimiter

strategy routine

data format
name scope
contents scope
Existing routine
system  wide
N/A
min shm segment size
N/A
N/A
N/A
strat_shmmin
Entire file contents

system wide
system wide
max number of shm segment ids
ATTR_NAME_SYS_WIDE | ATTR_CONT_SYS_WIDE
/proc/sys/kernel/shmmni
N/A
strat_file
per process
system wide
Shared libraries base address for a process
ATTR_NAME_PER_PROC | ATTR_CONT_SYS_WIDE
/proc/%d/mapped_base
N/A
strat_file
per network interface
system wide
MTU size
ATTR_NAME_PER_NETINT | ATTR_CONT_SYS_WIDE
/proc/sys/net/ipv6/conf/%s/mtu
N/A
strat_file
single line

system wide
system wide
max number of semaphores per id
ATTR_NAME_SYS_WIDE | ATTR_CONT_SYS_WIDE
/proc/sys/kernel/sem
1
strat_sub_file_line
per process
system wide
ppid of a process
ATTR_NAME_PER_PROC | ATTR_CONT_SYS_WIDE
/proc/%d/stat
4
strat_sub_file_line
one line of data per single object
system wide
per network interface
Number of receive packets for a network interface
ATTR_NAME_SYS_WIDE | ATTR_CONT_PER_NETINT

/proc/net/dev
2
strat_sub_file_lines
several lines, each line containing a fixed string followed by the corresponding value
system wide
system wide
Total swap space
ATTR_NAME_SYS_WIDE | ATTR_CONT_SYS_WIDE /proc/meminfo
"SwapTotal"
strat_sub_file_block
per process
system wide
Sleep average time of a process
ATTR_NAME_PER_PROC | ATTR_CONT_SYS_WIDE
/proc/%d/status
"SleepAVG"
strat_sub_file_block
block of data for each object

system wide
per CPU
Package id of a cpu
ATTR_NAME_SYS_WIDE | ATTR_CONT_PER_CPU
/proc/cpuinfo
"physical id"
strat_sub_file_block

Note: it should be noted that files with a variable location should be present only once in the database: it is not reasonable to fill a new entry for the same pseudo-file associated to a different identifier. Ex: in the case of the process associated pseudo-files (/proc/<pid>/*)
, we would reach PID_MAX * around 20 files at least! This is the reason why a printf format has been chosen for the file location

4.1. The tunables database generation

The objects accessed by the TUNDB database that has been presented in the previous chapter can be classified in the following 2 classes:
This classification leads to split the TUNDB database into the following "sub-databases": The following discussion only concerns TUNDB_S6, the database that is to be dynamically generated. In order to generate TUNDB_S6, we have to choose between:
  1. making each newly registered pseudo-file under /sys or /proc register itself in TUNDB_S6.
    • procfs pseudo-files: create_proc_entry() can be changed to fill TUNDB_S6 with each newly created pseudo-file. And remove_proc_entry() can be changed to do the reverse operation.
    • sysfs pseudo-files: sysfs_create() can be changed to fill TUNDB_S6 with each newly created pseudo-file, and sysfs_hash_and_remove() can be changed to do the reverse operation.
    The advantage of this solution is that the data base is always in a state that reflects that of the underlying pseudo filesystems. Its drawback is that it requires a change at the kernel level.
  2. scanning the procfs and sysfs pseudo-filesystems in order to discover their tree structure and fill TUNDB_S6 with the set of pseudo-files. The advantage of this solution is that it can be completely developed outside the kernel. Cons: there are cases were the key words will not perfectly reflect the pseudo filesystems tree structure, even if this scanning is to be done periodically. Actually, we can imagine some drivers or modules that generate their own files under /proc and that are not loaded at the time the scanning is done. This means that the corresponding generated files would not be manageable by libtune until a new scanning is done. On the other hand, adding and removing files from the pseudo filesystems is not such a frequent operation, except for files associated to running processes. But these are files whose names will be present in TUNDB_S6 from the very beginning, in their variable format (/proc/%d/cmdline). So it seems reasonable to think that such a "latency" is acceptable.
So the second solution is the one that will be kept. It will be implemented through a daemon that will periodically scan the pseudo filesystems and update TUNDB_S6 according to the results of its scan. Actually, TUNDB_S6 is split into:

The structure of TUNDB is shown in the following scheme:




The exchanges between the daemon and the library to update TUNDB_D are summarized in the following scheme:




In order to avoid TUNDB_D contents to disappear as soon as libtune is unloaded, TUNDB_D contents are actually stored in a file (/var/tuned/tundb_d) mapped by the library when needed. This file is initialized by a binary (libtuninit) after libtune has been installed.
When the tuned daemon is first called, it requests from the library to map /var/tuned/tundb_d into TUNDB_D.
Then the daemon periodically scans the /proc and /sys pseudo filesystems (forgetting the /proc/<pid>/* files).
If it detects that one or more files have been added since it has last been awaken, it requests from the library to add the corresponding entries into TUNDB_D. If the daemon detects that one or more files have been removed since it has last been awaken, it requests from the library to remove the corresponding entries from TUNDB_D.

4.2. Accessing the database

In order for an application to get or set system information, it should address an entry into TUNDB, through its index. This index enables the library to find all the needed information to get or set the addressed tunable (attributes, file location if any, associated strategy routines). Addressing a TUNDB entry is obvious for the static part (TUNDB_S1 to TUNDB_S6): the indexes can easily be defined as constants in a documented header file.
Then, since the static part of TUNDB is actually made of 6 distinct arrays, we need to convert the documented constant into an index in the appropriate array. This is done as follows:
  1. The maximum number of indexes into each TUNDB is the same (TUNDB_MAX = 0x400)
  2. An array called TUNLIMITS contains the following information for each TUNDB array:
  3. When a predefined documented keyword is referenced:
    1. it is first divided by TUNDB_MAX. This gives us the array where the corresponding entry should be found.
    2. the value of the first index for this array is substracted from the keyword value. This gives us the actual index in the array to access the needed information.
The following scheme summarizes the process that has just been described:



TUNDB_D, on its side, is an automatically generated part of TUNDB. For this array, each new index will be an increment of the last existing index in TUNDB. Moreover, in order for the applications to know which index to use for which file, a set of commands and API interfaces should be implemented, to query information from the database (it should be noted that a help field exists for each entry in the database).

5. Portability issues

We saw in the overview of this document that the main problem for installation or supervision scripts is a portability issue:
Portability across distros and versions within distros will be supported through:
  1. a hierarchy that contains the distro and version dependant source files. During the building phase, the distro/family/version will be determined by a script. Depending on the result found, the appropriate files will be taken into account for the compilation.
  2. 6 new databases for the distro / family dependent databases (these arrays' first and last indexes are integrated into TUNLIMITS):

Database
contains
Distro-independent corresponding database
TUNDB_S1_1
the set of distro-dependent tunables that enables access to static objects (various strategy routines) TUNDB_S1
TUNDB_S2_1
the set of distro-dependent tunables that enables access to parts of dynamic objects through strat_get_subfile_line / strat_set_subfile_line: gets / sets a field within a line inside files that are made of a single line.
TUNDB_S2
TUNDB_S3_1
the set of distro-dependent tunables that enables access to parts of dynamic objects through strat_get_subfile_lines / strat_set_subfile_lines: gets / sets a field within a line inside files that are made of several lines, each line containing a given identifier
TUNDB_S3
TUNDB_S4_1
the set of distro-dependent tunables that enables access to parts of dynamic objects through strat_get_subfile_block / strat_set_subfile_block: gets / sets a value in files made of a single block of data. This block of data is made of several lines that contain an identifier and the actual value.
TUNDB_S4
TUNDB_S5_1
the set of distro-dependent tunables that enables access to parts of dynamic objects through strat_get_subfile_blocks / strat_set_subfile_blocks: gets / sets a value in files made of several blocks of data. Each block of data is identified by a unique identifier, and it is made of several lines that contain an identifier and the actual value.
TUNDB_S5
TUNDB_S6_1
the set of distro-dependent tunables that enables access to the entire contents of dynamic objects TUNDB_S6

5.1. Determining the current distribution

Most of the Linux distributions maintain release information file in the “/etc/” directory. Here is the list of files on some of the well known Linux distributions:

Novell SuSE /etc/SuSE-release
Red Hat /etc/redhat-release, /etc/redhat_version
Fedora /etc/fedora-release
Debian /etc/debian_release, /etc/debian_version,
Mandrake /etc/mandrake-release
Gentoo /etc/gentoo-release

This design (and the corresponding developments) will be limited to Red Hat and SuSE distributions. But things can easily be extended in the future if needed. In turn, within the distribution, the family can be determined by scanning the /etc/*-release file. Then, the family release can be determined by scanning this file too.
Another information that is needed to know which source files should be taken to build the library is the kernel version. This information can be obtained by the "uname" command.

A new script is developed to get the distribution-releated information. It has the following syntax:
get_distro [ -d | -f | -r | -h ]
        -d: get distro name
        -f: get family name within the distro
        -r: get the release name within the family

5.2. Source files hierarchy

The non common source files are present in a tree sorted by kernel release, then by distribution, then by family and by release within the family.

The following table shows the Linux kernel releases supported by RH and SuSE distributions:

Kernel
SuSE Distros
Associated path within the sources
Red Hat distros
Associated path within the sources
2.6.5
SLES9
base-2.6.5/distro-SUSE/SLES/9
FC2
base-2.6.5/distro-REDHAT/FC/2
2.6.9


FC3
RHEL4
base-2.6.5/distro-REDHAT/FC/3
base-2.6.5/distro-REDHAT/RHEL/4
2.6.11
(Open) SuSE 9.3
base-2.6.11/distro-SUSE/SL/9.3
FC4
base-2.6.11/distro-REDHAT/FC/4
2.6.13
(Open) SuSE 10.0
base-2.6.13/distro-SUSE/SL/13 FC5
base-2.6.13/distro-REDHAT/FC/5

Note: 2.6.13 kernel is still not supported: Open SuSE 10.0 is still not released, while Fedora Core 5 is still unstable.

In what follows, we present the sources hierarchy. For the sake of simplicity, the presented tree is restricted to:
Files hierarchy:

Makefile
include:
   ---> Makefile
   ---> libtune.h
            #include <base/libtune_base.h>
   ---> libtune_priv.h
            #include <base/libtune_priv_base.h>
   ---> base (link to base-2.6.9)
   ---> base-2.6.9:
           ---> libtune_base.h
                         #include <distro/libtune_distro.h>
           ---> libtune_priv_base.h
                         #include <distro/libtune_priv_distro.h>
           ---> distro (link to distro-REDHAT/FC/3)
           ---> distro-REDHAT:
                   ---> FC:
                           ---> 3:
                                   ---> libtune_distro.h
                                   ---> libtune_priv_distro.h
                   ---> RHEL:
                           ---> 4:
                                   ---> libtune_distro.h
                                   ---> libtune_priv_distro.h

distro-lib:
   ---> base-2.6.9:
           ---> Makefile
           ---> libtundb_base.c
           ---> libtuninit_base.c
           ---> distro-REDHAT:
                   ---> FC:
                           ---> 3:
                                   ---> Makefile
                                   ---> libtuninit_distro.c
                                   ---> libtundb_distro.c
                   ---> RHEL:
                           ---> 4:
                                   ---> Makefile
                                   ---> libtuninit_distro.c
                                   ---> libtundb_distro.c
lib:
   ---> Makefile
   --->
libtundb.c
   ---> libtuninit.c

5.2.1. include directory

5.2.1.1. Directories contents
Directory
contains
will be linked to
include
the common include files, i.e. kernel / distro independent
N/A
include/base-2.6.9
the include files that are kernel release dependent
include/base
include/base-2.6.9/distro-REDHAT/FC/3
the include files that are specific to the Fedora Core 3 distribution
include/base/distro

5.2.1.2. Files contents
File contains
includes
installed under
include/libtune.h the prototypes for the libtune interfaces. This is the only part that is kernel / distro independent
<base/libtune_base.h> /usr/include
include/base-2.6.9/libtune_base.h the constants definitions for the tunables specific to the 2.6.9 standard kernel release. These are the tunables that are not distro / family dependent
<distro/libtune_distro.h> /usr/include/base
include/base-2.6.9/distro-REDHAT/FC/3/libtune_distro.h the constants definitions for the tunables specific to the Fedora Core 3 distribution. These are the tunables that are distro / family dependent
N/A
/usr/include/base/distro




include/libtune_priv.h private stuff for the libtune library, that are kernel / distro independent: structures, various internal constants, internal functions prototypes
<base/libtune_priv_base.h> Not installed
include/base-2.6.9/libtune_priv_base.h the constants definitions that are specific to the 2.6.9 standard kernel release: help strings and files locations for the tunables that are defined in include/base-2.6.9/libtune_base.h
<distro/libtune_priv_distro.h> Not installed
include/base-2.6.9/distro-REDHAT/FC/3/libtune_priv_distro.h the constants definitions that are specific to the Fedora Core 3 distribution: help strings and files locations for the tunables defined in include/base-2.6.9/distro-REDHAT/FC/3/ libtune_distro.h
N/A Not installed

5.2.2. lib and distro_lib directories

5.2.2.1. Directories contents
Directory
contains
lib
the kernel / distro independent source files
distro-lib/base-2.6.9
the source files that are kernel release dependent
distro-lib/base-2.6.9/distro-REDHAT/FC/3
the source files that are specific to the Fedora Core 3 distribution

5.2.2.2. Files contents
File
contains
lib/libtundb.c
The initialization of TUNDB_D and TUNLIMITS (to NULL). This is the only part that is kernel / distro independent
distro-lib/base-2.6.9/libtundb_base.c
The initialization of TUNDB_S1 to TUNDB_S6. These are the arrays that are not distro / family dependent
distro-lib/base-2.6.9/distro-REDHAT/FC/3/libtundb_distro.c
The initialization of TUNDB_S1_1 to TUNDB_S6_1. These are the tunables that are distro / family dependent


lib/libtuninit.c The storage of TUNLIMITS into the mapped files
distro-lib/base-2.6.9/libtuninit_base.c
The initialization of TUNLIMITS for the indexes that are not distro / family dependent (i.e. for the indexes that correspond to TUNDB_S1 to TUNDB_S6).
distro-lib/base-2.6.9/distro-REDHAT/FC/3/libtuninit_distro.c
The initialization of TUNLIMITS for the indexes that are distro / family dependent (i.e. for the indexes that correspond to TUNDB_S1_1 to TUNDB_S6_1)

5.2.3. Building process

The building process is the following (it is based on the case we want to build a library for Fedora Core 3):
  1. The top-level Makefile determines the current kernel / distribution / family / release (by calling uname and get_distro). This information is used to know where to take the sources from: base-<kernel>/distro-<distro>/<family>/<release> (ex: base-2.6.9/distro-REDHAT/FC/3). Actually, the Makefile can also be called with the needed variables to enable cross compilation (ex: make KVERSION=2.6.9 DISTRO=REDHAT FAMILY=FC FRELEASE=3).
  2. Then the Makefile initiates a preparation phase:
    1. it creates a symbolic link between include/base-<kernel> (ex: include/base-2.6.9) and include/base directories
    2. it creates a symbolic link between include/base/distro-<distro> (ex: include/base/distro-REDHAT) and include/base/distro
  3. Then it builds the following:
    1. builds the libtuninit command from lib/libtuninit.c, distro-lib/base-<kernel>/libtuninit_base.c and distro-lib/base-<kernel>/distro-<distro>/<family>/<release>/libtuninit_distro.c (ex. distro-lib/base-2.6.9/libtuninit_base.c and distro-lib/base-2.6.9/distro-REDHAT/FC/3/libtuninit_distro.c)
    2. builds the object files needed for the libtune library from lib/*.c, distro-lib/base<kernel>/libtundb_base.c and distro-lib/base-<kernel>/distro-<distro>/<family>/<release>/libtundb_distro.c (ex. distro-lib/base-2.6.9/libtundb_base.c and distro-lib/base-2.6.9/distro-REDHAT/FC/3/libtundb_distro.c)
    3. builds the libtune library from lib/*.o, distro-lib/base<kernel>/libtundb_base.o and distro-lib/base-<kernel>/distro-<distro>/<family>/<release>/libtundb_distro.o (ex. distro-lib/base-2.6.9/libtundb_base.o and distro-lib/base-2.6.9/distro-REDHAT/FC/3/libtundb_distro.o)

5.2.4. Installation process

The installation process is the following (it is based on the case we want to install a library for Fedora Core 3):
  1. The top-level Makefile determines the current kernel / distribution / family / release (by calling uname and get_distro). This information is used to know where to take the sources from: base-<kernel>/distro-<distro>/<family>/<release> (ex: base-2.6.9/distro-REDHAT/FC/3). Actually, the Makefile can also be called with the needed variables to enable cross compilation (ex: make KVERSION=2.6.9 DISTRO=REDHAT FAMILY=FC FRELEASE=3).
  2. Then the Makefile does the following:
    1. it creates a symbolic link between include/base-<kernel> (ex: include/base-2.6.9) and include/base directories
    2. it creates a symbolic link between include/base/distro-<distro> (ex: include/base/distro-REDHAT) and include/base/distro
  3. Then it builds the following:
    1. Creates the following directories:
      1. /usr/include/base
      2. /usr/include/base/distro
    2. it copies include/libtune.h under /usr/include
    3. it copies include/base/libtune_base.h under /usr/include/base directory (remember that include/base is actually a link to include/base-<kernel> - ex. include/base-2.6.9)
    4. it copies include/base/distro/libtune_distro.h under /usr/include/base/distro directory (remember that include/base/distro is actually a link to include/base-<kernel>/distro-<distro>/<family>/<release> - ex: include/base-2.6.9/distro-REDHAT/FC/3)

6. The libtune API

The API that comes out from the previous chapters is quite simple:
Since it is not POSIX compliant, this API is not intended to be integrated into the glibc: it will be a completely separate API.

6.1. Alternative

Looking at the limited number of actions during an installation, an alternative to the proposed API would be to define a single entry per wanted action. For example: This would be feasible while the number of actions remains under a reasonable limit. But the problem with this kind of API is that is has to be enhanced each time a new action is needed. The tun_set() / tun_get() solution is the most generic one, that' why it is the one we kept: it can be used not only during applications installation but also, for example, by a daemon in charge of periodically collecting statistics, and of adjusting the kernel parameters based on its observations.

6.2. Access rights and required privileges

The query interfaces are not restricted to specific users.

The interfaces used to get or set information, on their side, are submitted to the same access rights as the underlying object:

6.3. Getting information (tun_get)

size_t tun_get(int keyword, void *identifier, char **out_buff, size_t *out_sz)

6.3.1. Parameters

This routine takes the following parameters:

6.3.2. Returned values

On  success,  tun_get() returns the number of characters read, including the terminating null character, but not including the EOF character. This value can be used to handle embedded null characters in the data read.
tun_get() returns -1 on failure, and errno is set.

6.4. Setting information (tun_set)

size_t tun_set(int keyword, void *identifier, char *in_buff, size_t in_sz, char **out_buff, size_t *out_sz)

6.4.1. Parameters

This routine takes the following parameters:

6.4.2. Returned values

On  success,  tun_set() returns the number of output characters, including the terminating null character, but not including the EOF character. This value can be used to handle embedded null characters in the data read.
tun_set() returns -1 on failure, and errno is set.

6.5. Locating information (tun_locate)

This routine can be used, given an index into the TUNDB database, to locate the underlying pseudo-file.
This routine is only meaningful for indexes that correspond to information that is managed through pseudo files.

int tun_locate(int keyword, char **location, int *loc_sz)

6.5.1. Parameters

This routine takes the following parameters:

6.5.2. Returned values

6.6. Getting the key word for a location (tun_get_kwd)

This is the reverse operation of the preceding one: given a location, it returns the associated TUNDB index (to use for example in a set / get operation). It also returns a string that gives the corresponding constant name. Actually, since many indexes may have the same location in TUNDB, it is recommended to call this routine several times in order to get back all the possible indexes and constant names that correspond to the same location.
Example of entries that have the same underlying locations:
char *tun_get_kwd(char *location, int *keyword)

6.6.1. Parameters

This routine takes the following parameters:

6.6.2. Returned values

If successful, tun_get_kwd() returns a pointer to the next constant name for the specified location (this string is malloc'ed by the library and should then be freed by the calling application), or NULL if the location is not found anymore (in that case, *keyword is not updated).
Note: if the location corresponds to a keyword that has been generated by the daemon, an empty string is returned ("").
On failure, ((char *) -1) is returned, *keyword is not updated and errno is set.

6.7. Getting all the valid key words (tun_lst_all)

This is an interface that can be used to get all the valid tunables present in the TUNDB database: given a keyword, it returns the next valid one in TUNDB. This interface, combined with tun_help() (see below) is useful to know what are the tunables covered by the libtune API.

int tun_lst_all(int *keyword)

6.7.1. Parameters

This routine takes the following parameters:

6.7.2. Returned values

If successful, tun_lst_all() returns 0 and *keyword is filled with the next valid tunable value.
If there ane nomore valid tunables, TUN_ILLEGAL is returned.
On failure, -1 is returned, *keyword is not updated and errno is set.

6.8. Getting help information (tun_help)

This routine can be used, given an index into the TUNDB database, to return the corresponding help string.

int tun_help(int keyword, char **help, int *help_sz)

6.8.1. Parameters

This routine takes the following parameters:

6.8.2. Returned values

6.9. Updating TUNDB_D (tun_update)

This routine is for internal use only: it is for use by the tuned daemon. It is called to initialize, update and clean the TUNDB_D array (dynamic part of TUNDB).

int tun_update(int cmd, char *fname)

6.9.1. Parameters

This routine takes the following parameters:

6.9.2. Returned values

The returned values depend on the requested action, as follows:

cmd
returned value on success
returned value on failure
TUN_INIT
0
-1
TUN_CLEAN
0
-1
TUN_ADD
>= 0 (associated keyword = D_IDX_FIRST + new index)
-1
TUN_REMOVE
0
0

7. Deliverables

These are the remaining phases for the libtune API: