AIX/LINUX for System Administrators: AIX boot process

As a system administrator, you should have a general understanding of the boot

process. This knowledge is useful to solving problems that can prevent a system

from booting properly. These problems can be both software or hardware. We

also recommend that you be familiar with the hardware configuration of your

system.

Booting involves the following steps:

_ The initial step in booting a system is the Power On Self Test (POST). Its

purpose is to verify that the basic hardware is in a functional state. The

memory, keyboard, communication, and audio devices are also initialized.

You can see an image for each of these devices displayed on the screen. It is

during this step that you can press a function key to choose a different boot

list. The LED values displayed during this phase are model specific. Both

hardware and software problems can prevent the system from booting.

_ System Read Only Storage (ROS) is specific to each system type. It is

necessary for AIX 5L Version 5.3 to boot, but it does not build the data

structures required for booting. It will locate and load bootstrap code. System

ROS contains generic boot information and is operating system independent.

_ Software ROS (also named bootstrap) forms an IPL control block, which is

compatible with AIX 5L Version 5.3, that takes control and builds AIX 5L

specific boot information. A special file system located in memory and named

the RAMFS file system is created. Software ROS then locates, loads, and

turns control over to the AIX 5L boot logical volume (BLV). Software ROS is

AIX 5L information based on machine type and is responsible for completing

machine preparation to enable it to start the AIX 5L kernel.

_ A complete list of files that are part of the BLV can be obtained from the

/usr/lib/boot directory. The most important components are the following:

– The AIX 5L kernel

– Boot commands called during the boot process, such as bootinfo and

cfgmgr

– A reduced version of the ODM. Many devices need to be configured

before hd4 is made available, so their corresponding methods have to be

stored in the BLV. These devices are marked as base in PdDv.

– The rc.boot script

Note: Old systems based on the MCI architecture execute an additional step

before this, the so-called Built In Self Test (BIST). This step is no longer

required for systems based on the PCI architecture.

_ The AIX 5L kernel is loaded and takes control. The system will display 0299

on the LED panel. All previous codes are hardware-related. The kernel will

complete the boot process by configuring devices and starting the init

process. LED codes displayed during this stage will be generic AIX 5L codes.

_ So far, the system has tested the hardware, found a BLV, created the RAMFS,

and started the init process from the BLV. The rootvg has not yet been

activated. From now on, the rc.boot script will be called three times, and is

passed a different parameter each time.

Boot phase 1

During this phase, the following steps are taken:

_ The init process started from RAMFS executes the boot script rc.boot 1. If the

init process fails for some reason, code c06 is shown on the LED display.

_ At this stage, the restbase command is called to copy a partial image of ODM

from the BLV into the RAMFS. If this operation is successful, the LED display

shows 510; otherwise, LED code 548 is shown.

_ After this, the cfgmgr -f command reads the Config_Rules class from the

reduced ODM. In this class, devices with the attribute phase=1 are

considered base devices. Base devices are all devices that are necessary to

access rootvg. For example, if the rootvg is located on a hard disk, all devices

starting from the motherboard up to the disk will have to be initialized. The

corresponding methods are called so that rootvg can be activated in boot

phase 2.

_ At the end of boot phase 1, the bootinfo -b command is called to determine

the last boot device. At this stage, the LED shows 511.

Boot phase 2

In boot phase 2, the rc.boot script is passed to the parameter 2.

During this phase, the following steps are taken.

_ The rootvg volume group is varied on with the special version of the varyonvg

command named the ipl_varyon command. If this command is successful,

the system displays 517; otherwise, one of the following LED codes will

appear: 552, 554, or 556, and the boot process is halted.

_ Root file system hd4 is checked using the fsck -f command. This will verify

whether the file system was unmounted cleanly before the last shutdown. If

this command fails, the system will display code 555.

_ The root file system (/dev/hd4) is mounted on a temporary mount point (/mnt)

in RAMFS. If this fails, 557 will appear in the LED display.

96 IBM Eserver p5 and pSeries Administration and Support for AIX 5L V5.3

_ The /usr file system is verified using the fsck -f command and then

mounted. If this operation fails, the LED 518 appears.

_ The /var file system is verified using the fsck -f command and then

mounted. The copycore command checks if a dump occurred. If it did, it is

copied from default dump devices, /dev/hd6, to the default copy directory,

/var/adm/ras. Afterwards, /var is unmounted.

_ The primary paging space from rootvg, /dev/hd6, will be activated.

_ The mergedev process is called and all /dev files from the RAM file system

are copied onto disk.

_ All customized ODM files from the RAM file system are copied to disk. Both

ODM versions from hd4 and hd5 are now synchronized.

_ Finally, the root file system from rootvg (disk) is mounted over the root file

system from the RAMFS. The mount points for the rootvg file systems

become available. Now, the /var and /usr file systems from the rootvg are

mounted again on their ordinary mount points.

There is no console available at this stage, so all boot messages will be copied to

alog. The alog command maintains and manages logs.

Boot phase 3

After phase 2 is completed, rootvg is activated and the following steps are taken:

_ /etc/init process is started. It reads the /etc/inittab file and calls rc.boot with

argument 3.

_ The /tmp file system is mounted.

_ The rootvg is synchronized by calling the syncvg command and launching it

as a background process. As a result, all stale partitions from rootvg are

updated. At this stage, the LED code 553 is shown.

_ At this stage, the cfgmgr command is called; if the system is booted in normal

mode, the cfgmgr command is called with option -p2; if the system is booted

in service mode, the cfgmgr command is called with option -p3. The cfgmgr

command reads the Config_rules file from ODM and calls all methods

corresponding to either phase=2 or phase=3. All other devices that are not

base devices are configured at this time.

_ Next, the console is configured by calling the cfgcon command. After the

configuration of the console, boot messages are sent to the console if no

STDOUT redirection is made. However, all missed messages can be found in

/var/adm/ras/conslog. LED codes that can be displayed at this time are:

– c31: Console not yet configured. Provides instructions to select console.

– c32: Console is an LFT terminal.

– c33: Console is a TTY.

– c34: Console is a file on the disk.

_ Finally, the synchronization of the ODM in the BLV with the ODM from the /

(root) file system is done by the savebase command.

_ The syncd daemon and errdemon are started.

_ The LED display is turned off.

_ If the file /etc/nologin exists, it will be removed.

_ If there are devices marked as missing in CuDv, a message is displayed on

the console.

_ The message System initialization completed is sent to the console. The

execution of rc.boot is has completed. Process init will continue processing

the next command from /etc/inittab.

The /etc/inittab file

The /etc/inittab file controls the initialization process.

The /etc/inittab file supplies the script to the initcommand's role as a general

process dispatcher. The process that constitutes the majority of the init

command's process dispatching activities is the /etc/getty line process, which

initiates individual terminal lines. Other processes typically dispatched by the

init command are daemons and the shell.

The /etc/inittab file is composed of entries that are position-dependent and have

the following format:

Identifier:RunLevel:Action:Command

Each entry is delimited by a newline character. A backslash (\) preceding a

newline character indicates the continuation of an entry. There are no limits

(other than maximum entry size) on the number of entries in the /etc/inittab file.

The maximum entry size is 1024 characters.

The entry fields are:

Identifier A one to fourteen character field that uniquely identifies an

object.

RunLevel The run level at which this entry can be processed.

The run level has the following attributes:

– Run levels effectively correspond to a configuration of

processes in the system.

– Each process started by the init command is assigned one

or more run levels in which it can exist.

– Run levels are represented by the numbers 0 through 9. For

example, if the system is in run level 1, only those entries

with a 1 in the run-level field are started.

– When you request theinit command to change run levels,

all processes without a matching entry in the run-level field

for the target run level receive a warning signal (SIGTERM).

There is a 20-second grace period before processes are

forcibly terminated by the kill signal (SIGKILL).

– The run-level field can define multiple run levels for a process

by selecting more than one run level in any combination from

0 through 9. If no run level is specified, the process is

assumed to be valid at all run levels.

– There are four other values that appear in the run-level field,

even though they are not true run levels: a, b, c and h.

Entries that have these characters in the run level field are

processed only when the telinit command requests them

to be run (regardless of the current run level of the system).

They differ from run levels in that the init command can

never enter run level a, b, c, or h. Also, a request for the

execution of any of these processes does not change the

current run level. Furthermore, a process started by an a, b,

or c command is not killed when the init command

changes levels. They are only killed if their line in the

/etc/inittab file is marked off in the action field, their line is

deleted entirely from /etc/inittab, or the init command goes

into single-user mode.

Action Tells the init command how to treat the process specified in the

process field. The following actions are recognized by the init

command:

respawn If the process does not exist, start the process. Do

not wait for its termination (continue scanning the

/etc/inittab file). Restart the process when it dies. If

the process exists, do nothing and continue

scanning the /etc/inittab file.

wait When the init command enters the run level that

matches the entry's run level, start the process and

wait for its termination. All subsequent reads of the

/etc/inittab file, while the init command is in the

same run level, will cause the init command to

ignore this entry.

once When the init command enters a run level that

matches the entry's run level, start the process, and

do not wait for termination. When it dies, do not

restart the process. When the system enters a new

run level, and the process is still running from a

previous run level change, the program will not be

restarted.

boot Process the entry only during system boot, which is

when the init command reads the /etc/inittab file

during system startup. Start the process, do not wait

for its termination, and when it dies, do not restart

the process. In order for the instruction to be

meaningful, the run level should be the default or it

must match the init command's run level at boot

100 IBM Eserver p5 and pSeries Administration and Support for AIX 5L V5.3

time. This action is useful for an initialization function

following a hardware reboot of the system.

bootwait Process the entry the first time that the init

command goes from single-user to multi-user state

after the system is booted. Start the process, wait for

its termination, and when it dies, do not restart the

process. If the initdefault is 2, run the process right

after boot.

powerfail Execute the process associated with this entry only

when the init command receives a power fail signal

(SIGPWR).

powerwait Execute the process associated with this entry only

when the init command receives a power fail signal

(SIGPWR), and wait until it terminates before

continuing to process the /etc/inittab file.

off If the process associated with this entry is currently

running, send the warning signal (SIGTERM), and

wait 20 seconds before terminating the process with

the kill signal (SIGKILL). If the process is not

running, ignore this entry.

ondemand Functionally identical to respawn, except this action

applies to the a, b, or c values, not to run levels.

initdefault An entry with this action is only scanned when the

init command is initially invoked. The init

command uses this entry, if it exists, to determine

which run level to enter initially. It does this by taking

the highest run level specified in the run-level field

and using that as its initial state. If the run level field

is empty, this is interpreted as 0123456789:

therefore, the init command enters run level 9.

Additionally, if the init command does not find an

initdefault entry in the /etc/inittab file, it requests an

initial run level from the user at boot time.

sysinit Entries of this type are executed before the init

command tries to access the console before login. It

is expected that this entry will only be used to

initialize devices on which the init command might

try to ask the run level question. These entries are

executed and waited for before continuing.

Command A shell command to execute. The entire command field is

prefixed with exec and passed to a forked sh as the sh -c exec

command. Any legal sh command syntax can appear in this field.

Comments can be inserted with the # comment syntax.

The getty command overwrites the output of any commands that appear before

it in the /etc/inittab file. To record the output of these commands to the boot log,

pipe their output to the alog -tboot command.

The stdin, stdout, and stderr file descriptors may not be available while the init

command is processing inittab entries. Any entries writing to stdout or stderr may

not work predictably unless they redirect their output to a file or to /dev/console.

The following commands are the only supported methods for modifying the

records in the /etc/inittab file:

mkitab Adds records to the /etc/inittab file.

lsitab Lists records in the /etc/inittab file.

chitab Changes records in the /etc/inittab file.

rmitab Removes records from the /etc/inittab file.

For example, you want to add a record on the /etc/inittab file to run the find

command on the run level 2 and start it again once it has finished:

1. Run the ps command and display only those processes that contain the word

find:

# ps -ef | grep find

root 19750 13964 0 10:47:23 pts/0 0:00 grep find

2. Add a record named xcmd on the /etc/inittab using the mkitab command:

# mkitab "xcmd:2:respawn:find / -type f > /dev/null 2>&1"

3. Show the new record with the lsitab command:

# lsitab xcmd

xcmd:2:respawn:find / -type f > /dev/null 2>&1

4. Display the processes:

# ps -ef | grep find

root 25462 1 6 10:56:58 - 0:00 find / -type f

root 28002 13964 0 10:57:00 pts/0 0:00 grep find

5. Cancel the find command process:

# kill 25462

102 IBM Eserver p5 and pSeries Administration and Support for AIX 5L V5.3

6. Display the processes:

# ps -ef | grep find

root 23538 13964 0 10:58:24 pts/0 0:00 grep find

root 28966 1 4 10:58:21 - 0:00 find / -type f

Since the action field is configured as respawn, a new process (28966, in this

example) is started each time its predecessor finishes.

The process will continue re-spawning, unless you change the action field, for

example:

1. Change the action field on the record xcmd from respawn to once:

# chitab "xcmd:2:once:find / -type f > /dev/null 2>&1"

2. Display the processes:

# ps -ef | grep find

root 20378 13964 0 11:07:20 pts/0 0:00 grep find

root 28970 1 4 11:05:46 - 0:03 find / -type f

3. Cancel the find command process:

# kill 28970

4. Display the processes:

# ps -ef | grep find

root 28972 13964 0 11:07:33 pts/0 0:00 grep find

To delete this record from the /etc/inittab file, you use the rmitab command. For

example:

# rmitab xcmd

# lsitab xcmd

Order of the /etc/inittab entries

The base process entries in the /etc/inittab file is ordered as follows:

1. initdefault

2. sysinit

3. Powerfailure Detection (powerfail)

4. Multiuser check (rc)

5. /etc/firstboot (fbcheck)

6. System Resource Controller (srcmstr)

7. Start TCP/IP daemons (rctcpip)

8. Start NFS daemons (rcnfs)

9. cron

10.pb cleanup (piobe)

11.getty for the console (cons)

The System Resource Controller (SRC) has to be started near the beginning of

the etc/inittab file since the SRC daemon is needed to start other processes.

Since NFS requires TCP/IP daemons to run correctly, TCP/IP daemons are

started ahead of the NFS daemons. The entries in the /etc/inittab file are ordered

according to dependencies, meaning that if a process (process2) requires that

another process (process1) be present for it to operate normally, then an entry

for process1 comes before an entry for process2 in the /etc/inittab file.

AIX/LINUX for System Administrators

Translate

Saturday, 9 March 2013

AIX boot process

Booting involves the following steps:

Boot phase 1

Boot phase 2

Boot phase 3

The /etc/inittab file

No comments:

Post a Comment