As
a system administrator, you should have a general understanding of the boot
process.
This knowledge is useful to solving problems that can prevent a system
from
booting properly. These problems can be both software or hardware. We
also
recommend that you be familiar with the hardware configuration of your
system.
Booting involves the following steps:
_
The initial step in booting a system is the Power On Self Test (POST). Its
purpose
is to verify that the basic hardware is in a functional state. The
memory,
keyboard, communication, and audio devices are also initialized.
You
can see an image for each of these devices displayed on the screen. It is
during
this step that you can press a function key to choose a different boot
list.
The LED values displayed during this phase are model specific. Both
hardware
and software problems can prevent the system from booting.
_
System Read Only Storage (ROS) is specific to each system type. It is
necessary
for AIX 5L Version 5.3 to boot, but it does not build the data
structures
required for booting. It will locate and load bootstrap code. System
ROS
contains generic boot information and is operating system independent.
_
Software ROS (also named bootstrap) forms an IPL control block, which is
compatible
with AIX 5L Version 5.3, that takes control and builds AIX 5L
specific
boot information. A special file system located in memory and named
the
RAMFS file system is created. Software ROS then locates, loads, and
turns
control over to the AIX 5L boot logical volume (BLV). Software ROS is
AIX
5L information based on machine type and is responsible for completing
machine
preparation to enable it to start the AIX 5L kernel.
_
A complete list of files that are part of the BLV can be obtained from the
/usr/lib/boot
directory. The most important components are the following:
–
The AIX 5L kernel
–
Boot commands called during the boot process, such as bootinfo and
cfgmgr
–
A reduced version of the ODM. Many devices need to be configured
before
hd4 is made available, so their corresponding methods have to be
stored
in the BLV. These devices are marked as base in PdDv.
–
The rc.boot script
Note: Old systems based on the MCI architecture execute an additional
step
before
this, the so-called Built In Self Test (BIST). This step is no longer
required
for systems based on the PCI architecture.
_
The AIX 5L kernel is loaded and takes control. The system will display 0299
on
the LED panel. All previous codes are hardware-related. The kernel will
complete
the boot process by configuring devices and starting the init
process.
LED codes displayed during this stage will be generic AIX 5L codes.
_
So far, the system has tested the hardware, found a BLV, created the RAMFS,
and
started the init process from the BLV. The rootvg has not yet been
activated.
From now on, the rc.boot script will be called three times, and is
passed
a different parameter each time.
Boot phase 1
During
this phase, the following steps are taken:
_
The init process started from RAMFS executes the boot script rc.boot 1. If the
init
process fails for some reason, code c06 is shown on the LED display.
_
At this stage, the restbase command is called to copy a partial image of
ODM
from
the BLV into the RAMFS. If this operation is successful, the LED display
shows
510; otherwise, LED code 548 is shown.
_
After this, the cfgmgr -f command reads the Config_Rules class from the
reduced
ODM. In this class, devices with the attribute phase=1 are
considered
base devices. Base devices are all devices that are necessary to
access
rootvg. For example, if the rootvg is located on a hard disk, all devices
starting
from the motherboard up to the disk will have to be initialized. The
corresponding
methods are called so that rootvg can be activated in boot
phase
2.
_
At the end of boot phase 1, the bootinfo -b command is called to
determine
the
last boot device. At this stage, the LED shows 511.
Boot phase 2
In
boot phase 2, the rc.boot script is passed to the parameter 2.
During
this phase, the following steps are taken.
_
The rootvg volume group is varied on with the special version of the varyonvg
command
named the ipl_varyon command. If this command is successful,
the
system displays 517; otherwise, one of the following LED codes will
appear:
552, 554, or 556, and the boot process is halted.
_
Root file system hd4 is checked using the fsck -f command. This will
verify
whether
the file system was unmounted cleanly before the last shutdown. If
this
command fails, the system will display code 555.
_
The root file system (/dev/hd4) is mounted on a temporary mount point (/mnt)
in
RAMFS. If this fails, 557 will appear in the LED display.
96 IBM Eserver p5 and pSeries Administration and Support for AIX 5L
V5.3
_
The /usr file system is verified using the fsck -f command and then
mounted.
If this operation fails, the LED 518 appears.
_
The /var file system is verified using the fsck -f command and then
mounted.
The copycore command checks if a dump occurred. If it did, it is
copied
from default dump devices, /dev/hd6, to the default copy directory,
/var/adm/ras.
Afterwards, /var is unmounted.
_
The primary paging space from rootvg, /dev/hd6, will be activated.
_
The mergedev process is called and all /dev files from the RAM file system
are
copied onto disk.
_
All customized ODM files from the RAM file system are copied to disk. Both
ODM
versions from hd4 and hd5 are now synchronized.
_
Finally, the root file system from rootvg (disk) is mounted over the root file
system
from the RAMFS. The mount points for the rootvg file systems
become
available. Now, the /var and /usr file systems from the rootvg are
mounted
again on their ordinary mount points.
There
is no console available at this stage, so all boot messages will be copied to
alog. The alog command maintains and manages logs.
Boot phase 3
After
phase 2 is completed, rootvg is activated and the following steps are taken:
_
/etc/init process is started. It reads the /etc/inittab file and calls rc.boot
with
argument
3.
_
The /tmp file system is mounted.
_
The rootvg is synchronized by calling the syncvg command and launching it
as
a background process. As a result, all stale partitions from rootvg are
updated.
At this stage, the LED code 553 is shown.
_
At this stage, the cfgmgr command is called; if the system is booted in
normal
mode,
the cfgmgr command is called with option -p2; if the system is booted
in
service mode, the cfgmgr command is called with option -p3. The cfgmgr
command
reads the Config_rules file from ODM and calls all methods
corresponding
to either phase=2 or phase=3. All other devices that are not
base
devices are configured at this time.
_
Next, the console is configured by calling the cfgcon command. After the
configuration
of the console, boot messages are sent to the console if no
STDOUT
redirection is made. However, all missed messages can be found in
/var/adm/ras/conslog.
LED codes that can be displayed at this time are:
–
c31: Console not yet configured. Provides instructions to select console.
–
c32: Console is an LFT terminal.
–
c33: Console is a TTY.
–
c34: Console is a file on the disk.
_
Finally, the synchronization of the ODM in the BLV with the ODM from the /
(root)
file system is done by the savebase command.
_
The syncd daemon and errdemon are started.
_
The LED display is turned off.
_
If the file /etc/nologin exists, it will be removed.
_
If there are devices marked as missing in CuDv, a message is displayed on
the
console.
_
The message System initialization completed is sent to the console. The
execution
of rc.boot is has completed. Process init will continue processing
the
next command from /etc/inittab.
The /etc/inittab file
The
/etc/inittab file controls the initialization process.
The
/etc/inittab file supplies the script to the initcommand's role as a
general
process
dispatcher. The process that constitutes the majority of the init
command's
process dispatching activities is the /etc/getty line process, which
initiates
individual terminal lines. Other processes typically dispatched by the
init command are daemons and the shell.
The
/etc/inittab file is composed of entries that are position-dependent and have
the
following format:
Identifier:RunLevel:Action:Command
Each
entry is delimited by a newline character. A backslash (\) preceding a
newline
character indicates the continuation of an entry. There are no limits
(other
than maximum entry size) on the number of entries in the /etc/inittab file.
The
maximum entry size is 1024 characters.
The
entry fields are:
Identifier A one to fourteen character field that
uniquely identifies an
object.
RunLevel The run level at which this entry can be
processed.
The
run level has the following attributes:
–
Run levels effectively correspond to a configuration of
processes
in the system.
–
Each process started by the init command is assigned one
or
more run levels in which it can exist.
–
Run levels are represented by the numbers 0 through 9. For
example,
if the system is in run level 1, only those entries
with
a 1 in the run-level field are started.
–
When you request theinit command to change run levels,
all
processes without a matching entry in the run-level field
for
the target run level receive a warning signal (SIGTERM).
There
is a 20-second grace period before processes are
forcibly
terminated by the kill signal (SIGKILL).
–
The run-level field can define multiple run levels for a process
by
selecting more than one run level in any combination from
0
through 9. If no run level is specified, the process is
assumed
to be valid at all run levels.
–
There are four other values that appear in the run-level field,
even
though they are not true run levels: a, b, c and h.
Entries
that have these characters in the run level field are
processed
only when the telinit command requests them
to
be run (regardless of the current run level of the system).
They
differ from run levels in that the init command can
never
enter run level a, b, c, or h. Also, a request for the
execution
of any of these processes does not change the
current
run level. Furthermore, a process started by an a, b,
or
c command is not killed when the init command
changes
levels. They are only killed if their line in the
/etc/inittab
file is marked off in the action field, their line is
deleted
entirely from /etc/inittab, or the init command goes
into
single-user mode.
Action Tells the init command how to treat
the process specified in the
process
field. The following actions are recognized by the init
command:
respawn If the process does not exist, start the
process. Do
not
wait for its termination (continue scanning the
/etc/inittab
file). Restart the process when it dies. If
the
process exists, do nothing and continue
scanning
the /etc/inittab file.
wait When the init command enters the run level that
matches
the entry's run level, start the process and
wait
for its termination. All subsequent reads of the
/etc/inittab
file, while the init command is in the
same
run level, will cause the init command to
ignore
this entry.
once When the init command enters a run level that
matches
the entry's run level, start the process, and
do
not wait for termination. When it dies, do not
restart
the process. When the system enters a new
run
level, and the process is still running from a
previous
run level change, the program will not be
restarted.
boot Process the entry only during system boot, which is
when
the init command reads the /etc/inittab file
during
system startup. Start the process, do not wait
for
its termination, and when it dies, do not restart
the
process. In order for the instruction to be
meaningful,
the run level should be the default or it
must
match the init command's run level at boot
100 IBM Eserver p5 and pSeries Administration and Support for AIX 5L
V5.3
time.
This action is useful for an initialization function
following
a hardware reboot of the system.
bootwait Process the entry the first time that the init
command
goes from single-user to multi-user state
after
the system is booted. Start the process, wait for
its
termination, and when it dies, do not restart the
process.
If the initdefault is 2, run the process right
after
boot.
powerfail Execute the process associated with this
entry only
when
the init command receives a power fail signal
(SIGPWR).
powerwait Execute the process associated with this
entry only
when
the init command receives a power fail signal
(SIGPWR),
and wait until it terminates before
continuing
to process the /etc/inittab file.
off If the process associated with this entry is currently
running,
send the warning signal (SIGTERM), and
wait
20 seconds before terminating the process with
the
kill signal (SIGKILL). If the process is not
running,
ignore this entry.
ondemand Functionally identical to respawn, except
this action
applies
to the a, b, or c values, not to run levels.
initdefault An entry with this action is only scanned
when the
init command is initially invoked. The init
command
uses this entry, if it exists, to determine
which
run level to enter initially. It does this by taking
the
highest run level specified in the run-level field
and
using that as its initial state. If the run level field
is
empty, this is interpreted as 0123456789:
therefore,
the init command enters run level 9.
Additionally,
if the init command does not find an
initdefault
entry in the /etc/inittab file, it requests an
initial
run level from the user at boot time.
sysinit Entries of this type are executed before
the init
command
tries to access the console before login. It
is
expected that this entry will only be used to
initialize
devices on which the init command might
try
to ask the run level question. These entries are
executed
and waited for before continuing.
Command A shell command to execute. The entire
command field is
prefixed
with exec and passed to a forked sh as the sh -c exec
command.
Any legal sh command syntax can appear in this field.
Comments
can be inserted with the # comment syntax.
The
getty command overwrites the output of any commands that appear before
it
in the /etc/inittab file. To record the output of these commands to the boot
log,
pipe
their output to the alog -tboot command.
The
stdin, stdout, and stderr file descriptors may not be available while the init
command
is processing inittab entries. Any entries writing to stdout or stderr may
not
work predictably unless they redirect their output to a file or to /dev/console.
The
following commands are the only supported methods for modifying the
records
in the /etc/inittab file:
mkitab Adds records to the /etc/inittab file.
lsitab Lists records in the /etc/inittab file.
chitab Changes records in the /etc/inittab file.
rmitab Removes records from the /etc/inittab file.
For
example, you want to add a record on the /etc/inittab file to run the find
command
on the run level 2 and start it again once it has finished:
1.
Run the ps command and display only those processes that contain the word
find:
#
ps -ef | grep find
root
19750 13964 0 10:47:23 pts/0 0:00 grep find
#
2.
Add a record named xcmd on the /etc/inittab using the mkitab command:
#
mkitab "xcmd:2:respawn:find / -type f > /dev/null 2>&1"
3.
Show the new record with the lsitab command:
#
lsitab xcmd
xcmd:2:respawn:find
/ -type f > /dev/null 2>&1
#
4.
Display the processes:
#
ps -ef | grep find
root
25462 1 6 10:56:58 - 0:00 find / -type f
root
28002 13964 0 10:57:00 pts/0 0:00 grep find
#
5.
Cancel the find command process:
#
kill 25462
102 IBM Eserver p5 and pSeries Administration and Support for AIX 5L
V5.3
6.
Display the processes:
#
ps -ef | grep find
root
23538 13964 0 10:58:24 pts/0 0:00 grep find
root
28966 1 4 10:58:21 - 0:00 find / -type f
#
Since
the action field is configured as respawn, a new process (28966, in this
example)
is started each time its predecessor finishes.
The
process will continue re-spawning, unless you change the action field, for
example:
1.
Change the action field on the record xcmd from respawn to once:
#
chitab "xcmd:2:once:find / -type f > /dev/null 2>&1"
2.
Display the processes:
#
ps -ef | grep find
root
20378 13964 0 11:07:20 pts/0 0:00 grep find
root
28970 1 4 11:05:46 - 0:03 find / -type f
3.
Cancel the find command process:
#
kill 28970
4.
Display the processes:
#
ps -ef | grep find
root
28972 13964 0 11:07:33 pts/0 0:00 grep find
#
To
delete this record from the /etc/inittab file, you use the rmitab command.
For
example:
#
rmitab xcmd
#
lsitab xcmd
#
Order of the /etc/inittab entries
The
base process entries in the /etc/inittab file is ordered as follows:
1.
initdefault
2.
sysinit
3.
Powerfailure Detection (powerfail)
4.
Multiuser check (rc)
5.
/etc/firstboot (fbcheck)
6.
System Resource Controller (srcmstr)
7.
Start TCP/IP daemons (rctcpip)
8.
Start NFS daemons (rcnfs)
9.
cron
10.pb
cleanup (piobe)
11.getty
for the console (cons)
The
System Resource Controller (SRC) has to be started near the beginning of
the
etc/inittab file since the SRC daemon is needed to start other processes.
Since
NFS requires TCP/IP daemons to run correctly, TCP/IP daemons are
started
ahead of the NFS daemons. The entries in the /etc/inittab file are ordered
according
to dependencies, meaning that if a process (process2) requires that
another
process (process1) be present for it to operate normally, then an entry
for
process1 comes before an entry for process2 in the /etc/inittab file.
No comments:
Post a Comment